DOI : 10.17577/IJERTCONV14IS060034- Open Access

- Authors : Sachin Acharya, Muhammad Owais A, Assistant Professor, R Ram, Rajasekar G, Raghavendra Biradar
- Paper ID : IJERTCONV14IS060034
- Volume & Issue : Volume 14, Issue 06, ACSCON – 2026
- Published (First Online) : 15-06-2026
- ISSN (Online) : 2278-0181
- Publisher Name : IJERT
- License:
This work is licensed under a Creative Commons Attribution 4.0 International License
Credit Card Fraud Detection
Sachin Acharya
Dept. of Computer Science & Engineering
ACS College of Engineering Bangalore, India acharya4023@gmail.com
Muhammad Owais A Assistant Professor
Dept. of Computer Science & Engineering
ACS College of Engineering Bangalore, India owais8825811761@gmail.com
R Ram
Dept. of Computer Science & Engineering
ACS College of Engineering Bangalore, India ramrreddy2482@gmail.com
Rajasekar G
Dept. of Computer Science & Engineering
ACS College of Engineering Bangalore, India rajeshseemalamudi2003@gmail.com
Raghavendra Biradar Dept. of Computer Science & Engineering
ACS College of Engineering Bangalore, India rpbiradar27@gmail.com
ABSTRACT
:rra111l D tttti
搂 >[r,
! m
——–
'===:::::_-====:I
f'ii':iUduld:il Tlf:iHaii:I.NB
lluJua:q
l"qibwicliiii!
11,.;,k,.
The use of machine learning to identify credit card fraud is accompanied by a number of challenges. One of the significant problems is the class imbalance, where the fraudulent transactions constitute a very small percentage of all transactions. This can cause biased models towards the majority class, leading to ineffective fraud detection. Oversampling, under sampling, and creating synthetic data (e.g., SMOTE) arc utilized to avoid this problem. For avoiding this issue, oversampling, under sampling, and synthetic data generation (e.g., SMOTE) arc applied. Feature engineering is utilized to select and manipulate feature attributes from transactional data for optimal model performance. Since any delay in detection can lead to massive financial loss, real- time detection is required. Through data analysis of transactions and testing several ML algorithms, we aim to develop a low- false-positive, effective, and accurate real-time model. The outcome will help banks use ML-based solutions to combat fraud. Lastly, machine learning in fraud detection will enhance financial security and boost the trust of customers in the banking sector.
Fig. l. Credit Card Fraud Detection Using Machine Learning.
II. RELATED WORKS
Credit card fraud detection has been a problem explored at very long lengths in the last two years with numerous techniques being employed to increase the efficiency and effectiveness of fraud detection mechanisms. Statistics and rule-based mechanisms were among early solutions that were most dependent on pre-defined thresholds and patterns in identifying suspicious transactions. Whereas these techniques had helped in the detection of well-known fraud patterns, they were unable to match the latest and refined means of operating by fraudsters. Therefore, emphasis was laid on newer techniques like machine learning, which can process enormous amounts of transaction data and detect sophisticated, non-linear patterns characteristic of fraud.
Supervised learning algorithms have been widely used for credit card fraud detection due to the ability to label the transactions as fraudulent or genuine based on labeled datasets [4). It has been demonstrated through research that logistic regression, SVM, and decision trees are effective algorithms for detecting fraud. For insLance, ensemble algoritluns like random forests and gradient boosting have been shown to be very accurate in aggregating strengths of multiple models. However, one of the largest supervised learning challenges is class imbalance in fraud datasets, where fraudulent transactions significantly outnumber legitimate transactions. To address this, techniques such as oversampling, under sampling, and synthetic data creation (e.g., SMOTE) have been employed to balance the dataset as well as to improve model perfonnance
Unsupervised learning techniques bave also been identified as having the ability to detect unknown patterns of fraud without relying on labeled data. Clustering methods such as k-mcans and DBSCAN have been used to cluster similar transactions and separate out outliers that could be indicative of fraud. Algorithms such as isolation forests and auto encoders have proven to detect well rare and unusual transactions which do not represent usual behavior. These unsupervised algorithms are particularly beneficial for discovering new types of fraud that arc not based on well-known patterns and therefore complement supervised learning algorithms.
Deep learning, which is a machine learning method [5], has been found lo be a highly effective approach in detecting credit card fraud due to its ability to handle high-dimensional and complex data. Some of the neural networks utilized for handling sequences of transactions and extracting important features to detect fraud arc recurrent neural networks (RNNs) and convolutional neural networks (CNNs). Research has established that deep learning models are capable of state-of-the-art performance on fraud detection problems, particularly when augmented with methods such as transfer learning and attention mechanisms. Yet, the computational complexity and resource needs of deep learning models pose a problem for There have also been recent experiments on integrating real-time fraud detection systems into banking infrastructure. SLream processing platforms such as Apache Kafka and Apache Flink were used for real-time processing of transaction data and deployment of machine learning models in favor ofreal-time fraud detection.
=
K-nearo naighbor aigo-rithm
..路路路
(XGBoost, LightGBM, CatBoost) are the most popular algorithms for fraud classification. They are trained on labeled samples of known fraud historical transactions, and therefore they can detect true as well as fraud transactions.
Fig. J. K-ncarcsl neighbor classification
r.
0 ( ')
Fig. 4. Decision Tree
B. Module List and Descriptions
Data Collection and Preprocessing: This module is for gathering credit card transaction data samples from banking statements and open sources. Preprocessing data operations such as handling missing values, removing duplicates, transforming categorical variables into numerical variables, and normalizing numeric features are employed. Also, methods such as Synthetic Minority Over-sampling Technique (SMOTE) are employed to solve the imbalance in class and provide an efficient training set.
Data Set: Downloading datasets from Kaggle can be beneficial for data analysis, machine learning, and research. This dataset has 4,850 records and 11 fields, with a size of around 319KB. It seems to deal with credit card transactions, with one record per transaction. The fields have some information about the transaction and cardholdcr. The "Unnamed: 0" column Iikely is an index or auto-indexed ID of each row. The "cc num" column has the credit card number or its masked form, and "category" has the category of the transaction, e.g., grocery shopping, gas, online shopping.The "amt" column captures the amount spent on each transaction, and the "gender" column captures the gender of the card holder. The "is_fraud" column is a binary flag, with 1 representing a fraudulent transaction and O representing a valid one. The "age" column contains the age of the cardholder, and the "trans_rnonth" and "trans_year" columns deail the date of the transaction. Lastly, the "lat_dis" and "long_dis" columns represent the geographic distance (in latitude and longitude) between the transaction location and the cardholdcr's known location, which may aid in spotting suspicious activity or fraud due to location irregularities.
Feature Selection and Engineering: This module emphasizes identifying and selecting essential transaction attributes for fraud detection. Significant attributes including transaction amount, location, timing, device information, and behavioral patterns of a user are used to improve the accuracy of a model. Reduction of dimension and improvement in computing efficiency are attained through techniques like Principal Component Analysis (PCA) and Recursive Feature Elimination (RFE)
Model Training and Classification: Various machine learning algorithms, including Decision Trees, Random Forest, Support Vector Machines (SVM) are implemented in this module. Supervised learning is used for labeled transaction data, while unsupervised techniques identify anomalies without predefined fraud labels.
Anomaly Detection and Fraud Identification: This module utilizes unsupervised learning algorithms like Isolation Forest, One-Class SVM, and Clustering algorithms (K-Means, DBSCAN) to identify abnormal transaction patterns that could suggest fraudulent activity. These models assist in detecting new fraud patterns which could escape supervised models.
Real-time Fraud Detection and Alert System: The framework combines stream data processing platforms to identify fraud in real-time. Tn case of a transaction suspected of being fraudulent, an alert is triggered and subsequent verification processes, like multi-factor authentication, are invoked to block unaulhorized transactions.
Performance Evaluation and Optimization: The last module assesses model performance on measures like accuracy, precision, recall, FI -score, and ROC- AUC curves. Hyper parameter tuning methods like Grid Search and Random Search are employed to optimize the model and enhance fraud Detection.
JV, RESULTS
The machine learning approach proposed for detecting credit card fraud in banking is made up of several stages for the purpose of achieving accurate and efficient.
Data Acquisition & Preprocessing: The procedure begins with the collection of transactional data from bank statements or publicly available datasets. The data is preprocessed by handling missing values, removing duplicates, and encoding categorical features. Since fraud detection datasets arc usually highly imbalanced, techniques like SMOTE and under sampling are used to balance the dataset so that the model does not lean towards non颅 fraudulent transactions.
Feature Selection & Engineering: Relevant transaction attributes, including transaction value, frequency, device usage, geographical location, and time-based patterns, arc derived. Feature scaling and transformation methods, like Min-Max scaling or Standardization, are used to maintain consistency in data distribution.
Model Selection & Training: The data set is separated into training and testing sets, and machine learning models are trained. It is classified through supervised learning models such as Logistic Regression, Decision Trees, Random Forest, SVM. Anomaly detection is performed through the unsupervised learning algorithms such as Auto encoders and [solation Fore t.
fig. 5. Accuracy Graph
Anomaly Detection & Fraud Identification: The trained models categorize transactions into fraudulent or legitimate using learned patterns. The blended techniques that use both supervised and unsupervised techniques help accuracy and fraud discovery improvement. Other ensemble learning approaches like Boosting and Stacking help increase the performance of detection.
Real-Time Detection & Alert Generation: A real颅 time fraud detection system is implemented in a banking system, where streaming data is scanned in real time for dubious fraudulent transactions. In case a doubtful transaction is detected, the system initiates automatic alerts and security measures like OTP verification or locking of an account to avert fraud.
Model Evaluation & Performance Optimization: The value of fraudulent detection models is measured by Precision, Recall, FI -score, Confusion Matrix, and AUC-ROC curve. Methods of hyper parameter tuning such as Grid Search, Bayesian Optimization, and cural Architecture Search (NAS) arc applied for
improving model performance.
|
SNO |
nassifier |
Aceuraicy |
Precision |
Recall |
Fl Score: |
|
|
1 |
decision Tr,ee .cla.wf.ie.t:.. |
1 |
1 |
1 |
1 |
|
|
2 |
Random forest |
1 |
1 |
1 |
1 |
|
|
cla |
.ssifier: |
|||||
|
3 |
KNN classifier: |
0,,996 |
0.994 |
0.997 |
0.996 |
|
|
-'I |
SVM classifier: |
0.738 |
0.369 |
0.5 |
0.424 |
|
|
5 |
Naive bayes clasc,ifier: |
0,,335 |
0.641 |
0.549 |
0.31 |
|
TABLE I. ALGORITHM CLASSIFIER
-p "'f-W..tliNI
– :lliml, ….
, I'll
– 111s.-m –:J.al
路-Yi …-: 11,1
….D
[2J E. lleberi, Y. Sun. and Z. Wang, "A machine learning based credit card lraud detection using the GA algorithm for f"cature sclcctiun:路 J. Big Data. vol. 9. nu. 1. p. 24, Dec. 2022. doi: I0.1186M0537-022-00573-8.Fig. 6. ROU Cmve
V. CONCLUSION
Machine learning-based credit card fraud detection in banking has been an effective tool for detecting and preventing fraudulent transactions in real time. Through the use of supervised, unsupervised, banks are able to identify sophisticated fraud patterns that rule-based systems are unable to detect. The application oftechniques such as anomaly detection, ensemble learning, and feature engineering enhances the accuracy and reliability of fraud detection systems. In addition, real-time processing and automatic alerts al low financial institutions to take prompt action against suspicious activities, minimizing financial losses and boosting customer trust. As the pace of innovative fraudulent techniques accelerates. robust and elastic machine learning algorithms will become even more crucial in future years. Explainable Al, federated learning, blockchain protection, and quantum computing will steadily make fraud detection capabilities more user颅 friendly with assurances of privacy and transparency. As banks continue the evolution and consolidation of new technology, they are able to develop an even more secure and efficient payment process, making fraud risk obsolete and further increasing overall transaction security.
ACKNOWLEDGMENTS
The authors acknowledge the HOD, Department of Electronics and Communication Engineering, the management of Vel Tech High Tech Dr. Rangarajan Dr. Sakunthala Engineering college, Chennai, for their continuous encouragement.
REFERENCES
-
I. D. Mienye and N. Jere, "Deep Leaming for Credit Card Fraud Detection: A Review of Algorithms, Challenges, and Solutions," IF:FFAccess, vol. 12, pp. 96893-96910, 2024, doi: I0.1I09/ACCESS.2024.3426955.
-
V. N. IJornadula and S. Geetha, "Credit Card Fraud IJetecrion using Machine Learning Algorithms," l'rucedia Compw. Sci., vol. 165, pp. 631-641, 2019, doi: 10.10I6/j.procs.2020.01.057.
151A. R. Khalid, N. Owoh, 0. Uthmani, M. Ashawa, .I. Osamor. and
J. Adcjoh, ''Enhancing Cn;dit Card Fraud Detection: An Ensemble Machine Learning Approach," Big Dal(J Cogn. Compul., vol. 8, no. I,
p. <,,Jan. 2024, doi: IO.J390/bdcc80 I 000(,.
-
R. Sailusha, V. Gnaneswar. R. Ramesh, and G. R. Rao, "Credit Card Fraud Detection Using Machine Learning," in 202() 4th lnlenwliorw/ Cunfi-rence un /n1ellige11/ Compuling and Conlrol Systems (!Cl CSJ, !FEE, May 2020, pp. 1264-1270. doi: ID.I i09/ICICCS48265.2020.9121114.
-
Z. Zaffar. F. Sohrab, J. Kanniainen, and M. Cabbouj, "Credit Card Fraud Detection with Subspace Learning-based One- lass Classification," in 2023 IEEE Symposium Series 011 Co1111Ju/atirmal Intelligence (SSC!). [EE", Dec, 2023, pp. 407-412. dui:
I0.l 109/SSC152147.2023.10372038.
-
M.A. Talukder. R. Hassen, M.A. Uddin, M. N. Uddin, and U. K. Achaijec, "Securing transactions: a hybrid dependable ensemble machine lcaming model using !HT-LR and grid search," Cybersecuritr, vol. 7, no. I, p. 32, Nov. 2024, doi: IO.l l 86/s42400- 024-0022 l-z.
-
D. Rzayeva and S. Malekzadeh, "A Combmation of Deep Neural ctworks and K-Ncarcst Neighbors [or Credit Ca.rd Fraud De1ec1io11,'路 May 2022. [Online]. Avai!Jble:
hnp://arxiv.org/abs/2205.15300.
[ I OJ D. H. M. de Souza and C. J. 13orcl in, ''J:insemble and M ixecl Lcammg Techniques for Credit Card Fraud Detection," Dec. 2021,!Online[. Available: hltp://arxiv.org/abs/2112.02627.
[II [ F. K. AlarfaJ, I. Malik, 1-1. U. Khan, N. Almusallam, M. Ra1nzan, and M. Ahmed. "Credit Card 1-'raud Detection Using Stare-of-the-Art Machine Learning and Deep Learning Algorithms," IEEE Access. vol. 10, pp. 39700-39715. 2022. dui: 10.1109/ACCESS.2022.3166891. [12] S. Khatri, A. Arora. and A. P. Agrawal, "Supervised Machine Learning Algorithms for Credit Card fraud Detection: A Comparison,'' in 2020 llllfi /111ema1iona/ Conference on Cloud Co111p11/i11g, Data Scie11ce & F:ngineering (Confluence), IEEE, Jan. 2020, pp. 680-683. doi: I 0.1109/Conflut:ncc47617.2020.9057851. [I] S. Patil, V. Nemade, and P. K. Soni, "Predictive Modelling For Credit CarJ Fraud Detection "sing Data Analytics," Prucediu Cv111put. Sci., vol. 132, pp. 385-395, 2018, doi: I 0.1016/j,procs.2018.05.199.
