Data
Credit_Card_Fraud_Classification

Credit_Card_Fraud_Classification

active ARFF CC BY 4.0 Visibility: public Uploaded 23-11-2024 by Yayun Li
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes
Issue #Downvotes for this reason By


Loading wiki
Help us complete this description Edit
# Credit Card Fraud Detection Dataset ## Context It is important that credit card companies are able to recognize fraudulent credit card transactions so that customers are not charged for items that they did not purchase. ## Content The dataset contains transactions made by credit cards in September 2013 by European cardholders. This dataset presents transactions that occurred in two days, where we have 492 frauds out of 284,807 transactions. The dataset is highly imbalanced, the positive class (frauds) account for 0.172% of all transactions. ## Dataset Characteristics - Total Transactions: 284,807 - Fraudulent Transactions: 492 (0.172%) - Time Period: Two days in September 2013 - Region: European credit card holders - Type: Binary Classification (Fraud Detection) ## Class Distribution - Legitimate Transactions: 284,315 (99.828%) - Fraudulent Transactions: 492 (0.172%) ## Important Notes - The dataset is highly imbalanced - All transactions are anonymized - Contains only numeric input variables - Features are anonymized for confidentiality

31 features

class (target)nominal2 unique values
1 missing
timenumeric6442 unique values
0 missing
v1numeric9756 unique values
0 missing
v2numeric9756 unique values
0 missing
v3numeric9755 unique values
1 missing
v4numeric9755 unique values
1 missing
v5numeric9755 unique values
1 missing
v6numeric9755 unique values
1 missing
v7numeric9755 unique values
1 missing
v8numeric9755 unique values
1 missing
v9numeric9755 unique values
1 missing
v10numeric9755 unique values
1 missing
v11numeric9755 unique values
1 missing
v12numeric9755 unique values
1 missing
v13numeric9755 unique values
1 missing
v14numeric9755 unique values
1 missing
v15numeric9755 unique values
1 missing
v16numeric9755 unique values
1 missing
v17numeric9755 unique values
1 missing
v18numeric9755 unique values
1 missing
v19numeric9755 unique values
1 missing
v20numeric9755 unique values
1 missing
v21numeric9755 unique values
1 missing
v22numeric9755 unique values
1 missing
v23numeric9755 unique values
1 missing
v24numeric9755 unique values
1 missing
v25numeric9755 unique values
1 missing
v26numeric9755 unique values
1 missing
v27numeric9755 unique values
1 missing
v28numeric9755 unique values
1 missing
amountnumeric3769 unique values
1 missing

19 properties

9965
Number of instances (rows) of the dataset.
31
Number of attributes (columns) of the dataset.
3
Number of distinct values of the target attribute (if it is nominal).
28
Number of missing values in the dataset.
1
Number of instances with at least one value missing.
30
Number of numeric attributes.
1
Number of nominal attributes.
3.23
Percentage of binary attributes.
0.01
Percentage of instances having missing values.
0.01
Percentage of missing values.
0.99
Average class difference between consecutive instances.
96.77
Percentage of numeric attributes.
0
Number of attributes divided by the number of instances.
3.23
Percentage of nominal attributes.
99.61
Percentage of instances belonging to the most frequent class.
9926
Number of instances belonging to the most frequent class.
0.01
Percentage of instances belonging to the least frequent class.
1
Number of instances belonging to the least frequent class.
1
Number of binary attributes.

1 tasks

0 runs - estimation_procedure: 10-fold Crossvalidation - target_feature: class
Define a new task