Data
OpenML
Help
Sign in
×
Sign in
No account? Join OpenML
Forgot password
×
JavaScript is required to properly view the contents of this page!
OpenML
Explore
Data
Task
Flow
Run
Study
Task type
Measure
People
Help
Blog
Contact
Please cite us
Is_fraud
ARFF
CSV
JSON
XML
RDF
Is_fraud
active
ARFF
CC BY 4.0
Visibility: public
Uploaded 06-12-2024 by
Anna Wiewer
0 likes
downloaded by 0 people , 0 total downloads
0 issues
0 downvotes
Add tag
Issue
#Downvotes for this reason
By
Loading wiki
Help us complete this description
Edit
A fraud detection dataset for binary classification. The target variable is 'is_fraud', indicating whether a transaction is fraudulent.
21 features
is_fraud
(target)
nominal
2 unique values
0 missing
cc_num
numeric
668 unique values
0 missing
merchant
numeric
476 unique values
0 missing
category
numeric
14 unique values
0 missing
amt
numeric
4147 unique values
0 missing
gender
numeric
2 unique values
0 missing
state
numeric
48 unique values
0 missing
zip
numeric
664 unique values
0 missing
lat
numeric
663 unique values
0 missing
long
numeric
664 unique values
0 missing
city_pop
numeric
619 unique values
0 missing
job
numeric
349 unique values
0 missing
unix_time
numeric
5135 unique values
0 missing
merch_lat
numeric
5227 unique values
0 missing
merch_long
numeric
5227 unique values
0 missing
trans_year
numeric
1 unique values
0 missing
trans_month
numeric
2 unique values
0 missing
trans_day
numeric
4 unique values
0 missing
trans_hour
numeric
24 unique values
0 missing
trans_minute
numeric
60 unique values
0 missing
trans_second
numeric
60 unique values
0 missing
Show all 21 features
19 properties
NumberOfInstances
5227
Number of instances (rows) of the dataset.
NumberOfFeatures
21
Number of attributes (columns) of the dataset.
NumberOfClasses
2
Number of distinct values of the target attribute (if it is nominal).
NumberOfMissingValues
0
Number of missing values in the dataset.
NumberOfInstancesWithMissingValues
0
Number of instances with at least one value missing.
NumberOfNumericFeatures
20
Number of numeric attributes.
NumberOfSymbolicFeatures
1
Number of nominal attributes.
PercentageOfBinaryFeatures
4.76
Percentage of binary attributes.
PercentageOfInstancesWithMissingValues
0
Percentage of instances having missing values.
PercentageOfMissingValues
0
Percentage of missing values.
AutoCorrelation
0.99
Average class difference between consecutive instances.
PercentageOfNumericFeatures
95.24
Percentage of numeric attributes.
Dimensionality
0
Number of attributes divided by the number of instances.
PercentageOfSymbolicFeatures
4.76
Percentage of nominal attributes.
MajorityClassPercentage
99.69
Percentage of instances belonging to the most frequent class.
MajorityClassSize
5211
Number of instances belonging to the most frequent class.
MinorityClassPercentage
0.31
Percentage of instances belonging to the least frequent class.
MinorityClassSize
16
Number of instances belonging to the least frequent class.
NumberOfBinaryFeatures
1
Number of binary attributes.
Show all 19 properties
1 tasks
Supervised Classification on Is_fraud
0 runs
- estimation_procedure: 10-fold Crossvalidation - target_feature: is_fraud
Define a new task