OpenML
credit-g

credit-g

active ARFF Publicly available Visibility: public Uploaded 27-01-2023 by Young Lee
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes
  • Computer Systems Machine Learning
Issue #Downvotes for this reason By


Loading wiki
Help us complete this description Edit
This dataset classifies people described by a set of attributes as good or bad credit risks.This dataset comes with a cost matrix:Good Bad (predicted) Good 0 1 (actual)Bad 5 0 It is worse to class a customer as good when they are bad (5), than it is to class a customer as bad when they are good (1).

21 features

class (target)string2 unique values
0 missing
durationnumeric33 unique values
0 missing
credit_amountnumeric921 unique values
0 missing
installment_commitmentnumeric4 unique values
0 missing
residence_sincenumeric4 unique values
0 missing
agenumeric53 unique values
0 missing
existing_creditsnumeric4 unique values
0 missing
num_dependentsnumeric2 unique values
0 missing
checking_statusnominal4 unique values
0 missing
credit_historynominal5 unique values
0 missing
purposenominal10 unique values
0 missing
savings_statusnominal5 unique values
0 missing
employmentnominal5 unique values
0 missing
personal_statusnominal4 unique values
0 missing
other_partiesnominal3 unique values
0 missing
property_magnitudenominal4 unique values
0 missing
other_payment_plansnominal3 unique values
0 missing
housingnominal3 unique values
0 missing
jobnominal4 unique values
0 missing
own_telephonenominal2 unique values
0 missing
foreign_workernominal2 unique values
0 missing

19 properties

1000
Number of instances (rows) of the dataset.
21
Number of attributes (columns) of the dataset.
2
Number of distinct values of the target attribute (if it is nominal).
0
Number of missing values in the dataset.
0
Number of instances with at least one value missing.
7
Number of numeric attributes.
13
Number of nominal attributes.
700
Number of instances belonging to the most frequent class.
30
Percentage of instances belonging to the least frequent class.
300
Number of instances belonging to the least frequent class.
2
Number of binary attributes.
9.52
Percentage of binary attributes.
0
Percentage of instances having missing values.
1
Average class difference between consecutive instances.
0
Percentage of missing values.
0.02
Number of attributes divided by the number of instances.
33.33
Percentage of numeric attributes.
70
Percentage of instances belonging to the most frequent class.
61.9
Percentage of nominal attributes.

1 tasks

0 runs - estimation_procedure: 10-fold Crossvalidation - evaluation_measure: predictive_accuracy - target_feature: class
Define a new task