Data
credit-g_copy

credit-g_copy

active ARFF Publicly available Visibility: public Uploaded 21-06-2022 by Laurens Krudde
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes
Issue #Downvotes for this reason By


Loading wiki
Help us complete this description Edit
Author: Dr. Hans Hofmann Source: [UCI](https://archive.ics.uci.edu/ml/datasets/statlog+(german+credit+data)) - 1994 Please cite: [UCI](https://archive.ics.uci.edu/ml/citation_policy.html) German Credit dataset This dataset classifies people described by a set of attributes as good or bad credit risks. This dataset comes with a cost matrix: ``` Good Bad (predicted) Good 0 1 (actual) Bad 5 0 ``` It is worse to class a customer as good when they are bad (5), than it is to class a customer as bad when they are good (1). ### Attribute description 1. Status of existing checking account, in Deutsche Mark. 2. Duration in months 3. Credit history (credits taken, paid back duly, delays, critical accounts) 4. Purpose of the credit (car, television,...) 5. Credit amount 6. Status of savings account/bonds, in Deutsche Mark. 7. Present employment, in number of years. 8. Installment rate in percentage of disposable income 9. Personal status (married, single,...) and sex 10. Other debtors / guarantors 11. Present residence since X years 12. Property (e.g. real estate) 13. Age in years 14. Other installment plans (banks, stores) 15. Housing (rent, own,...) 16. Number of existing credits at this bank 17. Job 18. Number of people being liable to provide maintenance for 19. Telephone (yes,no) 20. Foreign worker (yes,no)

21 features

class (target)nominal2 unique values
0 missing
checking_statusnominal4 unique values
0 missing
durationnumeric33 unique values
0 missing
credit_historynominal5 unique values
0 missing
purposenominal10 unique values
0 missing
credit_amountnumeric921 unique values
0 missing
savings_statusnominal5 unique values
0 missing
employmentnominal5 unique values
0 missing
installment_commitmentnumeric4 unique values
0 missing
personal_statusnominal4 unique values
0 missing
other_partiesnominal3 unique values
0 missing
residence_sincenumeric4 unique values
0 missing
property_magnitudenominal4 unique values
0 missing
agenumeric53 unique values
0 missing
other_payment_plansnominal3 unique values
0 missing
housingnominal3 unique values
0 missing
existing_creditsnumeric4 unique values
0 missing
jobnominal4 unique values
0 missing
num_dependentsnumeric2 unique values
0 missing
own_telephonenominal2 unique values
0 missing
foreign_workernominal2 unique values
0 missing

19 properties

1000
Number of instances (rows) of the dataset.
21
Number of attributes (columns) of the dataset.
2
Number of distinct values of the target attribute (if it is nominal).
0
Number of missing values in the dataset.
0
Number of instances with at least one value missing.
7
Number of numeric attributes.
14
Number of nominal attributes.
14.29
Percentage of binary attributes.
0
Percentage of instances having missing values.
0
Percentage of missing values.
0.57
Average class difference between consecutive instances.
33.33
Percentage of numeric attributes.
0.02
Number of attributes divided by the number of instances.
66.67
Percentage of nominal attributes.
70
Percentage of instances belonging to the most frequent class.
700
Number of instances belonging to the most frequent class.
30
Percentage of instances belonging to the least frequent class.
300
Number of instances belonging to the least frequent class.
3
Number of binary attributes.

0 tasks

Define a new task