Data
HMEQ_Data

HMEQ_Data

active ARFF CC0: Public Domain Visibility: public Uploaded 23-03-2022 by Dustin Carrion
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes
  • Computer Systems Machine Learning
Issue #Downvotes for this reason By


Loading wiki
Help us complete this description Edit
Context The consumer credit department of a bank wants to automate the decisionmaking process for approval of home equity lines of credit. To do this, they will follow the recommendations of the Equal Credit Opportunity Act to create an empirically derived and statistically sound credit scoring model. The model will be based on data collected from recent applicants granted credit through the current process of loan underwriting. The model will be built from predictive modeling tools, but the created model must be sufficiently interpretable to provide a reason for any adverse actions (rejections). Content The Home Equity dataset (HMEQ) contains baseline and loan performance information for 5,960 recent home equity loans. The target (BAD) is a binary variable indicating whether an applicant eventually defaulted or was seriously delinquent. This adverse outcome occurred in 1,189 cases (20). For each applicant, 12 input variables were recorded. Acknowledgements Inspiration What if you can predict clients who default on their loans.

13 features

BADnumeric2 unique values
0 missing
LOANnumeric540 unique values
0 missing
MORTDUEnumeric5053 unique values
518 missing
VALUEnumeric5381 unique values
112 missing
REASONstring2 unique values
252 missing
JOBstring6 unique values
279 missing
YOJnumeric99 unique values
515 missing
DEROGnumeric11 unique values
708 missing
DELINQnumeric14 unique values
580 missing
CLAGEnumeric5314 unique values
308 missing
NINQnumeric16 unique values
510 missing
CLNOnumeric62 unique values
222 missing
DEBTINCnumeric4693 unique values
1267 missing

19 properties

5960
Number of instances (rows) of the dataset.
13
Number of attributes (columns) of the dataset.
Number of distinct values of the target attribute (if it is nominal).
5271
Number of missing values in the dataset.
2596
Number of instances with at least one value missing.
11
Number of numeric attributes.
0
Number of nominal attributes.
0
Number of attributes divided by the number of instances.
84.62
Percentage of numeric attributes.
Percentage of instances belonging to the most frequent class.
0
Percentage of nominal attributes.
Number of instances belonging to the most frequent class.
Percentage of instances belonging to the least frequent class.
Number of instances belonging to the least frequent class.
0
Number of binary attributes.
0
Percentage of binary attributes.
43.56
Percentage of instances having missing values.
Average class difference between consecutive instances.
6.8
Percentage of missing values.

0 tasks

Define a new task