Data
Loan_Status_Classification

Loan_Status_Classification

active ARFF CC BY 4.0 Visibility: public Uploaded 23-11-2024 by Yayun Li
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes
Issue #Downvotes for this reason By


Loading wiki
Help us complete this description Edit
# Loan Approval Classification Dataset ## Data Source This dataset is a synthetic version inspired by the original Credit Risk dataset on Kaggle and enriched with additional variables based on historical loan approval data. SMOTENC was used to simulate new data points to enlarge the databank. The dataset is structured for both categorical and continuous features. ## Metadata The dataset contains 45,000 records and 14 variables, each described below: ### Personal Information Features: 1. person_age: Age of the person 2. person_gender: Gender of the person 3. person_education: Highest education level 4. person_income: Annual income 5. person_emp_exp: Years of employment experience 6. person_home_ownership: Home ownership status (e.g., rent, own, mortgage) ### Loan Information Features: 7. loan_amnt: Loan amount requested 8. loan_intent: Purpose of the loan 9. loan_int_rate: Loan interest rate 10. loan_percent_income: Loan amount as a percentage of annual income ### Credit Information Features: 11. cb_person_cred_hist_length: Length of credit history in years 12. credit_score: Credit score of the person 13. previous_loan_defaults_on_file: Indicator of previous loan defaults 14. loan_status: Loan approval status: 1 = approved, 0 = rejected (target variable) ## Target Variable The target variable 'loan_status' is binary: - 1 = approved - 0 = rejected

14 features

loan_status (target)nominal2 unique values
0 missing
person_agenumeric60 unique values
0 missing
person_gendernumeric2 unique values
0 missing
person_educationnumeric5 unique values
0 missing
person_incomenumeric33989 unique values
0 missing
person_emp_expnumeric63 unique values
0 missing
person_home_ownershipnumeric4 unique values
0 missing
loan_amntnumeric4483 unique values
0 missing
loan_intentnumeric6 unique values
0 missing
loan_int_ratenumeric1302 unique values
0 missing
loan_percent_incomenumeric64 unique values
0 missing
cb_person_cred_hist_lengthnumeric29 unique values
0 missing
credit_scorenumeric340 unique values
0 missing
previous_loan_defaults_on_filenumeric2 unique values
0 missing

19 properties

45000
Number of instances (rows) of the dataset.
14
Number of attributes (columns) of the dataset.
2
Number of distinct values of the target attribute (if it is nominal).
0
Number of missing values in the dataset.
0
Number of instances with at least one value missing.
13
Number of numeric attributes.
1
Number of nominal attributes.
7.14
Percentage of binary attributes.
0
Percentage of instances having missing values.
0.78
Average class difference between consecutive instances.
0
Percentage of missing values.
0
Number of attributes divided by the number of instances.
92.86
Percentage of numeric attributes.
77.78
Percentage of instances belonging to the most frequent class.
7.14
Percentage of nominal attributes.
35000
Number of instances belonging to the most frequent class.
22.22
Percentage of instances belonging to the least frequent class.
10000
Number of instances belonging to the least frequent class.
1
Number of binary attributes.

1 tasks

0 runs - estimation_procedure: 10-fold Crossvalidation - target_feature: loan_status
Define a new task