Data
Loan-Predication

Loan-Predication

active ARFF CC0: Public Domain Visibility: public Uploaded 24-03-2022 by Dustin Carrion
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes
  • Computer Systems Machine Learning
Issue #Downvotes for this reason By


Loading wiki
Help us complete this description Edit
Among all industries, insurance domain has the largest use of analytics data science methods. This data set would provide you enough taste of working on data sets from insurance companies, what challenges are faced, what strategies are used, which variables influence the outcome etc. This is a classification problem. The data has 615 rows and 13 columns. Problem----- Company wants to automate the loan eligibility process (real time) based on customer detail provided while filling online application form. These details are Gender, Marital Status, Education, Number of Dependents, Income, Loan Amount, Credit History and others. To automate this process, they have given a problem to identify the customers segments, those are eligible for loan amount so that they can specifically target these customers. Here they have provided a partial data set.

12 features

Loan_Status (target)string2 unique values
0 missing
Loan_ID (ignore)string614 unique values
0 missing
Genderstring2 unique values
13 missing
Marriedstring2 unique values
3 missing
Dependentsstring4 unique values
15 missing
Educationstring2 unique values
0 missing
Self_Employedstring2 unique values
32 missing
ApplicantIncomenumeric505 unique values
0 missing
CoapplicantIncomenumeric287 unique values
0 missing
LoanAmountnumeric203 unique values
22 missing
Loan_Amount_Termnumeric10 unique values
14 missing
Credit_Historynumeric2 unique values
50 missing
Property_Areastring3 unique values
0 missing

19 properties

614
Number of instances (rows) of the dataset.
12
Number of attributes (columns) of the dataset.
2
Number of distinct values of the target attribute (if it is nominal).
149
Number of missing values in the dataset.
134
Number of instances with at least one value missing.
5
Number of numeric attributes.
0
Number of nominal attributes.
0.02
Number of attributes divided by the number of instances.
41.67
Percentage of numeric attributes.
68.73
Percentage of instances belonging to the most frequent class.
0
Percentage of nominal attributes.
422
Number of instances belonging to the most frequent class.
31.27
Percentage of instances belonging to the least frequent class.
192
Number of instances belonging to the least frequent class.
0
Number of binary attributes.
0
Percentage of binary attributes.
21.82
Percentage of instances having missing values.
1
Average class difference between consecutive instances.
2.02
Percentage of missing values.

0 tasks

Define a new task