Data
dataset_china

dataset_china

active ARFF CC BY 4.0 Visibility: public Uploaded 13-12-2024 by Sebastian Silva Ruiz
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes
Issue #Downvotes for this reason By


Loading wiki
Help us complete this description Edit
Financial dataset for automl benchmark. Name = dataset_china, target = five_categories

28 features

five_categories (target)string5 unique values
0 missing
customer_idnumeric27522 unique values
0 missing
type_of_loan_businessnumeric22 unique values
0 missing
guarantee_the_balancenumeric20191 unique values
304 missing
account_connection_amountnumeric24789 unique values
304 missing
security_guarantee_amountnumeric20209 unique values
304 missing
five-level_classificationnumeric11 unique values
0 missing
whether_interest_is_owednumeric2 unique values
0 missing
whether_self-service_loannumeric2 unique values
0 missing
type_of_guaranteenumeric18 unique values
0 missing
safety_coefficientnumeric8 unique values
304 missing
collateral_value_(yuan)numeric11577 unique values
322 missing
guarantee_methodnumeric5 unique values
0 missing
date_codenumeric3 unique values
0 missing
approval_deadlinenumeric37 unique values
0 missing
whether_devalue_accountnumeric2 unique values
0 missing
industry_categorynumeric20 unique values
0 missing
down_payment_amountnumeric10036 unique values
620 missing
whether_personal_business_loannumeric2 unique values
0 missing
whether_interest_is_owed_(regulatory_standard)numeric2 unique values
0 missing
repayment_typenumeric3 unique values
0 missing
installment_repayment_method_(numerical_type)numeric2 unique values
757 missing
installment_repayment_method_(discrete_type)numeric3 unique values
0 missing
installment_repayment_cycle_(numerical_type)numeric3 unique values
0 missing
repayment_cycle_(discrete_type)numeric3 unique values
0 missing
number_of_housesnumeric4 unique values
4571 missing
month_property_costsnumeric3344 unique values
1286 missing
family_monthly_incomenumeric2940 unique values
20 missing

19 properties

27522
Number of instances (rows) of the dataset.
28
Number of attributes (columns) of the dataset.
5
Number of distinct values of the target attribute (if it is nominal).
8792
Number of missing values in the dataset.
4610
Number of instances with at least one value missing.
27
Number of numeric attributes.
0
Number of nominal attributes.
0.37
Percentage of instances belonging to the least frequent class.
103
Number of instances belonging to the least frequent class.
0
Number of binary attributes.
0
Percentage of binary attributes.
16.75
Percentage of instances having missing values.
1
Average class difference between consecutive instances.
1.14
Percentage of missing values.
0
Number of attributes divided by the number of instances.
96.43
Percentage of numeric attributes.
92.71
Percentage of instances belonging to the most frequent class.
0
Percentage of nominal attributes.
25516
Number of instances belonging to the most frequent class.

0 tasks

Define a new task