Data
heloc

heloc

active ARFF See source Visibility: public Uploaded 03-01-2023 by Leo Grin
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes
Issue #Downvotes for this reason By


Loading wiki
Help us complete this description Edit
Dataset used in the tabular data benchmark https://github.com/LeoGrin/tabular-benchmark, transformed in the same way. This dataset belongs to the "classification on numerical features" benchmark. Original source: https://www.kaggle.com/datasets/averkiyoliabev/home-equity-line-of-creditheloc?select=heloc_dataset_v1+%281%29.csv Please give credit to the original source if you use this dataset.

23 features

RiskPerformance (target)nominal2 unique values
0 missing
ExternalRiskEstimatenumeric61 unique values
0 missing
MSinceOldestTradeOpennumeric526 unique values
0 missing
MSinceMostRecentTradeOpennumeric111 unique values
0 missing
AverageMInFilenumeric237 unique values
0 missing
NumSatisfactoryTradesnumeric74 unique values
0 missing
NumTrades60Ever2DerogPubRecnumeric19 unique values
0 missing
NumTrades90Ever2DerogPubRecnumeric17 unique values
0 missing
PercentTradesNeverDelqnumeric72 unique values
0 missing
MSinceMostRecentDelqnumeric87 unique values
0 missing
MaxDelq2PublicRecLast12Mnumeric10 unique values
0 missing
NumTotalTradesnumeric88 unique values
0 missing
NumTradesOpeninLast12Mnumeric19 unique values
0 missing
PercentInstallTradesnumeric95 unique values
0 missing
MSinceMostRecentInqexcl7daysnumeric28 unique values
0 missing
NumInqLast6Mnumeric26 unique values
0 missing
NumInqLast6Mexcl7daysnumeric26 unique values
0 missing
NetFractionRevolvingBurdennumeric127 unique values
0 missing
NetFractionInstallBurdennumeric139 unique values
0 missing
NumRevolvingTradesWBalancenumeric31 unique values
0 missing
NumInstallTradesWBalancenumeric20 unique values
0 missing
NumBank2NatlTradesWHighUtilizationnumeric19 unique values
0 missing
PercentTradesWBalancenumeric94 unique values
0 missing

19 properties

10000
Number of instances (rows) of the dataset.
23
Number of attributes (columns) of the dataset.
2
Number of distinct values of the target attribute (if it is nominal).
0
Number of missing values in the dataset.
0
Number of instances with at least one value missing.
22
Number of numeric attributes.
1
Number of nominal attributes.
4.35
Percentage of binary attributes.
0
Percentage of instances having missing values.
0
Percentage of missing values.
1
Average class difference between consecutive instances.
95.65
Percentage of numeric attributes.
0
Number of attributes divided by the number of instances.
4.35
Percentage of nominal attributes.
50
Percentage of instances belonging to the most frequent class.
5000
Number of instances belonging to the most frequent class.
50
Percentage of instances belonging to the least frequent class.
5000
Number of instances belonging to the least frequent class.
1
Number of binary attributes.

2 tasks

0 runs - estimation_procedure: 10-fold Crossvalidation - evaluation_measure: predictive_accuracy - target_feature: RiskPerformance
0 runs - estimation_procedure: 4-fold Crossvalidation - target_feature: RiskPerformance
Define a new task