Data
Click

Click

active ARFF Publicly available Visibility: public Uploaded 04-06-2023 by Matthias Feurer
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes
Issue #Downvotes for this reason By


Loading wiki
Help us complete this description Edit
This dataset is a subset of the [KDDCup 2012 track 2](https://www.kaggle.com/competitions/kddcup2012-track2/) data created by Manu Joseph and Harsh Raj for the paper > Joseph, M., & Raj, H. (2022). > GATE: Gated Additive Tree Ensemble for Tabular Classification and Regression. > arXiv preprint arXiv:2207.08548v4. We retrieved the data from [Dropbox](https://www.dropbox.com/s/ry6zsr6qtuz8l5z/click_set_1_.pickle?dl=1). Note: please read the Kaggle dataset description carefully before using this dataset. This dataset mostly contains IDs that should be looked up in other files that are not on OpenML and were not used as part of the benchmark in the paper mentioned above.

12 features

target (target)nominal2 unique values
0 missing
impressionnumeric872 unique values
0 missing
url_hashnumeric1602 unique values
0 missing
ad_idnumeric1829 unique values
0 missing
advertiser_idnumeric1587 unique values
0 missing
depthnumeric3 unique values
0 missing
positionnumeric3 unique values
0 missing
query_idnumeric1074 unique values
0 missing
keyword_idnumeric1787 unique values
0 missing
title_idnumeric1429 unique values
0 missing
description_idnumeric1564 unique values
0 missing
user_idnumeric40 unique values
0 missing

19 properties

1000000
Number of instances (rows) of the dataset.
12
Number of attributes (columns) of the dataset.
2
Number of distinct values of the target attribute (if it is nominal).
0
Number of missing values in the dataset.
0
Number of instances with at least one value missing.
11
Number of numeric attributes.
1
Number of nominal attributes.
8.33
Percentage of binary attributes.
0
Percentage of instances having missing values.
0.5
Average class difference between consecutive instances.
0
Percentage of missing values.
0
Number of attributes divided by the number of instances.
91.67
Percentage of numeric attributes.
50
Percentage of instances belonging to the most frequent class.
8.33
Percentage of nominal attributes.
500000
Number of instances belonging to the most frequent class.
50
Percentage of instances belonging to the least frequent class.
500000
Number of instances belonging to the least frequent class.
1
Number of binary attributes.

1 tasks

0 runs - estimation_procedure: 4-fold Crossvalidation - target_feature: target
Define a new task