{ "data_id": "44158", "name": "KDDCup09_upselling", "exact_name": "KDDCup09_upselling", "version": 4, "version_label": null, "description": "Dataset used in the tabular data benchmark https:\/\/github.com\/LeoGrin\/tabular-benchmark, \n transformed in the same way. This dataset belongs to the \"classification on categorical and\n numerical features\" benchmark. Original description: \n \n**Author**: \n**Source**: Unknown - Date unknown \n**Please cite**: \n\nDatasets from ACM KDD Cup (http:\/\/www.sigkdd.org\/kddcup\/index.php)\n\nKDD Cup 2009\nhttp:\/\/www.kddcup-orange.com\n\nConverted to ARFF format by TunedIT\nCustomer Relationship Management (CRM) is a key element of modern marketing strategies. The KDD Cup 2009 offers the opportunity to work on large marketing databases from the French Telecom company Orange to predict the propensity of customers to switch provider (churn), buy new products or services (appetency), or buy upgrades or add-ons proposed to them to make the sale more profitable (up-selling).\nThe most practical way, in a CRM system, to build knowledge on customer is to produce scores. A score (the output of a model) is an evaluation for all instances of a target variable to explain (i.e. churn, appetency or up-selling). Tools which produce scores allow to project, on a given population, quantifiable information. The score is computed using input variables which describe instances. Scores are then used by the information system (IS), for example, to personalize the customer relationship. An industrial customer analysis platform able to build prediction models with a very large number of input variables has been developed by Orange Labs. This platform implements several processing methods for instances and variables selection, prediction and indexation based on an efficient model combined with variable selection regularization and model averaging method. The main characteristic of this platform is its ability to scale on very large datasets with hundreds of thousands of instances and thousands of variables. The rapid and robust detection of the variables that have most contributed to the output prediction can be a key factor in a marketing application.\nUp-selling (wikipedia definition): Up-selling is a sales technique whereby a salesman attempts to have the customer purchase more expensive\nitems, upgrades, or other add-ons in an attempt to make a more profitable sale.\nUp-selling usually involves marketing more profitable services or products, but up-selling can also be simply exposing the customer\nto other options he or she may not have considered previously.\nUp-selling can imply selling something additional, or selling something that is more profitable or otherwise preferable for the seller instead of the original sale.\nThe training set contains 50,000 examples.\nThe first predictive 190 variables are numerical and the last 40 predictive variables are categorical.\nThe last target variable is binary {-1,1}.", "format": "arff", "uploader": "Leo Grin", "uploader_id": 26324, "visibility": "public", "creator": null, "contributor": "\"Leo Grin\"", "date": "2022-07-10 10:35:25", "update_comment": null, "last_update": "2022-07-10 10:35:25", "licence": "Public", "status": "active", "error_message": null, "url": "https:\/\/api.openml.org\/data\/download\/22103283\/dataset", "kaggle_url": null, "default_target_attribute": "UPSELLING", "row_id_attribute": null, "ignore_attribute": null, "runs": 0, "suggest": { "input": [ "KDDCup09_upselling", "Dataset used in the tabular data benchmark https:\/\/github.com\/LeoGrin\/tabular-benchmark, transformed in the same way. This dataset belongs to the \"classification on categorical and numerical features\" benchmark. Original description: Datasets from ACM KDD Cup (http:\/\/www.sigkdd.org\/kddcup\/index.php) KDD Cup 2009 http:\/\/www.kddcup-orange.com Converted to ARFF format by TunedIT Customer Relationship Management (CRM) is a key element of modern marketing strategies. The KDD Cup 2009 offers the oppor " ], "weight": 5 }, "qualities": { "NumberOfInstances": 5032, "NumberOfFeatures": 46, "NumberOfClasses": 2, "NumberOfMissingValues": 0, "NumberOfInstancesWithMissingValues": 0, "NumberOfNumericFeatures": 34, "NumberOfSymbolicFeatures": 12, "PercentageOfBinaryFeatures": 8.695652173913043, "PercentageOfInstancesWithMissingValues": 0, "PercentageOfMissingValues": 0, "AutoCorrelation": 0.9998012323593719, "PercentageOfNumericFeatures": 73.91304347826086, "Dimensionality": 0.009141494435612083, "PercentageOfSymbolicFeatures": 26.08695652173913, "MajorityClassPercentage": 50, "MajorityClassSize": 2516, "MinorityClassPercentage": 50, "MinorityClassSize": 2516, "NumberOfBinaryFeatures": 4 }, "tags": [], "features": [ { "name": "UPSELLING", "index": "45", "type": "nominal", "distinct": "2", "missing": "0", "target": "1", "distr": [ [ "-1", "1" ], [ [ "2516", "0" ], [ "0", "2516" ] ] ] }, { "name": "Var6", "index": "0", "type": "numeric", "distinct": "707", "missing": "0", "min": "0", "max": "68439", "mean": "1347", "stdev": "2230" }, { "name": "Var13", "index": "1", "type": "numeric", "distinct": "1137", "missing": "0", "min": "0", "max": "41688", "mean": "1154", "stdev": "2390" }, { "name": "Var21", "index": "2", "type": "numeric", "distinct": "321", "missing": "0", "min": "4", "max": "11880", "mean": "240", "stdev": "416" }, { "name": "Var22", "index": "3", "type": "numeric", "distinct": "321", "missing": "0", "min": "5", "max": "14850", "mean": "300", "stdev": "520" }, { "name": "Var24", "index": "4", "type": "numeric", "distinct": "46", "missing": "0", "min": "0", "max": "188", "mean": "5", "stdev": "9" }, { "name": "Var25", "index": "5", "type": "numeric", "distinct": "130", "missing": "0", "min": "0", "max": "3560", "mean": "106", "stdev": "180" }, { "name": "Var28", "index": "6", "type": "numeric", "distinct": "944", "missing": "0", "min": "0", "max": "2034", "mean": "213", "stdev": "81" }, { "name": "Var35", "index": "7", "type": "numeric", "distinct": "9", "missing": "0", "min": "0", "max": "45", "mean": "1", "stdev": "3" }, { "name": "Var38", "index": "8", "type": "numeric", "distinct": "3878", "missing": "0", "min": "0", "max": "17209020", "mean": "2487407", "stdev": "3006874" }, { "name": "Var57", "index": "9", "type": "numeric", "distinct": "4658", "missing": "0", "min": "0", "max": "7", "mean": "4", "stdev": "2" }, { "name": "Var65", "index": "10", "type": "numeric", "distinct": "11", "missing": "0", "min": "9", "max": "108", "mean": "15", "stdev": "10" }, { "name": "Var73", "index": "11", "type": "numeric", "distinct": "113", "missing": "0", "min": "12", "max": "264", "mean": "71", "stdev": "50" }, { "name": "Var74", "index": "12", "type": "numeric", "distinct": "187", "missing": "0", "min": "0", "max": "9919", "mean": "100", "stdev": "301" }, { "name": "Var76", "index": "13", "type": "numeric", "distinct": "3903", "missing": "0", "min": "0", "max": "19353600", "mean": "1506668", "stdev": "1885648" }, { "name": "Var78", "index": "14", "type": "numeric", "distinct": "8", "missing": "0", "min": "0", "max": "21", "mean": "0", "stdev": "2" }, { "name": "Var81", "index": "15", "type": "numeric", "distinct": "5019", "missing": "0", "min": "609", "max": "1326300", "mean": "98038", "stdev": "102780" }, { "name": "Var83", "index": "16", "type": "numeric", "distinct": "62", "missing": "0", "min": "0", "max": "2585", "mean": "18", "stdev": "63" }, { "name": "Var85", "index": "17", "type": "numeric", "distinct": "64", "missing": "0", "min": "0", "max": "414", "mean": "9", "stdev": "17" }, { "name": "Var109", "index": "18", "type": "numeric", "distinct": "87", "missing": "0", "min": "0", "max": "3112", "mean": "61", "stdev": "109" }, { "name": "Var112", "index": "19", "type": "numeric", "distinct": "100", "missing": "0", "min": "0", "max": "5704", "mean": "72", "stdev": "143" }, { "name": "Var113", "index": "20", "type": "numeric", "distinct": "5019", "missing": "0", "min": "-2574396", "max": "9537640", "mean": "-2564", "stdev": "487313" }, { "name": "Var119", "index": "21", "type": "numeric", "distinct": "647", "missing": "0", "min": "5", "max": "50685", "mean": "910", "stdev": "1620" }, { "name": "Var123", "index": "22", "type": "numeric", "distinct": "102", "missing": "0", "min": "0", "max": "6954", "mean": "57", "stdev": "174" }, { "name": "Var125", "index": "23", "type": "numeric", "distinct": "2673", "missing": "0", "min": "0", "max": "3583998", "mean": "26356", "stdev": "83745" }, { "name": "Var126", "index": "24", "type": "numeric", "distinct": "51", "missing": "0", "min": "-32", "max": "68", "mean": "-5", "stdev": "25" }, { "name": "Var132", "index": "25", "type": "numeric", "distinct": "13", "missing": "0", "min": "0", "max": "112", "mean": "3", "stdev": "9" }, { "name": "Var133", "index": "26", "type": "numeric", "distinct": "4648", "missing": "0", "min": "0", "max": "14207100", "mean": "2238552", "stdev": "2432625" }, { "name": "Var134", "index": "27", "type": "numeric", "distinct": "4238", "missing": "0", "min": "0", "max": "5735340", "mean": "426911", "stdev": "588444" }, { "name": "Var140", "index": "28", "type": "numeric", "distinct": "1065", "missing": "0", "min": "0", "max": "124445", "mean": "1341", "stdev": "3397" }, { "name": "Var144", "index": "29", "type": "numeric", "distinct": "8", "missing": "0", "min": "0", "max": "63", "mean": "11", "stdev": "11" }, { "name": "Var149", "index": "30", "type": "numeric", "distinct": "2703", "missing": "0", "min": "0", "max": "16934400", "mean": "297071", "stdev": "703593" }, { "name": "Var153", "index": "31", "type": "numeric", "distinct": "4917", "missing": "0", "min": "468", "max": "13167720", "mean": "6052288", "stdev": "4267883" }, { "name": "Var160", "index": "32", "type": "numeric", "distinct": "166", "missing": "0", "min": "0", "max": "1276", "mean": "40", "stdev": "73" }, { "name": "Var163", "index": "33", "type": "numeric", "distinct": "3179", "missing": "0", "min": "0", "max": "10886400", "mean": "482087", "stdev": "846726" }, { "name": "Var196", "index": "34", "type": "nominal", "distinct": "2", "missing": "0", "distr": [ [ "0", "3" ], [ [ "2506", "2499" ], [ "10", "17" ] ] ] }, { "name": "Var203", "index": "35", "type": "nominal", "distinct": "4", "missing": "0", "distr": [ [ "0", "1", "2", "5" ], [ [ "2330", "2302" ], [ "65", "51" ], [ "115", "160" ], [ "6", "3" ] ] ] }, { "name": "Var205", "index": "36", "type": "nominal", "distinct": "4", "missing": "0", "distr": [ [ "0", "1", "2", "3" ], [ [ "618", "635" ], [ "1580", "1606" ], [ "236", "177" ], [ "82", "98" ] ] ] }, { "name": "Var207", "index": "37", "type": "nominal", "distinct": "11", "missing": "0", "distr": [ [ "1", "10", "13", "2", "3", "4", "5", "6", "7", "8", "9" ], [ [ "1", "2" ], [ "1774", "1644" ], [ "0", "1" ], [ "6", "5" ], [ "3", "5" ], [ "324", "401" ], [ "175", "205" ], [ "3", "4" ], [ "53", "66" ], [ "108", "113" ], [ "69", "70" ] ] ] }, { "name": "Var208", "index": "38", "type": "nominal", "distinct": "3", "missing": "0", "distr": [ [ "0", "1", "2" ], [ [ "2370", "2333" ], [ "140", "180" ], [ "6", "3" ] ] ] }, { "name": "Var210", "index": "39", "type": "nominal", "distinct": "5", "missing": "0", "distr": [ [ "0", "1", "2", "3", "5" ], [ [ "4", "1" ], [ "25", "0" ], [ "9", "7" ], [ "59", "6" ], [ "2419", "2502" ] ] ] }, { "name": "Var211", "index": "40", "type": "nominal", "distinct": "2", "missing": "0", "distr": [ [ "0", "1" ], [ [ "1915", "2496" ], [ "601", "20" ] ] ] }, { "name": "Var218", "index": "41", "type": "nominal", "distinct": "2", "missing": "0", "distr": [ [ "UYBR", "cJvF" ], [ [ "1303", "930" ], [ "1213", "1586" ] ] ] }, { "name": "Var221", "index": "42", "type": "nominal", "distinct": "7", "missing": "0", "distr": [ [ "0", "1", "2", "3", "4", "5", "6" ], [ [ "89", "105" ], [ "7", "8" ], [ "88", "108" ], [ "160", "180" ], [ "1856", "1748" ], [ "7", "8" ], [ "309", "359" ] ] ] }, { "name": "Var223", "index": "43", "type": "nominal", "distinct": "5", "missing": "0", "distr": [ [ "0", "1", "2", "3", "4" ], [ [ "1932", "1997" ], [ "76", "92" ], [ "8", "4" ], [ "272", "269" ], [ "228", "154" ] ] ] }, { "name": "Var227", "index": "44", "type": "nominal", "distinct": "7", "missing": "0", "distr": [ [ "0", "1", "2", "3", "4", "5", "6" ], [ [ "102", "121" ], [ "188", "193" ], [ "1779", "1650" ], [ "290", "349" ], [ "130", "156" ], [ "0", "2" ], [ "27", "45" ] ] ] } ], "nr_of_issues": 0, "nr_of_downvotes": 0, "nr_of_likes": 0, "nr_of_downloads": 0, "total_downloads": 0, "reach": 0, "reuse": 0, "impact_of_reuse": 0, "reach_of_reuse": 0, "impact": 0 }