{ "data_id": "1552", "name": "autoUniv-au7-1100", "exact_name": "autoUniv-au7-1100", "version": 1, "version_label": null, "description": "**Author**: Ray. J. Hickey \n**Source**: UCI \n**Please cite**: \n\n* Dataset Title: \n\nAutoUniv Dataset \ndata problem: autoUniv-au7-300-drift-au7-cpd1-800 \n\n* Abstract: \n\nAutoUniv is an advanced data generator for classifications tasks. The aim is to reflect the nuances and heterogeneity of real data. Data can be generated in .csv, ARFF or C4.5 formats.\n\n* Source: \n\nAutoUniv was developed by Ray. J. Hickey. Email: ray.j.hickey '@' gmail.com \nAutoUniv web-site: http:\/\/sites.google.com\/site\/autouniv\/.\n\n\n* Data Set Information:\n\nThe user first creates a classification model and then generates classified examples from it. To create a model, the following are specified: the number of attributes (up to 1000) and their type (discrete or continuous), the number of classes (up to 10), the complexity of the underlying rules and the noise level. AutoUniv then produces a model through a process of constrained randomised search to satisfy the user's requirements. A model can have up to 3000 rules. Rare class models can be designed. A sequence of models can be designed to reflect concept and\/or population drift. \n\nAutoUniv creates three text files for a model: a Prolog specification of the model used to generate examples (.aupl); a user-friendly statement of the classification rules in an 'if ... then' format (.aurules); a statistical summary of the main properties of the model, including its Bayes rate (.auprops).\n\n\n* Attribute Information: \n\nAttributes may be discrete with up to 10 values or continuous. A discrete attribute can be nominal with values v1, v2, v3 ... or integer with values 0, 1, 2 , ... .\n\n\n* Relevant Papers:\n\nMarrs, G, Hickey, RJ and Black, MM (2010) Modeling the example life-cycle in an online classification learner. In Proceedings of HaCDAIS 2010: International Workshop on Handling Concept Drift in Adaptive Information Systems. \n[Web Link]#proc . \n\nMarrs, G, Hickey, RJ and Black, MM (2010) The Impact of Latency on Online Classification Learning with Concept Drift. In Y. Bi and M.A. Williams (Eds.): KSEM 2010, LNAI 6291, Springer-Verlag, Berlin, pp. 459\u00e2\u20ac\u201c469. \n\nHickey, RJ (2007) Structure and Majority Classes in Decision Tree Learning. Journal of Machine Learning Research, 8, pp. 1747-1768.", "format": "ARFF", "uploader": "Rafael Gomes Mantovani", "uploader_id": 64, "visibility": "public", "creator": null, "contributor": null, "date": "2015-06-01 19:52:43", "update_comment": null, "last_update": "2015-10-08 17:31:38", "licence": "Public", "status": "active", "error_message": null, "url": "https:\/\/www.openml.org\/data\/download\/1593748\/phpmRPvKy", "kaggle_url": null, "default_target_attribute": "Class", "row_id_attribute": null, "ignore_attribute": null, "runs": 7130, "suggest": { "input": [ "autoUniv-au7-1100", "* Dataset Title: AutoUniv Dataset data problem: autoUniv-au7-300-drift-au7-cpd1-800 * Abstract: AutoUniv is an advanced data generator for classifications tasks. The aim is to reflect the nuances and heterogeneity of real data. Data can be generated in .csv, ARFF or C4.5 formats. * Source: AutoUniv was developed by Ray. J. Hickey. Email: ray.j.hickey '@' gmail.com AutoUniv web-site: http:\/\/sites.google.com\/site\/autouniv\/. * Data Set Information: The user first creates a classification model and " ], "weight": 5 }, "qualities": { "NumberOfInstances": 1100, "NumberOfFeatures": 13, "NumberOfClasses": 5, "NumberOfMissingValues": 0, "NumberOfInstancesWithMissingValues": 0, "NumberOfNumericFeatures": 8, "NumberOfSymbolicFeatures": 5, "REPTreeDepth1Kappa": 0.243108866121908, "CfsSubsetEval_kNN1NAUC": 0.5971464324573859, "StdvNominalAttDistinctValues": 1.224744871391589, "J48.001.ErrRate": 0.6627272727272727, "MeanNominalAttDistinctValues": 3, "Quartile1SkewnessOfNumericAtts": 0.022409155794571107, "REPTreeDepth2AUC": 0.6764108455897102, "CfsSubsetEval_kNN1NErrRate": 0.6918181818181818, "kNN1NAUC": 0.5574875901922082, "J48.001.Kappa": 0.1626875443115692, "MeanSkewnessOfNumericAtts": 0.06283940906721638, "Quartile1StdDevOfNumericAtts": 0.5467012450243667, "REPTreeDepth2ErrRate": 0.5918181818181818, "CfsSubsetEval_kNN1NKappa": 0.09720155129051321, "kNN1NErrRate": 0.6963636363636364, "MajorityClassPercentage": 27.727272727272727, "MeanStdDevOfNumericAtts": 210.2407196518699, "Quartile2AttributeEntropy": 1.0705394909785793, "REPTreeDepth2Kappa": 0.243108866121908, "ClassEntropy": 2.2806595148216537, "kNN1NKappa": 0.1158864481740107, "MajorityClassSize": 305, "MinAttributeEntropy": 0.9630169713822223, "Quartile2KurtosisOfNumericAtts": -1.4151462860394408, "REPTreeDepth3AUC": 0.6764108455897102, "DecisionStumpAUC": 0.5850641919489404, "MaxAttributeEntropy": 1.5350206910714235, "MinKurtosisOfNumericAtts": -1.5743783110977574, "Quartile2MeansOfNumericAtts": 1.1008954545454546, "REPTreeDepth3ErrRate": 0.5918181818181818, "DecisionStumpErrRate": 0.6981818181818182, "MaxKurtosisOfNumericAtts": -0.9965901676449538, "MinMeansOfNumericAtts": 0.38274545454545456, "Quartile2MutualInformation": 0.059751521844434996, "Quartile2SkewnessOfNumericAtts": 0.07342156907982962, "REPTreeDepth3Kappa": 0.243108866121908, "DecisionStumpKappa": 0.06374159800069822, "MaxMeansOfNumericAtts": 5201.511818181818, "MinMutualInformation": 0.00801173999805, "Quartile2StdDevOfNumericAtts": 0.8040745368247751, "RandomTreeDepth1AUC": 0.5654642403309165, "Dimensionality": 0.011818181818181818, "MaxMutualInformation": 0.07801867345953, "MinNominalAttDistinctValues": 2, "PercentageOfBinaryFeatures": 15.384615384615385, "Quartile3AttributeEntropy": 1.4401745275086542, "RandomTreeDepth1ErrRate": 0.6845454545454546, "EquivalentNumberOfAtts": 44.38517303188457, "MaxNominalAttDistinctValues": 5, "MinSkewnessOfNumericAtts": -0.34797451477343355, "PercentageOfInstancesWithMissingValues": 0, "Quartile3KurtosisOfNumericAtts": -1.2745580046680725, "AutoCorrelation": 0.272975432211101, "RandomTreeDepth1Kappa": 0.1320503228483587, "J48.00001.AUC": 0.6021145524009214, "MaxSkewnessOfNumericAtts": 0.3465158278199235, "MinStdDevOfNumericAtts": 0.16796135850308763, "PercentageOfMissingValues": 0, "Quartile3MeansOfNumericAtts": 2462.5751727272727, "CfsSubsetEval_DecisionStumpAUC": 0.5971464324573859, "RandomTreeDepth2AUC": 0.5654642403309165, "J48.00001.ErrRate": 0.6627272727272727, "MaxStdDevOfNumericAtts": 1489.7474301651541, "MinorityClassPercentage": 13.90909090909091, "PercentageOfNumericFeatures": 61.53846153846154, "Quartile3MutualInformation": 0.0770084694897225, "CfsSubsetEval_DecisionStumpErrRate": 0.6918181818181818, "RandomTreeDepth2ErrRate": 0.6845454545454546, "J48.00001.Kappa": 0.1626875443115692, "MeanAttributeEntropy": 1.159779161102701, "MinorityClassSize": 153, "PercentageOfSymbolicFeatures": 38.46153846153847, "Quartile3SkewnessOfNumericAtts": 0.15922212477953182, "CfsSubsetEval_DecisionStumpKappa": 0.09720155129051321, "RandomTreeDepth2Kappa": 0.1320503228483587, "J48.0001.AUC": 0.6021145524009214, "MeanKurtosisOfNumericAtts": -1.380730650924698, "NaiveBayesAUC": 0.6543544034115975, "Quartile1AttributeEntropy": 0.9686234648208698, "Quartile3StdDevOfNumericAtts": 140.7060187004606, "CfsSubsetEval_NaiveBayesAUC": 0.5971464324573859, "RandomTreeDepth3AUC": 0.5654642403309165, "J48.0001.ErrRate": 0.6627272727272727, "MeanMeansOfNumericAtts": 1061.5329261363636, "NaiveBayesErrRate": 0.6627272727272727, "Quartile1KurtosisOfNumericAtts": -1.5407605075197877, "REPTreeDepth1AUC": 0.6764108455897102, "CfsSubsetEval_NaiveBayesErrRate": 0.6918181818181818, "RandomTreeDepth3ErrRate": 0.6845454545454546, "J48.0001.Kappa": 0.1626875443115692, "MeanMutualInformation": 0.0513833642866125, "NaiveBayesKappa": 0.14722787912946844, "Quartile1MeansOfNumericAtts": 0.9206818181818186, "REPTreeDepth1ErrRate": 0.5918181818181818, "CfsSubsetEval_NaiveBayesKappa": 0.09720155129051321, "RandomTreeDepth3Kappa": 0.1320503228483587, "J48.001.AUC": 0.6021145524009214, "MeanNoiseToSignalRatio": 21.57110209120487, "NumberOfBinaryFeatures": 2, "Quartile1MutualInformation": 0.01739010152568 }, "tags": [ { "uploader": "2", "tag": "artificial" }, { "uploader": "38960", "tag": "Data Science" }, { "uploader": "38960", "tag": "Information Technology" }, { "uploader": "38960", "tag": "Machine Learning" }, { "uploader": "5824", "tag": "study_144" }, { "uploader": "1", "tag": "study_34" }, { "uploader": "64", "tag": "study_50" }, { "uploader": "64", "tag": "study_52" }, { "uploader": "64", "tag": "study_7" } ], "features": [ { "name": "Class", "index": "12", "type": "nominal", "distinct": "5", "missing": "0", "target": "1", "distr": [ [ "class1", "class2", "class3", "class4", "class5" ], [ [ "153", "0", "0", "0", "0" ], [ "0", "246", "0", "0", "0" ], [ "0", "0", "216", "0", "0" ], [ "0", "0", "0", "180", "0" ], [ "0", "0", "0", "0", "305" ] ] ] }, { "name": "V1", "index": "0", "type": "numeric", "distinct": "423", "missing": "0", "min": "1", "max": "8", "mean": "4", "stdev": "2" }, { "name": "V2", "index": "1", "type": "numeric", "distinct": "360", "missing": "0", "min": "3069", "max": "3625", "mean": "3282", "stdev": "187" }, { "name": "V3", "index": "2", "type": "numeric", "distinct": "55", "missing": "0", "min": "0", "max": "1", "mean": "0", "stdev": "0" }, { "name": "V4", "index": "3", "type": "numeric", "distinct": "3", "missing": "0", "min": "0", "max": "2", "mean": "1", "stdev": "1" }, { "name": "V5", "index": "4", "type": "nominal", "distinct": "2", "missing": "0", "distr": [ [ "v1", "v2" ], [ [ "74", "136", "117", "118", "183" ], [ "79", "110", "99", "62", "122" ] ] ] }, { "name": "V6", "index": "5", "type": "nominal", "distinct": "2", "missing": "0", "distr": [ [ "v1", "v2" ], [ [ "110", "157", "141", "61", "205" ], [ "43", "89", "75", "119", "100" ] ] ] }, { "name": "V7", "index": "6", "type": "numeric", "distinct": "843", "missing": "0", "min": "2996", "max": "7362", "mean": "5202", "stdev": "1490" }, { "name": "V8", "index": "7", "type": "nominal", "distinct": "3", "missing": "0", "distr": [ [ "v1", "v2", "v3" ], [ [ "76", "158", "131", "157", "260" ], [ "39", "45", "43", "12", "31" ], [ "38", "43", "42", "11", "14" ] ] ] }, { "name": "V9", "index": "8", "type": "numeric", "distinct": "3", "missing": "0", "min": "0", "max": "2", "mean": "1", "stdev": "1" }, { "name": "V10", "index": "9", "type": "nominal", "distinct": "3", "missing": "0", "distr": [ [ "v1", "v2", "v3" ], [ [ "89", "132", "117", "47", "121" ], [ "44", "41", "28", "94", "81" ], [ "20", "73", "71", "39", "103" ] ] ] }, { "name": "V11", "index": "10", "type": "numeric", "distinct": "153", "missing": "0", "min": "1", "max": "2", "mean": "1", "stdev": "0" }, { "name": "V12", "index": "11", "type": "numeric", "distinct": "3", "missing": "0", "min": "0", "max": "2", "mean": "1", "stdev": "1" } ], "nr_of_issues": 0, "nr_of_downvotes": 0, "nr_of_likes": 0, "nr_of_downloads": 0, "total_downloads": 0, "reach": 0, "reuse": 0, "impact_of_reuse": 0, "reach_of_reuse": 0, "impact": 0 }