{ "data_id": "1459", "name": "artificial-characters", "exact_name": "artificial-characters", "version": 1, "version_label": null, "description": "**Author**: H. Altay Guvenir, Burak Acar, Haldun Muderrisoglu \r\n**Source**: [UCI](https:\/\/archive.ics.uci.edu\/ml\/datasets\/Artificial+Characters) - 1992 \r\n**Please cite**: [UCI](https:\/\/archive.ics.uci.edu\/ml\/citation_policy.html) \r\n\r\nThis database has been artificially generated. It describes the structure of the capital letters A, C, D, E, F, G, H, L, P, R, indicated by a number 1-10, in that order (A=1,C=2,...). Each letter's structure is described by a set of segments (lines) which resemble the way an automatic program would segment an image. The dataset consists of 600 such descriptions per letter. \r\n\r\nOriginally, each 'instance' (letter) was stored in a separate file, each consisting of between 1 and 7 segments, numbered 0,1,2,3,... Here they are merged. That means that the first 5 instances describe the first 5 segments of the first segmentation of the first letter (A). Also, the training set (100 examples) and test set (the rest) are merged. The next 7 instances describe another segmentation (also of the letter A) and so on.\r\n\r\n### Attribute Information \r\n\r\n* V1: object number, the number of the segment (0,1,2,..,7) \r\n* V2-V5: the initial and final coordinates of a segment in a cartesian plane (XX1,YY1,XX2,YY2). \r\n* V6: size, this is the length of a segment computed by using the geometric distance between two points A(X1,Y1) and B(X2,Y2).\r\n* V7: diagonal, this is the length of the diagonal of the smallest rectangle which includes the picture of the character. The value of this attribute is the same in each object.\r\n\r\n### Relevant Papers \r\n\r\nM. Botta, A. Giordana, L. Saitta: \"Learning Fuzzy Concept Definitions\", IEEE-Fuzzy Conference, 1993. \r\nM. Botta, A. Giordana: \"Learning Quantitative Feature in a Symbolic Environment\", LNAI 542, 1991, pp. 296-305. ", "format": "ARFF", "uploader": "Rafael Gomes Mantovani", "uploader_id": 64, "visibility": "public", "creator": null, "contributor": null, "date": "2015-05-21 20:58:53", "update_comment": null, "last_update": "2015-05-21 20:58:53", "licence": "Public", "status": "active", "error_message": null, "url": "https:\/\/www.openml.org\/data\/download\/1586212\/phpPQrHPH", "default_target_attribute": "Class", "row_id_attribute": null, "ignore_attribute": null, "runs": 24762, "suggest": { "input": [ "artificial-characters", "This database has been artificially generated. It describes the structure of the capital letters A, C, D, E, F, G, H, L, P, R, indicated by a number 1-10, in that order (A=1,C=2,...). Each letter's structure is described by a set of segments (lines) which resemble the way an automatic program would segment an image. The dataset consists of 600 such descriptions per letter. Originally, each 'instance' (letter) was stored in a separate file, each consisting of between 1 and 7 segments, numbered 0, " ], "weight": 5 }, "qualities": { "NumberOfInstances": 10218, "NumberOfFeatures": 8, "NumberOfClasses": 10, "NumberOfMissingValues": 0, "NumberOfInstancesWithMissingValues": 0, "NumberOfNumericFeatures": 7, "NumberOfSymbolicFeatures": 1, "REPTreeDepth3Kappa": 0.604785222133906, "DecisionStumpKappa": 0.07636718614547375, "MaxMeansOfNumericAtts": 40.46352319436287, "MinMutualInformation": null, "Quartile2SkewnessOfNumericAtts": 0.7106609244936721, "RandomTreeDepth1AUC": 0.8822326118159725, "Dimensionality": 0.0007829320806420043, "MaxMutualInformation": null, "MinNominalAttDistinctValues": 10, "PercentageOfBinaryFeatures": 0, "Quartile2StdDevOfNumericAtts": 9.730990564339642, "RandomTreeDepth1ErrRate": 0.21403405754550792, "EquivalentNumberOfAtts": null, "MaxNominalAttDistinctValues": 10, "MinSkewnessOfNumericAtts": 0.19900180105948925, "PercentageOfInstancesWithMissingValues": 0, "Quartile3AttributeEntropy": null, "RandomTreeDepth1Kappa": 0.7608496850207439, "J48.00001.AUC": 0.9265818933387849, "MaxSkewnessOfNumericAtts": 1.4157476325672549, "MinStdDevOfNumericAtts": 1.7196318285176377, "PercentageOfMissingValues": 0, "Quartile3KurtosisOfNumericAtts": 0.25628960456021055, "AutoCorrelation": 0.9982382304003132, "RandomTreeDepth2AUC": 0.8822326118159725, "J48.00001.ErrRate": 0.30407124681933845, "MaxStdDevOfNumericAtts": 14.178974105774868, "MinorityClassPercentage": 5.871990604815033, "PercentageOfNumericFeatures": 87.5, "Quartile3MeansOfNumericAtts": 21.029359953024056, "CfsSubsetEval_DecisionStumpAUC": 0.9059441112542227, "RandomTreeDepth2ErrRate": 0.21403405754550792, "J48.00001.Kappa": 0.660558659944214, "MeanAttributeEntropy": null, "MinorityClassSize": 600, "PercentageOfSymbolicFeatures": 12.5, "Quartile3MutualInformation": null, "CfsSubsetEval_DecisionStumpErrRate": 0.41612840086122527, "RandomTreeDepth2Kappa": 0.7608496850207439, "J48.0001.AUC": 0.9265818933387849, "MeanKurtosisOfNumericAtts": -0.03922173626081931, "NaiveBayesAUC": 0.7677179327332979, "Quartile1AttributeEntropy": null, "Quartile3SkewnessOfNumericAtts": 0.7949329609978085, "CfsSubsetEval_DecisionStumpKappa": 0.5345331746656256, "RandomTreeDepth3AUC": 0.8822326118159725, "J48.0001.ErrRate": 0.30407124681933845, "MeanMeansOfNumericAtts": 15.677500209713987, "NaiveBayesErrRate": 0.7031708749266001, "Quartile1KurtosisOfNumericAtts": -0.5335821934074398, "Quartile3StdDevOfNumericAtts": 13.10011887786818, "CfsSubsetEval_NaiveBayesAUC": 0.9059441112542227, "RandomTreeDepth3ErrRate": 0.21403405754550792, "J48.0001.Kappa": 0.660558659944214, "MeanMutualInformation": null, "NaiveBayesKappa": 0.22343509112823853, "Quartile1MeansOfNumericAtts": 6.061460168330399, "REPTreeDepth1AUC": 0.9396345065655836, "CfsSubsetEval_NaiveBayesErrRate": 0.41612840086122527, "RandomTreeDepth3Kappa": 0.7608496850207439, "J48.001.AUC": 0.9265818933387849, "MeanNoiseToSignalRatio": null, "NumberOfBinaryFeatures": 0, "Quartile1MutualInformation": null, "REPTreeDepth1ErrRate": 0.3539831669602662, "CfsSubsetEval_NaiveBayesKappa": 0.5345331746656256, "CfsSubsetEval_kNN1NAUC": 0.9059441112542227, "StdvNominalAttDistinctValues": 0, "J48.001.ErrRate": 0.30407124681933845, "MeanNominalAttDistinctValues": 10, "Quartile1SkewnessOfNumericAtts": 0.348123775480626, "REPTreeDepth1Kappa": 0.604785222133906, "CfsSubsetEval_kNN1NErrRate": 0.41612840086122527, "kNN1NAUC": 0.870539643653166, "J48.001.Kappa": 0.660558659944214, "MeanSkewnessOfNumericAtts": 0.6738720824506701, "Quartile1StdDevOfNumericAtts": 7.795727793517785, "REPTreeDepth2AUC": 0.9396345065655836, "CfsSubsetEval_kNN1NKappa": 0.5345331746656256, "kNN1NErrRate": 0.21462125660598944, "MajorityClassPercentage": 13.857897827363477, "MeanStdDevOfNumericAtts": 9.504522285963239, "Quartile2AttributeEntropy": null, "REPTreeDepth2ErrRate": 0.3539831669602662, "ClassEntropy": 3.284826011062924, "kNN1NKappa": 0.7603066926069706, "MajorityClassSize": 1416, "MinAttributeEntropy": null, "Quartile2KurtosisOfNumericAtts": -0.22758096976949904, "REPTreeDepth2Kappa": 0.604785222133906, "REPTreeDepth3AUC": 0.9396345065655836, "DecisionStumpAUC": 0.595130735282096, "MaxAttributeEntropy": null, "MinKurtosisOfNumericAtts": -0.5477487171323054, "Quartile2MeansOfNumericAtts": 15.246036406342014, "REPTreeDepth3ErrRate": 0.3539831669602662, "DecisionStumpErrRate": 0.826580544137796, "MaxKurtosisOfNumericAtts": 1.2503558717877254, "MinMeansOfNumericAtts": 2.2227441769426517, "Quartile2MutualInformation": null }, "tags": [ { "uploader": "2", "tag": "artificial" }, { "uploader": "38960", "tag": "Chemistry" }, { "uploader": "38960", "tag": "Life Science" }, { "uploader": "348", "tag": "OpenML100" }, { "uploader": "3886", "tag": "study_123" }, { "uploader": "5824", "tag": "study_135" }, { "uploader": "64", "tag": "study_14" }, { "uploader": "64", "tag": "study_50" }, { "uploader": "64", "tag": "study_52" }, { "uploader": "2", "tag": "uci" } ], "features": [ { "name": "Class", "index": "7", "type": "nominal", "distinct": "10", "missing": "0", "target": "1", "distr": [ [ "1", "2", "3", "4", "5", "6", "7", "8", "9", "10" ], [ [ "1196", "0", "0", "0", "0", "0", "0", "0", "0", "0" ], [ "0", "1192", "0", "0", "0", "0", "0", "0", "0", "0" ], [ "0", "0", "1416", "0", "0", "0", "0", "0", "0", "0" ], [ "0", "0", "0", "808", "0", "0", "0", "0", "0", "0" ], [ "0", "0", "0", "0", "1008", "0", "0", "0", "0", "0" ], [ "0", "0", "0", "0", "0", "1000", "0", "0", "0", "0" ], [ "0", "0", "0", "0", "0", "0", "800", "0", "0", "0" ], [ "0", "0", "0", "0", "0", "0", "0", "1198", "0", "0" ], [ "0", "0", "0", "0", "0", "0", "0", "0", "1000", "0" ], [ "0", "0", "0", "0", "0", "0", "0", "0", "0", "600" ] ] ] }, { "name": "V1", "index": "0", "type": "numeric", "distinct": "8", "missing": "0", "min": "0", "max": "7", "mean": "2", "stdev": "2" }, { "name": "V2", "index": "1", "type": "numeric", "distinct": "45", "missing": "0", "min": "0", "max": "48", "mean": "6", "stdev": "9" }, { "name": "V3", "index": "2", "type": "numeric", "distinct": "63", "missing": "0", "min": "0", "max": "63", "mean": "15", "stdev": "14" }, { "name": "V4", "index": "3", "type": "numeric", "distinct": "48", "missing": "0", "min": "0", "max": "50", "mean": "9", "stdev": "10" }, { "name": "V5", "index": "4", "type": "numeric", "distinct": "66", "missing": "0", "min": "-2", "max": "63", "mean": "21", "stdev": "13" }, { "name": "V6", "index": "5", "type": "numeric", "distinct": "333", "missing": "0", "min": "1", "max": "51", "mean": "15", "stdev": "8" }, { "name": "V7", "index": "6", "type": "numeric", "distinct": "511", "missing": "0", "min": "14", "max": "76", "mean": "40", "stdev": "11" } ], "nr_of_issues": 0, "nr_of_downvotes": 0, "nr_of_likes": 0, "nr_of_downloads": 0, "total_downloads": 0, "reach": 0, "reuse": 0, "impact_of_reuse": 0, "reach_of_reuse": 0, "impact": 0 }