{ "data_id": "1510", "name": "wdbc", "exact_name": "wdbc", "version": 1, "version_label": null, "description": "**Author**: William H. Wolberg, W. Nick Street, Olvi L. Mangasarian \r\n**Source**: [UCI](https:\/\/archive.ics.uci.edu\/ml\/datasets\/breast+cancer+wisconsin+(original)), [University of Wisconsin](http:\/\/pages.cs.wisc.edu\/~olvi\/uwmp\/cancer.html) - 1995 \r\n**Please cite**: [UCI](https:\/\/archive.ics.uci.edu\/ml\/citation_policy.html) \r\n\r\n**Breast Cancer Wisconsin (Diagnostic) Data Set (WDBC).** Features are computed from a digitized image of a fine needle aspirate (FNA) of a breast mass. They describe characteristics of the cell nuclei present in the image. The target feature records the prognosis (benign (1) or malignant (2)). [Original data available here](ftp:\/\/ftp.cs.wisc.edu\/math-prog\/cpo-dataset\/machine-learn\/cancer\/) \r\n\r\nCurrent dataset was adapted to ARFF format from the UCI version. Sample code ID's were removed. \r\n\r\n! Note that there is also a related Breast Cancer Wisconsin (Original) Data Set with a different set of features, better known as [breast-w](https:\/\/www.openml.org\/d\/15).\r\n\r\n\r\n### Feature description \r\n\r\nTen real-valued features are computed for each of 3 cell nuclei, yielding a total of 30 descriptive features. See the papers below for more details on how they were computed. The 10 features (in order) are: \r\n\r\na) radius (mean of distances from center to points on the perimeter) \r\nb) texture (standard deviation of gray-scale values) \r\nc) perimeter \r\nd) area \r\ne) smoothness (local variation in radius lengths) \r\nf) compactness (perimeter^2 \/ area - 1.0) \r\ng) concavity (severity of concave portions of the contour) \r\nh) concave points (number of concave portions of the contour) \r\ni) symmetry \r\nj) fractal dimension (\"coastline approximation\" - 1) \r\n\r\n### Relevant Papers \r\n\r\nW.N. Street, W.H. Wolberg and O.L. Mangasarian. Nuclear feature extraction for breast tumor diagnosis. IS&T\/SPIE 1993 International Symposium on Electronic Imaging: Science and Technology, volume 1905, pages 861-870, San Jose, CA, 1993. \r\n\r\nO.L. Mangasarian, W.N. Street and W.H. Wolberg. Breast cancer diagnosis and prognosis via linear programming. Operations Research, 43(4), pages 570-577, July-August 1995.", "format": "ARFF", "uploader": "Rafael Gomes Mantovani", "uploader_id": 64, "visibility": "public", "creator": null, "contributor": null, "date": "2015-05-26 16:24:07", "update_comment": null, "last_update": "2015-11-09 20:15:56", "licence": "Public", "status": "active", "error_message": null, "url": "https:\/\/www.openml.org\/data\/download\/1592318\/phpAmSP4g", "kaggle_url": null, "default_target_attribute": "Class", "row_id_attribute": null, "ignore_attribute": null, "runs": 226893, "suggest": { "input": [ "wdbc", "Current dataset was adapted to ARFF format from the UCI version. Sample code ID's were removed. ! Note that there is also a related Breast Cancer Wisconsin (Original) Data Set with a different set of features, better known as [breast-w](https:\/\/www.openml.org\/d\/15). ### Feature description Ten real-valued features are computed for each of 3 cell nuclei, yielding a total of 30 descriptive features. See the papers below for more details on how they were computed. The 10 features (in order) are: a) " ], "weight": 5 }, "qualities": { "NumberOfInstances": 569, "NumberOfFeatures": 31, "NumberOfClasses": 2, "NumberOfMissingValues": 0, "NumberOfInstancesWithMissingValues": 0, "NumberOfNumericFeatures": 30, "NumberOfSymbolicFeatures": 1, "REPTreeDepth1Kappa": 0.7901774129665797, "CfsSubsetEval_kNN1NAUC": 0.9492032662121453, "StdvNominalAttDistinctValues": 0, "J48.001.ErrRate": 0.06854130052724078, "MeanNominalAttDistinctValues": 2, "Quartile1SkewnessOfNumericAtts": 0.9785827119630317, "REPTreeDepth2AUC": 0.9630437080492574, "CfsSubsetEval_kNN1NErrRate": 0.0492091388400703, "kNN1NAUC": 0.9512248295544633, "J48.001.Kappa": 0.8518357781442584, "MeanSkewnessOfNumericAtts": 1.740406628520751, "Quartile1StdDevOfNumericAtts": 0.018022995343089827, "REPTreeDepth2ErrRate": 0.070298769771529, "CfsSubsetEval_kNN1NKappa": 0.8945445399065384, "kNN1NErrRate": 0.043936731107205626, "MajorityClassPercentage": 62.741652021089635, "MeanStdDevOfNumericAtts": 34.904718603211656, "Quartile2AttributeEntropy": null, "REPTreeDepth2Kappa": 0.8519077611784914, "ClassEntropy": 0.9526351224018599, "kNN1NKappa": 0.9052064799450897, "MajorityClassSize": 357, "MinAttributeEntropy": null, "Quartile2KurtosisOfNumericAtts": 3.022590146044739, "REPTreeDepth3AUC": 0.9651709740499974, "DecisionStumpAUC": 0.8721592410549125, "MaxAttributeEntropy": null, "MinKurtosisOfNumericAtts": -0.5355351225188612, "Quartile2MeansOfNumericAtts": 0.21771345342706502, "REPTreeDepth3ErrRate": 0.05272407732864675, "DecisionStumpErrRate": 0.08963093145869948, "MaxKurtosisOfNumericAtts": 49.20907650724138, "MinMeansOfNumericAtts": 0.0037949033391915642, "Quartile2MutualInformation": null, "Quartile2SkewnessOfNumericAtts": 1.4175537695584106, "REPTreeDepth3Kappa": 0.8870120070427198, "DecisionStumpKappa": 0.8012438100586974, "MaxMeansOfNumericAtts": 880.5831282952547, "MinMutualInformation": null, "Quartile2StdDevOfNumericAtts": 0.072726074660982, "RandomTreeDepth1AUC": 0.8897785529306063, "Dimensionality": 0.054481546572934976, "MaxMutualInformation": null, "MinNominalAttDistinctValues": 2, "PercentageOfBinaryFeatures": 3.225806451612903, "Quartile3AttributeEntropy": null, "RandomTreeDepth1ErrRate": 0.0984182776801406, "EquivalentNumberOfAtts": null, "MaxNominalAttDistinctValues": 2, "MinSkewnessOfNumericAtts": 0.4154259962824675, "PercentageOfInstancesWithMissingValues": 0, "Quartile3KurtosisOfNumericAtts": 5.985908976234704, "AutoCorrelation": 0.625, "RandomTreeDepth1Kappa": 0.7815409507877525, "J48.00001.AUC": 0.9415596427250146, "MaxSkewnessOfNumericAtts": 5.447186284898407, "MinStdDevOfNumericAtts": 0.002646071523977847, "PercentageOfMissingValues": 0, "Quartile3MeansOfNumericAtts": 17.024304481546572, "CfsSubsetEval_DecisionStumpAUC": 0.8721592410549125, "RandomTreeDepth2AUC": 0.9583333333333335, "J48.00001.ErrRate": 0.07205623901581722, "MaxStdDevOfNumericAtts": 569.3569926699494, "MinorityClassPercentage": 37.258347978910365, "PercentageOfNumericFeatures": 96.7741935483871, "Quartile3MutualInformation": null, "CfsSubsetEval_DecisionStumpErrRate": 0.08963093145869948, "RandomTreeDepth2ErrRate": 0.10193321616871705, "J48.00001.Kappa": 0.8436320738908661, "MeanAttributeEntropy": null, "MinorityClassSize": 212, "PercentageOfSymbolicFeatures": 3.225806451612903, "Quartile3SkewnessOfNumericAtts": 1.9754487571153525, "CfsSubsetEval_DecisionStumpKappa": 0.8012438100586974, "RandomTreeDepth2Kappa": 0.7802913293566254, "J48.0001.AUC": 0.9415596427250146, "MeanKurtosisOfNumericAtts": 7.8147348251102615, "NaiveBayesAUC": 0.9797278811381966, "Quartile1AttributeEntropy": null, "Quartile3StdDevOfNumericAtts": 4.434087221242544, "CfsSubsetEval_NaiveBayesAUC": 0.989788686638896, "RandomTreeDepth3AUC": 0.956305163574864, "J48.0001.ErrRate": 0.07205623901581722, "MeanMeansOfNumericAtts": 61.89071233954305, "NaiveBayesErrRate": 0.07205623901581722, "Quartile1KurtosisOfNumericAtts": 0.9651825547526209, "REPTreeDepth1AUC": 0.8700584007187779, "CfsSubsetEval_NaiveBayesErrRate": 0.0492091388400703, "RandomTreeDepth3ErrRate": 0.06502636203866433, "J48.0001.Kappa": 0.8436320738908661, "MeanMutualInformation": null, "NaiveBayesKappa": 0.8451371786276162, "Quartile1MeansOfNumericAtts": 0.05932799384885764, "REPTreeDepth1ErrRate": 0.09490333919156414, "CfsSubsetEval_NaiveBayesKappa": 0.8941381280814363, "RandomTreeDepth3Kappa": 0.8588874813161476, "J48.001.AUC": 0.9362745098039216, "MeanNoiseToSignalRatio": null, "NumberOfBinaryFeatures": 1, "Quartile1MutualInformation": null }, "tags": [ { "uploader": "38960", "tag": "Biology" }, { "uploader": "2", "tag": "cancer" }, { "uploader": "38960", "tag": "Health" }, { "uploader": "2", "tag": "medical" }, { "uploader": "38960", "tag": "Medicine" }, { "uploader": "1", "tag": "OpenML-CC18" }, { "uploader": "348", "tag": "OpenML100" }, { "uploader": "38960", "tag": "Research" }, { "uploader": "3886", "tag": "study_123" }, { "uploader": "5824", "tag": "study_135" }, { "uploader": "64", "tag": "study_14" }, { "uploader": "64", "tag": "study_52" }, { "uploader": "64", "tag": "study_7" }, { "uploader": "1935", "tag": "study_98" }, { "uploader": "1", "tag": "study_99" }, { "uploader": "2", "tag": "uci" } ], "features": [ { "name": "Class", "index": "30", "type": "nominal", "distinct": "2", "missing": "0", "target": "1", "distr": [ [ "1", "2" ], [ [ "357", "0" ], [ "0", "212" ] ] ] }, { "name": "V1", "index": "0", "type": "numeric", "distinct": "456", "missing": "0", "min": "7", "max": "28", "mean": "14", "stdev": "4" }, { "name": "V2", "index": "1", "type": "numeric", "distinct": "479", "missing": "0", "min": "10", "max": "39", "mean": "19", "stdev": "4" }, { "name": "V3", "index": "2", "type": "numeric", "distinct": "522", "missing": "0", "min": "44", "max": "189", "mean": "92", "stdev": "24" }, { "name": "V4", "index": "3", "type": "numeric", "distinct": "539", "missing": "0", "min": "144", "max": "2501", "mean": "655", "stdev": "352" }, { "name": "V5", "index": "4", "type": "numeric", "distinct": "474", "missing": "0", "min": "0", "max": "0", "mean": "0", "stdev": "0" }, { "name": "V6", "index": "5", "type": "numeric", "distinct": "537", "missing": "0", "min": "0", "max": "0", "mean": "0", "stdev": "0" }, { "name": "V7", "index": "6", "type": "numeric", "distinct": "537", "missing": "0", "min": "0", "max": "0", "mean": "0", "stdev": "0" }, { "name": "V8", "index": "7", "type": "numeric", "distinct": "542", "missing": "0", "min": "0", "max": "0", "mean": "0", "stdev": "0" }, { "name": "V9", "index": "8", "type": "numeric", "distinct": "432", "missing": "0", "min": "0", "max": "0", "mean": "0", "stdev": "0" }, { "name": "V10", "index": "9", "type": "numeric", "distinct": "499", "missing": "0", "min": "0", "max": "0", "mean": "0", "stdev": "0" }, { "name": "V11", "index": "10", "type": "numeric", "distinct": "540", "missing": "0", "min": "0", "max": "3", "mean": "0", "stdev": "0" }, { "name": "V12", "index": "11", "type": "numeric", "distinct": "519", "missing": "0", "min": "0", "max": "5", "mean": "1", "stdev": "1" }, { "name": "V13", "index": "12", "type": "numeric", "distinct": "533", "missing": "0", "min": "1", "max": "22", "mean": "3", "stdev": "2" }, { "name": "V14", "index": "13", "type": "numeric", "distinct": "528", "missing": "0", "min": "7", "max": "542", "mean": "40", "stdev": "45" }, { "name": "V15", "index": "14", "type": "numeric", "distinct": "547", "missing": "0", "min": "0", "max": "0", "mean": "0", "stdev": "0" }, { "name": "V16", "index": "15", "type": "numeric", "distinct": "541", "missing": "0", "min": "0", "max": "0", "mean": "0", "stdev": "0" }, { "name": "V17", "index": "16", "type": "numeric", "distinct": "533", "missing": "0", "min": "0", "max": "0", "mean": "0", "stdev": "0" }, { "name": "V18", "index": "17", "type": "numeric", "distinct": "507", "missing": "0", "min": "0", "max": "0", "mean": "0", "stdev": "0" }, { "name": "V19", "index": "18", "type": "numeric", "distinct": "498", "missing": "0", "min": "0", "max": "0", "mean": "0", "stdev": "0" }, { "name": "V20", "index": "19", "type": "numeric", "distinct": "545", "missing": "0", "min": "0", "max": "0", "mean": "0", "stdev": "0" }, { "name": "V21", "index": "20", "type": "numeric", "distinct": "457", "missing": "0", "min": "8", "max": "36", "mean": "16", "stdev": "5" }, { "name": "V22", "index": "21", "type": "numeric", "distinct": "511", "missing": "0", "min": "12", "max": "50", "mean": "26", "stdev": "6" }, { "name": "V23", "index": "22", "type": "numeric", "distinct": "514", "missing": "0", "min": "50", "max": "251", "mean": "107", "stdev": "34" }, { "name": "V24", "index": "23", "type": "numeric", "distinct": "544", "missing": "0", "min": "185", "max": "4254", "mean": "881", "stdev": "569" }, { "name": "V25", "index": "24", "type": "numeric", "distinct": "411", "missing": "0", "min": "0", "max": "0", "mean": "0", "stdev": "0" }, { "name": "V26", "index": "25", "type": "numeric", "distinct": "529", "missing": "0", "min": "0", "max": "1", "mean": "0", "stdev": "0" }, { "name": "V27", "index": "26", "type": "numeric", "distinct": "539", "missing": "0", "min": "0", "max": "1", "mean": "0", "stdev": "0" }, { "name": "V28", "index": "27", "type": "numeric", "distinct": "492", "missing": "0", "min": "0", "max": "0", "mean": "0", "stdev": "0" }, { "name": "V29", "index": "28", "type": "numeric", "distinct": "500", "missing": "0", "min": "0", "max": "1", "mean": "0", "stdev": "0" }, { "name": "V30", "index": "29", "type": "numeric", "distinct": "535", "missing": "0", "min": "0", "max": "0", "mean": "0", "stdev": "0" } ], "nr_of_issues": 0, "nr_of_downvotes": 0, "nr_of_likes": 0, "nr_of_downloads": 0, "total_downloads": 0, "reach": 0, "reuse": 0, "impact_of_reuse": 0, "reach_of_reuse": 0, "impact": 0 }