{ "data_id": "1512", "name": "heart-long-beach", "exact_name": "heart-long-beach", "version": 1, "version_label": null, "description": "**Author**: V.A. Medical Center, Long Beach and Cleveland Clinic Foundation: Robert Detrano, M.D., Ph.D. \n**Source**: UCI \n**Please cite**: V.A. Medical Center, Long Beach and Cleveland Clinic Foundation:Robert Detrano, M.D., Ph.D. \n\n* Donor: \n\nDavid W. Aha (aha '@' ics.uci.edu) (714) 856-8779\n\n\n* Data Set Information:\n\nThis database contains 76 attributes, but all published experiments refer to using a subset of 14 of them. In particular, the Cleveland database is the only one that has been used by ML researchers to \nthis date. The \"goal\" field refers to the presence of heart disease in the patient. It is integer valued from 0 (no presence) to 4. Experiments with the Cleveland database have concentrated on simply attempting to distinguish presence (values 1,2,3,4) from absence (value 0). \n\nThe names and social security numbers of the patients were recently removed from the database, replaced with dummy values. \n\nOne file has been \"processed\", that one containing the Cleveland database. All four unprocessed files also exist in this directory. \n\nTo see Test Costs (donated by Peter Turney), please see the folder \"Costs\"\n\n\n* Attribute Information:\n\nOnly 14 attributes used: \n1. #3 (age) \n2. #4 (sex) \n3. #9 (cp) \n4. #10 (trestbps) \n5. #12 (chol) \n6. #16 (fbs) \n7. #19 (restecg) \n8. #32 (thalach) \n9. #38 (exang) \n10. #40 (oldpeak) \n11. #41 (slope) \n12. #44 (ca) \n13. #51 (thal) \n14. #58 (num) (the predicted attribute) \n\nComplete attribute documentation: \n1 id: patient identification number \n2 ccf: social security number (I replaced this with a dummy value of 0) \n3 age: age in years \n4 sex: sex (1 = male; 0 = female) \n5 painloc: chest pain location (1 = substernal; 0 = otherwise) \n6 painexer (1 = provoked by exertion; 0 = otherwise) \n7 relrest (1 = relieved after rest; 0 = otherwise) \n8 pncaden (sum of 5, 6, and 7) \n9 cp: chest pain type \n-- Value 1: typical angina \n-- Value 2: atypical angina \n-- Value 3: non-anginal pain \n-- Value 4: asymptomatic \n10 trestbps: resting blood pressure (in mm Hg on admission to the hospital) \n11 htn \n12 chol: serum cholestoral in mg\/dl \n13 smoke: I believe this is 1 = yes; 0 = no (is or is not a smoker) \n14 cigs (cigarettes per day) \n15 years (number of years as a smoker) \n16 fbs: (fasting blood sugar > 120 mg\/dl) (1 = true; 0 = false) \n17 dm (1 = history of diabetes; 0 = no such history) \n18 famhist: family history of coronary artery disease (1 = yes; 0 = no) \n19 restecg: resting electrocardiographic results \n-- Value 0: normal \n-- Value 1: having ST-T wave abnormality (T wave inversions and\/or ST elevation or depression of > 0.05 mV) \n-- Value 2: showing probable or definite left ventricular hypertrophy by Estes' criteria \n20 ekgmo (month of exercise ECG reading) \n21 ekgday(day of exercise ECG reading) \n22 ekgyr (year of exercise ECG reading) \n23 dig (digitalis used furing exercise ECG: 1 = yes; 0 = no) \n24 prop (Beta blocker used during exercise ECG: 1 = yes; 0 = no) \n25 nitr (nitrates used during exercise ECG: 1 = yes; 0 = no) \n26 pro (calcium channel blocker used during exercise ECG: 1 = yes; 0 = no) \n27 diuretic (diuretic used used during exercise ECG: 1 = yes; 0 = no) \n28 proto: exercise protocol \n1 = Bruce \n2 = Kottus \n3 = McHenry \n4 = fast Balke \n5 = Balke \n6 = Noughton \n7 = bike 150 kpa min\/min (Not sure if \"kpa min\/min\" is what was written!) \n8 = bike 125 kpa min\/min \n9 = bike 100 kpa min\/min \n10 = bike 75 kpa min\/min \n11 = bike 50 kpa min\/min \n12 = arm ergometer \n29 thaldur: duration of exercise test in minutes \n30 thaltime: time when ST measure depression was noted \n31 met: mets achieved \n32 thalach: maximum heart rate achieved \n33 thalrest: resting heart rate \n34 tpeakbps: peak exercise blood pressure (first of 2 parts) \n35 tpeakbpd: peak exercise blood pressure (second of 2 parts) \n36 dummy \n37 trestbpd: resting blood pressure \n38 exang: exercise induced angina (1 = yes; 0 = no) \n39 xhypo: (1 = yes; 0 = no) \n40 oldpeak = ST depression induced by exercise relative to rest \n41 slope: the slope of the peak exercise ST segment \n-- Value 1: upsloping \n-- Value 2: flat \n-- Value 3: downsloping \n42 rldv5: height at rest \n43 rldv5e: height at peak exercise \n44 ca: number of major vessels (0-3) colored by flourosopy \n45 restckm: irrelevant \n46 exerckm: irrelevant \n47 restef: rest raidonuclid (sp?) ejection fraction \n48 restwm: rest wall (sp?) motion abnormality \n0 = none \n1 = mild or moderate \n2 = moderate or severe \n3 = akinesis or dyskmem (sp?) \n49 exeref: exercise radinalid (sp?) ejection fraction \n50 exerwm: exercise wall (sp?) motion \n51 thal: 3 = normal; 6 = fixed defect; 7 = reversable defect \n52 thalsev: not used \n53 thalpul: not used \n54 earlobe: not used \n55 cmo: month of cardiac cath (sp?) (perhaps \"call\") \n56 cday: day of cardiac cath (sp?) \n57 cyr: year of cardiac cath (sp?) \n58 num: diagnosis of heart disease (angiographic disease status) \n-- Value 0: < 50% diameter narrowing \n-- Value 1: > 50% diameter narrowing \n(in any major vessel: attributes 59 through 68 are vessels) \n59 lmt \n60 ladprox \n61 laddist \n62 diag \n63 cxmain \n64 ramus \n65 om1 \n66 om2 \n67 rcaprox \n68 rcadist \n69 lvx1: not used \n70 lvx2: not used \n71 lvx3: not used \n72 lvx4: not used \n73 lvf: not used \n74 cathef: not used \n75 junk: not used \n76 name: last name of patient (I replaced this with the dummy string \"name\") \n\n\n* Relevant Papers:\n\nDetrano, R., Janosi, A., Steinbrunn, W., Pfisterer, M., Schmid, J., Sandhu, S., Guppy, K., Lee, S., & Froelicher, V. (1989). International application of a new probability algorithm for the diagnosis of coronary artery disease. American Journal of Cardiology, 64,304--310. \n\nDavid W. Aha & Dennis Kibler. \"Instance-based prediction of heart-disease presence with the Cleveland database.\" \n\nGennari, J.H., Langley, P, & Fisher, D. (1989). Models of incremental concept formation. Artificial Intelligence, 40, 11--61. ", "format": "ARFF", "uploader": "Rafael Gomes Mantovani", "uploader_id": 64, "visibility": "public", "creator": null, "contributor": null, "date": "2015-06-01 16:03:43", "update_comment": null, "last_update": "2015-11-09 20:10:17", "licence": "Public", "status": "active", "error_message": null, "url": "https:\/\/www.openml.org\/data\/download\/1593700\/phpy0HwUD", "default_target_attribute": "Class", "row_id_attribute": null, "ignore_attribute": null, "runs": 159, "suggest": { "input": [ "heart-long-beach", "* Donor: David W. Aha (aha '@' ics.uci.edu) (714) 856-8779 * Data Set Information: This database contains 76 attributes, but all published experiments refer to using a subset of 14 of them. In particular, the Cleveland database is the only one that has been used by ML researchers to this date. The \"goal\" field refers to the presence of heart disease in the patient. It is integer valued from 0 (no presence) to 4. Experiments with the Cleveland database have concentrated on simply attempting to di " ], "weight": 5 }, "qualities": { "NumberOfInstances": 200, "NumberOfFeatures": 14, "NumberOfClasses": 5, "NumberOfMissingValues": 0, "NumberOfInstancesWithMissingValues": 0, "NumberOfNumericFeatures": 13, "NumberOfSymbolicFeatures": 1, "REPTreeDepth3Kappa": 0.049849889479067044, "DecisionStumpKappa": 0.03313526034847417, "MaxMeansOfNumericAtts": 59.35, "MinMutualInformation": null, "Quartile2SkewnessOfNumericAtts": 0.39146284207158727, "RandomTreeDepth1AUC": 0.5473991572017521, "Dimensionality": 0.07, "MaxMutualInformation": null, "MinNominalAttDistinctValues": 5, "PercentageOfBinaryFeatures": 0, "Quartile2StdDevOfNumericAtts": 0.995012688992598, "RandomTreeDepth1ErrRate": 0.695, "EquivalentNumberOfAtts": null, "MaxNominalAttDistinctValues": 5, "MinSkewnessOfNumericAtts": -5.552105435821193, "PercentageOfInstancesWithMissingValues": 0, "Quartile3AttributeEntropy": null, "RandomTreeDepth1Kappa": 0.09463948413990751, "J48.00001.AUC": 0.5249842678818851, "MaxSkewnessOfNumericAtts": 9.92395586524636, "MinStdDevOfNumericAtts": 0.09974842727441162, "PercentageOfMissingValues": 0, "Quartile3KurtosisOfNumericAtts": 2.2035492262043483, "AutoCorrelation": 0.22110552763819097, "RandomTreeDepth2AUC": 0.5473991572017521, "J48.00001.ErrRate": 0.76, "MaxStdDevOfNumericAtts": 29.841699602668065, "MinorityClassPercentage": 5, "PercentageOfNumericFeatures": 92.85714285714286, "Quartile3MeansOfNumericAtts": 19.16, "CfsSubsetEval_DecisionStumpAUC": 0.5496241235002084, "RandomTreeDepth2ErrRate": 0.695, "J48.00001.Kappa": 0.011671380734094065, "MeanAttributeEntropy": null, "MinorityClassSize": 10, "PercentageOfSymbolicFeatures": 7.142857142857142, "Quartile3MutualInformation": null, "CfsSubsetEval_DecisionStumpErrRate": 0.7, "RandomTreeDepth2Kappa": 0.09463948413990751, "J48.0001.AUC": 0.5249842678818851, "MeanKurtosisOfNumericAtts": 9.40489230056545, "NaiveBayesAUC": 0.6243615762795679, "Quartile1AttributeEntropy": null, "Quartile3SkewnessOfNumericAtts": 0.5387168353780951, "CfsSubsetEval_DecisionStumpKappa": 0.048363525133399, "RandomTreeDepth3AUC": 0.5473991572017521, "J48.0001.ErrRate": 0.76, "MeanMeansOfNumericAtts": 11.885000000000002, "NaiveBayesErrRate": 0.685, "Quartile1KurtosisOfNumericAtts": -1.3414750904406352, "Quartile3StdDevOfNumericAtts": 10.760802698599807, "CfsSubsetEval_NaiveBayesAUC": 0.5496241235002084, "RandomTreeDepth3ErrRate": 0.695, "J48.0001.Kappa": 0.011671380734094065, "MeanMutualInformation": null, "NaiveBayesKappa": 0.12179487179487179, "Quartile1MeansOfNumericAtts": 1.2200000000000002, "REPTreeDepth1AUC": 0.5459790690305675, "CfsSubsetEval_NaiveBayesErrRate": 0.7, "RandomTreeDepth3Kappa": 0.09463948413990751, "J48.001.AUC": 0.5249842678818851, "MeanNoiseToSignalRatio": null, "NumberOfBinaryFeatures": 0, "Quartile1MutualInformation": null, "REPTreeDepth1ErrRate": 0.72, "CfsSubsetEval_NaiveBayesKappa": 0.048363525133399, "CfsSubsetEval_kNN1NAUC": 0.5496241235002084, "StdvNominalAttDistinctValues": 0, "J48.001.ErrRate": 0.76, "MeanNominalAttDistinctValues": 5, "Quartile1SkewnessOfNumericAtts": -0.4083053208643634, "REPTreeDepth1Kappa": 0.049849889479067044, "CfsSubsetEval_kNN1NErrRate": 0.7, "kNN1NAUC": 0.540206017247981, "J48.001.Kappa": 0.011671380734094065, "MeanSkewnessOfNumericAtts": 0.5057745905329623, "Quartile1StdDevOfNumericAtts": 0.6079002898740116, "REPTreeDepth2AUC": 0.5459790690305675, "CfsSubsetEval_kNN1NKappa": 0.048363525133399, "kNN1NErrRate": 0.71, "MajorityClassPercentage": 28.000000000000004, "MeanStdDevOfNumericAtts": 6.346905125010748, "Quartile2AttributeEntropy": null, "REPTreeDepth2ErrRate": 0.72, "ClassEntropy": 2.1745471249212494, "kNN1NKappa": 0.08256880733944952, "MajorityClassSize": 56, "MinAttributeEntropy": null, "Quartile2KurtosisOfNumericAtts": -0.8396344153910591, "REPTreeDepth2Kappa": 0.049849889479067044, "REPTreeDepth3AUC": 0.5459790690305675, "DecisionStumpAUC": 0.5502196584213135, "MaxAttributeEntropy": null, "MinKurtosisOfNumericAtts": -1.4785607140899353, "Quartile2MeansOfNumericAtts": 2.305, "REPTreeDepth3ErrRate": 0.72, "DecisionStumpErrRate": 0.72, "MaxKurtosisOfNumericAtts": 97.45944291398763, "MinMeansOfNumericAtts": 0.7350000000000003, "Quartile2MutualInformation": null }, "tags": [ { "uploader": "38960", "tag": "Chemistry" }, { "uploader": "38960", "tag": "Life Science" }, { "uploader": "3886", "tag": "mf_less_than_80" }, { "uploader": "3886", "tag": "study_123" }, { "uploader": "4209", "tag": "study_127" }, { "uploader": "64", "tag": "study_50" }, { "uploader": "64", "tag": "study_52" }, { "uploader": "64", "tag": "study_7" }, { "uploader": "4209", "tag": "study_88" } ], "topics": [ { "topic": "Health", "uploader": "8111" } ], "features": [ { "name": "Class", "index": "13", "type": "nominal", "distinct": "5", "missing": "0", "target": "1", "distr": [ [ "1", "2", "3", "4", "5" ], [ [ "51", "0", "0", "0", "0" ], [ "0", "56", "0", "0", "0" ], [ "0", "0", "41", "0", "0" ], [ "0", "0", "0", "42", "0" ], [ "0", "0", "0", "0", "10" ] ] ] }, { "name": "V1", "index": "0", "type": "numeric", "distinct": "39", "missing": "0", "min": "35", "max": "77", "mean": "59", "stdev": "8" }, { "name": "V2", "index": "1", "type": "numeric", "distinct": "2", "missing": "0", "min": "0", "max": "1", "mean": "1", "stdev": "0" }, { "name": "V3", "index": "2", "type": "numeric", "distinct": "4", "missing": "0", "min": "1", "max": "4", "mean": "4", "stdev": "1" }, { "name": "V4", "index": "3", "type": "numeric", "distinct": "41", "missing": "0", "min": "1", "max": "41", "mean": "15", "stdev": "14" }, { "name": "V5", "index": "4", "type": "numeric", "distinct": "100", "missing": "0", "min": "1", "max": "100", "mean": "36", "stdev": "30" }, { "name": "V6", "index": "5", "type": "numeric", "distinct": "3", "missing": "0", "min": "1", "max": "3", "mean": "2", "stdev": "1" }, { "name": "V7", "index": "6", "type": "numeric", "distinct": "3", "missing": "0", "min": "0", "max": "2", "mean": "1", "stdev": "1" }, { "name": "V8", "index": "7", "type": "numeric", "distinct": "60", "missing": "0", "min": "1", "max": "60", "mean": "23", "stdev": "21" }, { "name": "V9", "index": "8", "type": "numeric", "distinct": "3", "missing": "0", "min": "1", "max": "3", "mean": "2", "stdev": "1" }, { "name": "V10", "index": "9", "type": "numeric", "distinct": "15", "missing": "0", "min": "1", "max": "15", "mean": "7", "stdev": "5" }, { "name": "V11", "index": "10", "type": "numeric", "distinct": "4", "missing": "0", "min": "1", "max": "4", "mean": "2", "stdev": "1" }, { "name": "V12", "index": "11", "type": "numeric", "distinct": "2", "missing": "0", "min": "1", "max": "2", "mean": "1", "stdev": "0" }, { "name": "V13", "index": "12", "type": "numeric", "distinct": "4", "missing": "0", "min": "1", "max": "4", "mean": "1", "stdev": "1" } ], "nr_of_issues": 0, "nr_of_downvotes": 0, "nr_of_likes": 0, "nr_of_downloads": 0, "total_downloads": 0, "reach": 0, "reuse": 0, "impact_of_reuse": 0, "reach_of_reuse": 0, "impact": 0 }