{ "data_id": "44092", "name": "Higgs", "exact_name": "Higgs", "version": 6, "version_label": null, "description": "Dataset used in the tabular data benchmark https:\/\/github.com\/LeoGrin\/tabular-benchmark, transformed in the same way. This dataset belongs to the \"regression on numerical features\" benchmark. Original description: \n \nThis is a smaller version of the original dataset, containing 1M rows. \n**Author**: Daniel Whiteson, University of California Irvine \n**Source**: [UCI](https:\/\/archive.ics.uci.edu\/ml\/datasets\/HIGGS) \n**Please cite**: Baldi, P., P. Sadowski, and D. Whiteson. Searching for Exotic Particles in High-energy Physics with Deep Learning. Nature Communications 5 (July 2, 2014). \n\n**Higgs Boson detection data**. The data has been produced using Monte Carlo simulations. The first 21 features (columns 2-22) are kinematic properties measured by the particle detectors in the accelerator. The last seven features are functions of the first 21 features; these are high-level features derived by physicists to help discriminate between the two classes. There is an interest in using deep learning methods to obviate the need for physicists to manually develop such features. The last 500,000 examples are used as a test set.\n\n**Note: This is the UCI Higgs dataset, same as version 1, but it fixes the definition of the class attribute, which is categorical, not numeric.**\n\n\n### Attribute Information\n* The first column is the class label (1 for signal, 0 for background)\n* 21 low-level features (kinematic properties): lepton pT, lepton eta, lepton phi, missing energy magnitude, missing energy phi, jet 1 pt, jet 1 eta, jet 1 phi, jet 1 b-tag, jet 2 pt, jet 2 eta, jet 2 phi, jet 2 b-tag, jet 3 pt, jet 3 eta, jet 3 phi, jet 3 b-tag, jet 4 pt, jet 4 eta, jet 4 phi, jet 4 b-tag\n* 7 high-level features derived by physicists: m_jj, m_jjj, m_lv, m_jlv, m_bb, m_wbb, m_wwbb. \n\nFor more detailed information about each feature see the original paper.\n\nRelevant Papers:\n\nBaldi, P., P. Sadowski, and D. Whiteson. Searching for Exotic Particles in High-energy Physics with Deep Learning. Nature Communications 5 (July 2, 2014).", "format": "arff", "uploader": "Leo Grin", "uploader_id": 26324, "visibility": "public", "creator": null, "contributor": "\"Leo Grin\"", "date": "2022-06-21 12:00:24", "update_comment": null, "last_update": "2022-06-21 12:00:24", "licence": "Public", "status": "active", "error_message": null, "url": "https:\/\/old.openml.org\/data\/download\/22103188\/dataset", "default_target_attribute": "target", "row_id_attribute": null, "ignore_attribute": null, "runs": 0, "suggest": { "input": [ "Higgs", "Dataset used in the tabular data benchmark https:\/\/github.com\/LeoGrin\/tabular-benchmark, transformed in the same way. This dataset belongs to the \"regression on numerical features\" benchmark. Original description: This is a smaller version of the original dataset, containing 1M rows. ### Attribute Information * The first column is the class label (1 for signal, 0 for background) * 21 low-level features (kinematic properties): lepton pT, lepton eta, lepton phi, missing energy magnitude, missing e " ], "weight": 5 }, "qualities": { "NumberOfInstances": 940160, "NumberOfFeatures": 25, "NumberOfClasses": 2, "NumberOfMissingValues": 0, "NumberOfInstancesWithMissingValues": 0, "NumberOfNumericFeatures": 24, "NumberOfSymbolicFeatures": 1, "PercentageOfBinaryFeatures": 4, "PercentageOfInstancesWithMissingValues": 0, "AutoCorrelation": 0.999998936350128, "PercentageOfMissingValues": 0, "Dimensionality": 2.6591218515997275e-5, "PercentageOfNumericFeatures": 96, "MajorityClassPercentage": 50, "PercentageOfSymbolicFeatures": 4, "MajorityClassSize": 470080, "MinorityClassPercentage": 50, "MinorityClassSize": 470080, "NumberOfBinaryFeatures": 1 }, "tags": [ { "uploader": "38960", "tag": "Computer Systems" }, { "uploader": "38960", "tag": "Mathematics" } ], "features": [ { "name": "target", "index": "24", "type": "nominal", "distinct": "2", "missing": "0", "target": "1", "distr": [ [ "0", "1" ], [ [ "470080", "0" ], [ "0", "470080" ] ] ] }, { "name": "lepton_pT", "index": "0", "type": "numeric", "distinct": "19749", "missing": "0", "min": "0", "max": "10", "mean": "1", "stdev": "1" }, { "name": "lepton_eta", "index": "1", "type": "numeric", "distinct": "5001", "missing": "0", "min": "-2", "max": "2", "mean": "0", "stdev": "1" }, { "name": "lepton_phi", "index": "2", "type": "numeric", "distinct": "6284", "missing": "0", "min": "-2", "max": "2", "mean": "0", "stdev": "1" }, { "name": "missing_energy_magnitude", "index": "3", "type": "numeric", "distinct": "594293", "missing": "0", "min": "0", "max": "10", "mean": "1", "stdev": "1" }, { "name": "missing_energy_phi", "index": "4", "type": "numeric", "distinct": "611732", "missing": "0", "min": "-2", "max": "2", "mean": "0", "stdev": "1" }, { "name": "jet_1_pt", "index": "5", "type": "numeric", "distinct": "33765", "missing": "0", "min": "0", "max": "8", "mean": "1", "stdev": "0" }, { "name": "jet_1_eta", "index": "6", "type": "numeric", "distinct": "5999", "missing": "0", "min": "-3", "max": "3", "mean": "0", "stdev": "1" }, { "name": "jet_1_phi", "index": "7", "type": "numeric", "distinct": "6284", "missing": "0", "min": "-2", "max": "2", "mean": "0", "stdev": "1" }, { "name": "jet_2_pt", "index": "8", "type": "numeric", "distinct": "26700", "missing": "0", "min": "0", "max": "12", "mean": "1", "stdev": "1" }, { "name": "jet_2_eta", "index": "9", "type": "numeric", "distinct": "5999", "missing": "0", "min": "-3", "max": "3", "mean": "0", "stdev": "1" }, { "name": "jet_2_phi", "index": "10", "type": "numeric", "distinct": "6284", "missing": "0", "min": "-2", "max": "2", "mean": "0", "stdev": "1" }, { "name": "jet_3_pt", "index": "11", "type": "numeric", "distinct": "18530", "missing": "0", "min": "0", "max": "15", "mean": "1", "stdev": "0" }, { "name": "jet_3_eta", "index": "12", "type": "numeric", "distinct": "5999", "missing": "0", "min": "-3", "max": "3", "mean": "0", "stdev": "1" }, { "name": "jet_3_phi", "index": "13", "type": "numeric", "distinct": "6284", "missing": "0", "min": "-2", "max": "2", "mean": "0", "stdev": "1" }, { "name": "jet_4_pt", "index": "14", "type": "numeric", "distinct": "13881", "missing": "0", "min": "0", "max": "10", "mean": "1", "stdev": "1" }, { "name": "jet_4_eta", "index": "15", "type": "numeric", "distinct": "5999", "missing": "0", "min": "-2", "max": "2", "mean": "0", "stdev": "1" }, { "name": "jet_4_phi", "index": "16", "type": "numeric", "distinct": "6284", "missing": "0", "min": "-2", "max": "2", "mean": "0", "stdev": "1" }, { "name": "m_jj", "index": "17", "type": "numeric", "distinct": "486364", "missing": "0", "min": "0", "max": "40", "mean": "1", "stdev": "1" }, { "name": "m_jjj", "index": "18", "type": "numeric", "distinct": "207711", "missing": "0", "min": "0", "max": "20", "mean": "1", "stdev": "0" }, { "name": "m_lv", "index": "19", "type": "numeric", "distinct": "179647", "missing": "0", "min": "0", "max": "8", "mean": "1", "stdev": "0" }, { "name": "m_jlv", "index": "20", "type": "numeric", "distinct": "255489", "missing": "0", "min": "0", "max": "12", "mean": "1", "stdev": "0" }, { "name": "m_bb", "index": "21", "type": "numeric", "distinct": "440990", "missing": "0", "min": "0", "max": "13", "mean": "1", "stdev": "1" }, { "name": "m_wbb", "index": "22", "type": "numeric", "distinct": "345644", "missing": "0", "min": "0", "max": "11", "mean": "1", "stdev": "0" }, { "name": "m_wwbb", "index": "23", "type": "numeric", "distinct": "397485", "missing": "0", "min": "0", "max": "8", "mean": "1", "stdev": "0" } ], "nr_of_issues": 0, "nr_of_downvotes": 0, "nr_of_likes": 0, "nr_of_downloads": 0, "total_downloads": 0, "reach": 0, "reuse": 0, "impact_of_reuse": 0, "reach_of_reuse": 0, "impact": 0 }