{ "data_id": "45579", "name": "Microsoft", "exact_name": "Microsoft", "version": 2, "version_label": null, "description": "Microsoft Learning to Rank Datasets\n\n## Dataset Descriptions\n\nThe datasets are machine learning data, in which queries and urls are represented by IDs. The datasets consist of feature vectors extracted from query-url pairs along with relevance judgment labels:\n\n(1) The relevance judgments are obtained from a retired labeling set of a commercial web search engine (Microsoft Bing), which take 5 values from 0 (irrelevant) to 4 (perfectly relevant).\n\n(2) The features are basically extracted by us, and are those widely used in the research community.\n\nIn the data files, each row corresponds to a query-url pair. The first column is relevance label of the pair, the second column is query id, and the following columns are features. The larger value the relevance label has, the more relevant the query-url pair is. A query-url pair is represented by a 136-dimensional feature vector.\n\nBelow are two rows from MSLR-WEB10K dataset:\n\n==============================================\n\n0 qid:1 1:3 2:0 3:2 4:2 ... 135:0 136:0\n\n2 qid:1 1:3 2:3 3:0 4:0 ... 135:0 136:0\n\n==============================================\n\n## Dataset Partition\n\nWe have partitioned each dataset into five parts with about the same number of queries, denoted as S1, S2, S3, S4, and S5, for five-fold cross validation. In each fold, we propose using three parts for training, one part for validation, and the remaining part for test (see the following table). The training set is used to learn ranking models. The validation set is used to tune the hyper parameters of the learning algorithms, such as the number of iterations in RankBoost and the combination coefficient in the objective function of Ranking SVM. The test set is used to evaluate the performance of the learned ranking models.\n\nFolds\t Training Set\tValidation Set\tTest Set\nFold1\t {S1,S2,S3}\t S4\t S5\nFold2\t {S2,S3,S4}\t S5\t S1\nFold3\t {S3,S4,S5}\t S1\t S2\nFold4\t {S4,S5,S1}\t S2\t S3\nFold5\t {S5,S1,S2}\t S3\t S4\n\n## Reference\n\nYou can cite this dataset as below.\n\n```\n@article{DBLP:journals\/corr\/QinL13,\n author = {Tao Qin and\n Tie{-}Yan Liu},\n title = {Introducing {LETOR} 4.0 Datasets},\n journal = {CoRR},\n volume = {abs\/1306.2597},\n year = {2013},\n url = {http:\/\/arxiv.org\/abs\/1306.2597},\n timestamp = {Mon, 01 Jul 2013 20:31:25 +0200},\n biburl = {http:\/\/dblp.uni-trier.de\/rec\/bib\/journals\/corr\/QinL13},\n bibsource = {dblp computer science bibliography, http:\/\/dblp.org}\n}\n```\n\n## Note:\n\n* This is a learning-to-rank dataset and it should not be used for standard classification tasks. It is only coded this way to enable reproducing the work \"Tabular data: Deep learning is not all you need\" by Shwartz-Ziv and Amitai Armon.\n* This dataset concatenats the train, valid and test set from Fold1.\n* This is the 10k Version (Web10k)\n* The uploader shortened the word \"variance\" in the feature names to \"var\" to comply with OpenML's maximum feature name length.", "format": "arff", "uploader": "Matthias Feurer", "uploader_id": 86, "visibility": "public", "creator": null, "contributor": null, "date": "2023-07-05 08:39:01", "update_comment": null, "last_update": "2023-07-05 08:39:01", "licence": "Public", "status": "active", "error_message": null, "url": "https:\/\/api.openml.org\/data\/download\/22116700\/dataset", "kaggle_url": null, "default_target_attribute": "relevance", "row_id_attribute": "query_id", "ignore_attribute": null, "runs": 0, "suggest": { "input": [ "Microsoft", "Microsoft Learning to Rank Datasets ## Dataset Descriptions The datasets are machine learning data, in which queries and urls are represented by IDs. The datasets consist of feature vectors extracted from query-url pairs along with relevance judgment labels: (1) The relevance judgments are obtained from a retired labeling set of a commercial web search engine (Microsoft Bing), which take 5 values from 0 (irrelevant) to 4 (perfectly relevant). (2) The features are basically extracted by us, and a " ], "weight": 5 }, "qualities": { "NumberOfInstances": 1200192, "NumberOfFeatures": 137, "NumberOfClasses": 5, "NumberOfMissingValues": 0, "NumberOfInstancesWithMissingValues": 0, "NumberOfNumericFeatures": 136, "NumberOfSymbolicFeatures": 1, "PercentageOfBinaryFeatures": 0, "PercentageOfInstancesWithMissingValues": 0, "PercentageOfMissingValues": 0, "AutoCorrelation": 0.46223642736864384, "PercentageOfNumericFeatures": 99.27007299270073, "Dimensionality": 0.00011414840292219911, "PercentageOfSymbolicFeatures": 0.7299270072992701, "MajorityClassPercentage": 52.01359449154802, "MajorityClassSize": 624263, "MinorityClassPercentage": 0.7399649389431024, "MinorityClassSize": 8881, "NumberOfBinaryFeatures": 0 }, "tags": [], "features": [ { "name": "relevance", "index": "0", "type": "nominal", "distinct": "5", "missing": "0", "target": "1", "distr": [ [ "0", "1", "2", "3", "4" ], [ [ "624263", "0", "0", "0", "0" ], [ "0", "386280", "0", "0", "0" ], [ "0", "0", "159451", "0", "0" ], [ "0", "0", "0", "21317", "0" ], [ "0", "0", "0", "0", "8881" ] ] ] }, { "name": "query_id", "index": "1", "type": "numeric", "distinct": "10000", "missing": "0", "identifier": "1", "min": "1", "max": "29998", "mean": "14829", "stdev": "8254" }, { "name": "covered_query_term_number-body", "index": "2", "type": "numeric", "distinct": "22", "missing": "0", "min": "0", "max": "75", "mean": "2", "stdev": "1" }, { "name": "covered_query_term_number-anchor", "index": "3", "type": "numeric", "distinct": "10", "missing": "0", "min": "0", "max": "18", "mean": "0", "stdev": "1" }, { "name": "covered_query_term_number-title", "index": "4", "type": "numeric", "distinct": "19", "missing": "0", "min": "0", "max": "27", "mean": "1", "stdev": "1" }, { "name": "covered_query_term_number-url", "index": "5", "type": "numeric", "distinct": "13", "missing": "0", "min": "0", "max": "15", "mean": "1", "stdev": "1" }, { "name": "covered_query_term_number-whole_document", "index": "6", "type": "numeric", "distinct": "22", "missing": "0", "min": "0", "max": "75", "mean": "2", "stdev": "1" }, { "name": "covered_query_term_ratio-body", "index": "7", "type": "numeric", "distinct": "68", "missing": "0", "min": "0", "max": "1", "mean": "1", "stdev": "0" }, { "name": "covered_query_term_ratio-anchor", "index": "8", "type": "numeric", "distinct": "38", "missing": "0", "min": "0", "max": "1", "mean": "0", "stdev": "0" }, { "name": "covered_query_term_ratio-title", "index": "9", "type": "numeric", "distinct": "54", "missing": "0", "min": "0", "max": "1", "mean": "1", "stdev": "0" }, { "name": "covered_query_term_ratio-url", "index": "10", "type": "numeric", "distinct": "43", "missing": "0", "min": "0", "max": "1", "mean": "0", "stdev": "0" }, { "name": "covered_query_term_ratio-whole_document", "index": "11", "type": "numeric", "distinct": "68", "missing": "0", "min": "0", "max": "1", "mean": "1", "stdev": "0" }, { "name": "stream_length-body", "index": "12", "type": "numeric", "distinct": "5840", "missing": "0", "min": "0", "max": "13540", "mean": "663", "stdev": "780" }, { "name": "stream_length-anchor", "index": "13", "type": "numeric", "distinct": "226", "missing": "0", "min": "0", "max": "3233", "mean": "2", "stdev": "16" }, { "name": "stream_length-title", "index": "14", "type": "numeric", "distinct": "627", "missing": "0", "min": "0", "max": "6669", "mean": "8", "stdev": "55" }, { "name": "stream_length-url", "index": "15", "type": "numeric", "distinct": "107", "missing": "0", "min": "2", "max": "393", "mean": "8", "stdev": "3" }, { "name": "stream_length-whole_document", "index": "16", "type": "numeric", "distinct": "5961", "missing": "0", "min": "2", "max": "13554", "mean": "681", "stdev": "786" }, { "name": "IDF(Inverse_document_frequency)-body", "index": "17", "type": "numeric", "distinct": "9208", "missing": "0", "min": "-4", "max": "225", "mean": "10", "stdev": "6" }, { "name": "IDF(Inverse_document_frequency)-anchor", "index": "18", "type": "numeric", "distinct": "8511", "missing": "0", "min": "4", "max": "514", "mean": "21", "stdev": "10" }, { "name": "IDF(Inverse_document_frequency)-title", "index": "19", "type": "numeric", "distinct": "8691", "missing": "0", "min": "2", "max": "497", "mean": "18", "stdev": "9" }, { "name": "IDF(Inverse_document_frequency)-url", "index": "20", "type": "numeric", "distinct": "8677", "missing": "0", "min": "2", "max": "493", "mean": "20", "stdev": "10" }, { "name": "IDF(Inverse_document_frequency)-whole_document", "index": "21", "type": "numeric", "distinct": "9213", "missing": "0", "min": "-4", "max": "225", "mean": "10", "stdev": "6" }, { "name": "sum_of_term_frequency-body", "index": "22", "type": "numeric", "distinct": "797", "missing": "0", "min": "0", "max": "338325", "mean": "24", "stdev": "682" }, { "name": "sum_of_term_frequency-anchor", "index": "23", "type": "numeric", "distinct": "89", "missing": "0", "min": "0", "max": "277", "mean": "0", "stdev": "1" }, { "name": "sum_of_term_frequency-title", "index": "24", "type": "numeric", "distinct": "160", "missing": "0", "min": "0", "max": "1259", "mean": "1", "stdev": "3" }, { "name": "sum_of_term_frequency-url", "index": "25", "type": "numeric", "distinct": "20", "missing": "0", "min": "0", "max": "52", "mean": "1", "stdev": "1" }, { "name": "sum_of_term_frequency-whole_document", "index": "26", "type": "numeric", "distinct": "818", "missing": "0", "min": "0", "max": "338325", "mean": "26", "stdev": "682" }, { "name": "min_of_term_frequency-body", "index": "27", "type": "numeric", "distinct": "298", "missing": "0", "min": "0", "max": "4511", "mean": "5", "stdev": "14" }, { "name": "min_of_term_frequency-anchor", "index": "28", "type": "numeric", "distinct": "50", "missing": "0", "min": "0", "max": "86", "mean": "0", "stdev": "1" }, { "name": "min_of_term_frequency-title", "index": "29", "type": "numeric", "distinct": "46", "missing": "0", "min": "0", "max": "315", "mean": "0", "stdev": "1" }, { "name": "min_of_term_frequency-url", "index": "30", "type": "numeric", "distinct": "12", "missing": "0", "min": "0", "max": "42", "mean": "0", "stdev": "0" }, { "name": "min_of_term_frequency-whole_document", "index": "31", "type": "numeric", "distinct": "307", "missing": "0", "min": "0", "max": "4511", "mean": "6", "stdev": "15" }, { "name": "max_of_term_frequency-body", "index": "32", "type": "numeric", "distinct": "587", "missing": "0", "min": "0", "max": "4511", "mean": "15", "stdev": "25" }, { "name": "max_of_term_frequency-anchor", "index": "33", "type": "numeric", "distinct": "54", "missing": "0", "min": "0", "max": "94", "mean": "0", "stdev": "1" }, { "name": "max_of_term_frequency-title", "index": "34", "type": "numeric", "distinct": "129", "missing": "0", "min": "0", "max": "993", "mean": "1", "stdev": "2" }, { "name": "max_of_term_frequency-url", "index": "35", "type": "numeric", "distinct": "17", "missing": "0", "min": "0", "max": "42", "mean": "0", "stdev": "1" }, { "name": "max_of_term_frequency-whole_document", "index": "36", "type": "numeric", "distinct": "615", "missing": "0", "min": "0", "max": "4511", "mean": "16", "stdev": "26" }, { "name": "mean_of_term_frequency-body", "index": "37", "type": "numeric", "distinct": "2561", "missing": "0", "min": "0", "max": "4511", "mean": "10", "stdev": "17" }, { "name": "mean_of_term_frequency-anchor", "index": "38", "type": "numeric", "distinct": "223", "missing": "0", "min": "0", "max": "86", "mean": "0", "stdev": "1" }, { "name": "mean_of_term_frequency-title", "index": "39", "type": "numeric", "distinct": "324", "missing": "0", "min": "0", "max": "315", "mean": "1", "stdev": "1" }, { "name": "mean_of_term_frequency-url", "index": "40", "type": "numeric", "distinct": "80", "missing": "0", "min": "0", "max": "42", "mean": "0", "stdev": "0" }, { "name": "mean_of_term_frequency-whole_document", "index": "41", "type": "numeric", "distinct": "2597", "missing": "0", "min": "0", "max": "4511", "mean": "11", "stdev": "17" }, { "name": "var_of_term_frequency-body", "index": "42", "type": "numeric", "distinct": "21008", "missing": "0", "min": "0", "max": "2763906", "mean": "103", "stdev": "3107" }, { "name": "var_of_term_frequency-anchor", "index": "43", "type": "numeric", "distinct": "470", "missing": "0", "min": "0", "max": "2209", "mean": "0", "stdev": "4" }, { "name": "var_of_term_frequency-title", "index": "44", "type": "numeric", "distinct": "666", "missing": "0", "min": "0", "max": "161697", "mean": "1", "stdev": "232" }, { "name": "var_of_term_frequency-url", "index": "45", "type": "numeric", "distinct": "123", "missing": "0", "min": "0", "max": "121", "mean": "0", "stdev": "0" }, { "name": "var_of_term_frequency-whole_document", "index": "46", "type": "numeric", "distinct": "21951", "missing": "0", "min": "0", "max": "2783892", "mean": "112", "stdev": "3261" }, { "name": "sum_of_stream_length_normalized_term_frequency-body", "index": "47", "type": "numeric", "distinct": "89847", "missing": "0", "min": "0", "max": "60", "mean": "0", "stdev": "0" }, { "name": "sum_of_stream_length_normalized_term_frequency-anchor", "index": "48", "type": "numeric", "distinct": "1153", "missing": "0", "min": "0", "max": "3", "mean": "0", "stdev": "0" }, { "name": "sum_of_stream_length_normalized_term_frequency-title", "index": "49", "type": "numeric", "distinct": "1674", "missing": "0", "min": "0", "max": "5", "mean": "0", "stdev": "0" }, { "name": "sum_of_stream_length_normalized_term_frequency-url", "index": "50", "type": "numeric", "distinct": "300", "missing": "0", "min": "0", "max": "2", "mean": "0", "stdev": "0" }, { "name": "sum_of_stream_length_normalized_term_frequency-whole_document", "index": "51", "type": "numeric", "distinct": "94720", "missing": "0", "min": "0", "max": "60", "mean": "0", "stdev": "0" }, { "name": "min_of_stream_length_normalized_term_frequency-body", "index": "52", "type": "numeric", "distinct": "41428", "missing": "0", "min": "0", "max": "1", "mean": "0", "stdev": "0" }, { "name": "min_of_stream_length_normalized_term_frequency-anchor", "index": "53", "type": "numeric", "distinct": "639", "missing": "0", "min": "0", "max": "1", "mean": "0", "stdev": "0" }, { "name": "min_of_stream_length_normalized_term_frequency-title", "index": "54", "type": "numeric", "distinct": "852", "missing": "0", "min": "0", "max": "1", "mean": "0", "stdev": "0" }, { "name": "min_of_stream_length_normalized_term_frequency-url", "index": "55", "type": "numeric", "distinct": "139", "missing": "0", "min": "0", "max": "1", "mean": "0", "stdev": "0" }, { "name": "min_of_stream_length_normalized_term_frequency-whole_document", "index": "56", "type": "numeric", "distinct": "44050", "missing": "0", "min": "0", "max": "1", "mean": "0", "stdev": "0" }, { "name": "max_of_stream_length_normalized_term_frequency-body", "index": "57", "type": "numeric", "distinct": "66961", "missing": "0", "min": "0", "max": "1", "mean": "0", "stdev": "0" }, { "name": "max_of_stream_length_normalized_term_frequency-anchor", "index": "58", "type": "numeric", "distinct": "854", "missing": "0", "min": "0", "max": "1", "mean": "0", "stdev": "0" }, { "name": "max_of_stream_length_normalized_term_frequency-title", "index": "59", "type": "numeric", "distinct": "1373", "missing": "0", "min": "0", "max": "1", "mean": "0", "stdev": "0" }, { "name": "max_of_stream_length_normalized_term_frequency-url", "index": "60", "type": "numeric", "distinct": "196", "missing": "0", "min": "0", "max": "1", "mean": "0", "stdev": "0" }, { "name": "max_of_stream_length_normalized_term_frequency-whole_document", "index": "61", "type": "numeric", "distinct": "69811", "missing": "0", "min": "0", "max": "1", "mean": "0", "stdev": "0" }, { "name": "mean_of_stream_length_normalized_term_frequency-body", "index": "62", "type": "numeric", "distinct": "63890", "missing": "0", "min": "0", "max": "1", "mean": "0", "stdev": "0" }, { "name": "mean_of_stream_length_normalized_term_frequency-anchor", "index": "63", "type": "numeric", "distinct": "1479", "missing": "0", "min": "0", "max": "1", "mean": "0", "stdev": "0" }, { "name": "mean_of_stream_length_normalized_term_frequency-title", "index": "64", "type": "numeric", "distinct": "2125", "missing": "0", "min": "0", "max": "1", "mean": "0", "stdev": "0" }, { "name": "mean_of_stream_length_normalized_term_frequency-url", "index": "65", "type": "numeric", "distinct": "540", "missing": "0", "min": "0", "max": "1", "mean": "0", "stdev": "0" }, { "name": "mean_of_stream_length_normalized_term_frequency-whole_document", "index": "66", "type": "numeric", "distinct": "67228", "missing": "0", "min": "0", "max": "1", "mean": "0", "stdev": "0" }, { "name": "var_of_stream_length_normalized_term_frequency-body", "index": "67", "type": "numeric", "distinct": "8774", "missing": "0", "min": "0", "max": "0", "mean": "0", "stdev": "0" }, { "name": "var_of_stream_length_normalized_term_frequency-anchor", "index": "68", "type": "numeric", "distinct": "2319", "missing": "0", "min": "0", "max": "0", "mean": "0", "stdev": "0" }, { "name": "var_of_stream_length_normalized_term_frequency-title", "index": "69", "type": "numeric", "distinct": "2946", "missing": "0", "min": "0", "max": "0", "mean": "0", "stdev": "0" }, { "name": "var_of_stream_length_normalized_term_frequency-url", "index": "70", "type": "numeric", "distinct": "864", "missing": "0", "min": "0", "max": "0", "mean": "0", "stdev": "0" }, { "name": "var_of_stream_length_normalized_term_frequency-whole_document", "index": "71", "type": "numeric", "distinct": "8887", "missing": "0", "min": "0", "max": "0", "mean": "0", "stdev": "0" }, { "name": "sum_of_tf*idf-body", "index": "72", "type": "numeric", "distinct": "619392", "missing": "0", "min": "-6809", "max": "368557", "mean": "82", "stdev": "774" }, { "name": "sum_of_tf*idf-anchor", "index": "73", "type": "numeric", "distinct": "34996", "missing": "0", "min": "0", "max": "2288", "mean": "3", "stdev": "11" }, { "name": "sum_of_tf*idf-title", "index": "74", "type": "numeric", "distinct": "67401", "missing": "0", "min": "0", "max": "6928", "mean": "11", "stdev": "23" }, { "name": "sum_of_tf*idf-url", "index": "75", "type": "numeric", "distinct": "30162", "missing": "0", "min": "-1", "max": "302", "mean": "5", "stdev": "7" }, { "name": "sum_of_tf*idf-whole_document", "index": "76", "type": "numeric", "distinct": "655452", "missing": "0", "min": "-6967", "max": "358987", "mean": "91", "stdev": "757" }, { "name": "min_of_tf*idf-body", "index": "77", "type": "numeric", "distinct": "108789", "missing": "0", "min": "-3405", "max": "8667", "mean": "22", "stdev": "57" }, { "name": "min_of_tf*idf-anchor", "index": "78", "type": "numeric", "distinct": "7878", "missing": "0", "min": "0", "max": "717", "mean": "1", "stdev": "5" }, { "name": "min_of_tf*idf-title", "index": "79", "type": "numeric", "distinct": "9462", "missing": "0", "min": "0", "max": "2059", "mean": "3", "stdev": "6" }, { "name": "min_of_tf*idf-url", "index": "80", "type": "numeric", "distinct": "4812", "missing": "0", "min": "-1", "max": "169", "mean": "2", "stdev": "4" }, { "name": "min_of_tf*idf-whole_document", "index": "81", "type": "numeric", "distinct": "116800", "missing": "0", "min": "-3483", "max": "8675", "mean": "25", "stdev": "60" }, { "name": "max_of_tf*idf-body", "index": "82", "type": "numeric", "distinct": "215802", "missing": "0", "min": "0", "max": "25963", "mean": "60", "stdev": "106" }, { "name": "max_of_tf*idf-anchor", "index": "83", "type": "numeric", "distinct": "15520", "missing": "0", "min": "0", "max": "771", "mean": "2", "stdev": "7" }, { "name": "max_of_tf*idf-title", "index": "84", "type": "numeric", "distinct": "19132", "missing": "0", "min": "0", "max": "6857", "mean": "8", "stdev": "18" }, { "name": "max_of_tf*idf-url", "index": "85", "type": "numeric", "distinct": "9531", "missing": "0", "min": "0", "max": "218", "mean": "4", "stdev": "5" }, { "name": "max_of_tf*idf-whole_document", "index": "86", "type": "numeric", "distinct": "228653", "missing": "0", "min": "0", "max": "26018", "mean": "67", "stdev": "110" }, { "name": "mean_of_tf*idf-body", "index": "87", "type": "numeric", "distinct": "625056", "missing": "0", "min": "-3405", "max": "13048", "mean": "39", "stdev": "68" }, { "name": "mean_of_tf*idf-anchor", "index": "88", "type": "numeric", "distinct": "40443", "missing": "0", "min": "0", "max": "717", "mean": "1", "stdev": "5" }, { "name": "mean_of_tf*idf-title", "index": "89", "type": "numeric", "distinct": "74892", "missing": "0", "min": "0", "max": "2580", "mean": "5", "stdev": "9" }, { "name": "mean_of_tf*idf-url", "index": "90", "type": "numeric", "distinct": "36212", "missing": "0", "min": "0", "max": "169", "mean": "3", "stdev": "4" }, { "name": "mean_of_tf*idf-whole_document", "index": "91", "type": "numeric", "distinct": "661356", "missing": "0", "min": "-3483", "max": "13075", "mean": "44", "stdev": "71" }, { "name": "var_of_tf*idf-body", "index": "92", "type": "numeric", "distinct": "583371", "missing": "0", "min": "0", "max": "166804770", "mean": "2039", "stdev": "158735" }, { "name": "var_of_tf*idf-anchor", "index": "93", "type": "numeric", "distinct": "38005", "missing": "0", "min": "0", "max": "148653", "mean": "6", "stdev": "231" }, { "name": "var_of_tf*idf-title", "index": "94", "type": "numeric", "distinct": "72885", "missing": "0", "min": "0", "max": "10343023", "mean": "62", "stdev": "11894" }, { "name": "var_of_tf*idf-url", "index": "95", "type": "numeric", "distinct": "34786", "missing": "0", "min": "0", "max": "4515", "mean": "4", "stdev": "14" }, { "name": "var_of_tf*idf-whole_document", "index": "96", "type": "numeric", "distinct": "616183", "missing": "0", "min": "0", "max": "167505292", "mean": "2241", "stdev": "160618" }, { "name": "boolean_model-body", "index": "97", "type": "numeric", "distinct": "2", "missing": "0", "min": "0", "max": "1", "mean": "1", "stdev": "0" }, { "name": "boolean_model-anchor", "index": "98", "type": "numeric", "distinct": "2", "missing": "0", "min": "0", "max": "1", "mean": "0", "stdev": "0" }, { "name": "boolean_model-title", "index": "99", "type": "numeric", "distinct": "2", "missing": "0", "min": "0", "max": "1", "mean": "0", "stdev": "0" }, { "name": "boolean_model-url", "index": "100", "type": "numeric", "distinct": "2", "missing": "0", "min": "0", "max": "1", "mean": "0", "stdev": "0" }, { "name": "boolean_model-whole_document", "index": "101", "type": "numeric", "distinct": "2", "missing": "0", "min": "0", "max": "1", "mean": "1", "stdev": "0" }, { "name": "vector_space_model-body", "index": "102", "type": "numeric", "distinct": "276184", "missing": "0", "min": "0", "max": "1", "mean": "1", "stdev": "0" }, { "name": "vector_space_model-anchor", "index": "103", "type": "numeric", "distinct": "27447", "missing": "0", "min": "0", "max": "1", "mean": "0", "stdev": "0" }, { "name": "vector_space_model-title", "index": "104", "type": "numeric", "distinct": "55303", "missing": "0", "min": "0", "max": "1", "mean": "1", "stdev": "0" }, { "name": "vector_space_model-url", "index": "105", "type": "numeric", "distinct": "31999", "missing": "0", "min": "0", "max": "1", "mean": "0", "stdev": "0" }, { "name": "vector_space_model-whole_document", "index": "106", "type": "numeric", "distinct": "280515", "missing": "0", "min": "0", "max": "1", "mean": "1", "stdev": "0" }, { "name": "BM25-body", "index": "107", "type": "numeric", "distinct": "952619", "missing": "0", "min": "-15", "max": "684", "mean": "18", "stdev": "13" }, { "name": "BM25-anchor", "index": "108", "type": "numeric", "distinct": "80924", "missing": "0", "min": "0", "max": "147", "mean": "2", "stdev": "5" }, { "name": "BM25-title", "index": "109", "type": "numeric", "distinct": "249263", "missing": "0", "min": "0", "max": "285", "mean": "10", "stdev": "9" }, { "name": "BM25-url", "index": "110", "type": "numeric", "distinct": "115107", "missing": "0", "min": "-1", "max": "72", "mean": "5", "stdev": "7" }, { "name": "BM25-whole_document", "index": "111", "type": "numeric", "distinct": "1022469", "missing": "0", "min": "-15", "max": "685", "mean": "19", "stdev": "13" }, { "name": "LMIR.ABS-body", "index": "112", "type": "numeric", "distinct": "941872", "missing": "0", "min": "-193", "max": "0", "mean": "-12", "stdev": "10" }, { "name": "LMIR.ABS-anchor", "index": "113", "type": "numeric", "distinct": "99016", "missing": "0", "min": "-338", "max": "0", "mean": "-14", "stdev": "12" }, { "name": "LMIR.ABS-title", "index": "114", "type": "numeric", "distinct": "327100", "missing": "0", "min": "-345", "max": "0", "mean": "-12", "stdev": "11" }, { "name": "LMIR.ABS-url", "index": "115", "type": "numeric", "distinct": "163631", "missing": "0", "min": "-163", "max": "0", "mean": "-15", "stdev": "13" }, { "name": "LMIR.ABS-whole_document", "index": "116", "type": "numeric", "distinct": "1005749", "missing": "0", "min": "-192", "max": "0", "mean": "-12", "stdev": "10" }, { "name": "LMIR.DIR-body", "index": "117", "type": "numeric", "distinct": "945167", "missing": "0", "min": "-186", "max": "0", "mean": "-15", "stdev": "10" }, { "name": "LMIR.DIR-anchor", "index": "118", "type": "numeric", "distinct": "100884", "missing": "0", "min": "-360", "max": "0", "mean": "-18", "stdev": "11" }, { "name": "LMIR.DIR-title", "index": "119", "type": "numeric", "distinct": "305059", "missing": "0", "min": "-353", "max": "0", "mean": "-17", "stdev": "11" }, { "name": "LMIR.DIR-url", "index": "120", "type": "numeric", "distinct": "159639", "missing": "0", "min": "-161", "max": "0", "mean": "-19", "stdev": "12" }, { "name": "LMIR.DIR-whole_document", "index": "121", "type": "numeric", "distinct": "1011588", "missing": "0", "min": "-186", "max": "0", "mean": "-14", "stdev": "9" }, { "name": "LMIR.JM-body", "index": "122", "type": "numeric", "distinct": "899692", "missing": "0", "min": "-175", "max": "0", "mean": "-12", "stdev": "10" }, { "name": "LMIR.JM-anchor", "index": "123", "type": "numeric", "distinct": "78232", "missing": "0", "min": "-379", "max": "1", "mean": "-15", "stdev": "13" }, { "name": "LMIR.JM-title", "index": "124", "type": "numeric", "distinct": "242188", "missing": "0", "min": "-379", "max": "1", "mean": "-12", "stdev": "13" }, { "name": "LMIR.JM-url", "index": "125", "type": "numeric", "distinct": "130083", "missing": "0", "min": "-191", "max": "0", "mean": "-16", "stdev": "15" }, { "name": "LMIR.JM-whole_document", "index": "126", "type": "numeric", "distinct": "953189", "missing": "0", "min": "-176", "max": "0", "mean": "-12", "stdev": "10" }, { "name": "Number_of_slash_in_URL", "index": "127", "type": "numeric", "distinct": "27", "missing": "0", "min": "1", "max": "97", "mean": "3", "stdev": "1" }, { "name": "Length_of_URL", "index": "128", "type": "numeric", "distinct": "424", "missing": "0", "min": "4", "max": "1549", "mean": "43", "stdev": "22" }, { "name": "Inlink_number", "index": "129", "type": "numeric", "distinct": "29030", "missing": "0", "min": "-2083777989", "max": "314131554", "mean": "99948", "stdev": "6184944" }, { "name": "Outlink_number", "index": "130", "type": "numeric", "distinct": "139", "missing": "0", "min": "0", "max": "178", "mean": "4", "stdev": "9" }, { "name": "PageRank", "index": "131", "type": "numeric", "distinct": "65102", "missing": "0", "min": "100", "max": "65535", "mean": "19691", "stdev": "22412" }, { "name": "SiteRank", "index": "132", "type": "numeric", "distinct": "60686", "missing": "0", "min": "1", "max": "65535", "mean": "36024", "stdev": "21282" }, { "name": "QualityScore", "index": "133", "type": "numeric", "distinct": "254", "missing": "0", "min": "1", "max": "254", "mean": "17", "stdev": "30" }, { "name": "QualityScore2", "index": "134", "type": "numeric", "distinct": "255", "missing": "0", "min": "0", "max": "254", "mean": "24", "stdev": "41" }, { "name": "Query-url_click_count", "index": "135", "type": "numeric", "distinct": "5326", "missing": "0", "min": "0", "max": "13544625", "mean": "216", "stdev": "29617" }, { "name": "url_click_count", "index": "136", "type": "numeric", "distinct": "3745", "missing": "0", "min": "0", "max": "2789632", "mean": "454", "stdev": "18884" }, { "name": "url_dwell_time", "index": "137", "type": "numeric", "distinct": "91823", "missing": "0", "min": "0", "max": "980000001", "mean": "18149", "stdev": "3499186" } ], "nr_of_issues": 0, "nr_of_downvotes": 0, "nr_of_likes": 0, "nr_of_downloads": 0, "total_downloads": 0, "reach": 0, "reuse": 0, "impact_of_reuse": 0, "reach_of_reuse": 0, "impact": 0 }