{ "data_id": "44157", "name": "eye_movements", "exact_name": "eye_movements", "version": 8, "version_label": null, "description": "Dataset used in the tabular data benchmark https:\/\/github.com\/LeoGrin\/tabular-benchmark, \n transformed in the same way. This dataset belongs to the \"classification on categorical and\n numerical features\" benchmark. Original description: \n \n**Author**: \n**Source**: Unknown - Date unknown \n**Please cite**: \n\nJarkko Salojarvi, Kai Puolamaki, Jaana Simola, Lauri Kovanen, Ilpo Kojo, Samuel Kaski. Inferring Relevance from Eye Movements: Feature Extraction. Helsinki University of Technology, Publications in Computer and Information Science, Report A82. 3 March 2005. Data set at http:\/\/www.cis.hut.fi\/eyechallenge2005\/\n\nCompetition 1 (preprocessed data)\nA straight-forward classification task. We provide pre-computed feature vectors for each word in the eye movement trajectory, with class labels.\n\nThe dataset consist of several assignments. Each assignment consists of a question followed by ten sentences (titles of news articles). One of the sentences is the correct answer to the question (C) and five of the sentences are irrelevant to the question (I). Four of the sentences are relevant to the question (R), but they do not answer it.\n\n\n* Features are in columns, feature vectors in rows.\n* Each assignment is a time sequence of 22-dimensional feature vectors.\n* The first column is the line number, second the assignment number and the next 22 columns (3 to 24) are the different features. Columns 25 to 27 contain extra information about the example. The training data set contains the classification label in the 28th column: \"0\" for irrelevant, \"1\" for relevant and \"2\" for the correct answer.\n* Each example (row) represents a single word. You are asked to return the classification of each read sentence.\n* The 22 features provided are commonly used in psychological studies on eye movement. All of them are not necessarily relevant in this context.\n\nThe objective of the Challenge is to predict the classification labels (I, R, C).\n\n\n\nPlease see the technical report for information of eye movements, experimental setup, baseline methods and references:\n\nJarkko Salojarvi, Kai Puolamaki, Jaana Simola, Lauri Kovanen, Ilpo Kojo, Samuel Kaski. Inferring Relevance from Eye Movements: Feature Extraction. Helsinki University of Technology, Publications in Computer and Information Science, Report A82. 3 March 2005. [PDF]\n\n\n\nModified by TunedIT (converted to ARFF format)\n\n\nFEATURES\n\nThe values in columns marked with an asterisk (*) are same for all occurances of the word.\n\nCOL\tNAME\t\tDESCRIPTION\n1\t#line\t\tLine number\n2\t#assg\t\tAssignment Number\n3\tfixcount\tNumber of fixations to the word\n4*\tfirstPassCnt\tNumber of fixations to the word when it is first encountered\n5*\tP1stFixation\t'1' if fixation occured when the sentence the word was in was encountered the first time\n6*\tP2stFixation\t'1' if fixation occured when the sentence the word was in was encountered the second time\n7*\tprevFixDur\tDuration of previous fixation\n8*\tfirstfixDur\tDuration of the first fixation when the word is first encountered\n9*\tfirstPassFixDur\tSum of durations of fixations when the word is first encountered\n10*\tnextFixDur\tDuration of the next fixation when gaze initially moves from the word\n11\tfirstSaccLen\tLength of the first saccade\n12\tlastSaccLen\tDistance between fixation on the word and the next fixation\n13\tprevFixPos\tDistance between the first fixation preceding the word and the beginning ot the word\n14\tlandingPos\tDistance between the first fixation on the word and the beginning of the word\n15\tleavingPos\tDistance between the last fixation on the word and the beginning of the word\n16\ttotalFixDur\tSum of all durations of fixations to the word\n17\tmeanFixDur\tMean duration of the fixations to the word\n18*\tnRegressFrom\tNumber of regressions leaving from the word\n19*\tregressLen\tSum of durations of regressions initiating from this word\n20*\tnextWordRegress\t'1' if a regression initiated from the following word\n21*\tregressDur\tSum of durations of the fixations on the word during regression\n22\tpupilDiamMax\tMaximum pupil diameter\n23\tpupilDiamLag\tMaximum pupil diameter 0.5 - 1.5 seconds after the beginning of fixation\n24\ttimePrtctg\tFirst fixation duration divided by the total number of fixations\n25\tnWordsInTitle\tNumber of word in the sentence (title) this word is in\n26\ttitleNo\t\tTitle number\n27\twordNo\t\tWord number (ordinal) in this title\n28\tlabel\t\tClassification for training data ('0'=irrelevant, '1'=relevant, '2'=correct)", "format": "arff", "uploader": "Leo Grin", "uploader_id": 26324, "visibility": "public", "creator": "\"Jarkko Salojarvi\",\"Kai Puolamaki\",\"Jaana Simola\",\"Lauri Kovanen\",\"Ilpo Kojo\",\"Samuel Kaski\"", "contributor": "\"Leo Grin\"", "date": "2022-07-10 10:35:06", "update_comment": null, "last_update": "2022-07-10 10:35:06", "licence": "Public", "status": "active", "error_message": null, "url": "https:\/\/api.openml.org\/data\/download\/22103282\/dataset", "default_target_attribute": "label", "row_id_attribute": null, "ignore_attribute": null, "runs": 0, "suggest": { "input": [ "eye_movements", "Dataset used in the tabular data benchmark https:\/\/github.com\/LeoGrin\/tabular-benchmark, transformed in the same way. This dataset belongs to the \"classification on categorical and numerical features\" benchmark. Original description: Jarkko Salojarvi, Kai Puolamaki, Jaana Simola, Lauri Kovanen, Ilpo Kojo, Samuel Kaski. Inferring Relevance from Eye Movements: Feature Extraction. Helsinki University of Technology, Publications in Computer and Information Science, Report A82. 3 March 2005. Data set " ], "weight": 5 }, "qualities": { "NumberOfInstances": 7608, "NumberOfFeatures": 24, "NumberOfClasses": 2, "NumberOfMissingValues": 0, "NumberOfInstancesWithMissingValues": 0, "NumberOfNumericFeatures": 20, "NumberOfSymbolicFeatures": 4, "MajorityClassSize": 3804, "MinorityClassPercentage": 50, "MinorityClassSize": 3804, "NumberOfBinaryFeatures": 4, "PercentageOfBinaryFeatures": 16.666666666666664, "PercentageOfInstancesWithMissingValues": 0, "AutoCorrelation": 0.9998685421322466, "PercentageOfMissingValues": 0, "Dimensionality": 0.0031545741324921135, "PercentageOfNumericFeatures": 83.33333333333334, "MajorityClassPercentage": 50, "PercentageOfSymbolicFeatures": 16.666666666666664 }, "tags": [ { "uploader": "38960", "tag": "Computer Systems" }, { "uploader": "38960", "tag": "Machine Learning" } ], "features": [ { "name": "label", "index": "23", "type": "nominal", "distinct": "2", "missing": "0", "target": "1", "distr": [ [ "0", "1" ], [ [ "3804", "0" ], [ "0", "3804" ] ] ] }, { "name": "lineNo", "index": "0", "type": "numeric", "distinct": "7608", "missing": "0", "min": "1", "max": "10927", "mean": "5424", "stdev": "3163" }, { "name": "assgNo", "index": "1", "type": "numeric", "distinct": "331", "missing": "0", "min": "1", "max": "336", "mean": "167", "stdev": "97" }, { "name": "P1stFixation", "index": "2", "type": "nominal", "distinct": "2", "missing": "0", "distr": [ [ "0", "1" ], [ [ "752", "712" ], [ "3052", "3092" ] ] ] }, { "name": "P2stFixation", "index": "3", "type": "nominal", "distinct": "2", "missing": "0", "distr": [ [ "0", "1" ], [ [ "2374", "2534" ], [ "1430", "1270" ] ] ] }, { "name": "prevFixDur", "index": "4", "type": "numeric", "distinct": "58", "missing": "0", "min": "0", "max": "1036", "mean": "157", "stdev": "82" }, { "name": "firstfixDur", "index": "5", "type": "numeric", "distinct": "59", "missing": "0", "min": "20", "max": "777", "mean": "166", "stdev": "74" }, { "name": "firstPassFixDur", "index": "6", "type": "numeric", "distinct": "94", "missing": "0", "min": "20", "max": "1392", "mean": "191", "stdev": "101" }, { "name": "nextFixDur", "index": "7", "type": "numeric", "distinct": "62", "missing": "0", "min": "0", "max": "1133", "mean": "167", "stdev": "75" }, { "name": "firstSaccLen", "index": "8", "type": "numeric", "distinct": "6792", "missing": "0", "min": "0", "max": "1686", "mean": "230", "stdev": "198" }, { "name": "lastSaccLen", "index": "9", "type": "numeric", "distinct": "6977", "missing": "0", "min": "0", "max": "1924", "mean": "241", "stdev": "202" }, { "name": "prevFixPos", "index": "10", "type": "numeric", "distinct": "5911", "missing": "0", "min": "0", "max": "1070", "mean": "214", "stdev": "191" }, { "name": "landingPos", "index": "11", "type": "numeric", "distinct": "5390", "missing": "0", "min": "1", "max": "1349", "mean": "75", "stdev": "93" }, { "name": "leavingPos", "index": "12", "type": "numeric", "distinct": "5458", "missing": "0", "min": "1", "max": "1344", "mean": "77", "stdev": "93" }, { "name": "totalFixDur", "index": "13", "type": "numeric", "distinct": "105", "missing": "0", "min": "20", "max": "1392", "mean": "193", "stdev": "103" }, { "name": "meanFixDur", "index": "14", "type": "numeric", "distinct": "166", "missing": "0", "min": "20", "max": "757", "mean": "166", "stdev": "73" }, { "name": "regressLen", "index": "15", "type": "numeric", "distinct": "431", "missing": "0", "min": "0", "max": "24987", "mean": "416", "stdev": "1623" }, { "name": "nextWordRegress", "index": "16", "type": "nominal", "distinct": "2", "missing": "0", "distr": [ [ "0", "1" ], [ [ "3044", "3329" ], [ "760", "475" ] ] ] }, { "name": "regressDur", "index": "17", "type": "numeric", "distinct": "249", "missing": "0", "min": "0", "max": "11140", "mean": "198", "stdev": "595" }, { "name": "pupilDiamMax", "index": "18", "type": "numeric", "distinct": "3058", "missing": "0", "min": "-4", "max": "4", "mean": "0", "stdev": "0" }, { "name": "pupilDiamLag", "index": "19", "type": "numeric", "distinct": "2158", "missing": "0", "min": "-1", "max": "4", "mean": "0", "stdev": "0" }, { "name": "timePrtctg", "index": "20", "type": "numeric", "distinct": "843", "missing": "0", "min": "0", "max": "0", "mean": "0", "stdev": "0" }, { "name": "titleNo", "index": "21", "type": "numeric", "distinct": "10", "missing": "0", "min": "1", "max": "10", "mean": "5", "stdev": "3" }, { "name": "wordNo", "index": "22", "type": "numeric", "distinct": "10", "missing": "0", "min": "1", "max": "10", "mean": "3", "stdev": "2" } ], "nr_of_issues": 0, "nr_of_downvotes": 0, "nr_of_likes": 0, "nr_of_downloads": 0, "total_downloads": 0, "reach": 0, "reuse": 0, "impact_of_reuse": 0, "reach_of_reuse": 0, "impact": 0 }