Data
eye_movements

eye_movements

active ARFF Publicly available Visibility: public Uploaded 10-07-2022 by Leo Grin
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes
  • Computer Systems Machine Learning
Issue #Downvotes for this reason By


Loading wiki
Help us complete this description Edit
Dataset used in the tabular data benchmark https://github.com/LeoGrin/tabular-benchmark, transformed in the same way. This dataset belongs to the "classification on categorical and numerical features" benchmark. Original description: Author: Source: Unknown - Date unknown Please cite: Jarkko Salojarvi, Kai Puolamaki, Jaana Simola, Lauri Kovanen, Ilpo Kojo, Samuel Kaski. Inferring Relevance from Eye Movements: Feature Extraction. Helsinki University of Technology, Publications in Computer and Information Science, Report A82. 3 March 2005. Data set at http://www.cis.hut.fi/eyechallenge2005/ Competition 1 (preprocessed data) A straight-forward classification task. We provide pre-computed feature vectors for each word in the eye movement trajectory, with class labels. The dataset consist of several assignments. Each assignment consists of a question followed by ten sentences (titles of news articles). One of the sentences is the correct answer to the question (C) and five of the sentences are irrelevant to the question (I). Four of the sentences are relevant to the question (R), but they do not answer it. * Features are in columns, feature vectors in rows. * Each assignment is a time sequence of 22-dimensional feature vectors. * The first column is the line number, second the assignment number and the next 22 columns (3 to 24) are the different features. Columns 25 to 27 contain extra information about the example. The training data set contains the classification label in the 28th column: "0" for irrelevant, "1" for relevant and "2" for the correct answer. * Each example (row) represents a single word. You are asked to return the classification of each read sentence. * The 22 features provided are commonly used in psychological studies on eye movement. All of them are not necessarily relevant in this context. The objective of the Challenge is to predict the classification labels (I, R, C). Please see the technical report for information of eye movements, experimental setup, baseline methods and references: Jarkko Salojarvi, Kai Puolamaki, Jaana Simola, Lauri Kovanen, Ilpo Kojo, Samuel Kaski. Inferring Relevance from Eye Movements: Feature Extraction. Helsinki University of Technology, Publications in Computer and Information Science, Report A82. 3 March 2005. [PDF] Modified by TunedIT (converted to ARFF format) FEATURES The values in columns marked with an asterisk (*) are same for all occurances of the word. COL NAME DESCRIPTION 1 #line Line number 2 #assg Assignment Number 3 fixcount Number of fixations to the word 4* firstPassCnt Number of fixations to the word when it is first encountered 5* P1stFixation '1' if fixation occured when the sentence the word was in was encountered the first time 6* P2stFixation '1' if fixation occured when the sentence the word was in was encountered the second time 7* prevFixDur Duration of previous fixation 8* firstfixDur Duration of the first fixation when the word is first encountered 9* firstPassFixDur Sum of durations of fixations when the word is first encountered 10* nextFixDur Duration of the next fixation when gaze initially moves from the word 11 firstSaccLen Length of the first saccade 12 lastSaccLen Distance between fixation on the word and the next fixation 13 prevFixPos Distance between the first fixation preceding the word and the beginning ot the word 14 landingPos Distance between the first fixation on the word and the beginning of the word 15 leavingPos Distance between the last fixation on the word and the beginning of the word 16 totalFixDur Sum of all durations of fixations to the word 17 meanFixDur Mean duration of the fixations to the word 18* nRegressFrom Number of regressions leaving from the word 19* regressLen Sum of durations of regressions initiating from this word 20* nextWordRegress '1' if a regression initiated from the following word 21* regressDur Sum of durations of the fixations on the word during regression 22 pupilDiamMax Maximum pupil diameter 23 pupilDiamLag Maximum pupil diameter 0.5 - 1.5 seconds after the beginning of fixation 24 timePrtctg First fixation duration divided by the total number of fixations 25 nWordsInTitle Number of word in the sentence (title) this word is in 26 titleNo Title number 27 wordNo Word number (ordinal) in this title 28 label Classification for training data ('0'=irrelevant, '1'=relevant, '2'=correct)

24 features

label (target)nominal2 unique values
0 missing
lineNonumeric7608 unique values
0 missing
assgNonumeric331 unique values
0 missing
P1stFixationnominal2 unique values
0 missing
P2stFixationnominal2 unique values
0 missing
prevFixDurnumeric58 unique values
0 missing
firstfixDurnumeric59 unique values
0 missing
firstPassFixDurnumeric94 unique values
0 missing
nextFixDurnumeric62 unique values
0 missing
firstSaccLennumeric6792 unique values
0 missing
lastSaccLennumeric6977 unique values
0 missing
prevFixPosnumeric5911 unique values
0 missing
landingPosnumeric5390 unique values
0 missing
leavingPosnumeric5458 unique values
0 missing
totalFixDurnumeric105 unique values
0 missing
meanFixDurnumeric166 unique values
0 missing
regressLennumeric431 unique values
0 missing
nextWordRegressnominal2 unique values
0 missing
regressDurnumeric249 unique values
0 missing
pupilDiamMaxnumeric3058 unique values
0 missing
pupilDiamLagnumeric2158 unique values
0 missing
timePrtctgnumeric843 unique values
0 missing
titleNonumeric10 unique values
0 missing
wordNonumeric10 unique values
0 missing

19 properties

7608
Number of instances (rows) of the dataset.
24
Number of attributes (columns) of the dataset.
2
Number of distinct values of the target attribute (if it is nominal).
0
Number of missing values in the dataset.
0
Number of instances with at least one value missing.
20
Number of numeric attributes.
4
Number of nominal attributes.
3804
Number of instances belonging to the most frequent class.
50
Percentage of instances belonging to the least frequent class.
3804
Number of instances belonging to the least frequent class.
4
Number of binary attributes.
16.67
Percentage of binary attributes.
0
Percentage of instances having missing values.
1
Average class difference between consecutive instances.
0
Percentage of missing values.
0
Number of attributes divided by the number of instances.
83.33
Percentage of numeric attributes.
50
Percentage of instances belonging to the most frequent class.
16.67
Percentage of nominal attributes.

2 tasks

1 runs - estimation_procedure: 10-fold Crossvalidation - evaluation_measure: predictive_accuracy - target_feature: label
0 runs - estimation_procedure: 4-fold Crossvalidation - target_feature: label
Define a new task