OpenML
Human-Memory-and-Cognition

Human-Memory-and-Cognition

active ARFF CC0: Public Domain Visibility: public Uploaded 24-03-2022 by Dustin Carrion
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes
  • Computer Systems Machine Learning
Issue #Downvotes for this reason By


Loading wiki
Help us complete this description Edit
Context Models of human cognition hold that information processing occurs in a series of stages. Cognitive psychology, in particular, is concerned with the internal mental processes that begin with the appearance of an external stimulus and result in a behavioural response. Content Explore human cognitive processes around the generation of narrativeswith a focus on the language employed in stories about events that have been experienced versus imagined. Investigate and characterize cognitive processes involved in storytelling, contrasting imagination and recollection of events with the help of Data Science. Build a machine learning model that would help you to categorize cognitive processes involved in storytelling - Imagined, Recalledor Retold. These are the columns in the data: AssignmentId: Unique ID of this story WorkTimeInSeconds: Time in seconds that it took the worker to do the entire HIT (reading instructions, story writing, questions) WorkerId: Unique ID of the worker (random string, not MTurk worker ID) annotatorAge: Lower limit of the age bucket of the worker. Buckets are: 18-24, 25-29, 30-34, 35-39, 40-44, 45-49, 50-54, 55+ annotatorGender: Gender of the worker annotatorRace: Race/ethnicity of the worker distracted: How distracted were you while writing your story? (5-point Likert) draining: How taxing/draining was writing for you emotionally? (5-point Likert) frequency: How often do you think about or talk about this event? (5-point Likert) importance: How impactful, important, or personal is this story/event to you? (5-point Likert) logTimeSinceEvent: Log of time (days) since the recalled event happened mainEvent: Short phrase describing the main event described memType: Type of story (recalled, imagined, retold) - The target variable mostSurprising: Short phrase describing what the most surprising aspect of the story was openness: Continuous variable representing the openness to experience of the worker recAgnPairId: ID of the recalled story that corresponds to this retold story (null for imagined stories). Group on this variable to get the recalled-retold pairs. recImgPairId: ID of the recalled story that corresponds to this imagined story (null for retold stories). Group on this variable to get the recalled-imagined pairs. similarity: How similar to your life does this event/story feel to you? (5-point Likert) similarityReason: Free text annotation of similarity story: Story about the imagined or recalled event (15-25 sentences) stressful: How stressful was this writing task? (5-point Likert) summary: Summary of the events in the story (1-3 sentences) timeSinceEvent: Time (number of days) since the recalled event happened Likert scaling is a bipolar scaling method, measuring either positive or negative response to a statement. Acknowledgements Maarten Sap, Eric Horvitz, Yejin Choi, Noah A. Smith, and James Pennebaker (2020) Recollection versus Imagination: Exploring Human Memory and Cognition via Neural Language Models. ACL. Inspiration Explore the human cognitive process using machine learning.

23 features

AssignmentIdstring6854 unique values
0 missing
WorkTimeInSecondsnumeric3384 unique values
0 missing
WorkerIdstring3640 unique values
0 missing
annotatorAgenumeric8 unique values
23 missing
annotatorGenderstring7 unique values
0 missing
annotatorRacestring10 unique values
0 missing
distractednumeric5 unique values
0 missing
drainingnumeric5 unique values
0 missing
frequencynumeric5 unique values
2756 missing
importancenumeric5 unique values
144 missing
logTimeSinceEventnumeric76 unique values
0 missing
mainEventstring6322 unique values
0 missing
memTypestring3 unique values
0 missing
mostSurprisingstring6584 unique values
0 missing
opennessnumeric17 unique values
0 missing
recAgnPairIdstring1309 unique values
4235 missing
recImgPairIdstring2572 unique values
1526 missing
similaritynumeric5 unique values
4098 missing
similarityReasonstring2512 unique values
4141 missing
storystring6854 unique values
0 missing
stressfulnumeric5 unique values
0 missing
summarystring2787 unique values
1 missing
timeSinceEventnumeric76 unique values
0 missing

19 properties

6854
Number of instances (rows) of the dataset.
23
Number of attributes (columns) of the dataset.
Number of distinct values of the target attribute (if it is nominal).
16924
Number of missing values in the dataset.
6854
Number of instances with at least one value missing.
11
Number of numeric attributes.
0
Number of nominal attributes.
0
Number of attributes divided by the number of instances.
47.83
Percentage of numeric attributes.
Percentage of instances belonging to the most frequent class.
0
Percentage of nominal attributes.
Number of instances belonging to the most frequent class.
Percentage of instances belonging to the least frequent class.
Number of instances belonging to the least frequent class.
0
Number of binary attributes.
0
Percentage of binary attributes.
100
Percentage of instances having missing values.
Average class difference between consecutive instances.
10.74
Percentage of missing values.

0 tasks

Define a new task