Data
Risk-of-being-drawn-into-online-sex-work-(cleaned)

Risk-of-being-drawn-into-online-sex-work-(cleaned)

active ARFF CC0: Public Domain Visibility: public Uploaded 24-03-2022 by Dustin Carrion
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes
  • Computer Systems Machine Learning
Issue #Downvotes for this reason By


Loading wiki
Help us complete this description Edit
This dataset is the resulting cleaned version of Panos Kostakos's Risk of being drawn into online sex work dataset. Context This database was used in the paper: "Covert online ethnography and machine learning for detecting individuals at risk of being drawn into online sex work". https://www.flinders.edu.au/centre-crime-policy-research/illicit-networks-workshop Content The database includes data scraped from a European online adult forum. Using covert online ethnography we interviewed a small number of participants and determined their risk to either supply or demand sex services through that forum. This is a great dataset for semi-supervised learning. Acknowledgements The dataset was initially publicized by Panos Kostakos. Inspiration How can we identify individuals at risk of being drawn into online sex work? The spread of online social media enables a greater number of people to be involved into online sex trade; however, detecting deviant behaviors online is limited by the low available of data. To overcome this challenge, we combine covert online ethnography with semi-supervised learning using data from a popular European adult forum.

30 features

User_IDnumeric28763 unique values
0 missing
Femalenominal2 unique values
0 missing
Agenumeric526 unique values
0 missing
Locationstring18 unique values
0 missing
Verificationnominal2 unique values
0 missing
Heterosexualnumeric2 unique values
0 missing
Homosexualnumeric2 unique values
0 missing
bicuriousnumeric2 unique values
0 missing
bisexualnumeric2 unique values
0 missing
Dominantnumeric2 unique values
0 missing
Submisivenumeric2 unique values
0 missing
Switchnumeric2 unique values
0 missing
Mennumeric2 unique values
0 missing
Men_and_Womennumeric2 unique values
0 missing
Nobodynumeric2 unique values
0 missing
Nobody_but_maybenumeric2 unique values
0 missing
Womennumeric2 unique values
0 missing
Points_Ranknumeric375 unique values
0 missing
Last_loginnumeric2360 unique values
0 missing
Member_since_yearnumeric9 unique values
0 missing
Member_since_monthnumeric12 unique values
0 missing
Member_since_daynumeric31 unique values
0 missing
Number_of_Comments_in_public_forumnumeric217 unique values
0 missing
Time_spent_chating_H:Mnumeric1199 unique values
0 missing
Number_of_advertisments_postednumeric18 unique values
0 missing
Number_of_offline_meetings_attendednumeric16 unique values
0 missing
Number_of_Friendsnumeric49 unique values
0 missing
Profile_picturesnumeric63 unique values
0 missing
Friends_ID_liststring2742 unique values
25518 missing
Risknumeric2 unique values
28741 missing

19 properties

28831
Number of instances (rows) of the dataset.
30
Number of attributes (columns) of the dataset.
Number of distinct values of the target attribute (if it is nominal).
54259
Number of missing values in the dataset.
28779
Number of instances with at least one value missing.
26
Number of numeric attributes.
2
Number of nominal attributes.
0
Number of attributes divided by the number of instances.
86.67
Percentage of numeric attributes.
Percentage of instances belonging to the most frequent class.
6.67
Percentage of nominal attributes.
Number of instances belonging to the most frequent class.
Percentage of instances belonging to the least frequent class.
Number of instances belonging to the least frequent class.
2
Number of binary attributes.
6.67
Percentage of binary attributes.
99.82
Percentage of instances having missing values.
Average class difference between consecutive instances.
6.27
Percentage of missing values.

0 tasks

Define a new task