OpenML

JavaScript is required to properly view the contents of this page!

Explore
- Data
- Task
- Flow
- Run
- Study
- Task type
- Measure
- People
Help
Blog
Contact
Please cite us

Articles-From-Buzzfeed-2020

active ARFF CC0: Public Domain Visibility: public Uploaded 24-03-2022 by Elif Ceren Gok
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes

Issue	#Downvotes for this reason	By

Loading wiki

Help us complete this description Edit

Context This dataset was created by our in house teams at PromptCloud(https://www.promptcloud.com/) and DataStock(https://datastock.shop/). We have about 5K samples in this dataset. You can download the full dataset here(https://app.datastock.shop/?site_name=Articles_From_BuzzFeed_2020). We have a 30 discount on all datasets in our data repository. Feel free to head over to DataStock(https://datastock.shop/) and avail the discount. Content This dataset contains the following: Total Records Count :: 14831 Domain Name: buzzfeed.com Date Range: 01st Jan 2020 - 30th Apr 2020 File Extension :: csv Available Fields: Uniq Id, Crawl Timestamp, Title Headline, Short Description Sub Headline, Content Body, Author, Date And Time Of Posting, Image Urls Acknowledgements We wouldn't be here without the help of our web scraping and data mining experts at PromptCloud and DataStock. Inspiration The inspiration for this dataset came from Buzzfeed itself. We thought long and hard about the informative articles that we have on Buzzfeed. So we came up with a dataset for it.

8 features

Uniq_Id	string	741 unique values 0 missing
Crawl_Timestamp	string	741 unique values 0 missing
Title__Headline	string	687 unique values 48 missing
Short_Description__Sub_Headline	string	689 unique values 45 missing
Content__Body	string	741 unique values 0 missing
Author	numeric	0 unique values 741 missing
Date_And_Time_Of_Posting	string	728 unique values 0 missing
Image_Urls	numeric	0 unique values 741 missing