Dutch News Articles
This dataset contains all the articles published by the NOS as of the 1st of January 2010. The data is obtained by scraping the NOS website. The NOS is one of the biggest (online) news organizations in the Netherlands.
Features:
datetime: date and time of publication of the article.
title: the title of the news article.
content: the content of the news article.
category: the category under which the NOS filed the article.
url: link to the original article.
About the data
The title and content of features somewhat clean. Meaning extra whites spaces and newlines are removed. Furthermore, these features are normalized (NFKD). The NOS also publishes liveblogs. The posts in this live blog are not part of this dataset.
Example
I used this dataset in a recent blog post.