OpenML
AirBNB-analysis-Lisbon

AirBNB-analysis-Lisbon

active ARFF CC0: Public Domain Visibility: public Uploaded 24-03-2022 by Dustin Carrion
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes
  • Computer Systems Machine Learning
Issue #Downvotes for this reason By


Loading wiki
Help us complete this description Edit
Dataset is from http://tomslee.net/airbnb-data-collection-get-the-data room_id: A unique number identifying an Airbnb listing. The listing has a URL on the Airbnb web site of http://airbnb.com/rooms/room_id host_id: A unique number identifying an Airbnb host. The hosts page has a URL on the Airbnb web site of http://airbnb.com/users/show/host_id room_type: One of Entire home/apt, Private room, or Shared room borough: A subregion of the city or search area for which the survey is carried out. The borough is taken from a shapefile of the city that is obtained independently of the Airbnb web site. For some cities, there is no borough information; for others the borough may be a number. If you have better shapefiles for a city of interest, please send them to me. neighborhood: As with borough: a subregion of the city or search area for which the survey is carried out. For cities that have both, a neighbourhood is smaller than a borough. For some cities there is no neighbourhood information. reviews: The number of reviews that a listing has received. Airbnb has said that 70 of visits end up with a review, so the number of reviews can be used to estimate the number of visits. Note that such an estimate will not be reliable for an individual listing (especially as reviews occasionally vanish from the site), but over a city as a whole it should be a useful metric of traffic. overall_satisfaction: The average rating (out of five) that the listing has received from those visitors who left a review. accommodates: The number of guests a listing can accommodate. bedrooms: The number of bedrooms a listing offers. price: The price (in US) for a night stay. In early surveys, there may be some values that were recorded by month. minstay: The minimum stay for a visit, as posted by the host. latitude and longitude: The latitude and longitude of the listing as posted on the Airbnb site: this may be off by a few hundred metres. I do not have a way to track individual listing locations with last_modified: the date and time that the values were read from the Airbnb web site. The first line of the CSV file holds the column headings. Here are the cities, the survey dates, and a link to download each zip file. Aarhus Survey dates: 2016-10-28 (2258 listings), 2016-11-26 (1900 listings), 2017-01-21 (2167 listings), 2017-02-21 (2295 listings), 2017-03-30 (2323 listings), 2017-04-18 (2398 listings), 2017-04-28 (2360 listings), 2017-05-15 (2437 listings), 2017-06-19 (2802 listings), 2017-07-28 (3142 listings)

20 features

room_idnumeric13578 unique values
0 missing
survey_idnumeric1 unique values
0 missing
host_idnumeric6457 unique values
0 missing
room_typestring3 unique values
0 missing
countrynumeric0 unique values
13578 missing
citystring1 unique values
0 missing
boroughnumeric0 unique values
13578 missing
neighborhoodstring24 unique values
0 missing
reviewsnumeric276 unique values
0 missing
overall_satisfactionnumeric9 unique values
0 missing
accommodatesnumeric16 unique values
0 missing
bedroomsnumeric11 unique values
0 missing
bathroomsnumeric0 unique values
13578 missing
pricenumeric293 unique values
0 missing
minstaynumeric0 unique values
13578 missing
namestring13343 unique values
35 missing
last_modifiedstring13578 unique values
0 missing
latitudenumeric11079 unique values
0 missing
longitudenumeric11745 unique values
0 missing
locationstring13578 unique values
0 missing

19 properties

13578
Number of instances (rows) of the dataset.
20
Number of attributes (columns) of the dataset.
Number of distinct values of the target attribute (if it is nominal).
54347
Number of missing values in the dataset.
13578
Number of instances with at least one value missing.
14
Number of numeric attributes.
0
Number of nominal attributes.
0
Number of attributes divided by the number of instances.
70
Percentage of numeric attributes.
Percentage of instances belonging to the most frequent class.
0
Percentage of nominal attributes.
Number of instances belonging to the most frequent class.
Percentage of instances belonging to the least frequent class.
Number of instances belonging to the least frequent class.
0
Number of binary attributes.
0
Percentage of binary attributes.
100
Percentage of instances having missing values.
Average class difference between consecutive instances.
20.01
Percentage of missing values.

0 tasks

Define a new task