OpenML
Boston-Airbnb-Listings

Boston-Airbnb-Listings

active ARFF CC0: Public Domain Visibility: public Uploaded 24-03-2022 by Elif Ceren Gok
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes
Issue #Downvotes for this reason By


Loading wiki
Help us complete this description Edit
Context Since 2008, guests and hosts have used Airbnb to travel in a more unique, personalized way. As part of the Airbnb Inside initiative, this dataset describes the listing activity of homestays in Boston, MA. Content This data file includes all needed information to about the listing details, the host, geographical availability, and necessary metrics to make predictions and draw conclusions. Basic data cleaning has been done, such as dropping redundant features (ex: city) and converting amenities into a dictionary. The data includes both numerical and categorical data, as well as natural language descriptions. Acknowledgements This dataset is part of Airbnb Inside, and the original source can be found here. Inspiration Listing visualization What features drive the price of a listing up? What can we learn about different hosts and areas? What can we learn from predictions? (ex: locations, prices, reviews, etc) Which hosts are the busiest and why? Is there any noticeable difference of traffic among different areas and what could be the reason for it?

51 features

idnumeric3845 unique values
0 missing
namestring3679 unique values
3 missing
summarystring2721 unique values
80 missing
accessstring1492 unique values
1637 missing
interactionstring1404 unique values
1257 missing
house_rulesstring1530 unique values
990 missing
host_idnumeric1331 unique values
0 missing
host_sincestring1080 unique values
0 missing
host_locationstring123 unique values
3 missing
host_response_timestring4 unique values
562 missing
host_response_ratestring35 unique values
562 missing
host_acceptance_ratestring73 unique values
260 missing
host_is_superhoststring2 unique values
0 missing
host_neighbourhoodstring65 unique values
221 missing
host_total_listings_countnumeric51 unique values
0 missing
host_verificationsstring152 unique values
0 missing
host_identity_verifiedstring2 unique values
0 missing
neighbourhoodstring31 unique values
0 missing
neighbourhood_cleansedstring25 unique values
0 missing
zipcodestring54 unique values
14 missing
latitudenumeric2838 unique values
0 missing
longitudenumeric2968 unique values
0 missing
is_location_exactstring2 unique values
0 missing
property_typestring21 unique values
0 missing
room_typestring4 unique values
0 missing
accommodatesnumeric19 unique values
0 missing
bathroomsnumeric12 unique values
3 missing
bedroomsnumeric10 unique values
4 missing
bedsnumeric16 unique values
20 missing
bed_typestring5 unique values
1 missing
amenities_dictstring2908 unique values
0 missing
pricestring345 unique values
0 missing
cleaning_feestring119 unique values
399 missing
availability_30numeric31 unique values
0 missing
availability_60numeric61 unique values
0 missing
availability_90numeric91 unique values
0 missing
availability_365numeric364 unique values
0 missing
number_of_reviewsnumeric289 unique values
0 missing
review_scores_ratingnumeric46 unique values
839 missing
review_scores_accuracynumeric9 unique values
841 missing
review_scores_cleanlinessnumeric9 unique values
840 missing
review_scores_checkinnumeric8 unique values
842 missing
review_scores_communicationnumeric8 unique values
839 missing
review_scores_locationnumeric8 unique values
841 missing
review_scores_valuenumeric9 unique values
841 missing
requires_licensestring2 unique values
0 missing
licensestring895 unique values
1543 missing
instant_bookablestring2 unique values
0 missing
is_business_travel_readystring1 unique values
0 missing
cancellation_policystring6 unique values
0 missing
reviews_per_monthnumeric617 unique values
825 missing

19 properties

3845
Number of instances (rows) of the dataset.
51
Number of attributes (columns) of the dataset.
Number of distinct values of the target attribute (if it is nominal).
14267
Number of missing values in the dataset.
2984
Number of instances with at least one value missing.
22
Number of numeric attributes.
0
Number of nominal attributes.
0.01
Number of attributes divided by the number of instances.
43.14
Percentage of numeric attributes.
Percentage of instances belonging to the most frequent class.
0
Percentage of nominal attributes.
Number of instances belonging to the most frequent class.
Percentage of instances belonging to the least frequent class.
Number of instances belonging to the least frequent class.
0
Number of binary attributes.
0
Percentage of binary attributes.
77.61
Percentage of instances having missing values.
Average class difference between consecutive instances.
7.28
Percentage of missing values.

0 tasks

Define a new task