OpenML
Bike-Sharing-Washington-DC

Bike-Sharing-Washington-DC

active ARFF CC BY-NC-SA 4.0 Visibility: public Uploaded 23-03-2022 by Onur Yildirim
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes
Issue #Downvotes for this reason By


Loading wiki
Help us complete this description Edit
Context Climate change is forcing cities to re-imaging their transportation infrastructure. Shared mobility concepts, such as car sharing, bike sharing or scooter sharing become more and more popular. And if they are implemented well, they can actually contribute to mitigating climate change. Bike sharing in particular is interesting because no electricity of gasoline is necessary (unless e-bikes are used) for this mode of transportation. However, there are inherent problems to this type of shared mobility: varying demand at bike sharing stations needs to be balanced to avoid oversupply or shortages heavily used bikes break down more often Forecasting the future demand can help address those issues. Moreover, demand forecasts can help operators decide whether to expand the business, determine adequate prices and generate additional income through advertisements at particularly busy stations. But that's not all. Another challenge is redistributing bikes between stations and determining the optimal routes. And determining the location of new stations is also an area of interest for operators. Content This dataset can be used to forecast demand to avoid oversupply and shortages. It spans from January 1, 2011, until December 31, 2018. Determining new station locations, analyzing movement patterns or planning routes will only be possible with additional data. date - date with the format yyyy-mm-dd temp_avg - average daily temperature in degree Celsius temp_min - minimum daily temperature in degree Celsius temp_max - maximum daily temperature in degree Celsius temp_observ - temperature at the time of observation in degree Celsius precip - amount of precipitation in mm wind - wind speed in meters per second wt_fog - weather type fog, ice fog, or freezing fog (may include heavy fog) wtheavyfog - weather type heavy fog or heaving freezing fog (not always distinguished from fog) wt_thunder - weather type thunder wt_sleet - weather type ice pellets, sleet, snow pellets, or small hail wt_hail - weather type hail (may include small hail) wt_glaze - weather type glaze or rime wt_haze - weather type smoke or haze wtdriftsnow - weather type blowing or drifting snow wthighwind - weather type high or damaging winds wt_mist - weather type mist wt_drizzle - weather type drizzle wt_rain - weather type rain (may include freezing rain, drizzle, and freezing drizzle) wtfreezerain - weather type freezing rain wt_snow - weather type snow, snow pellets, snow grains, or ice crystals wtgroundfog - weather type ground fog wticefog - weather type ice fog or freezing fog wtfreezedrizzle - weather type freezing drizzle wt_unknown - weather type unknown source of precipitation casual - number of unregistered customers registered - number of registered customers total_cust - sum of registered and casual customers holiday - indicates whether the day is a holiday or not Acknowledgements The data I used to create this dataset was taken from: Capital Bikeshare for the bike sharing demand, NOAA's National Climatic Data Center for weather data, DC Department of Human Resources for data on public holidays. Inspiration Think about the following questions/topics and add more data to this dataset to improve your results: What will tomorrow's, next week's or next month's bike demand? Use time series analysis to determine this. Use anomaly detection to identify seasonality and trend in daily customers data. Which features are particularly important for the forecast of the bike demand?

29 features

datestring2922 unique values
0 missing
temp_avgnumeric1121 unique values
821 missing
temp_minnumeric2778 unique values
0 missing
temp_maxnumeric2772 unique values
0 missing
temp_observnumeric2590 unique values
0 missing
precipnumeric2093 unique values
0 missing
windnumeric492 unique values
0 missing
wt_fognumeric1 unique values
1419 missing
wt_heavy_fognumeric1 unique values
2714 missing
wt_thundernumeric1 unique values
2228 missing
wt_sleetnumeric1 unique values
2793 missing
wt_hailnumeric1 unique values
2872 missing
wt_glazenumeric1 unique values
2769 missing
wt_hazenumeric1 unique values
2217 missing
wt_drift_snownumeric1 unique values
2915 missing
wt_high_windnumeric1 unique values
2664 missing
wt_mistnumeric1 unique values
2551 missing
wt_drizzlenumeric1 unique values
2794 missing
wt_rainnumeric1 unique values
2516 missing
wt_freeze_rainnumeric1 unique values
2917 missing
wt_snownumeric1 unique values
2838 missing
wt_ground_fognumeric1 unique values
2886 missing
wt_ice_fognumeric1 unique values
2912 missing
wt_freeze_drizzlenumeric1 unique values
2918 missing
wt_unknownnumeric1 unique values
2921 missing
casualnumeric2018 unique values
4 missing
registerednumeric2542 unique values
4 missing
total_custnumeric2605 unique values
4 missing
holidaynumeric1 unique values
2833 missing

19 properties

2922
Number of instances (rows) of the dataset.
29
Number of attributes (columns) of the dataset.
Number of distinct values of the target attribute (if it is nominal).
51510
Number of missing values in the dataset.
2922
Number of instances with at least one value missing.
28
Number of numeric attributes.
0
Number of nominal attributes.
0.01
Number of attributes divided by the number of instances.
96.55
Percentage of numeric attributes.
Percentage of instances belonging to the most frequent class.
0
Percentage of nominal attributes.
Number of instances belonging to the most frequent class.
Percentage of instances belonging to the least frequent class.
Number of instances belonging to the least frequent class.
0
Number of binary attributes.
0
Percentage of binary attributes.
100
Percentage of instances having missing values.
Average class difference between consecutive instances.
60.79
Percentage of missing values.

0 tasks

Define a new task