Data
German-House-Prices

German-House-Prices

active ARFF CC BY-NC-SA 4.0 Visibility: public Uploaded 23-03-2022 by Dustin Carrion
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes
  • Computer Systems Machine Learning
Issue #Downvotes for this reason By


Loading wiki
Help us complete this description Edit
Context Projects are a great way to learn data science. So I started my own. The numerous housing data sets on Kaggle were the inspiration for this data set. Predicting housing prices is a simple yet insightful regression problem. Understanding data takes time, and the more people analyze it, the faster the secrets can be uncovered. I acquired the data by scraping Immo Scout 24, a marketplace for German real estate.

26 features

Unnamed:_0numeric10552 unique values
0 missing
Pricenumeric1411 unique values
0 missing
Typestring11 unique values
402 missing
Living_spacenumeric1867 unique values
0 missing
Lotnumeric2526 unique values
0 missing
Usable_areanumeric1012 unique values
4984 missing
Free_of_Relationstring704 unique values
3569 missing
Roomsnumeric72 unique values
0 missing
Bedroomsnumeric32 unique values
3674 missing
Bathroomsnumeric28 unique values
1801 missing
Floorsnumeric10 unique values
2664 missing
Year_builtnumeric292 unique values
694 missing
Furnishing_qualitystring4 unique values
2726 missing
Year_renovatednumeric67 unique values
5203 missing
Conditionstring10 unique values
323 missing
Heatingstring13 unique values
584 missing
Energy_sourcestring104 unique values
1227 missing
Energy_certificatestring3 unique values
755 missing
Energy_certificate_typestring2 unique values
3526 missing
Energy_consumptionnumeric1423 unique values
8119 missing
Energy_efficiency_classstring9 unique values
4819 missing
Statestring16 unique values
1 missing
Citystring534 unique values
1 missing
Placestring4760 unique values
290 missing
Garagesnumeric37 unique values
1960 missing
Garagetypestring7 unique values
1960 missing

19 properties

10552
Number of instances (rows) of the dataset.
26
Number of attributes (columns) of the dataset.
Number of distinct values of the target attribute (if it is nominal).
49282
Number of missing values in the dataset.
10406
Number of instances with at least one value missing.
13
Number of numeric attributes.
0
Number of nominal attributes.
Percentage of instances belonging to the most frequent class.
0
Percentage of nominal attributes.
Number of instances belonging to the most frequent class.
Percentage of instances belonging to the least frequent class.
Number of instances belonging to the least frequent class.
0
Number of binary attributes.
0
Percentage of binary attributes.
98.62
Percentage of instances having missing values.
Average class difference between consecutive instances.
17.96
Percentage of missing values.
0
Number of attributes divided by the number of instances.
50
Percentage of numeric attributes.

0 tasks

Define a new task