OpenML

JavaScript is required to properly view the contents of this page!

Glass-Classification

active ARFF Database: Open Database, Contents: Database Contents Visibility: public Uploaded 24-03-2022 by Dustin Carrion
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes

Issue	#Downvotes for this reason	By

Loading wiki

Help us complete this description Edit

Context This is a Glass Identification Data Set from UCI. It contains 10 attributes including id. The response is glass type(discrete 7 values) Content Attribute Information: Id number: 1 to 214 (removed from CSV file) RI: refractive index Na: Sodium (unit measurement: weight percent in corresponding oxide, as are attributes 4-10) Mg: Magnesium Al: Aluminum Si: Silicon K: Potassium Ca: Calcium Ba: Barium Fe: Iron Type of glass: (class attribute) -- 1 buildingwindowsfloatprocessed -- 2 buildingwindowsnonfloatprocessed -- 3 vehiclewindowsfloatprocessed -- 4 vehiclewindowsnonfloatprocessed (none in this database) -- 5 containers -- 6 tableware -- 7 headlamps Acknowledgements https://archive.ics.uci.edu/ml/datasets/Glass+Identification Source: Creator: B. German Central Research Establishment Home Office Forensic Science Service Aldermaston, Reading, Berkshire RG7 4PN Donor: Vina Spiehler, Ph.D., DABFT Diagnostic Products Corporation (213) 776-0180 (ext 3014) Inspiration Data exploration of this dataset reveals two important characteristics : 1) The variables are highly corelated with each other including the response variables: So which kind of ML algorithm is most suitable for this dataset Random Forest , KNN or other? Also since dataset is too small is there any chance of applying PCA or it should be completely avoided? 2) Highly Skewed Data: Is scaling sufficient or are there any other techniques which should be applied to normalize data? Like BOX-COX Power transformation?

10 features

RI	numeric	178 unique values 0 missing
Na	numeric	142 unique values 0 missing
Mg	numeric	94 unique values 0 missing
Al	numeric	118 unique values 0 missing
Si	numeric	133 unique values 0 missing
K	numeric	65 unique values 0 missing
Ca	numeric	143 unique values 0 missing
Ba	numeric	34 unique values 0 missing
Fe	numeric	32 unique values 0 missing
Type	numeric	6 unique values 0 missing