Data
Brilliant-Diamonds

Brilliant-Diamonds

active ARFF Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) Visibility: public Uploaded 23-03-2022 by Elif Ceren Gok
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes
Issue #Downvotes for this reason By


Loading wiki
Help us complete this description Edit
Context Buying a diamond can be frustrating and expensive. It inspired me to create this dataset of 119K natural and lab-created diamonds from brilliantearth.com to demystify the value of the 4 Cs cut, color, clarity, carat. This data was scraped using DiamondScraper. Content Attribute Description Data Type id Diamond identification number provided by Brilliant Earth int url URL for the diamond details page string shape External geometric appearance of a diamond string/categorical price Price in U.S. dollars int carat Unit of measurement used to describe the weight of a diamond float cut Facets, symmetry, and reflective qualities of a diamond string/categorical color Natural color or lack of color visible within a diamond, based on the GIA grade scale string/categorical clarity Visibility of natural microscopic inclusions and imperfections within a diamond string/categorical report Diamond certificate or grading report provided by an independent gemology lab string type Natural or lab created diamonds string date_fetched Date the data was fetched date Acknowledgements Thanks to Brilliant Earth for committing to ethically soured jewelry and for having a great shopping experience. Check out their buying guide to learn more about diamonds.

11 features

idnumeric119307 unique values
0 missing
urlstring119307 unique values
0 missing
shapestring10 unique values
0 missing
pricenumeric3144 unique values
0 missing
caratnumeric522 unique values
0 missing
cutstring5 unique values
0 missing
colorstring7 unique values
0 missing
claritystring8 unique values
0 missing
reportstring4 unique values
0 missing
typestring2 unique values
0 missing
date_fetchedstring1 unique values
0 missing

19 properties

119307
Number of instances (rows) of the dataset.
11
Number of attributes (columns) of the dataset.
Number of distinct values of the target attribute (if it is nominal).
0
Number of missing values in the dataset.
0
Number of instances with at least one value missing.
3
Number of numeric attributes.
0
Number of nominal attributes.
0
Percentage of binary attributes.
0
Percentage of instances having missing values.
0
Percentage of missing values.
Average class difference between consecutive instances.
27.27
Percentage of numeric attributes.
0
Number of attributes divided by the number of instances.
0
Percentage of nominal attributes.
Percentage of instances belonging to the most frequent class.
Number of instances belonging to the most frequent class.
Percentage of instances belonging to the least frequent class.
Number of instances belonging to the least frequent class.
0
Number of binary attributes.

0 tasks

Define a new task