Data
Wisconsin-breast-cancer-cytology-features

Wisconsin-breast-cancer-cytology-features

active ARFF CC0: Public Domain Visibility: public Uploaded 24-03-2022 by Dustin Carrion
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes
  • Computer Systems Machine Learning
Issue #Downvotes for this reason By


Loading wiki
Help us complete this description Edit
Context Cytology features of breast cancer biopsy. It can be used to predict breast cancer from cytology features. The data was obtained from https://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+(Original) Data description can be found at https://archive.ics.uci.edu/ml/machine-learning-databases/breast-cancer-wisconsin/breast-cancer-wisconsin.names Content Data contains cytology features of breast cancer biopsies - clump thickness, uniformity of cell size, uniformity of cell shape, marginal adhesion, single epithelial cell size, bare nuclei, bland chromatin, normal nuceloli, mitosis. The class variable denotes whether it was cancer or not. Cancer = 1 and not cancer = 0 Attribute Information: Sample code number: id number Clump Thickness: 1 - 10 Uniformity of Cell Size: 1 - 10 Uniformity of Cell Shape: 1 - 10 Marginal Adhesion: 1 - 10 Single Epithelial Cell Size: 1 - 10 Bare Nuclei: 1 - 10 Bland Chromatin: 1 - 10 Normal Nucleoli: 1 - 10 Mitoses: 1 - 10 Class: (0 for benign, 1 for malignant) Acknowledgements Data obtained from : UCI machine learning repository Dua, D. and Karra Taniskidou, E. (2017). UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science. Picture courtesy: Photo by Pablo Heimplatz on Unsplash

11 features

idnumeric645 unique values
0 missing
thicknessnumeric10 unique values
0 missing
sizenumeric10 unique values
0 missing
shapenumeric10 unique values
0 missing
adhesionnumeric10 unique values
0 missing
singlenumeric10 unique values
0 missing
nucleinumeric10 unique values
16 missing
chromatinnumeric10 unique values
0 missing
nucleolinumeric10 unique values
0 missing
mitosisnumeric9 unique values
0 missing
classnumeric2 unique values
0 missing

19 properties

699
Number of instances (rows) of the dataset.
11
Number of attributes (columns) of the dataset.
Number of distinct values of the target attribute (if it is nominal).
16
Number of missing values in the dataset.
16
Number of instances with at least one value missing.
11
Number of numeric attributes.
0
Number of nominal attributes.
0.02
Number of attributes divided by the number of instances.
100
Percentage of numeric attributes.
Percentage of instances belonging to the most frequent class.
0
Percentage of nominal attributes.
Number of instances belonging to the most frequent class.
Percentage of instances belonging to the least frequent class.
Number of instances belonging to the least frequent class.
0
Number of binary attributes.
0
Percentage of binary attributes.
2.29
Percentage of instances having missing values.
Average class difference between consecutive instances.
0.21
Percentage of missing values.

1 tasks

0 runs - estimation_procedure: 10-fold Crossvalidation - evaluation_measure: predictive_accuracy - target_feature: class
Define a new task