Data
Fashion-MNIST_seed_1_nrows_2000_nclasses_10_ncols_100_stratify_True

Fashion-MNIST_seed_1_nrows_2000_nclasses_10_ncols_100_stratify_True

active ARFF Publicly available Visibility: public Uploaded 17-11-2022 by Eddie Bergman
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes
Issue #Downvotes for this reason By


Loading wiki
Help us complete this description Edit
Subsampling of the dataset Fashion-MNIST (40996) with seed=1 args.nrows=2000 args.ncols=100 args.nclasses=10 args.no_stratify=True Generated with the following source code: ```python def subsample( self, seed: int, nrows_max: int = 2_000, ncols_max: int = 100, nclasses_max: int = 10, stratified: bool = True, ) -> Dataset: rng = np.random.default_rng(seed) x = self.x y = self.y # Uniformly sample classes = y.unique() if len(classes) > nclasses_max: vcs = y.value_counts() selected_classes = rng.choice( classes, size=nclasses_max, replace=False, p=vcs / sum(vcs), ) # Select the indices where one of these classes is present idxs = y.index[y.isin(classes)] x = x.iloc[idxs] y = y.iloc[idxs] # Uniformly sample columns if required if len(x.columns) > ncols_max: columns_idxs = rng.choice( list(range(len(x.columns))), size=ncols_max, replace=False ) sorted_column_idxs = sorted(columns_idxs) selected_columns = list(x.columns[sorted_column_idxs]) x = x[selected_columns] else: sorted_column_idxs = list(range(len(x.columns))) if len(x) > nrows_max: # Stratify accordingly target_name = y.name data = pd.concat((x, y), axis="columns") _, subset = train_test_split( data, test_size=nrows_max, stratify=data[target_name], shuffle=True, random_state=seed, ) x = subset.drop(target_name, axis="columns") y = subset[target_name] # We need to convert categorical columns to string for openml categorical_mask = [self.categorical_mask[i] for i in sorted_column_idxs] columns = list(x.columns) return Dataset( # Technically this is not the same but it's where it was derived from dataset=self.dataset, x=x, y=y, categorical_mask=categorical_mask, columns=columns, ) ```

101 features

class (target)nominal10 unique values
0 missing
pixel15numeric229 unique values
0 missing
pixel20numeric148 unique values
0 missing
pixel25numeric16 unique values
0 missing
pixel31numeric6 unique values
0 missing
pixel43numeric234 unique values
0 missing
pixel46numeric247 unique values
0 missing
pixel48numeric233 unique values
0 missing
pixel61numeric47 unique values
0 missing
pixel68numeric249 unique values
0 missing
pixel86numeric9 unique values
0 missing
pixel87numeric13 unique values
0 missing
pixel89numeric99 unique values
0 missing
pixel97numeric248 unique values
0 missing
pixel100numeric248 unique values
0 missing
pixel116numeric54 unique values
0 missing
pixel120numeric241 unique values
0 missing
pixel147numeric229 unique values
0 missing
pixel166numeric149 unique values
0 missing
pixel173numeric195 unique values
0 missing
pixel180numeric249 unique values
0 missing
pixel190numeric245 unique values
0 missing
pixel191numeric242 unique values
0 missing
pixel201numeric208 unique values
0 missing
pixel205numeric244 unique values
0 missing
pixel217numeric253 unique values
0 missing
pixel219numeric245 unique values
0 missing
pixel235numeric247 unique values
0 missing
pixel240numeric252 unique values
0 missing
pixel267numeric253 unique values
0 missing
pixel274numeric249 unique values
0 missing
pixel276numeric247 unique values
0 missing
pixel281numeric38 unique values
0 missing
pixel287numeric234 unique values
0 missing
pixel291numeric248 unique values
0 missing
pixel295numeric252 unique values
0 missing
pixel314numeric237 unique values
0 missing
pixel319numeric252 unique values
0 missing
pixel322numeric252 unique values
0 missing
pixel324numeric252 unique values
0 missing
pixel325numeric247 unique values
0 missing
pixel343numeric243 unique values
0 missing
pixel350numeric249 unique values
0 missing
pixel351numeric248 unique values
0 missing
pixel352numeric252 unique values
0 missing
pixel355numeric255 unique values
0 missing
pixel364numeric130 unique values
0 missing
pixel382numeric245 unique values
0 missing
pixel385numeric249 unique values
0 missing
pixel386numeric247 unique values
0 missing
pixel396numeric203 unique values
0 missing
pixel398numeric251 unique values
0 missing
pixel400numeric250 unique values
0 missing
pixel402numeric252 unique values
0 missing
pixel427numeric245 unique values
0 missing
pixel451numeric218 unique values
0 missing
pixel452numeric226 unique values
0 missing
pixel457numeric254 unique values
0 missing
pixel463numeric247 unique values
0 missing
pixel468numeric253 unique values
0 missing
pixel492numeric250 unique values
0 missing
pixel519numeric252 unique values
0 missing
pixel524numeric256 unique values
0 missing
pixel532numeric156 unique values
0 missing
pixel534numeric188 unique values
0 missing
pixel538numeric252 unique values
0 missing
pixel545numeric252 unique values
0 missing
pixel546numeric250 unique values
0 missing
pixel550numeric252 unique values
0 missing
pixel562numeric189 unique values
0 missing
pixel569numeric256 unique values
0 missing
pixel578numeric253 unique values
0 missing
pixel580numeric256 unique values
0 missing
pixel584numeric249 unique values
0 missing
pixel585numeric247 unique values
0 missing
pixel588numeric129 unique values
0 missing
pixel591numeric207 unique values
0 missing
pixel593numeric243 unique values
0 missing
pixel604numeric249 unique values
0 missing
pixel611numeric242 unique values
0 missing
pixel621numeric243 unique values
0 missing
pixel643numeric182 unique values
0 missing
pixel651numeric243 unique values
0 missing
pixel652numeric250 unique values
0 missing
pixel654numeric243 unique values
0 missing
pixel655numeric247 unique values
0 missing
pixel657numeric247 unique values
0 missing
pixel664numeric249 unique values
0 missing
pixel684numeric244 unique values
0 missing
pixel694numeric245 unique values
0 missing
pixel701numeric26 unique values
0 missing
pixel706numeric225 unique values
0 missing
pixel714numeric244 unique values
0 missing
pixel715numeric240 unique values
0 missing
pixel716numeric252 unique values
0 missing
pixel718numeric249 unique values
0 missing
pixel722numeric243 unique values
0 missing
pixel729numeric11 unique values
0 missing
pixel737numeric242 unique values
0 missing
pixel740numeric235 unique values
0 missing
pixel765numeric201 unique values
0 missing

19 properties

2000
Number of instances (rows) of the dataset.
101
Number of attributes (columns) of the dataset.
10
Number of distinct values of the target attribute (if it is nominal).
0
Number of missing values in the dataset.
0
Number of instances with at least one value missing.
100
Number of numeric attributes.
1
Number of nominal attributes.
0
Percentage of binary attributes.
0
Percentage of instances having missing values.
0.1
Average class difference between consecutive instances.
0
Percentage of missing values.
0.05
Number of attributes divided by the number of instances.
99.01
Percentage of numeric attributes.
10
Percentage of instances belonging to the most frequent class.
0.99
Percentage of nominal attributes.
200
Number of instances belonging to the most frequent class.
10
Percentage of instances belonging to the least frequent class.
200
Number of instances belonging to the least frequent class.
0
Number of binary attributes.

0 tasks

Define a new task