Data
Fashion-MNIST_seed_3_nrows_2000_nclasses_10_ncols_100_stratify_True

Fashion-MNIST_seed_3_nrows_2000_nclasses_10_ncols_100_stratify_True

active ARFF Publicly available Visibility: public Uploaded 17-11-2022 by Eddie Bergman
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes
Issue #Downvotes for this reason By


Loading wiki
Help us complete this description Edit
Subsampling of the dataset Fashion-MNIST (40996) with seed=3 args.nrows=2000 args.ncols=100 args.nclasses=10 args.no_stratify=True Generated with the following source code: ```python def subsample( self, seed: int, nrows_max: int = 2_000, ncols_max: int = 100, nclasses_max: int = 10, stratified: bool = True, ) -> Dataset: rng = np.random.default_rng(seed) x = self.x y = self.y # Uniformly sample classes = y.unique() if len(classes) > nclasses_max: vcs = y.value_counts() selected_classes = rng.choice( classes, size=nclasses_max, replace=False, p=vcs / sum(vcs), ) # Select the indices where one of these classes is present idxs = y.index[y.isin(classes)] x = x.iloc[idxs] y = y.iloc[idxs] # Uniformly sample columns if required if len(x.columns) > ncols_max: columns_idxs = rng.choice( list(range(len(x.columns))), size=ncols_max, replace=False ) sorted_column_idxs = sorted(columns_idxs) selected_columns = list(x.columns[sorted_column_idxs]) x = x[selected_columns] else: sorted_column_idxs = list(range(len(x.columns))) if len(x) > nrows_max: # Stratify accordingly target_name = y.name data = pd.concat((x, y), axis="columns") _, subset = train_test_split( data, test_size=nrows_max, stratify=data[target_name], shuffle=True, random_state=seed, ) x = subset.drop(target_name, axis="columns") y = subset[target_name] # We need to convert categorical columns to string for openml categorical_mask = [self.categorical_mask[i] for i in sorted_column_idxs] columns = list(x.columns) return Dataset( # Technically this is not the same but it's where it was derived from dataset=self.dataset, x=x, y=y, categorical_mask=categorical_mask, columns=columns, ) ```

101 features

class (target)nominal10 unique values
0 missing
pixel2numeric4 unique values
0 missing
pixel4numeric7 unique values
0 missing
pixel23numeric31 unique values
0 missing
pixel28numeric4 unique values
0 missing
pixel32numeric14 unique values
0 missing
pixel56numeric6 unique values
0 missing
pixel59numeric13 unique values
0 missing
pixel66numeric248 unique values
0 missing
pixel68numeric241 unique values
0 missing
pixel80numeric117 unique values
0 missing
pixel81numeric66 unique values
0 missing
pixel102numeric246 unique values
0 missing
pixel112numeric19 unique values
0 missing
pixel124numeric250 unique values
0 missing
pixel125numeric246 unique values
0 missing
pixel141numeric16 unique values
0 missing
pixel157numeric241 unique values
0 missing
pixel163numeric246 unique values
0 missing
pixel168numeric52 unique values
0 missing
pixel174numeric231 unique values
0 missing
pixel181numeric242 unique values
0 missing
pixel186numeric251 unique values
0 missing
pixel188numeric248 unique values
0 missing
pixel190numeric246 unique values
0 missing
pixel205numeric248 unique values
0 missing
pixel212numeric249 unique values
0 missing
pixel218numeric245 unique values
0 missing
pixel223numeric160 unique values
0 missing
pixel227numeric91 unique values
0 missing
pixel230numeric235 unique values
0 missing
pixel231numeric236 unique values
0 missing
pixel253numeric33 unique values
0 missing
pixel277numeric240 unique values
0 missing
pixel280numeric89 unique values
0 missing
pixel297numeric248 unique values
0 missing
pixel298numeric247 unique values
0 missing
pixel300numeric252 unique values
0 missing
pixel302numeric244 unique values
0 missing
pixel306numeric236 unique values
0 missing
pixel309numeric47 unique values
0 missing
pixel319numeric252 unique values
0 missing
pixel332numeric253 unique values
0 missing
pixel335numeric224 unique values
0 missing
pixel348numeric245 unique values
0 missing
pixel352numeric250 unique values
0 missing
pixel366numeric100 unique values
0 missing
pixel370numeric243 unique values
0 missing
pixel388numeric248 unique values
0 missing
pixel391numeric227 unique values
0 missing
pixel403numeric251 unique values
0 missing
pixel418numeric227 unique values
0 missing
pixel431numeric251 unique values
0 missing
pixel433numeric250 unique values
0 missing
pixel456numeric252 unique values
0 missing
pixel460numeric248 unique values
0 missing
pixel464numeric248 unique values
0 missing
pixel467numeric250 unique values
0 missing
pixel469numeric252 unique values
0 missing
pixel474numeric225 unique values
0 missing
pixel477numeric143 unique values
0 missing
pixel485numeric256 unique values
0 missing
pixel488numeric247 unique values
0 missing
pixel496numeric252 unique values
0 missing
pixel499numeric248 unique values
0 missing
pixel503numeric236 unique values
0 missing
pixel506numeric212 unique values
0 missing
pixel514numeric252 unique values
0 missing
pixel516numeric254 unique values
0 missing
pixel526numeric250 unique values
0 missing
pixel527numeric246 unique values
0 missing
pixel542numeric253 unique values
0 missing
pixel551numeric254 unique values
0 missing
pixel553numeric253 unique values
0 missing
pixel556numeric243 unique values
0 missing
pixel564numeric234 unique values
0 missing
pixel573numeric252 unique values
0 missing
pixel589numeric108 unique values
0 missing
pixel601numeric243 unique values
0 missing
pixel628numeric240 unique values
0 missing
pixel629numeric251 unique values
0 missing
pixel634numeric250 unique values
0 missing
pixel636numeric253 unique values
0 missing
pixel647numeric166 unique values
0 missing
pixel655numeric248 unique values
0 missing
pixel665numeric248 unique values
0 missing
pixel681numeric249 unique values
0 missing
pixel684numeric245 unique values
0 missing
pixel685numeric250 unique values
0 missing
pixel689numeric245 unique values
0 missing
pixel699numeric134 unique values
0 missing
pixel701numeric31 unique values
0 missing
pixel707numeric229 unique values
0 missing
pixel709numeric243 unique values
0 missing
pixel713numeric243 unique values
0 missing
pixel719numeric250 unique values
0 missing
pixel742numeric245 unique values
0 missing
pixel747numeric240 unique values
0 missing
pixel760numeric83 unique values
0 missing
pixel770numeric230 unique values
0 missing
pixel775numeric221 unique values
0 missing

19 properties

2000
Number of instances (rows) of the dataset.
101
Number of attributes (columns) of the dataset.
10
Number of distinct values of the target attribute (if it is nominal).
0
Number of missing values in the dataset.
0
Number of instances with at least one value missing.
100
Number of numeric attributes.
1
Number of nominal attributes.
10
Percentage of instances belonging to the least frequent class.
200
Number of instances belonging to the least frequent class.
0
Number of binary attributes.
0
Percentage of binary attributes.
0
Percentage of instances having missing values.
0.1
Average class difference between consecutive instances.
0
Percentage of missing values.
0.05
Number of attributes divided by the number of instances.
99.01
Percentage of numeric attributes.
10
Percentage of instances belonging to the most frequent class.
0.99
Percentage of nominal attributes.
200
Number of instances belonging to the most frequent class.

0 tasks

Define a new task