active ARFF CC BY-NC 4.0 Visibility: public Uploaded 08-11-2022 by Meta Album
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes
Issue #Downvotes for this reason By

Loading wiki
Help us complete this description Edit
## Meta-Album DIBaS Dataset (Extended) * The Digital Images of Bacteria Species dataset (DIBaS) ( is a dataset of 33 bacterial species with around 20 images for each species. For the Meta-Album, since the images were large (2 048x1 532) with very few samples in each class, we decided to split each image into several smaller images before resizing them to 128x128. We then obtained a preprocessed dataset of 4 060 images with at least 108 images for each class. This dataset was also preprocessed with blob normalization techniques, which is quite unusual for this type of image. The goal of this transformation was to reduce the importance of color in decision-making for a bias-aware challenge. ### Dataset Details ![]( Meta Album ID: MCR.BCT Meta Album URL: []( Domain ID: MCR Domain Name: Microscopic Dataset ID: BCT Dataset Name: DIBaS Short Description: Digital Image of Bacterial Species (DIBaS) \# Classes: 33 \# Images: 4060 Keywords: microscopic, bacteria Data Format: images Image size: 128x128 License (original data release): Public for researchers License URL(original data release): License (Meta-Album data release): CC BY-NC 4.0 License URL (Meta-Album data release): []( Source: Digital Image of Bacterial Species (DIBaS) Source URL: Original Author: Bartosz Zielinski, Anna Plichta, Krzysztof Misztal, Przemyslaw Spurek, Monika Brzychczy-Wloch, Dorota Ochonska Original contact: Meta Album author: Romain Mussard Created Date: 01 March 2022 Contact Name: Ihsan Ullah Contact Email: Contact URL: []( ### Cite this dataset ``` @article{10.1371/journal.pone.0184554, doi = {10.1371/journal.pone.0184554}, author = {Zielinski, Bartosz AND Plichta, Anna AND Misztal, Krzysztof AND Spurek, Przemyslaw AND Brzychczy-Wloch, Monika AND Ochonska, Dorota}, journal = {PLOS ONE}, publisher = {Public Library of Science}, title = {Deep learning approach to bacterial colony classification}, year = {2017}, month = {09}, volume = {12}, url = {}, pages = {1-14}, number = {9} } ``` ### Cite Meta-Album ``` @inproceedings{meta-album-2022, title={Meta-Album: Multi-domain Meta-Dataset for Few-Shot Image Classification}, author={Ullah, Ihsan and Carrion, Dustin and Escalera, Sergio and Guyon, Isabelle M and Huisman, Mike and Mohr, Felix and van Rijn, Jan N and Sun, Haozhe and Vanschoren, Joaquin and Vu, Phan Anh}, booktitle={Thirty-sixth Conference on Neural Information Processing Systems Datasets and Benchmarks Track}, url = {}, year = {2022} } ``` ### More For more information on the Meta-Album dataset, please see the [[NeurIPS 2022 paper]]( For details on the dataset preprocessing, please see the [[supplementary materials]]( Supporting code can be found on our [[GitHub repo]]( Meta-Album on Papers with Code [[Meta-Album]]( ### Other versions of this dataset [[Micro]]( [[Mini]](

3 features

CATEGORY (target)string33 unique values
0 missing
FILE_NAMEstring4060 unique values
0 missing
SUPER_CATEGORYnumeric0 unique values
4060 missing

19 properties

Number of instances (rows) of the dataset.
Number of attributes (columns) of the dataset.
Number of distinct values of the target attribute (if it is nominal).
Number of missing values in the dataset.
Number of instances with at least one value missing.
Number of numeric attributes.
Number of nominal attributes.
Number of instances belonging to the most frequent class.
Percentage of instances belonging to the least frequent class.
Number of instances belonging to the least frequent class.
Number of binary attributes.
Percentage of binary attributes.
Percentage of instances having missing values.
Percentage of missing values.
Average class difference between consecutive instances.
Percentage of numeric attributes.
Number of attributes divided by the number of instances.
Percentage of nominal attributes.
Percentage of instances belonging to the most frequent class.

1 tasks

0 runs - estimation_procedure: 10-fold Crossvalidation - evaluation_measure: predictive_accuracy - target_feature: CATEGORY
Define a new task