{ "data_id": "44276", "name": "Meta_Album_INS_Micro", "exact_name": "Meta_Album_INS_Micro", "version": 1, "version_label": null, "description": "## **Meta-Album Insects Dataset (Micro)**\n***\nThe original Insects dataset is created by the National Museum of Natural History, Paris (https:\/\/www.mnhn.fr\/fr). It has more than 290 000 images in different sizes and orientations. The dataset has hierarchical classes which are listed from top to bottom as Order, Super-Family, Family, and Texa. Each image contains an insect in its natural environment or habitat, i.e, either on a flower or near to vegetation. The images are collected by the researchers and hundreds of volunteers from SPIPOLL Science project(https:\/\/www.spipoll.org\/). The images are uploaded to a centralized server either by using the SPIPOLL website, Android application or IOS application. The preprocessed insect dataset is prepared from the original Insects dataset by carefully preprocessing the images, i.e., cropping the images from either side to make squared images. These cropped images are then resized into 128x128 using Open-CV with an anti-aliasing filter. \n\n\n\n### **Dataset Details**\n![](https:\/\/meta-album.github.io\/assets\/img\/samples\/INS.png)\n\n**Meta Album ID**: SM_AM.INS \n**Meta Album URL**: [https:\/\/meta-album.github.io\/datasets\/INS.html](https:\/\/meta-album.github.io\/datasets\/INS.html) \n**Domain ID**: SM_AM \n**Domain Name**: Small Aninamls \n**Dataset ID**: INS \n**Dataset Name**: Insects \n**Short Description**: Insects dataset from Science Project SPIPOLL \n**\\# Classes**: 20 \n**\\# Images**: 800 \n**Keywords**: insects, ecology \n**Data Format**: images \n**Image size**: 128x128 \n\n**License (original data release)**: CC BY-NC 2.0 \n**License URL(original data release)**: https:\/\/www.spipoll.org\/mentions-legales\n \n**License (Meta-Album data release)**: CC BY-NC 2.0 \n**License URL (Meta-Album data release)**: [https:\/\/creativecommons.org\/licenses\/by-nc\/2.0\/](https:\/\/creativecommons.org\/licenses\/by-nc\/2.0\/) \n\n**Source**: SPIPOLL; National Museum of Natural History, Paris \n**Source URL**: https:\/\/www.spipoll.org\/ \n \n**Original Author**: Gregoire Lois, Colin Fontaine, Jean-Francois Julien \n**Original contact**: contact@spipoll.org \n\n**Meta Album author**: Ihsan Ullah \n**Created Date**: 01 March 2022 \n**Contact Name**: Ihsan Ullah \n**Contact Email**: meta-album@chalearn.org \n**Contact URL**: [https:\/\/meta-album.github.io\/](https:\/\/meta-album.github.io\/) \n\n\n\n### **Cite this dataset**\n```\n@article{insects, \n title={Data quality and participant engagement in citizen science: comparing two approaches for monitoring pollinators in France and South Korea}, \n author={Serret, Hortense and Deguines, Nicolas and Jang, Yikweon and Lois, Gregoire and Julliard, Romain}, \n journal={Citizen Science: Theory and Practice}, \n volume={4}, \n number={1}, \n pages={22}, \n year={2019} \n}\n```\n\n\n### **Cite Meta-Album**\n```\n@inproceedings{meta-album-2022,\n title={Meta-Album: Multi-domain Meta-Dataset for Few-Shot Image Classification},\n author={Ullah, Ihsan and Carrion, Dustin and Escalera, Sergio and Guyon, Isabelle M and Huisman, Mike and Mohr, Felix and van Rijn, Jan N and Sun, Haozhe and Vanschoren, Joaquin and Vu, Phan Anh},\n booktitle={Thirty-sixth Conference on Neural Information Processing Systems Datasets and Benchmarks Track},\n url = {https:\/\/meta-album.github.io\/},\n year = {2022}\n }\n```\n\n\n### **More**\nFor more information on the Meta-Album dataset, please see the [[NeurIPS 2022 paper]](https:\/\/meta-album.github.io\/paper\/Meta-Album.pdf) \nFor details on the dataset preprocessing, please see the [[supplementary materials]](https:\/\/openreview.net\/attachment?id=70_Wx-dON3q&name=supplementary_material) \nSupporting code can be found on our [[GitHub repo]](https:\/\/github.com\/ihsaan-ullah\/meta-album) \nMeta-Album on Papers with Code [[Meta-Album]](https:\/\/paperswithcode.com\/dataset\/meta-album) \n\n\n\n### **Other versions of this dataset**\n[[Mini]](https:\/\/www.openml.org\/d\/44306) [[Extended]](https:\/\/www.openml.org\/d\/44340) ", "format": "arff", "uploader": "Meta Album", "uploader_id": 30980, "visibility": "public", "creator": "\"Ihsan Ullah\"", "contributor": null, "date": "2022-10-28 11:29:46", "update_comment": null, "last_update": "2022-10-28 11:29:46", "licence": "CC BY-NC 4.0", "status": "active", "error_message": null, "url": "https:\/\/api.openml.org\/data\/download\/22110976\/dataset", "default_target_attribute": "CATEGORY", "row_id_attribute": null, "ignore_attribute": null, "runs": 0, "suggest": { "input": [ "Meta_Album_INS_Micro", "## **Meta-Album Insects Dataset (Micro)** The original Insects dataset is created by the National Museum of Natural History, Paris (https:\/\/www.mnhn.fr\/fr). It has more than 290 000 images in different sizes and orientations. The dataset has hierarchical classes which are listed from top to bottom as Order, Super-Family, Family, and Texa. Each image contains an insect in its natural environment or habitat, i.e, either on a flower or near to vegetation. The images are collected by the researchers " ], "weight": 5 }, "qualities": { "NumberOfInstances": 800, "NumberOfFeatures": 3, "NumberOfClasses": 20, "NumberOfMissingValues": 0, "NumberOfInstancesWithMissingValues": 0, "NumberOfNumericFeatures": 0, "NumberOfSymbolicFeatures": 0, "PercentageOfBinaryFeatures": 0, "PercentageOfInstancesWithMissingValues": 0, "AutoCorrelation": 1, "PercentageOfMissingValues": 0, "Dimensionality": 0.00375, "PercentageOfNumericFeatures": 0, "MajorityClassPercentage": 5, "PercentageOfSymbolicFeatures": 0, "MajorityClassSize": 40, "MinorityClassPercentage": 5, "MinorityClassSize": 40, "NumberOfBinaryFeatures": 0 }, "tags": [ { "uploader": "38960", "tag": "Life Science" }, { "uploader": "38960", "tag": "Machine Learning" } ], "features": [ { "name": "CATEGORY", "index": "1", "type": "string", "distinct": "20", "missing": "0", "target": "1" }, { "name": "FILE_NAME", "index": "0", "type": "string", "distinct": "800", "missing": "0" }, { "name": "SUPER_CATEGORY", "index": "2", "type": "string", "distinct": "14", "missing": "0" } ], "nr_of_issues": 0, "nr_of_downvotes": 0, "nr_of_likes": 0, "nr_of_downloads": 0, "total_downloads": 0, "reach": 0, "reuse": 1, "impact_of_reuse": 0, "reach_of_reuse": 0, "impact": 1 }