{ "data_id": "44308", "name": "Meta_Album_PRT_Mini", "exact_name": "Meta_Album_PRT_Mini", "version": 1, "version_label": null, "description": "## **Meta-Album Subcellular Human Protein Dataset (Mini)**\n***\nThis dataset is a subset of the Subcellular dataset in the Protein Atlas project(https:\/\/www.proteinatlas.org\/). The original dataset, which stems from the Human Protein Atlas Image Classification Kaggle competition(https:\/\/www.kaggle.com\/competitions\/human-protein-atlas-image-classification), comprises 31 072 RGBY images of size 512x512 px, each of which belongs to one or more out of 28 classes. The labels correspond to protein organelle localizations. For Meta-Album, we performed two modifications: (1), to turn the dataset into a multi-class dataset, we dropped all images belonging to more than a single class and also those images that belong to classes with less than 40 members; (2) we converted the remaining images into RGB simply by dropping the yellow channel; this was also a common practice in the competition. Finally, and as for all datasets in Meta-Album, the images from the original dataset were resized to 128x128 image size. \n\n\n\n### **Dataset Details**\n![](https:\/\/meta-album.github.io\/assets\/img\/samples\/PRT.png)\n\n**Meta Album ID**: MCR.PRT \n**Meta Album URL**: [https:\/\/meta-album.github.io\/datasets\/PRT.html](https:\/\/meta-album.github.io\/datasets\/PRT.html) \n**Domain ID**: MCR \n**Domain Name**: Microscopy \n**Dataset ID**: PRT \n**Dataset Name**: Subcellular Human Protein \n**Short Description**: Subcellular protein patterns in human cells \n**\\# Classes**: 21 \n**\\# Images**: 840 \n**Keywords**: human protein, subcellular \n**Data Format**: images \n**Image size**: 128x128 \n\n**License (original data release)**: CC BY-SA 3.0 \n**License URL(original data release)**: https:\/\/www.proteinatlas.org\/about\/licence\n \n**License (Meta-Album data release)**: CC BY-SA 3.0 \n**License URL (Meta-Album data release)**: [https:\/\/www.proteinatlas.org\/about\/licence](https:\/\/www.proteinatlas.org\/about\/licence) \n\n**Source**: The Human Protein Atlas \n**Source URL**: https:\/\/proteinatlas.org \nhttps:\/\/www.kaggle.com\/c\/human-protein-atlas-image-classification \n \n**Original Author**: Peter J Thul, Lovisa Akesson, Mikaela Wiking, Diana Mahdessian, Aikaterini Geladaki, Hammou Ait Blal, Tove Alm, Anna Asplund, Lars Bjork, Lisa Breckels, and others \n**Original contact**: contact@proteinatlas.org \n\n**Meta Album author**: Felix Mohr \n**Created Date**: 01 June 2022 \n**Contact Name**: Felix Mohr \n**Contact Email**: meta-album@chalearn.org \n**Contact URL**: [https:\/\/meta-album.github.io\/](https:\/\/meta-album.github.io\/) \n\n\n\n### **Cite this dataset**\n```\n@article{thul2017subcellular, \n title={A subcellular map of the human proteome},\n author={Thul, Peter J and Akesson, Lovisa and Wiking, Mikaela and Mahdessian, Diana and Geladaki, Aikaterini and Ait Blal, Hammou and Alm, Tove and Asplund, Anna and Bjork, Lars and Breckels, Lisa M},\n journal={Science},\n volume={356},\n number={6340},\n year={2017},\n publisher={American Association for the Advancement of Science}\n}\n\n```\n\n\n### **Cite Meta-Album**\n```\n@inproceedings{meta-album-2022,\n title={Meta-Album: Multi-domain Meta-Dataset for Few-Shot Image Classification},\n author={Ullah, Ihsan and Carrion, Dustin and Escalera, Sergio and Guyon, Isabelle M and Huisman, Mike and Mohr, Felix and van Rijn, Jan N and Sun, Haozhe and Vanschoren, Joaquin and Vu, Phan Anh},\n booktitle={Thirty-sixth Conference on Neural Information Processing Systems Datasets and Benchmarks Track},\n url = {https:\/\/meta-album.github.io\/},\n year = {2022}\n }\n```\n\n\n### **More**\nFor more information on the Meta-Album dataset, please see the [[NeurIPS 2022 paper]](https:\/\/meta-album.github.io\/paper\/Meta-Album.pdf) \nFor details on the dataset preprocessing, please see the [[supplementary materials]](https:\/\/openreview.net\/attachment?id=70_Wx-dON3q&name=supplementary_material) \nSupporting code can be found on our [[GitHub repo]](https:\/\/github.com\/ihsaan-ullah\/meta-album) \nMeta-Album on Papers with Code [[Meta-Album]](https:\/\/paperswithcode.com\/dataset\/meta-album) \n\n\n\n### **Other versions of this dataset**\n[[Micro]](https:\/\/www.openml.org\/d\/44278) [[Extended]](https:\/\/www.openml.org\/d\/44342) ", "format": "arff", "uploader": "Meta Album", "uploader_id": 30980, "visibility": "public", "creator": "\"Ihsan Ullah\"", "contributor": null, "date": "2022-10-28 16:28:42", "update_comment": null, "last_update": "2022-10-28 16:28:42", "licence": "CC BY-NC 4.0", "status": "active", "error_message": null, "url": "https:\/\/api.openml.org\/data\/download\/22111008\/dataset", "default_target_attribute": "CATEGORY", "row_id_attribute": null, "ignore_attribute": null, "runs": 0, "suggest": { "input": [ "Meta_Album_PRT_Mini", "## **Meta-Album Subcellular Human Protein Dataset (Mini)** This dataset is a subset of the Subcellular dataset in the Protein Atlas project(https:\/\/www.proteinatlas.org\/). The original dataset, which stems from the Human Protein Atlas Image Classification Kaggle competition(https:\/\/www.kaggle.com\/competitions\/human-protein-atlas-image-classification), comprises 31 072 RGBY images of size 512x512 px, each of which belongs to one or more out of 28 classes. The labels correspond to protein organell " ], "weight": 5 }, "qualities": { "NumberOfInstances": 840, "NumberOfFeatures": 3, "NumberOfClasses": 21, "NumberOfMissingValues": 840, "NumberOfInstancesWithMissingValues": 840, "NumberOfNumericFeatures": 1, "NumberOfSymbolicFeatures": 0, "PercentageOfBinaryFeatures": 0, "PercentageOfInstancesWithMissingValues": 100, "AutoCorrelation": 1, "PercentageOfMissingValues": 33.33333333333333, "Dimensionality": 0.0035714285714285713, "PercentageOfNumericFeatures": 33.33333333333333, "MajorityClassPercentage": 4.761904761904762, "PercentageOfSymbolicFeatures": 0, "MajorityClassSize": 40, "MinorityClassPercentage": 4.761904761904762, "MinorityClassSize": 40, "NumberOfBinaryFeatures": 0 }, "tags": [ { "uploader": "38960", "tag": "Chemistry" } ], "features": [ { "name": "CATEGORY", "index": "1", "type": "string", "distinct": "21", "missing": "0", "target": "1" }, { "name": "FILE_NAME", "index": "0", "type": "string", "distinct": "840", "missing": "0" }, { "name": "SUPER_CATEGORY", "index": "2", "type": "numeric", "distinct": "0", "missing": "840", "min": "2147483647", "max": "0", "mean": "0", "stdev": "0" } ], "nr_of_issues": 0, "nr_of_downvotes": 0, "nr_of_likes": 0, "nr_of_downloads": 0, "total_downloads": 0, "reach": 0, "reuse": 1, "impact_of_reuse": 0, "reach_of_reuse": 0, "impact": 1 }