{ "data_id": "44312", "name": "Meta_Album_PNU_Micro", "exact_name": "Meta_Album_PNU_Micro", "version": 2, "version_label": null, "description": "## **Meta-Album PanNuke Dataset (Micro)**\n***\nThe PanNuke dataset(https:\/\/jgamper.github.io\/PanNukeDataset\/) is a semi-automatically generated segmentation and classification task of nuclei. The dataset contains 7 753 images of 19 different tissue types. For the Meta-Album meta-dataset, even though this dataset was designed as a segmentation task, we were able to transform it into a tissue classification task since we had the tissue type for each sample in the dataset. We also resized the images to 128x128 pixels and applied stain normalization to avoid bias and remove some spurious features. \n\n\n\n### **Dataset Details**\n![](https:\/\/meta-album.github.io\/assets\/img\/samples\/PNU.png)\n\n**Meta Album ID**: MCR.PNU \n**Meta Album URL**: [https:\/\/meta-album.github.io\/datasets\/PNU.html](https:\/\/meta-album.github.io\/datasets\/PNU.html) \n**Domain ID**: MCR \n**Domain Name**: Microscopic \n**Dataset ID**: PNU \n**Dataset Name**: PanNuke \n**Short Description**: 19 Human Tissues Dataset \n**\\# Classes**: 20 \n**\\# Images**: 800 \n**Keywords**: microscopic, human tissues \n**Data Format**: images \n**Image size**: 128x128 \n\n**License (original data release)**: Attribution-NonCommercial-ShareAlike 4.0 International \n**License URL(original data release)**: https:\/\/warwick.ac.uk\/fac\/cross_fac\/tia\/data\/pannuke\nhttps:\/\/creativecommons.org\/licenses\/by-nc-sa\/4.0\/\n \n**License (Meta-Album data release)**: Attribution-NonCommercial-ShareAlike 4.0 International \n**License URL (Meta-Album data release)**: [https:\/\/creativecommons.org\/licenses\/by-nc-sa\/4.0\/](https:\/\/creativecommons.org\/licenses\/by-nc-sa\/4.0\/) \n\n**Source**: PanNuke: An Open Pan-Cancer Histology Dataset for Nuclei Instance Segmentation and Classification \n**Source URL**: https:\/\/jgamper.github.io\/PanNukeDataset\/ \n \n**Original Author**: Gamper, Jevgenij and Koohbanani, Navid Alemi and Benet, Ksenija and Khuram, Ali and Rajpoot, Nasir \n**Original contact**: j.gamper@warwick.ac.uk \n\n**Meta Album author**: Romain Mussard \n**Created Date**: 01 March 2022 \n**Contact Name**: Ihsan Ullah \n**Contact Email**: meta-album@chalearn.org \n**Contact URL**: [https:\/\/meta-album.github.io\/](https:\/\/meta-album.github.io\/) \n\n\n\n### **Cite this dataset**\n```\n@inproceedings{gamper2019pannuke,\n title={PanNuke: an open pan-cancer histology dataset for nuclei instance segmentation and classification},\n author={Gamper, Jevgenij and Koohbanani, Navid Alemi and Benet, Ksenija and Khuram, Ali and Rajpoot, Nasir},\n booktitle={European Congress on Digital Pathology},\n pages={11--19},\n year={2019},\n organization={Springer}\n}\n\n@article{gamper2020pannuke,\n title={PanNuke Dataset Extension, Insights and Baselines},\n author={Gamper, Jevgenij and Koohbanani, Navid Alemi and Graham, Simon and Jahanifar, Mostafa and Khurram, Syed Ali and Azam, Ayesha and Hewitt, Katherine and Rajpoot, Nasir},\n journal={arXiv preprint arXiv:2003.10778},\n year={2020}\n}\n```\n\n\n### **Cite Meta-Album**\n```\n@inproceedings{meta-album-2022,\n title={Meta-Album: Multi-domain Meta-Dataset for Few-Shot Image Classification},\n author={Ullah, Ihsan and Carrion, Dustin and Escalera, Sergio and Guyon, Isabelle M and Huisman, Mike and Mohr, Felix and van Rijn, Jan N and Sun, Haozhe and Vanschoren, Joaquin and Vu, Phan Anh},\n booktitle={Thirty-sixth Conference on Neural Information Processing Systems Datasets and Benchmarks Track},\n url = {https:\/\/meta-album.github.io\/},\n year = {2022}\n }\n```\n\n\n### **More**\nFor more information on the Meta-Album dataset, please see the [[NeurIPS 2022 paper]](https:\/\/meta-album.github.io\/paper\/Meta-Album.pdf) \nFor details on the dataset preprocessing, please see the [[supplementary materials]](https:\/\/openreview.net\/attachment?id=70_Wx-dON3q&name=supplementary_material) \nSupporting code can be found on our [[GitHub repo]](https:\/\/github.com\/ihsaan-ullah\/meta-album) \nMeta-Album on Papers with Code [[Meta-Album]](https:\/\/paperswithcode.com\/dataset\/meta-album) \n\n\n\n### **Other versions of this dataset**\n[[Mini]](https:\/\/www.openml.org\/d\/44297) [[Extended]](https:\/\/www.openml.org\/d\/44330) ", "format": "arff", "uploader": "Meta Album", "uploader_id": 30980, "visibility": "public", "creator": "\"Ihsan Ullah\"", "contributor": null, "date": "2022-11-04 12:13:35", "update_comment": null, "last_update": "2022-11-04 12:13:35", "licence": "CC BY-NC 4.0", "status": "active", "error_message": null, "url": "https:\/\/api.openml.org\/data\/download\/22111024\/dataset", "default_target_attribute": "CATEGORY", "row_id_attribute": null, "ignore_attribute": null, "runs": 0, "suggest": { "input": [ "Meta_Album_PNU_Micro", "## **Meta-Album PanNuke Dataset (Micro)** The PanNuke dataset(https:\/\/jgamper.github.io\/PanNukeDataset\/) is a semi-automatically generated segmentation and classification task of nuclei. The dataset contains 7 753 images of 19 different tissue types. For the Meta-Album meta-dataset, even though this dataset was designed as a segmentation task, we were able to transform it into a tissue classification task since we had the tissue type for each sample in the dataset. We also resized the images to " ], "weight": 5 }, "qualities": { "NumberOfInstances": 760, "NumberOfFeatures": 3, "NumberOfClasses": 19, "NumberOfMissingValues": 760, "NumberOfInstancesWithMissingValues": 760, "NumberOfNumericFeatures": 1, "NumberOfSymbolicFeatures": 0, "PercentageOfBinaryFeatures": 0, "PercentageOfInstancesWithMissingValues": 100, "PercentageOfMissingValues": 33.33333333333333, "AutoCorrelation": 1, "PercentageOfNumericFeatures": 33.33333333333333, "Dimensionality": 0.003947368421052632, "PercentageOfSymbolicFeatures": 0, "MajorityClassPercentage": 5.263157894736842, "MajorityClassSize": 40, "MinorityClassPercentage": 5.263157894736842, "MinorityClassSize": 40, "NumberOfBinaryFeatures": 0 }, "tags": [ { "uploader": "38960", "tag": "Chemistry" }, { "uploader": "38960", "tag": "Computer Systems" } ], "features": [ { "name": "CATEGORY", "index": "1", "type": "string", "distinct": "19", "missing": "0", "target": "1" }, { "name": "FILE_NAME", "index": "0", "type": "string", "distinct": "760", "missing": "0" }, { "name": "SUPER_CATEGORY", "index": "2", "type": "numeric", "distinct": "0", "missing": "760", "min": "2147483647", "max": "0", "mean": "0", "stdev": "0" } ], "nr_of_issues": 0, "nr_of_downvotes": 0, "nr_of_likes": 0, "nr_of_downloads": 0, "total_downloads": 0, "reach": 0, "reuse": 1, "impact_of_reuse": 0, "reach_of_reuse": 0, "impact": 1 }