{ "data_id": "44296", "name": "Meta_Album_MD_5_BIS_Mini", "exact_name": "Meta_Album_MD_5_BIS_Mini", "version": 1, "version_label": null, "description": "## **Meta-Album OmniPrint-MD-5-bis Dataset (Mini)**\n***\nOmniPrint-MD-5-bis dataset consists of 28 240 images (128x128, RGB) from 706 categories. The images are synthesized with OmniPrint, and no further processing was done. The OmniPrint synthesis parameters are stated as follows: font size is 192, image size is 128, the strength of random perspective transformation is 0.04, left\/right\/top\/bottom margins are all 20% of the image size, the strength of pre-rasterization elastic transformation is 0.035, random translation is activated both horizontally and vertically, image blending method is Poisson Image Editing, rotation is within -60 and 60 degrees, horizontal shear is within -0.5 and 0.5, the foreground is filled with a random color, the background consists of images downloaded from Pexels(https:\/\/www.pexels.com\/). \n\n\n\n### **Dataset Details**\n![](https:\/\/meta-album.github.io\/assets\/img\/samples\/MD_5_BIS.png)\n\n**Meta Album ID**: OCR.MD_5_BIS \n**Meta Album URL**: [https:\/\/meta-album.github.io\/datasets\/MD_5_BIS.html](https:\/\/meta-album.github.io\/datasets\/MD_5_BIS.html) \n**Domain ID**: OCR \n**Domain Name**: Optical Character Recognition \n**Dataset ID**: MD_5_BIS \n**Dataset Name**: OmniPrint-MD-5-bis \n**Short Description**: Character images with a specific set of nuisance parameters \n**\\# Classes**: 706 \n**\\# Images**: 28240 \n**Keywords**: ocr \n**Data Format**: images \n**Image size**: 128x128 \n\n**License (original data release)**: CC BY 4.0 \n**License URL(original data release)**: https:\/\/creativecommons.org\/licenses\/by\/4.0\/\n \n**License (Meta-Album data release)**: CC BY 4.0 \n**License URL (Meta-Album data release)**: [https:\/\/creativecommons.org\/licenses\/by\/4.0\/](https:\/\/creativecommons.org\/licenses\/by\/4.0\/) \n\n**Source**: OmniPrint \n**Source URL**: https:\/\/github.com\/SunHaozhe\/OmniPrint \n \n**Original Author**: Haozhe Sun \n**Original contact**: sunhaozhe275940200@gmail.com \n\n**Meta Album author**: Haozhe Sun \n**Created Date**: 25 June 2021 \n**Contact Name**: Haozhe Sun \n**Contact Email**: meta-album@chalearn.org \n**Contact URL**: [https:\/\/meta-album.github.io\/](https:\/\/meta-album.github.io\/) \n\n\n\n### **Cite this dataset**\n```\n@inproceedings{sun2021omniprint,\n title={OmniPrint: A Configurable Printed Character Synthesizer},\n author={Haozhe Sun and Wei-Wei Tu and Isabelle M Guyon},\n booktitle={Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 1)},\n year={2021},\n url={https:\/\/openreview.net\/forum?id=R07XwJPmgpl}\n}\n```\n\n\n### **Cite Meta-Album**\n```\n@inproceedings{meta-album-2022,\n title={Meta-Album: Multi-domain Meta-Dataset for Few-Shot Image Classification},\n author={Ullah, Ihsan and Carrion, Dustin and Escalera, Sergio and Guyon, Isabelle M and Huisman, Mike and Mohr, Felix and van Rijn, Jan N and Sun, Haozhe and Vanschoren, Joaquin and Vu, Phan Anh},\n booktitle={Thirty-sixth Conference on Neural Information Processing Systems Datasets and Benchmarks Track},\n url = {https:\/\/meta-album.github.io\/},\n year = {2022}\n }\n```\n\n\n### **More**\nFor more information on the Meta-Album dataset, please see the [[NeurIPS 2022 paper]](https:\/\/meta-album.github.io\/paper\/Meta-Album.pdf) \nFor details on the dataset preprocessing, please see the [[supplementary materials]](https:\/\/openreview.net\/attachment?id=70_Wx-dON3q&name=supplementary_material) \nSupporting code can be found on our [[GitHub repo]](https:\/\/github.com\/ihsaan-ullah\/meta-album) \nMeta-Album on Papers with Code [[Meta-Album]](https:\/\/paperswithcode.com\/dataset\/meta-album) \n\n\n\n### **Other versions of this dataset**\n[[Micro]](https:\/\/www.openml.org\/d\/44252) ", "format": "arff", "uploader": "Meta Album", "uploader_id": 30980, "visibility": "public", "creator": "\"Ihsan Ullah\"", "contributor": null, "date": "2022-10-28 16:20:00", "update_comment": null, "last_update": "2022-10-28 16:20:00", "licence": "CC BY-NC 4.0", "status": "active", "error_message": null, "url": "https:\/\/api.openml.org\/data\/download\/22110996\/dataset", "kaggle_url": null, "default_target_attribute": "CATEGORY", "row_id_attribute": null, "ignore_attribute": null, "runs": 0, "suggest": { "input": [ "Meta_Album_MD_5_BIS_Mini", "## **Meta-Album OmniPrint-MD-5-bis Dataset (Mini)** OmniPrint-MD-5-bis dataset consists of 28 240 images (128x128, RGB) from 706 categories. The images are synthesized with OmniPrint, and no further processing was done. The OmniPrint synthesis parameters are stated as follows: font size is 192, image size is 128, the strength of random perspective transformation is 0.04, left\/right\/top\/bottom margins are all 20% of the image size, the strength of pre-rasterization elastic transformation is 0.035 " ], "weight": 5 }, "qualities": { "NumberOfInstances": 28240, "NumberOfFeatures": 39, "NumberOfClasses": 0, "NumberOfMissingValues": 2, "NumberOfInstancesWithMissingValues": 2, "NumberOfNumericFeatures": 24, "NumberOfSymbolicFeatures": 1, "PercentageOfBinaryFeatures": 2.564102564102564, "PercentageOfInstancesWithMissingValues": 0.0070821529745042485, "PercentageOfMissingValues": 0.00018159366601292948, "AutoCorrelation": -3431.3033747653953, "PercentageOfNumericFeatures": 61.53846153846154, "Dimensionality": 0.0013810198300283287, "PercentageOfSymbolicFeatures": 2.564102564102564, "MajorityClassPercentage": null, "MajorityClassSize": null, "MinorityClassPercentage": null, "MinorityClassSize": null, "NumberOfBinaryFeatures": 1 }, "tags": [], "features": [ { "name": "CATEGORY", "index": "2", "type": "numeric", "distinct": "706", "missing": "0", "target": "1", "min": "48", "max": "12436", "mean": "4269", "stdev": "3327" }, { "name": "FILE_NAME", "index": "0", "type": "string", "distinct": "28240", "missing": "0" }, { "name": "text", "index": "1", "type": "string", "distinct": "706", "missing": "0" }, { "name": "font_file", "index": "3", "type": "string", "distinct": "853", "missing": "0" }, { "name": "background", "index": "4", "type": "string", "distinct": "1", "missing": "0" }, { "name": "background_image_crop_x", "index": "5", "type": "numeric", "distinct": "3239", "missing": "0", "min": "0", "max": "7544", "mean": "1978", "stdev": "1353" }, { "name": "background_image_crop_x_plus_width", "index": "6", "type": "numeric", "distinct": "3239", "missing": "0", "min": "128", "max": "7672", "mean": "2106", "stdev": "1353" }, { "name": "background_image_crop_y", "index": "7", "type": "numeric", "distinct": "3177", "missing": "0", "min": "0", "max": "7647", "mean": "1938", "stdev": "1341" }, { "name": "background_image_crop_y_plus_height", "index": "8", "type": "numeric", "distinct": "3177", "missing": "0", "min": "128", "max": "7775", "mean": "2066", "stdev": "1341" }, { "name": "background_image_name", "index": "9", "type": "string", "distinct": "20", "missing": "0" }, { "name": "background_image_original_height", "index": "10", "type": "numeric", "distinct": "13", "missing": "0", "min": "2304", "max": "7952", "mean": "4082", "stdev": "1300" }, { "name": "background_image_original_width", "index": "11", "type": "numeric", "distinct": "14", "missing": "0", "min": "2250", "max": "7680", "mean": "4059", "stdev": "1306" }, { "name": "background_image_resized_height", "index": "12", "type": "numeric", "distinct": "13", "missing": "0", "min": "2304", "max": "7952", "mean": "4082", "stdev": "1300" }, { "name": "background_image_resized_width", "index": "13", "type": "numeric", "distinct": "14", "missing": "0", "min": "2250", "max": "7680", "mean": "4059", "stdev": "1306" }, { "name": "font_size", "index": "14", "type": "numeric", "distinct": "1", "missing": "0", "min": "192", "max": "192", "mean": "192", "stdev": "0" }, { "name": "font_weight", "index": "15", "type": "numeric", "distinct": "1", "missing": "0", "min": "400", "max": "400", "mean": "400", "stdev": "0" }, { "name": "foreground", "index": "16", "type": "string", "distinct": "1", "missing": "0" }, { "name": "image_blending_method", "index": "17", "type": "string", "distinct": "1", "missing": "0" }, { "name": "image_height_resolution", "index": "18", "type": "numeric", "distinct": "1", "missing": "0", "min": "128", "max": "128", "mean": "128", "stdev": "0" }, { "name": "image_mode", "index": "19", "type": "string", "distinct": "1", "missing": "0" }, { "name": "image_width_resolution", "index": "20", "type": "numeric", "distinct": "1", "missing": "0", "min": "128", "max": "128", "mean": "128", "stdev": "0" }, { "name": "margin_bottom", "index": "21", "type": "numeric", "distinct": "1", "missing": "0", "min": "0", "max": "0", "mean": "0", "stdev": "0" }, { "name": "margin_left", "index": "22", "type": "numeric", "distinct": "1", "missing": "0", "min": "0", "max": "0", "mean": "0", "stdev": "0" }, { "name": "margin_right", "index": "23", "type": "numeric", "distinct": "1", "missing": "0", "min": "0", "max": "0", "mean": "0", "stdev": "0" }, { "name": "margin_top", "index": "24", "type": "numeric", "distinct": "1", "missing": "0", "min": "0", "max": "0", "mean": "0", "stdev": "0" }, { "name": "offset_horizontal", "index": "25", "type": "numeric", "distinct": "272", "missing": "0", "min": "0", "max": "327", "mean": "68", "stdev": "46" }, { "name": "offset_vertical", "index": "26", "type": "numeric", "distinct": "313", "missing": "0", "min": "0", "max": "450", "mean": "68", "stdev": "48" }, { "name": "original_image_height_resolution", "index": "27", "type": "numeric", "distinct": "297", "missing": "0", "min": "105", "max": "675", "mean": "286", "stdev": "67" }, { "name": "original_image_width_resolution", "index": "28", "type": "numeric", "distinct": "297", "missing": "0", "min": "105", "max": "675", "mean": "286", "stdev": "67" }, { "name": "perspective_params", "index": "29", "type": "string", "distinct": "27107", "missing": "0" }, { "name": "pre_elastic", "index": "30", "type": "numeric", "distinct": "1", "missing": "0", "min": "0", "max": "0", "mean": "0", "stdev": "0" }, { "name": "rotation", "index": "31", "type": "numeric", "distinct": "3440", "missing": "0", "min": "-60", "max": "60", "mean": "-1", "stdev": "35" }, { "name": "shear_x", "index": "32", "type": "numeric", "distinct": "3440", "missing": "0", "min": "0", "max": "0", "mean": "0", "stdev": "0" }, { "name": "stroke_fill", "index": "33", "type": "string", "distinct": "5536", "missing": "0" }, { "name": "SUPER_CATEGORY", "index": "34", "type": "string", "distinct": "23", "missing": "0" }, { "name": "family_name", "index": "35", "type": "string", "distinct": "528", "missing": "0" }, { "name": "style_name", "index": "36", "type": "string", "distinct": "48", "missing": "0" }, { "name": "postscript_name", "index": "37", "type": "string", "distinct": "812", "missing": "2" }, { "name": "variable_font_weight", "index": "38", "type": "nominal", "distinct": "2", "missing": "0", "distr": [] } ], "nr_of_issues": 0, "nr_of_downvotes": 0, "nr_of_likes": 0, "nr_of_downloads": 0, "total_downloads": 0, "reach": 0, "reuse": 0, "impact_of_reuse": 0, "reach_of_reuse": 0, "impact": 0 }