{ "data_id": "232", "name": "fishcatch", "exact_name": "fishcatch", "version": 1, "version_label": "1", "description": "**Author**: \n**Source**: Unknown - \n**Please cite**: \n\n!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!\n\n Weight treated as the class attribute. Identifier deleted.\n\n As used by Kilpatrick, D. & Cameron-Jones, M. (1998). Numeric prediction\n using instance-based learning with encoding length selection. In Progress\n in Connectionist-Based Information Systems. Singapore: Springer-Verlag.\n\n !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!\n\n NAME: fishcatch\n TYPE: Sample \n SIZE: 159 observations, 8 variables\n \n DESCRIPTIVE ABSTRACT:\n \n 159 fishes of 7 species are caught and measured. Altogether there are\n 8 variables. All the fishes are caught from the same lake\n (Laengelmavesi) near Tampere in Finland.\n \n SOURCES:\n Brofeldt, Pekka: Bidrag till kaennedom on fiskbestondet i vaera\n sjoear. Laengelmaevesi. T.H.Jaervi: Finlands Fiskeriet Band 4,\n Meddelanden utgivna av fiskerifoereningen i Finland.\n Helsingfors 1917\n \n VARIABLE DESCRIPTIONS:\n \n 1 Obs Observation number ranges from 1 to 159\n 2 Species (Numeric)\n Code Finnish Swedish English Latin \n 1 Lahna Braxen Bream Abramis brama\n 2 Siika Iiden Whitewish Leusiscus idus\n 3 Saerki Moerten Roach Leuciscus rutilus\n 4 Parkki Bjoerknan ? Abramis bjrkna\n 5 Norssi Norssen Smelt Osmerus eperlanus\n 6 Hauki Jaedda Pike Esox lucius\n 7 Ahven Abborre Perch Perca fluviatilis\n \n 3 Weight Weight of the fish (in grams)\n 4 Length1 Length from the nose to the beginning of the tail (in cm)\n 5 Length2 Length from the nose to the notch of the tail (in cm)\n 6 Length3 Length from the nose to the end of the tail (in cm)\n 7 Height% Maximal height as % of Length3\n 8 Width% Maximal width as % of Length3\n 9 Sex 1 = male 0 = female\n \n \n \n ___\/\/\/\/\/___ _\n \/ ___ |\n \/ _ \/ \/ H\n < ) __) |\n \/__________\/ __ _\n \n |------- L1 -------|\n |------- L2 ----------|\n |------- L3 ------------|\n \n \n Values are aligned and delimited by blanks.\n Missing values are denoted with NA.\n There is one data line for each case.\n \n SPECIAL NOTES:\n I have usually calculated\n Height = Height%*Length3\/100\n Widht = Widht%*Length3\/100\n \n \n PEDAGOGICAL NOTES:\n I have mainly used only Species=7 (Perch) and here is some of the\n models and test, we have used\n \n Weight=a+b*(Length3*Height*Width)+epsilon\n Ho: a=0;\n Heteroscedastic case. Question: What is proper weighting, \n if you use Length3 as a weighting variable.\n \n Log(Weight)=a+b1*Length3+epsilon\n \n Weight^(1\/3)=a+b1*Length3+epsilon\n (Given by Box-Cox-transformation)\n Ho: a=0;\n \n Log(Weight)=a+b1*Length3+b2*Height+b3*Width+epsilon\n Ho: b1+b2+b3=3; \n i.e. dimension of the fish = 3\n \n Weight^(1\/3)=a+b1*Length3+b2*Height+b3*Width+epsilon\n (Given by Box-Cox-transformation)\n Ho: a=0;\n \n Weight=a*Length3^b1*Height^b2*Width^b3+epsilon\n Nonlinear, heteroscedastic case.\n What is proper weighting?\n \n Is obs 143\n \n 143 7 840.0 32.5 35.0 37.3 30.8 20.9 0\n \n an outlier? It had in its stomach 6 roach.\n \n \n \n REFERENCES:\n Brofeldt, Pekka: Bidrag till kaennedom on fiskbestondet i vaara\n sjoear. Laengelmaevesi. T.H.Jaervi: Finlands Fiskeriet Band 4,\n Meddelanden utgivna av fiskerifoereningen i Finland.\n Helsingfors 1917\n \n \n SUBMITTED BY:\n Juha Puranen\n Departement of statistics\n PL33 (Aleksanterinkatu 7)\n 000014 University of Helsinki\n Finland\n e-mail: jpuranen@noppa.helsinki.fi", "format": "ARFF", "uploader": "Jan van Rijn", "uploader_id": 1, "visibility": "public", "creator": "Brofeldt, Pekka", "contributor": "J. Puranen", "date": "2014-04-23 13:20:41", "update_comment": null, "last_update": "2014-04-23 13:20:41", "licence": "Public", "status": "active", "error_message": null, "url": "https:\/\/www.openml.org\/data\/download\/3669\/dataset_2218_fishcatch.arff", "default_target_attribute": "class", "row_id_attribute": null, "ignore_attribute": null, "runs": 10, "suggest": { "input": [ "fishcatch", "!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Weight treated as the class attribute. Identifier deleted. As used by Kilpatrick, D. & Cameron-Jones, M. (1998). Numeric prediction using instance-based learning with encoding length selection. In Progress in Connectionist-Based Information Systems. Singapore: Springer-Verlag. !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! NAME: fishcatch TYPE: Sample SIZE: 159 observations, 8 variables DESCRIP " ], "weight": 5 }, "qualities": { "NumberOfInstances": 158, "NumberOfFeatures": 8, "NumberOfClasses": 0, "NumberOfMissingValues": 87, "NumberOfInstancesWithMissingValues": 87, "NumberOfNumericFeatures": 6, "NumberOfSymbolicFeatures": 2, "Quartile2SkewnessOfNumericAtts": 0.47270927217725245, "REPTreeDepth3Kappa": null, "DecisionStumpKappa": null, "MaxMeansOfNumericAtts": 398.69556962025325, "MinMutualInformation": null, "PercentageOfBinaryFeatures": 12.5, "Quartile2StdDevOfNumericAtts": 10.385708088005124, "RandomTreeDepth1AUC": null, "Dimensionality": 0.05063291139240506, "MaxMutualInformation": null, "MinNominalAttDistinctValues": 2, "PercentageOfInstancesWithMissingValues": 55.06329113924051, "Quartile3AttributeEntropy": null, "RandomTreeDepth1ErrRate": null, "EquivalentNumberOfAtts": null, "MaxNominalAttDistinctValues": 7, "MinSkewnessOfNumericAtts": -0.47418381274078825, "PercentageOfMissingValues": 6.882911392405064, "Quartile3KurtosisOfNumericAtts": 0.5348105017952185, "AutoCorrelation": -78.25350318471338, "RandomTreeDepth1Kappa": null, "J48.00001.AUC": null, "MaxSkewnessOfNumericAtts": 1.098221819130179, "MinStdDevOfNumericAtts": 2.2812293830708743, "PercentageOfNumericFeatures": 75, "Quartile3MeansOfNumericAtts": 123.06534810126585, "CfsSubsetEval_DecisionStumpAUC": null, "RandomTreeDepth2AUC": null, "J48.00001.ErrRate": null, "MaxStdDevOfNumericAtts": 359.08620371714636, "MinorityClassPercentage": null, "PercentageOfSymbolicFeatures": 25, "Quartile3MutualInformation": null, "CfsSubsetEval_DecisionStumpErrRate": null, "RandomTreeDepth2ErrRate": null, "J48.00001.Kappa": null, "MeanAttributeEntropy": null, "MinorityClassSize": null, "Quartile1AttributeEntropy": null, "Quartile3SkewnessOfNumericAtts": 0.7195138937327614, "CfsSubsetEval_DecisionStumpKappa": null, "RandomTreeDepth2Kappa": null, "J48.0001.AUC": null, "MeanKurtosisOfNumericAtts": 0.17180698749198609, "NaiveBayesAUC": null, "Quartile1KurtosisOfNumericAtts": -0.19358192622594628, "Quartile3StdDevOfNumericAtts": 98.49930826928653, "CfsSubsetEval_NaiveBayesAUC": null, "RandomTreeDepth3AUC": null, "J48.0001.ErrRate": null, "MeanMeansOfNumericAtts": 87.81329113924053, "NaiveBayesErrRate": null, "Quartile1MeansOfNumericAtts": 23.1998417721519, "REPTreeDepth1AUC": null, "CfsSubsetEval_NaiveBayesErrRate": null, "RandomTreeDepth3ErrRate": null, "J48.0001.Kappa": null, "MeanMutualInformation": null, "NaiveBayesKappa": null, "Quartile1MutualInformation": null, "REPTreeDepth1ErrRate": null, "CfsSubsetEval_NaiveBayesKappa": null, "RandomTreeDepth3Kappa": null, "J48.001.AUC": null, "MeanNoiseToSignalRatio": null, "NumberOfBinaryFeatures": 1, "Quartile1SkewnessOfNumericAtts": -0.01788238424158961, "REPTreeDepth1Kappa": null, "CfsSubsetEval_kNN1NAUC": null, "StdvNominalAttDistinctValues": 3.5355339059327378, "J48.001.ErrRate": null, "MeanNominalAttDistinctValues": 4.5, "Quartile1StdDevOfNumericAtts": 6.792589771212284, "REPTreeDepth2AUC": null, "CfsSubsetEval_kNN1NErrRate": null, "kNN1NAUC": null, "J48.001.Kappa": null, "MeanSkewnessOfNumericAtts": 0.38282542687816573, "Quartile2AttributeEntropy": null, "REPTreeDepth2ErrRate": null, "CfsSubsetEval_kNN1NKappa": null, "kNN1NErrRate": null, "MajorityClassPercentage": null, "MeanStdDevOfNumericAtts": 67.01203927169226, "Quartile2KurtosisOfNumericAtts": 0.33238418816866644, "REPTreeDepth2Kappa": null, "ClassEntropy": null, "kNN1NKappa": null, "MajorityClassSize": null, "MinAttributeEntropy": null, "Quartile2MeansOfNumericAtts": 28.324683544303795, "REPTreeDepth3AUC": null, "DecisionStumpAUC": null, "MaxAttributeEntropy": null, "MinKurtosisOfNumericAtts": -0.9894962095686894, "Quartile2MutualInformation": null, "REPTreeDepth3ErrRate": null, "DecisionStumpErrRate": null, "MaxKurtosisOfNumericAtts": 0.8561493813520205, "MinMeansOfNumericAtts": 14.119620253164557 }, "tags": [ { "uploader": "38960", "tag": "Life Science" }, { "uploader": "38960", "tag": "Statistics" } ], "features": [ { "name": "class", "index": "7", "type": "numeric", "distinct": "101", "missing": "0", "target": "1", "min": "0", "max": "1650", "mean": "399", "stdev": "359" }, { "name": "Species", "index": "0", "type": "nominal", "distinct": "7", "missing": "0", "distr": [] }, { "name": "Length1", "index": "1", "type": "numeric", "distinct": "116", "missing": "0", "min": "8", "max": "59", "mean": "26", "stdev": "10" }, { "name": "Length2", "index": "2", "type": "numeric", "distinct": "93", "missing": "0", "min": "8", "max": "63", "mean": "28", "stdev": "11" }, { "name": "Length3", "index": "3", "type": "numeric", "distinct": "124", "missing": "0", "min": "9", "max": "68", "mean": "31", "stdev": "12" }, { "name": "Height", "index": "4", "type": "numeric", "distinct": "108", "missing": "0", "min": "15", "max": "45", "mean": "28", "stdev": "8" }, { "name": "Width", "index": "5", "type": "numeric", "distinct": "66", "missing": "0", "min": "9", "max": "21", "mean": "14", "stdev": "2" }, { "name": "Sex", "index": "6", "type": "nominal", "distinct": "2", "missing": "87", "distr": [] } ], "nr_of_issues": 0, "nr_of_downvotes": 0, "nr_of_likes": 0, "nr_of_downloads": 0, "total_downloads": 0, "reach": 0, "reuse": 0, "impact_of_reuse": 0, "reach_of_reuse": 0, "impact": 0 }