{ "data_id": "488", "name": "colleges_aaup", "exact_name": "colleges_aaup", "version": 1, "version_label": null, "description": "**Author**: \n**Source**: Unknown - Date unknown \n**Please cite**: \n\nThe AAUP dataset for the ASA Statistical Graphics Section's 1995\nData Analysis Exposition contains information on faculty salaries\nfor 1161 American colleges and universities. The data may be\nobtained in either of two formats.\n\nAAUP.DATA contains the raw data in comma delimited fields with a\nsingle data line for each school. The order of variables is the\nsame as given below for the fixed column version, although the\nspacing varies for each school.\n\nAAUP2.DATA has the data arranged in fixed columns, with two data\nlines for each school and a maximum line length of 80 characters.\n\nThis dataset is taken from the March-April 1994 issue of Academe.\nThanks to Maryse Eymonerie, Consultant to AAUP, for assistance in\nsupplying the data. Faculty salary data are for the 1993-94\nschool year. You may wish to consult a copy of the special issue\nof Academe for more detailed descriptions of the variables.\n\nData Revised: Wed Jan 18 1995\n\nVARIABLE DESCRIPTIONS (AAUP2.DAT)\nFixed column format with two data lines per school\n\nLine #1\n1 - 5 FICE (Federal ID number)\n7 - 37 College name\n38 - 39 State (postal code)\n40 - 43 Type (I, IIA, or IIB)\n44 - 48 Average salary - full professors\n49 - 52 Average salary - associate professors\n53 - 56 Average salary - assistant professors\n57 - 60 Average salary - all ranks\n61 - 65 Average compensation - full professors\n66 - 69 Average compensation - associate professors\n70 - 73 Average compensation - assistant professors\n74 - 78 Average compensation - all ranks\n\nLine #2\n1 - 4 Number of full professors\n5 - 8 Number of associate professors\n9 - 12 Number of assistant professors\n13 - 16 Number of instructors\n17 - 21 Number of faculty - all ranks\n\nMissing values are denoted with *\nAll salary and compensation figures are yearly in $100's\n\n**************************************************************\nTo obtain the dataset from Statlib, send one of the single line\nmessages below to the address statlib@lib.stat.cmu.edu\n\nsend aaup.data from colleges\nor\nsend aaup2.data from colleges\n\n\nFor more information on the ASA Statistical Graphics Section's\n1995 Data Analysis Exposition send the message\n\nsend readme from colleges\n\n%%%%%%%%%%%%%%\nINFORMATION %\n%%%%%%%%%%%%%%\n\nWHAT'S WHAT AMONG AMERICAN COLLEGES AND UNIVERSITIES?\n\nThis is the subject of the 1995 Data Analysis Exposition\nsponsored by the Statistical Graphics Section of the American\nStatistical Association. The purpose of the Exposition is to\nencourage statisticians to demonstrate techniques, especially\ngraphical, for analyzing data and displaying the results of an\nanalysis. Individuals and groups will work with the same set of\ndata and present their analyses at a special session as part of\nthe annual Joint Statistical Meetings in Orlando, Florida on\nAugust 13th-17th, 1995. The datasets for 1995 are drawn from two\nsources, U.S. News & World Report's Guide to Americas Best\nColleges and the AAUP (American Association of University\nProfessors) 1994 Salary Survey which appeared in the March-April\n1994 issue of Academe.\n\nThe U.S. News data contains information on tuition, room & board\ncosts, SAT or ACT scores, application\/acceptance rates,\ngraduation rate, student\/faculty ratio, spending per student, and\na number of other variables for 1300+ schools. The AAUP data\nincludes average salary, overall compensation, and number of\nfaculty broken down by full, associate, and assistant professor\nranks.\n\nThe raw data and documentation are contained in the files\ndescribed below. To obtain any of these files send a message to\nstatlib@lib.stat.cmu.edu of the following form (substituting the\nfile you want for XXXXX)\n\nsend XXXXX from colleges\n\nAvailable files\n\nusnews.doc Documentation for the U.S. News data\nusnews.data U.S. News data in comma delimited format\nusnews3.data U.S. News data in fixed column format\n\naaup.doc Documentation for the AAUP salary data\naaup.data AAUP salary data in comma delimited format\naaup2.data AAUP salary data in fixed column format\n\nTwo versions of each dataset are provided to accommodate users\nwith different software constraints. The comma delimited\nversions (USNEWS.DATA and AAUP.DATA) contain information for each\ncollege on a separate line with values delimited by commas. The\nfixed column versions (USNEWS3.DATA and AAUP2.DATA) use 2 or 3\ndata lines per school and a maximum line length of 80 characters.\n\nTo participate in the 1995 Data Analysis Exposition you must send\nan abstract form to the American Statistical Association by\nFebruary 1st, 1995. Information is available from the ASA\nMeetings Department by e-mail (meetings@asa.mhs.compuserve.com),\nphone (703-684-1221), fax (703-684-2037), or surface mail (ASA,\n1429 Duke St., Alexandria, VA 22314). Your initial abstract may\nbe fairly general since you may do the bulk of your analysis\nafter the February 1 deadline.\n\nYou may choose your own path to proceed in analyzing the data or\nuse some of the suggested questions below to get started.\n\n... How well can we model tuition using the other variables?\n... How might we cluster colleges into similar comparison groups?\n... How can we best display faculty salary structure?\n... Can we find a reasonable way to rank the schools?\n\nYou may work on your own or put together a team. Show off the\ncapabilities of your favorite software package or use the data\nfor a class project and display your students results. You may\nchoose to consider just a subset of schools or examine regional\npatterns. The main point is to find innovative ways to display\nthe interesting features of the data.\n\nFurther questions about the 1995 Exposition can be directed to\nRobin Lock, Mathematics Department, St. Lawrence University,\nCanton, NY 13617 e-mail rlock@vm.stlawu.edu\n\nIf you would like to be informed about any subsequent adjustments\nor error fixes to the 1995 Exposition data, please send an e-mail\nmessage to register your interest to rlock@vm.stlawu.edu.\n\nSpecial thanks for providing data for the 1995 Exposition to:\nRobert Morse, Director of Research for America's Best Colleges at\nU.S. News & World Report\nMaryse Eymonerie, Consultant to AAUP.\n\n\nInformation about the dataset\nCLASSTYPE: numeric\nCLASSINDEX: none specific", "format": "ARFF", "uploader": "Joaquin Vanschoren", "uploader_id": 2, "visibility": "public", "creator": null, "contributor": "Statlib", "date": "2014-09-29 00:05:26", "update_comment": "set target feature", "last_update": "2014-10-06 17:12:59", "licence": "Public", "status": "active", "error_message": null, "url": "https:\/\/www.openml.org\/data\/download\/52600\/colleges_aaup.arff", "kaggle_url": null, "default_target_attribute": "Type", "row_id_attribute": null, "ignore_attribute": "\"FICE\",\"College_name\"", "runs": 32, "suggest": { "input": [ "colleges_aaup", "The AAUP dataset for the ASA Statistical Graphics Section's 1995 Data Analysis Exposition contains information on faculty salaries for 1161 American colleges and universities. The data may be obtained in either of two formats. AAUP.DATA contains the raw data in comma delimited fields with a single data line for each school. The order of variables is the same as given below for the fixed column version, although the spacing varies for each school. AAUP2.DATA has the data arranged in fixed columns " ], "weight": 5 }, "qualities": { "NumberOfInstances": 1161, "NumberOfFeatures": 15, "NumberOfClasses": 4, "NumberOfMissingValues": 256, "NumberOfInstancesWithMissingValues": 87, "NumberOfNumericFeatures": 13, "NumberOfSymbolicFeatures": 2, "Quartile2AttributeEntropy": 5.248050549098432, "REPTreeDepth2ErrRate": 0.34194659776055125, "CfsSubsetEval_kNN1NKappa": 0.6725598466145939, "kNN1NErrRate": 0.35400516795865633, "MajorityClassPercentage": 53.14384151593454, "MeanStdDevOfNumericAtts": 109.25711049667814, "Quartile2KurtosisOfNumericAtts": 0.5367715420788586, "REPTreeDepth2Kappa": 0.41246137608223454, "ClassEntropy": 1.434835752759632, "kNN1NKappa": 0.4025295122281044, "MajorityClassSize": 617, "MinAttributeEntropy": 5.248050549098432, "Quartile2MeansOfNumericAtts": 416.39555555555546, "REPTreeDepth3AUC": 0.7875662822405399, "DecisionStumpAUC": 0.8201136168530669, "MaxAttributeEntropy": 5.248050549098432, "MinKurtosisOfNumericAtts": -0.029910078625991154, "Quartile2MutualInformation": 0.11276889813864, "REPTreeDepth3ErrRate": 0.34194659776055125, "DecisionStumpErrRate": 0.29112833763996554, "MaxKurtosisOfNumericAtts": 15.581339620315852, "MinMeansOfNumericAtts": 12.735572782084397, "Quartile2SkewnessOfNumericAtts": 0.6846258112240072, "REPTreeDepth3Kappa": 0.41246137608223454, "DecisionStumpKappa": 0.4747829764680373, "MaxMeansOfNumericAtts": 653.4876486733763, "MinMutualInformation": 0.11276889813864, "PercentageOfBinaryFeatures": 0, "Quartile2StdDevOfNumericAtts": 92.28671935660121, "RandomTreeDepth1AUC": 0.7645737102289444, "Dimensionality": 0.012919896640826873, "MaxMutualInformation": 0.11276889813864, "MinNominalAttDistinctValues": 4, "PercentageOfInstancesWithMissingValues": 7.493540051679587, "Quartile3AttributeEntropy": 5.248050549098432, "RandomTreeDepth1ErrRate": 0.29285099052540914, "EquivalentNumberOfAtts": 12.723683359889007, "MaxNominalAttDistinctValues": 52, "MinSkewnessOfNumericAtts": 0.34358420480988844, "PercentageOfMissingValues": 1.4699971289118576, "Quartile3KurtosisOfNumericAtts": 8.49221453033001, "AutoCorrelation": 0.5086206896551724, "RandomTreeDepth1Kappa": 0.5106937354815839, "J48.00001.AUC": 0.8921348419556767, "MaxSkewnessOfNumericAtts": 3.374891643702985, "MinStdDevOfNumericAtts": 19.514093510979706, "PercentageOfNumericFeatures": 86.66666666666667, "Quartile3MeansOfNumericAtts": 523.9687225780217, "CfsSubsetEval_DecisionStumpAUC": 0.8619167027359397, "RandomTreeDepth2AUC": 0.7645737102289444, "J48.00001.ErrRate": 0.22566752799310938, "MaxStdDevOfNumericAtts": 314.09056309371505, "MinorityClassPercentage": 0.08613264427217916, "PercentageOfSymbolicFeatures": 13.333333333333334, "Quartile3MutualInformation": 0.11276889813864, "CfsSubsetEval_DecisionStumpErrRate": 0.1920757967269595, "RandomTreeDepth2ErrRate": 0.29285099052540914, "J48.00001.Kappa": 0.6139249101706725, "MeanAttributeEntropy": 5.248050549098432, "MinorityClassSize": 1, "Quartile1AttributeEntropy": 5.248050549098432, "Quartile3SkewnessOfNumericAtts": 2.5994193523935607, "CfsSubsetEval_DecisionStumpKappa": 0.6725598466145939, "RandomTreeDepth2Kappa": 0.5106937354815839, "J48.0001.AUC": 0.8921348419556767, "MeanKurtosisOfNumericAtts": 4.003674970591461, "NaiveBayesAUC": 0.9023612452539781, "Quartile1KurtosisOfNumericAtts": 0.21189934058201376, "Quartile3StdDevOfNumericAtts": 131.64151497610425, "CfsSubsetEval_NaiveBayesAUC": 0.8619167027359397, "RandomTreeDepth3AUC": 0.7645737102289444, "J48.0001.ErrRate": 0.22566752799310938, "MeanMeansOfNumericAtts": 335.77908459577486, "NaiveBayesErrRate": 0.2360034453057709, "Quartile1MeansOfNumericAtts": 83.74074074074076, "REPTreeDepth1AUC": 0.7875662822405399, "CfsSubsetEval_NaiveBayesErrRate": 0.1920757967269595, "RandomTreeDepth3ErrRate": 0.29285099052540914, "J48.0001.Kappa": 0.6139249101706725, "MeanMutualInformation": 0.11276889813864, "NaiveBayesKappa": 0.6051314515528413, "Quartile1MutualInformation": 0.11276889813864, "REPTreeDepth1ErrRate": 0.34194659776055125, "CfsSubsetEval_NaiveBayesKappa": 0.6725598466145939, "RandomTreeDepth3Kappa": 0.5106937354815839, "J48.001.AUC": 0.8921348419556767, "MeanNoiseToSignalRatio": 45.538102577240664, "NumberOfBinaryFeatures": 0, "Quartile1SkewnessOfNumericAtts": 0.4326393685013813, "REPTreeDepth1Kappa": 0.41246137608223454, "CfsSubsetEval_kNN1NAUC": 0.8619167027359397, "StdvNominalAttDistinctValues": 33.94112549695428, "J48.001.ErrRate": 0.22566752799310938, "MeanNominalAttDistinctValues": 28, "Quartile1StdDevOfNumericAtts": 72.17282678834677, "REPTreeDepth2AUC": 0.7875662822405399, "CfsSubsetEval_kNN1NErrRate": 0.1920757967269595, "kNN1NAUC": 0.7026431852352764, "J48.001.Kappa": 0.6139249101706725, "MeanSkewnessOfNumericAtts": 1.3871231769119512 }, "tags": [ { "tag": "study_1", "uploader": "2" } ], "features": [ { "name": "Type", "index": "3", "type": "nominal", "distinct": "4", "missing": "0", "target": "1", "distr": [ [ "I", "IIA", "IIB", "VIIB" ], [ [ "180", "0", "0", "0" ], [ "0", "363", "0", "0" ], [ "0", "0", "617", "0" ], [ "0", "0", "0", "1" ] ] ] }, { "name": "FICE", "index": "0", "type": "numeric", "distinct": "1160", "missing": "0", "ignore": "1", "min": "1002", "max": "29269", "mean": "3052", "stdev": "2412" }, { "name": "College_name", "index": "1", "type": "nominal", "distinct": "1140", "missing": "0", "ignore": "1", "distr": [] }, { "name": "State", "index": "2", "type": "nominal", "distinct": "52", "missing": "0", "distr": [ [ "AK", "AL", "AR", "AZ", "CA", "CO", "CT", "DC", "DE", "FL", "GA", "HI", "IA", "ID", "IL", "IN", "KS", "KY", "LA", "MA", "MD", "ME", "MI", "MN", "MO", "MS", "MT", "NC", "ND", "NE", "NH", "NJ", "NM", "NV", "NY", "OH", "OK", "OR", "PA", "RI", "SC", "SD", "TN", "TX", "UT", "VA", "VT", "WA", "WI", "WV", "WW", "WY" ], [ [ "1", "2", "1", "0" ], [ "3", "9", "9", "0" ], [ "1", "4", "11", "0" ], [ "3", "0", "1", "0" ], [ "12", "29", "13", "0" ], [ "4", "4", "7", "0" ], [ "1", "11", "4", "0" ], [ "5", "2", "2", "0" ], [ "1", "1", "2", "0" ], [ "4", "10", "5", "0" ], [ "4", "8", "9", "0" ], [ "1", "0", "2", "0" ], [ "2", "3", "22", "0" ], [ "1", "2", "3", "0" ], [ "9", "15", "26", "0" ], [ "5", "11", "25", "0" ], [ "2", "5", "11", "0" ], [ "2", "6", "14", "0" ], [ "3", "11", "6", "0" ], [ "8", "16", "18", "0" ], [ "3", "9", "11", "0" ], [ "1", "2", "11", "0" ], [ "4", "9", "19", "0" ], [ "1", "8", "16", "0" ], [ "6", "9", "19", "0" ], [ "3", "3", "7", "0" ], [ "2", "4", "2", "0" ], [ "4", "10", "28", "0" ], [ "1", "2", "3", "0" ], [ "1", "5", "11", "0" ], [ "2", "2", "7", "0" ], [ "2", "11", "11", "0" ], [ "3", "2", "1", "0" ], [ "1", "1", "0", "0" ], [ "14", "25", "42", "0" ], [ "9", "8", "36", "0" ], [ "2", "4", "11", "0" ], [ "3", "5", "10", "0" ], [ "7", "26", "52", "0" ], [ "2", "3", "3", "0" ], [ "2", "3", "20", "0" ], [ "1", "3", "5", "0" ], [ "4", "8", "15", "0" ], [ "13", "23", "18", "0" ], [ "3", "0", "3", "0" ], [ "6", "9", "24", "0" ], [ "1", "1", "8", "0" ], [ "2", "7", "7", "0" ], [ "3", "11", "13", "0" ], [ "1", "1", "13", "0" ], [ "0", "0", "0", "1" ], [ "1", "0", "0", "0" ] ] ] }, { "name": "Average_salary-full_professors", "index": "4", "type": "numeric", "distinct": "427", "missing": "68", "min": "270", "max": "1009", "mean": "524", "stdev": "118" }, { "name": "Average_salary-associate_professors", "index": "5", "type": "numeric", "distinct": "303", "missing": "36", "min": "234", "max": "733", "mean": "416", "stdev": "72" }, { "name": "Average_salary-assistant_professors", "index": "6", "type": "numeric", "distinct": "235", "missing": "24", "min": "199", "max": "576", "mean": "352", "stdev": "55" }, { "name": "Average_salary-all_ranks", "index": "7", "type": "numeric", "distinct": "345", "missing": "0", "min": "232", "max": "866", "mean": "420", "stdev": "92" }, { "name": "Average_compensation-full_professors", "index": "8", "type": "numeric", "distinct": "485", "missing": "68", "min": "319", "max": "1236", "mean": "653", "stdev": "152" }, { "name": "Average_compensation-associate_professors", "index": "9", "type": "numeric", "distinct": "373", "missing": "36", "min": "292", "max": "909", "mean": "524", "stdev": "97" }, { "name": "Average_compensation-assistant_professors", "index": "10", "type": "numeric", "distinct": "307", "missing": "24", "min": "246", "max": "717", "mean": "442", "stdev": "75" }, { "name": "Average_compensation-all_ranks", "index": "11", "type": "numeric", "distinct": "431", "missing": "0", "min": "265", "max": "1075", "mean": "527", "stdev": "121" }, { "name": "Number_of_full_professors", "index": "12", "type": "numeric", "distinct": "298", "missing": "0", "min": "0", "max": "997", "mean": "95", "stdev": "143" }, { "name": "Number_of_associate_professors", "index": "13", "type": "numeric", "distinct": "255", "missing": "0", "min": "0", "max": "721", "mean": "72", "stdev": "89" }, { "name": "Number_of_assistant_professors", "index": "14", "type": "numeric", "distinct": "241", "missing": "0", "min": "0", "max": "510", "mean": "69", "stdev": "73" }, { "name": "Number_of_instructors", "index": "15", "type": "numeric", "distinct": "83", "missing": "0", "min": "0", "max": "178", "mean": "13", "stdev": "20" }, { "name": "Number_of_faculty-all_ranks", "index": "16", "type": "numeric", "distinct": "495", "missing": "0", "min": "7", "max": "2261", "mean": "257", "stdev": "314" } ], "nr_of_issues": 0, "nr_of_downvotes": 0, "nr_of_likes": 0, "nr_of_downloads": 0, "total_downloads": 0, "reach": 0, "reuse": 0, "impact_of_reuse": 0, "reach_of_reuse": 0, "impact": 0 }