OpenML

JavaScript is required to properly view the contents of this page!

cars

active ARFF Publicly available Visibility: public Uploaded 28-09-2014 by Joaquin Vanschoren
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes

Issue	#Downvotes for this reason	By

Loading wiki

Help us complete this description Edit

Author: Source: Unknown - Date unknown Please cite: The Committee on Statistical Graphics of the American Statistical Association (ASA) invites you to participate in its Second (1983) Exposition of Statistical Graphics Technology. The purposes of the Exposition are (l) to provide a forum in which users and providers of statistical graphics technology can exchange information and ideas and (2) to expose those members of the ASA community who are less familiar with statistical graphics to its capabilities and potential benefits to them. The Exposition wil1 be held in conjunction with the Annual Meetings in Toronto, August 15-18, 1983 and is tentatively scheduled for the afternoon of Wednesday, August 17. Seven providers of statistical graphics technology participated in the l982 Exposition. By all accounts, the Exposition was well received by the ASA community and was a worthwhile experience for the participants. We hope to have those seven involved again this year, along with as many new participants as we can muster. The 1982 Exposition was summarized in a paper to appear in the Proceeding of the Statistical Computing Section. A copy of that paper is enclosed for your information. The basic format of the 1983 Exposition will be similar to that of 1982. However, based upon comments received and experience gained, there are some changes. The basic structure, intended to be simpler and more flexible than last year, is as follows: A fixed data set is to be analyzed. This data set is a version of the CRCARS data set of Donoho, David and Ramos, Ernesto (1982), ``PRIMDATA: Data Sets for Use With PRIM-H'' (DRAFT). Because of the Committee's limited (zero) budget for the Exposition, we are forced to provide the data in hardcopy form only (enclosed). (Sorry!) There are 406 observations on the following 8 variables: MPG (miles per gallon), # cylinders, engine displacement (cu. inches), horsepower, vehicle weight (lbs.), time to accelerate from O to 60 mph (sec.), model year (modulo 100), and origin of car (1. American, 2. European, 3. Japanese). These data appear on seven pages. Also provided are the car labels (types) in the same order as the 8 variables on seven separate pages. Missing data values are marked by series of question marks. You are asked to analyze these data using your statistical graphics software. Your objective should be to achieve graphical displays which will be meaningful to the viewers and highlight relevant aspects of the data. If you can best achieve this using simple graphical formats, fine. If you choose to illustrate some of the more sophisticated capabilities of your software and can do so without losing relevancy to the data, that is fine, too. This year, there will be no Committee commentary on the individual presentations, so you are not competing with other presenters. The role of each presenter is to do his/her best job of presenting their statistical graphics technology to the viewers. Each participant will be provided with a 6'(long) by 4'(tall) posterboard on which to display the results of their analyses. This is the same format as last year. You are encouraged to remain by your presentation during the Exposition to answer viewers' questions. Three copies of your presentation must be submitted to me by July 1. Movie or slide show presentations cannot be accommodated (sorry). The Committee will prepare its own poster presentation which will orient the viewers to the data and the purposes of the Exposition. The ASA has asked us to remind all participants that the Exposition is intended for educational and scientific purposes and is not a marketing activity. Even though last year's participants did an excellent job of maintaining that distinction, a cautionary note at this point is appropriate. Those of us who were involved with the 1982 Exposition found it worthwhile and fun to do. We would very much like to have you participate this year. For planning purposes, please RSVP (to me, in writing please) by April 15 as to whether you plan to accept the Committee's invitation. If you have any questions about the Exposition, please call me on (301/763-5350). If you have specific questions about the data, or the analysis, please call Karen Kafadar on (301/921-3651). If you cannot participate but know of another person or group in your organization who can, please pass this invitation along to them. Sincerely, LAWRENCE H. COX Statistical Research Division Bureau of the Census Room 3524-3 Washington, DC 20233 Information about the dataset CLASSTYPE: nominal CLASSINDEX: last

8 features

origin (target)	nominal	3 unique values 0 missing
name (ignore)	nominal	312 unique values 0 missing
mpg	numeric	129 unique values 8 missing
cylinders	nominal	5 unique values 0 missing
displacement	numeric	83 unique values 0 missing
horsepower	numeric	93 unique values 6 missing
weight	numeric	356 unique values 0 missing
acceleration	numeric	96 unique values 0 missing
model.year	numeric	13 unique values 0 missing

Show all 8 features

107 properties

NumberOfInstances

406

Number of instances (rows) of the dataset.

NumberOfFeatures

Number of attributes (columns) of the dataset.

NumberOfClasses

Number of distinct values of the target attribute (if it is nominal).

NumberOfMissingValues

Number of missing values in the dataset.

NumberOfInstancesWithMissingValues

Number of instances with at least one value missing.

NumberOfNumericFeatures

Number of numeric attributes.

NumberOfSymbolicFeatures

Number of nominal attributes.

MeanSkewnessOfNumericAtts

0.49

Mean skewness among attributes of the numeric type.

Quartile1StdDevOfNumericAtts

3.51

First quartile of standard deviation of attributes of the numeric type.

REPTreeDepth2AUC

0.87

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.REPTree -L 2

CfsSubsetEval_kNN1NErrRate

0.24

Error rate achieved by the landmarker weka.classifiers.lazy.IBk -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W

kNN1NAUC

0.76

Area Under the ROC Curve achieved by the landmarker weka.classifiers.lazy.IBk

J48.001.Kappa

0.62

Kappa coefficient achieved by the landmarker weka.classifiers.trees.J48 -C .001

MeanStdDevOfNumericAtts

167.51

Mean standard deviation of attributes of the numeric type.

Quartile2AttributeEntropy

1.59

Second quartile (Median) of entropy among attributes.

REPTreeDepth2ErrRate

0.28

Error rate achieved by the landmarker weka.classifiers.trees.REPTree -L 2

CfsSubsetEval_kNN1NKappa

0.55

Kappa coefficient achieved by the landmarker weka.classifiers.lazy.IBk -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W

kNN1NErrRate

0.28

Error rate achieved by the landmarker weka.classifiers.lazy.IBk

MajorityClassPercentage

62.56

Percentage of instances belonging to the most frequent class.

MinAttributeEntropy

1.59

Minimal entropy among attributes.

Quartile2KurtosisOfNumericAtts

-0.66

Second quartile (Median) of kurtosis among attributes of the numeric type.

REPTreeDepth2Kappa

0.51

Kappa coefficient achieved by the landmarker weka.classifiers.trees.REPTree -L 2

ClassEntropy

1.33

Entropy of the target attribute values.

kNN1NKappa

0.48

Kappa coefficient achieved by the landmarker weka.classifiers.lazy.IBk

MajorityClassSize

254

Number of instances belonging to the most frequent class.

MinKurtosisOfNumericAtts

-1.2

Minimum kurtosis among attributes of the numeric type.

Quartile2MeansOfNumericAtts

90.5

Second quartile (Median) of means among attributes of the numeric type.

REPTreeDepth3AUC

0.87

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.REPTree -L 3

DecisionStumpAUC

0.85

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.DecisionStump

MaxAttributeEntropy

1.59

Maximum entropy among attributes.

MinMeansOfNumericAtts

15.52

Minimum of means among attributes of the numeric type.

Quartile2MutualInformation

0.39

Second quartile (Median) of mutual information between the nominal attributes and the target attribute.

REPTreeDepth3ErrRate

0.28

Error rate achieved by the landmarker weka.classifiers.trees.REPTree -L 3

DecisionStumpErrRate

0.34

Error rate achieved by the landmarker weka.classifiers.trees.DecisionStump

MaxKurtosisOfNumericAtts

0.54

Maximum kurtosis among attributes of the numeric type.

MinMutualInformation

0.39

Minimal mutual information between the nominal attributes and the target attribute.

Quartile2SkewnessOfNumericAtts

0.48

Second quartile (Median) of skewness among attributes of the numeric type.

REPTreeDepth3Kappa

0.51

Kappa coefficient achieved by the landmarker weka.classifiers.trees.REPTree -L 3

DecisionStumpKappa

0.43

Kappa coefficient achieved by the landmarker weka.classifiers.trees.DecisionStump

MaxMeansOfNumericAtts

2979.41

Maximum of means among attributes of the numeric type.

MinNominalAttDistinctValues

The minimal number of distinct values among attributes of the nominal type.

PercentageOfBinaryFeatures

Percentage of binary attributes.

Quartile2StdDevOfNumericAtts

23.29

Second quartile (Median) of standard deviation of attributes of the numeric type.

RandomTreeDepth1AUC

0.85

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.RandomTree -depth 1

Dimensionality

0.02

Number of attributes divided by the number of instances.

MaxMutualInformation

0.39

Maximum mutual information between the nominal attributes and the target attribute.

MinSkewnessOfNumericAtts

0.02

Minimum skewness among attributes of the numeric type.

PercentageOfInstancesWithMissingValues

3.45

Percentage of instances having missing values.

Quartile3AttributeEntropy

1.59

Third quartile of entropy among attributes.

RandomTreeDepth1ErrRate

0.19

Error rate achieved by the landmarker weka.classifiers.trees.RandomTree -depth 1

EquivalentNumberOfAtts

3.43

Number of attributes needed to optimally describe the class (under the assumption of independence among attributes). Equals ClassEntropy divided by MeanMutualInformation.

MaxNominalAttDistinctValues

The maximum number of distinct values among attributes of the nominal type.

MinStdDevOfNumericAtts

2.8

Minimum standard deviation of attributes of the numeric type.

PercentageOfMissingValues

0.43

Percentage of missing values.

Quartile3KurtosisOfNumericAtts

0.42

Third quartile of kurtosis among attributes of the numeric type.

AutoCorrelation

0.62

Average class difference between consecutive instances.

RandomTreeDepth1Kappa

0.63

Kappa coefficient achieved by the landmarker weka.classifiers.trees.RandomTree -depth 1

J48.00001.AUC

0.87

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.J48 -C .00001

MaxSkewnessOfNumericAtts

1.03

Maximum skewness among attributes of the numeric type.

MinorityClassPercentage

17.98

Percentage of instances belonging to the least frequent class.

PercentageOfNumericFeatures

Percentage of numeric attributes.

Quartile3MeansOfNumericAtts

890.94

Third quartile of means among attributes of the numeric type.

CfsSubsetEval_DecisionStumpAUC

0.86

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.DecisionStump -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W

RandomTreeDepth2AUC

0.85

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.RandomTree -depth 2

J48.00001.ErrRate

0.2

Error rate achieved by the landmarker weka.classifiers.trees.J48 -C .00001

MaxStdDevOfNumericAtts

847

Maximum standard deviation of attributes of the numeric type.

MinorityClassSize

Number of instances belonging to the least frequent class.

PercentageOfSymbolicFeatures

Percentage of nominal attributes.

Quartile3MutualInformation

0.39

Third quartile of mutual information between the nominal attributes and the target attribute.

CfsSubsetEval_DecisionStumpErrRate

0.24

Error rate achieved by the landmarker weka.classifiers.trees.DecisionStump -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W

RandomTreeDepth2ErrRate

0.19

Error rate achieved by the landmarker weka.classifiers.trees.RandomTree -depth 2

J48.00001.Kappa

0.62

Kappa coefficient achieved by the landmarker weka.classifiers.trees.J48 -C .00001

MeanAttributeEntropy

1.59

Average entropy of the attributes.

NaiveBayesAUC

0.86

Area Under the ROC Curve achieved by the landmarker weka.classifiers.bayes.NaiveBayes

Quartile1AttributeEntropy

1.59

First quartile of entropy among attributes.

Quartile3SkewnessOfNumericAtts

0.78

Third quartile of skewness among attributes of the numeric type.

CfsSubsetEval_DecisionStumpKappa

0.55

Kappa coefficient achieved by the landmarker weka.classifiers.trees.DecisionStump -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W

RandomTreeDepth2Kappa

0.63

Kappa coefficient achieved by the landmarker weka.classifiers.trees.RandomTree -depth 2

J48.0001.AUC

0.87

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.J48 -C .0001

MeanKurtosisOfNumericAtts

-0.4

Mean kurtosis among attributes of the numeric type.

NaiveBayesErrRate

0.33

Error rate achieved by the landmarker weka.classifiers.bayes.NaiveBayes

Quartile1KurtosisOfNumericAtts

-0.92

First quartile of kurtosis among attributes of the numeric type.

Quartile3StdDevOfNumericAtts

290.44

Third quartile of standard deviation of attributes of the numeric type.

CfsSubsetEval_NaiveBayesAUC

0.86

Area Under the ROC Curve achieved by the landmarker weka.classifiers.bayes.NaiveBayes -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W

RandomTreeDepth3AUC

0.85

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.RandomTree -depth 3

J48.0001.ErrRate

0.2

Error rate achieved by the landmarker weka.classifiers.trees.J48 -C .0001

MeanMeansOfNumericAtts

565.71

Mean of means among attributes of the numeric type.

MeanMutualInformation

0.39

Average mutual information between the nominal attributes and the target attribute.

NaiveBayesKappa

0.44

Kappa coefficient achieved by the landmarker weka.classifiers.bayes.NaiveBayes

Quartile1MeansOfNumericAtts

21.52

First quartile of means among attributes of the numeric type.

REPTreeDepth1AUC

0.87

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.REPTree -L 1

CfsSubsetEval_NaiveBayesErrRate

0.24

Error rate achieved by the landmarker weka.classifiers.bayes.NaiveBayes -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W

RandomTreeDepth3ErrRate

0.19

Error rate achieved by the landmarker weka.classifiers.trees.RandomTree -depth 3

J48.0001.Kappa

0.62

Kappa coefficient achieved by the landmarker weka.classifiers.trees.J48 -C .0001

MeanNoiseToSignalRatio

3.11

An estimate of the amount of irrelevant information in the attributes regarding the class. Equals (MeanAttributeEntropy - MeanMutualInformation) divided by MeanMutualInformation.

NumberOfBinaryFeatures

Number of binary attributes.

Quartile1MutualInformation

0.39

First quartile of mutual information between the nominal attributes and the target attribute.

REPTreeDepth1ErrRate

0.28

Error rate achieved by the landmarker weka.classifiers.trees.REPTree -L 1

CfsSubsetEval_NaiveBayesKappa

0.55

Kappa coefficient achieved by the landmarker weka.classifiers.bayes.NaiveBayes -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W

RandomTreeDepth3Kappa

0.63

Kappa coefficient achieved by the landmarker weka.classifiers.trees.RandomTree -depth 3

J48.001.AUC

0.87

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.J48 -C .001

MeanNominalAttDistinctValues

Average number of distinct values among the attributes of the nominal type.

Quartile1SkewnessOfNumericAtts

0.18

First quartile of skewness among attributes of the numeric type.

REPTreeDepth1Kappa

0.51

Kappa coefficient achieved by the landmarker weka.classifiers.trees.REPTree -L 1

CfsSubsetEval_kNN1NAUC

0.86

Area Under the ROC Curve achieved by the landmarker weka.classifiers.lazy.IBk -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W

StdvNominalAttDistinctValues

1.41

Standard deviation of the number of distinct values among attributes of the nominal type.

J48.001.ErrRate

0.2

Error rate achieved by the landmarker weka.classifiers.trees.J48 -C .001

Show all 107 properties

14 tasks

Supervised Classification on cars

164 runs - estimation_procedure: 10-fold Crossvalidation - evaluation_measure: predictive_accuracy - target_feature: origin

Supervised Classification on cars

0 runs - estimation_procedure: 10 times 10-fold Crossvalidation - evaluation_measure: predictive_accuracy - target_feature: origin

Supervised Data Stream Classification on cars

0 runs - estimation_procedure: Interleaved Test then Train - target_feature: origin

Clustering on cars