Data
pair0002

pair0002

active ARFF Publicly available Visibility: public Uploaded 16-03-2022 by Oleksandr Zadorozhnyi
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes
  • Computer Systems Graphical models Machine Learning MaRDI TA3
Issue #Downvotes for this reason By


Loading wiki
Help us complete this description Edit
//Add the description.md of the data file pair0002 Cause-effect is a growing database with two-variable cause-effect pairs created at Max-Planck-Institute for Biological Cybernetics in Tuebingen, Germany. ================================================================================================================================================== Some pairs are highdimensional, for machine readability the relevant information about this is coded in Meta-data. Meta-data contains the following information: number of pair | 1st column of cause | last column of cause | 1st column of effect | last column of effect | dataset weight The dataset weight should be used for calculating average performance of causal inference methods to avoid a bias introduced by having multiple copies of essentially the same data (for example, the pairs 56-63). When you use this data set in a publication, please cite the following paper (which also contains much more detailed information regarding this data set in the supplement): J. M. Mooij, J. Peters, D. Janzing, J. Zscheischler, B. Schoelkopf "Distinguishing cause from effect using observational data: methods and benchmarks" Journal of Machine Learning Research 17(32):1-102, 2016 NOTE: pair0001 - pair0041 are taken from the UCI Machine Learning Repository: Asuncion, A. & Newman, D.J. (2007). UCI Machine Learning Repository [http://www.ics.uci.edu/~mlearn/MLRepository.html]. Irvine, CA: University of California, School of Information and Computer Science. ================================================================================================================================================== Overview over all data pairs. var 1 var 2 dataset ground truth pair0001 Altitude Temperature DWD -> pair0002 Altitude Precipitation DWD -> pair0003 Longitude Temperature DWD -> pair0004 Altitude Sunshine hours DWD -> Information for pairs0002: DWD data (Deutscher Wetterdienst) data was taken at 349 stations taken from http://www.dwd.de/bvbw/appmanager/bvbw/dwdwwwDesktop/?_nfpb=true&_pageLabel=_dwdwww_klima_umwelt_klimadaten_deutschland&T82002gsbDocumentPath=Navigation%2FOeffentlichkeit%2FKlima__Umwelt%2FKlimadaten%2Fkldaten__kostenfrei%2Fausgabe__mittelwerte__node.html__nnn%3Dtrue more recent link (Jan 2010): http://www.dwd.de/bvbw/appmanager/bvbw/dwdwwwDesktop/?_nfpb=true&_pageLabel=_dwdwww_klima_umwelt_klimadaten_deutschland&T82002gsbDocumentPath=Navigation%2FOeffentlichkeit%2FKlima__Umwelt%2FKlimadaten%2Fkldaten__kostenfrei%2Fausgabe__mittelwerte__node.html__nnn%3Dtrue x: altitude y: precipitation (yearly value averaged over 1961-1990) ground truth: x --> y

2 features

X205numeric263 unique values
0 missing
X828.3numeric336 unique values
0 missing

19 properties

348
Number of instances (rows) of the dataset.
2
Number of attributes (columns) of the dataset.
Number of distinct values of the target attribute (if it is nominal).
0
Number of missing values in the dataset.
0
Number of instances with at least one value missing.
2
Number of numeric attributes.
0
Number of nominal attributes.
0.01
Number of attributes divided by the number of instances.
100
Percentage of numeric attributes.
Percentage of instances belonging to the most frequent class.
0
Percentage of nominal attributes.
Number of instances belonging to the most frequent class.
Percentage of instances belonging to the least frequent class.
Number of instances belonging to the least frequent class.
0
Number of binary attributes.
0
Percentage of binary attributes.
0
Percentage of instances having missing values.
Average class difference between consecutive instances.
0
Percentage of missing values.

0 tasks

Define a new task