{ "data_id": "43561", "name": "Accidents-on-the-Rio-Niteri-bridge", "exact_name": "Accidents-on-the-Rio-Niteri-bridge", "version": 1, "version_label": "v1.0", "description": "Context\nAbout the Rio Niteri Bridge, according to Wikipedia:\nPresidente Costa e Silva Bridge, popularly known as Rio Niteri Bridge, is a bridge that crosses Guanabara Bay, in the state of Rio de Janeiro, in Brazil. It connects the municipalities of Rio de Janeiro and Niteri.\nCurrently, the bridge is the largest in the southern hemisphere in prestressed concrete and the largest in Latin America. The structure receives more than 150 thousand passengers a day, according to information from the Ecoponte concessionaire, on days with normal flows. [1] It is also known as the largest straight span in the world and the largest set of prestressed structures in America. [2]\nContent\nThe data to carry out this research were obtained on the website of the Federal Highway Police, which provides an Open Data session which, according to the website: has no restrictions on licenses, patents or control mechanisms, so that they are freely available for use and redistributed at will . In the Accidents section it is possible to find accident records in csv format over the years.\nAcknowledgements\nIt is great that this data is public and can present valuable insights to improve people's quality of life\nInspiration\nSome guiding questions about this data:\n\nIs the number of accidents on the bridge daily?\nHave the number of accidents on the bridge decreased?\nHas the installation of security cameras reduced the number of accidents?\n\nHow to import from source\nSee below the code used to import and organize the open data of the site: \n Load tidyverse:\nlibrary(tidyverse)\n\n Import available data:\ndataset - \n glue::glue(\"datatran2007:2020.csv\") \n map(.x \n data.table::fread(sep = \";\",dec=\",\", encoding = \"Latin-1\") \n as_tibble() )\n\n Perform data pre-processing in parallel using the foreach package:\nlibrary(foreach)\ncl - parallel::makeCluster(4)\ndoParallel::registerDoParallel(cl)\ndataset - foreach(x = dataset, \n info = 2007:2020, \n .packages = c(\"dplyr\", \"stringr\", \"lubridate\")) dopar \n\n x \n mutate(km = as.character(km),\n br = as.character(br)) \n mutate_if(is.character,\n .x \n enc2native() \n stringi::stri_trans_general(id = \"Latin-ASCII\") \n tolower() \n str_replace_all(\"\/\",\"-\") \n str_replace_all(\",\",\".\") ) \n mutate(data_inversa = if_else(is.na(dmy(data_inversa)),\n ymd(data_inversa),\n dmy(data_inversa)) ,\n km = as.numeric(km),\n br = as.numeric(br),\n id = as.character(id),\n dia_semana = str_remove_all(dia_semana,\"-feira\")\n ) \n filter(br==101 km=321 km334) \n slice(str_which(municipio,\"(niteroirio de janeiro)\"))\n \nparallel::stopCluster(cl)\n\n Save the tidy dataset:\ndataset \n map2_df(2007:2020, mutate(.x,ano = .y)) \n write_csv('accidents-rio-niteroi-bridge.csv', na = \"\")", "format": "arff", "uploader": "Onur Yildirim", "uploader_id": 30126, "visibility": "public", "creator": null, "contributor": null, "date": "2022-03-23 13:51:20", "update_comment": null, "last_update": "2022-03-23 13:51:20", "licence": "CC0: Public Domain", "status": "active", "error_message": null, "url": "https:\/\/www.openml.org\/data\/download\/22102386\/dataset", "default_target_attribute": null, "row_id_attribute": null, "ignore_attribute": null, "runs": 0, "suggest": { "input": [ "Accidents-on-the-Rio-Niteri-bridge", "Context About the Rio Niteri Bridge, according to Wikipedia: Presidente Costa e Silva Bridge, popularly known as Rio Niteri Bridge, is a bridge that crosses Guanabara Bay, in the state of Rio de Janeiro, in Brazil. It connects the municipalities of Rio de Janeiro and Niteri. Currently, the bridge is the largest in the southern hemisphere in prestressed concrete and the largest in Latin America. The structure receives more than 150 thousand passengers a day, according to information from the Ecop " ], "weight": 5 }, "qualities": { "NumberOfInstances": 9367, "NumberOfFeatures": 31, "NumberOfClasses": null, "NumberOfMissingValues": 44380, "NumberOfInstancesWithMissingValues": 8876, "NumberOfNumericFeatures": 14, "NumberOfSymbolicFeatures": 0, "Dimensionality": 0.0033094907654531865, "PercentageOfNumericFeatures": 45.16129032258064, "MajorityClassPercentage": null, "PercentageOfSymbolicFeatures": 0, "MajorityClassSize": null, "MinorityClassPercentage": null, "MinorityClassSize": null, "NumberOfBinaryFeatures": 0, "PercentageOfBinaryFeatures": 0, "PercentageOfInstancesWithMissingValues": 94.75819365858867, "AutoCorrelation": null, "PercentageOfMissingValues": 15.283579622353011 }, "tags": [ { "uploader": "38960", "tag": "Computer Systems" }, { "uploader": "38960", "tag": "Machine Learning" } ], "features": [ { "name": "id", "index": "0", "type": "numeric", "distinct": "9366", "missing": "0", "min": "1491", "max": "83528449", "mean": "18851882", "stdev": "34193514" }, { "name": "data_inversa", "index": "1", "type": "string", "distinct": "3402", "missing": "0" }, { "name": "dia_semana", "index": "2", "type": "string", "distinct": "7", "missing": "0" }, { "name": "horario", "index": "3", "type": "string", "distinct": "971", "missing": "0" }, { "name": "uf", "index": "4", "type": "string", "distinct": "1", "missing": "0" }, { "name": "br", "index": "5", "type": "numeric", "distinct": "1", "missing": "0", "min": "101", "max": "101", "mean": "101", "stdev": "0" }, { "name": "km", "index": "6", "type": "numeric", "distinct": "101", "missing": "0", "min": "321", "max": "334", "mean": "325", "stdev": "4" }, { "name": "municipio", "index": "7", "type": "string", "distinct": "2", "missing": "0" }, { "name": "causa_acidente", "index": "8", "type": "string", "distinct": "25", "missing": "0" }, { "name": "tipo_acidente", "index": "9", "type": "string", "distinct": "22", "missing": "0" }, { "name": "classificacao_acidente", "index": "10", "type": "string", "distinct": "4", "missing": "0" }, { "name": "fase_dia", "index": "11", "type": "string", "distinct": "4", "missing": "0" }, { "name": "sentido_via", "index": "12", "type": "string", "distinct": "2", "missing": "0" }, { "name": "condicao_metereologica", "index": "13", "type": "string", "distinct": "10", "missing": "0" }, { "name": "tipo_pista", "index": "14", "type": "string", "distinct": "3", "missing": "0" }, { "name": "tracado_via", "index": "15", "type": "string", "distinct": "8", "missing": "0" }, { "name": "uso_solo", "index": "16", "type": "string", "distinct": "4", "missing": "0" }, { "name": "ano", "index": "17", "type": "numeric", "distinct": "14", "missing": "0", "min": "2007", "max": "2020", "mean": "2011", "stdev": "3" }, { "name": "pessoas", "index": "18", "type": "numeric", "distinct": "19", "missing": "0", "min": "1", "max": "42", "mean": "2", "stdev": "1" }, { "name": "mortos", "index": "19", "type": "numeric", "distinct": "4", "missing": "0", "min": "0", "max": "3", "mean": "0", "stdev": "0" }, { "name": "feridos_leves", "index": "20", "type": "numeric", "distinct": "16", "missing": "0", "min": "0", "max": "36", "mean": "0", "stdev": "1" }, { "name": "feridos_graves", "index": "21", "type": "numeric", "distinct": "6", "missing": "0", "min": "0", "max": "6", "mean": "0", "stdev": "0" }, { "name": "ilesos", "index": "22", "type": "numeric", "distinct": "12", "missing": "0", "min": "0", "max": "11", "mean": "2", "stdev": "1" }, { "name": "ignorados", "index": "23", "type": "numeric", "distinct": "4", "missing": "0", "min": "0", "max": "3", "mean": "0", "stdev": "0" }, { "name": "feridos", "index": "24", "type": "numeric", "distinct": "15", "missing": "0", "min": "0", "max": "36", "mean": "0", "stdev": "1" }, { "name": "veiculos", "index": "25", "type": "numeric", "distinct": "9", "missing": "0", "min": "1", "max": "9", "mean": "2", "stdev": "1" }, { "name": "latitude", "index": "26", "type": "numeric", "distinct": "213", "missing": "8876", "min": "-23", "max": "0", "mean": "-23", "stdev": "0" }, { "name": "longitude", "index": "27", "type": "numeric", "distinct": "223", "missing": "8876", "min": "-43", "max": "0", "mean": "-43", "stdev": "0" }, { "name": "regional", "index": "28", "type": "string", "distinct": "1", "missing": "8876" }, { "name": "delegacia", "index": "29", "type": "string", "distinct": "2", "missing": "8876" }, { "name": "uop", "index": "30", "type": "string", "distinct": "2", "missing": "8876" } ], "nr_of_issues": 0, "nr_of_downvotes": 0, "nr_of_likes": 0, "nr_of_downloads": 0, "total_downloads": 0, "reach": 0, "reuse": 0, "impact_of_reuse": 0, "reach_of_reuse": 0, "impact": 0 }