OpenML
Accidents-on-the-Rio-Niteri-bridge

Accidents-on-the-Rio-Niteri-bridge

active ARFF CC0: Public Domain Visibility: public Uploaded 23-03-2022 by Onur Yildirim
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes
  • Computer Systems Machine Learning
Issue #Downvotes for this reason By


Loading wiki
Help us complete this description Edit
Context About the Rio Niteri Bridge, according to Wikipedia: Presidente Costa e Silva Bridge, popularly known as Rio Niteri Bridge, is a bridge that crosses Guanabara Bay, in the state of Rio de Janeiro, in Brazil. It connects the municipalities of Rio de Janeiro and Niteri. Currently, the bridge is the largest in the southern hemisphere in prestressed concrete and the largest in Latin America. The structure receives more than 150 thousand passengers a day, according to information from the Ecoponte concessionaire, on days with normal flows. [1] It is also known as the largest straight span in the world and the largest set of prestressed structures in America. [2] Content The data to carry out this research were obtained on the website of the Federal Highway Police, which provides an Open Data session which, according to the website: has no restrictions on licenses, patents or control mechanisms, so that they are freely available for use and redistributed at will . In the Accidents section it is possible to find accident records in csv format over the years. Acknowledgements It is great that this data is public and can present valuable insights to improve people's quality of life Inspiration Some guiding questions about this data: Is the number of accidents on the bridge daily? Have the number of accidents on the bridge decreased? Has the installation of security cameras reduced the number of accidents? How to import from source See below the code used to import and organize the open data of the site: Load tidyverse: library(tidyverse) Import available data: dataset - glue::glue("datatran2007:2020.csv") map(.x data.table::fread(sep = ";",dec=",", encoding = "Latin-1") as_tibble() ) Perform data pre-processing in parallel using the foreach package: library(foreach) cl - parallel::makeCluster(4) doParallel::registerDoParallel(cl) dataset - foreach(x = dataset, info = 2007:2020, .packages = c("dplyr", "stringr", "lubridate")) dopar x mutate(km = as.character(km), br = as.character(br)) mutate_if(is.character, .x enc2native() stringi::stri_trans_general(id = "Latin-ASCII") tolower() str_replace_all("/","-") str_replace_all(",",".") ) mutate(data_inversa = if_else(is.na(dmy(data_inversa)), ymd(data_inversa), dmy(data_inversa)) , km = as.numeric(km), br = as.numeric(br), id = as.character(id), dia_semana = str_remove_all(dia_semana,"-feira") ) filter(br==101 km=321 km334) slice(str_which(municipio,"(niteroirio de janeiro)")) parallel::stopCluster(cl) Save the tidy dataset: dataset map2_df(2007:2020, mutate(.x,ano = .y)) write_csv('accidents-rio-niteroi-bridge.csv', na = "")

31 features

idnumeric9366 unique values
0 missing
data_inversastring3402 unique values
0 missing
dia_semanastring7 unique values
0 missing
horariostring971 unique values
0 missing
ufstring1 unique values
0 missing
brnumeric1 unique values
0 missing
kmnumeric101 unique values
0 missing
municipiostring2 unique values
0 missing
causa_acidentestring25 unique values
0 missing
tipo_acidentestring22 unique values
0 missing
classificacao_acidentestring4 unique values
0 missing
fase_diastring4 unique values
0 missing
sentido_viastring2 unique values
0 missing
condicao_metereologicastring10 unique values
0 missing
tipo_pistastring3 unique values
0 missing
tracado_viastring8 unique values
0 missing
uso_solostring4 unique values
0 missing
anonumeric14 unique values
0 missing
pessoasnumeric19 unique values
0 missing
mortosnumeric4 unique values
0 missing
feridos_levesnumeric16 unique values
0 missing
feridos_gravesnumeric6 unique values
0 missing
ilesosnumeric12 unique values
0 missing
ignoradosnumeric4 unique values
0 missing
feridosnumeric15 unique values
0 missing
veiculosnumeric9 unique values
0 missing
latitudenumeric213 unique values
8876 missing
longitudenumeric223 unique values
8876 missing
regionalstring1 unique values
8876 missing
delegaciastring2 unique values
8876 missing
uopstring2 unique values
8876 missing

19 properties

9367
Number of instances (rows) of the dataset.
31
Number of attributes (columns) of the dataset.
Number of distinct values of the target attribute (if it is nominal).
44380
Number of missing values in the dataset.
8876
Number of instances with at least one value missing.
14
Number of numeric attributes.
0
Number of nominal attributes.
0
Number of attributes divided by the number of instances.
45.16
Percentage of numeric attributes.
Percentage of instances belonging to the most frequent class.
0
Percentage of nominal attributes.
Number of instances belonging to the most frequent class.
Percentage of instances belonging to the least frequent class.
Number of instances belonging to the least frequent class.
0
Number of binary attributes.
0
Percentage of binary attributes.
94.76
Percentage of instances having missing values.
Average class difference between consecutive instances.
15.28
Percentage of missing values.

0 tasks

Define a new task