Author: Nishtha Hooda, CSED, TIET, Patiala
Source: [UCI](https://archive.ics.uci.edu/ml/datasets/Audit+Data) - 2018
Please cite: [Hooda, Nishtha, Seema Bawa, and Prashant Singh Rana. 'Fraudulent Firm Classification: A Case Study of an External Audit.' Applied Artificial Intelligence 32.1 (2018): 48-64.]( https://doi.org/10.1080/08839514.2018.1451032)
The goal of the research is to help the auditors by building a classification model that can predict the fraudulent firm on the basis the present and historical risk factors. The information about the sectors and the counts of firms are listed respectively as Irrigation (114), Public Health (77), Buildings and Roads (82), Forest (70), Corporate (47), Animal Husbandry (95), Communication (1), Electrical (4), Land (5), Science and Technology (3), Tourism (1), Fisheries (41), Industries (37), Agriculture (200). The original dataset was separated into a trial and audit dataset. In this dataset these are concatenated into 1 dataset. Two features (trial and audit) have been added to indicate whether the data was originally from the trial or audit dataset.