Data
OpenML
Help
Sign in
×
Sign in
No account? Join OpenML
Forgot password
×
JavaScript is required to properly view the contents of this page!
OpenML
Explore
Data
Task
Flow
Run
Study
Task type
Measure
People
Help
Blog
Contact
Please cite us
drug-directory
ARFF
CSV
JSON
XML
RDF
drug-directory
active
ARFF
Publicly available
Visibility: public
Uploaded 18-06-2021 by
Marcos de Paula Bueno
0 likes
downloaded by 0 people , 0 total downloads
0 issues
0 downvotes
Images
Machine Learning
Add tag
Issue
#Downvotes for this reason
By
Loading wiki
Help us complete this description
Edit
Product listing data submitted to the U.S. FDA for all unfinished, unapproved drugs.
20 features
PRODUCTTYPENAME
(target)
nominal
7 unique values
0 missing
ROW_ID
(row identifier)
numeric
120215 unique values
0 missing
PRODUCTID
string
120215 unique values
0 missing
PRODUCTNDC
string
117896 unique values
0 missing
PROPRIETARYNAME
string
45019 unique values
7 missing
PROPRIETARYNAMESUFFIX
string
4569 unique values
108397 missing
NONPROPRIETARYNAME
string
19307 unique values
7 missing
DOSAGEFORMNAME
string
139 unique values
0 missing
ROUTENAME
string
192 unique values
2152 missing
STARTMARKETINGDATE
numeric
7474 unique values
0 missing
ENDMARKETINGDATE
numeric
676 unique values
115624 missing
MARKETINGCATEGORYNAME
string
10 unique values
0 missing
APPLICATIONNUMBER
string
11256 unique values
14921 missing
LABELERNAME
string
13388 unique values
0 missing
SUBSTANCENAME
string
9729 unique values
2616 missing
ACTIVE_NUMERATOR_STRENGTH
string
10204 unique values
2616 missing
ACTIVE_INGRED_UNIT
string
2927 unique values
2616 missing
PHARM_CLASSES
string
1319 unique values
74252 missing
DEASCHEDULE
string
4 unique values
115504 missing
NDC_EXCLUDE_FLAG
string
1 unique values
0 missing
LISTING_RECORD_CERTIFIED_THROUGH
numeric
2 unique values
4593 missing
Show all 20 features
19 properties
NumberOfInstances
120215
Number of instances (rows) of the dataset.
NumberOfFeatures
20
Number of attributes (columns) of the dataset.
NumberOfClasses
7
Number of distinct values of the target attribute (if it is nominal).
NumberOfMissingValues
443305
Number of missing values in the dataset.
NumberOfInstancesWithMissingValues
120215
Number of instances with at least one value missing.
NumberOfNumericFeatures
3
Number of numeric attributes.
NumberOfSymbolicFeatures
1
Number of nominal attributes.
Dimensionality
0
Number of attributes divided by the number of instances.
PercentageOfNumericFeatures
15
Percentage of numeric attributes.
MajorityClassPercentage
57.4
Percentage of instances belonging to the most frequent class.
PercentageOfSymbolicFeatures
5
Percentage of nominal attributes.
MajorityClassSize
69001
Number of instances belonging to the most frequent class.
MinorityClassPercentage
0.01
Percentage of instances belonging to the least frequent class.
MinorityClassSize
7
Number of instances belonging to the least frequent class.
NumberOfBinaryFeatures
0
Number of binary attributes.
PercentageOfBinaryFeatures
0
Percentage of binary attributes.
PercentageOfInstancesWithMissingValues
100
Percentage of instances having missing values.
AutoCorrelation
0.95
Average class difference between consecutive instances.
PercentageOfMissingValues
18.44
Percentage of missing values.
Show all 19 properties
0 tasks
Define a new task