Data
Binary-Dataset-of-Phishing-and-Legitimate-URLs

Binary-Dataset-of-Phishing-and-Legitimate-URLs

active ARFF Attribution 4.0 International (CC BY 4.0) Visibility: public Uploaded 24-03-2022 by Dustin Carrion
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes
  • Computer Systems Machine Learning
Issue #Downvotes for this reason By


Loading wiki
Help us complete this description Edit
Description The data set is provided csv file which provides the following resources that can be used as inputs for model building : A collection of website URLs for 11001 websites. Each sample has 15 website parameters and a class label identifying it as a phishing website or not (0 or 1). If URLs is Phished then label is 0 and for legitimate label is 1 The data set also serves as an input for project scoping and tries to specify the functional and non-functional requirements for it.

15 features

whois_regDatenumeric4038 unique values
0 missing
whois_expDatenumeric1872 unique values
0 missing
whois_updatedDatenumeric1146 unique values
0 missing
dot_countnumeric19 unique values
0 missing
url_lennumeric347 unique values
0 missing
digit_countnumeric128 unique values
0 missing
special_countnumeric35 unique values
0 missing
hyphen_countnumeric32 unique values
0 missing
double_slashnumeric6 unique values
0 missing
single_slashnumeric25 unique values
0 missing
at_the_ratenumeric4 unique values
0 missing
protocolnumeric2 unique values
0 missing
protocol_countnumeric6 unique values
0 missing
web_trafficnumeric2 unique values
0 missing
labelnumeric2 unique values
0 missing

19 properties

11000
Number of instances (rows) of the dataset.
15
Number of attributes (columns) of the dataset.
Number of distinct values of the target attribute (if it is nominal).
0
Number of missing values in the dataset.
0
Number of instances with at least one value missing.
15
Number of numeric attributes.
0
Number of nominal attributes.
0
Number of attributes divided by the number of instances.
100
Percentage of numeric attributes.
Percentage of instances belonging to the most frequent class.
0
Percentage of nominal attributes.
Number of instances belonging to the most frequent class.
Percentage of instances belonging to the least frequent class.
Number of instances belonging to the least frequent class.
0
Number of binary attributes.
0
Percentage of binary attributes.
0
Percentage of instances having missing values.
Average class difference between consecutive instances.
0
Percentage of missing values.

0 tasks

Define a new task