Data
Malware-Analysis-Datasets-PE-Section-Headers

Malware-Analysis-Datasets-PE-Section-Headers

active ARFF Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) Visibility: public Uploaded 23-03-2022 by Dustin Carrion
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes
Issue #Downvotes for this reason By


Loading wiki
Help us complete this description Edit
Introduction This dataset is part of my PhD research on malware detection and classification using Deep Learning. It contains static analysis data (PE Section Headers of the .text, .code and CODE sections) extracted from the 'pe_sections' elements of Cuckoo Sandbox reports. PE malware examples were downloaded from virusshare.com. PE goodware examples were downloaded from portableapps.com and from Windows 7 x86 directories. Features Column name: hash Description: MD5 hash of the example Content: 32 bytes string Column name: sizeofdata Description: The size of the section on disk Content: Integer Column name: virtualaddress Description: Memory address of the first byte of the section relative to the image base Content: Integer Column name: entropy Description: Calculated entropy of the section Content: Float Column name: virtualsize Description: The size of the section when loaded into memory Content: Integer Column name: malware Description: Class Content: 0 (Goodware) or 1 (Malware) Acknowledgements Thank you Cuckoo Sandbox for developing such an amazing dynamic analysis environment! Thank you VirusShare! Because sharing is caring! Citations Please refer to http://dx.doi.org/10.21227/2czh-es14

5 features

hash (ignore)string43144 unique values
0 missing
size_of_datanumeric2659 unique values
0 missing
virtual_addressnumeric129 unique values
0 missing
entropynumeric17613 unique values
0 missing
virtual_sizenumeric11715 unique values
0 missing
malwarenumeric2 unique values
0 missing

19 properties

43293
Number of instances (rows) of the dataset.
5
Number of attributes (columns) of the dataset.
Number of distinct values of the target attribute (if it is nominal).
0
Number of missing values in the dataset.
0
Number of instances with at least one value missing.
5
Number of numeric attributes.
0
Number of nominal attributes.
0
Percentage of binary attributes.
0
Percentage of instances having missing values.
0
Percentage of missing values.
Average class difference between consecutive instances.
100
Percentage of numeric attributes.
0
Number of attributes divided by the number of instances.
0
Percentage of nominal attributes.
Percentage of instances belonging to the most frequent class.
Number of instances belonging to the most frequent class.
Percentage of instances belonging to the least frequent class.
Number of instances belonging to the least frequent class.
0
Number of binary attributes.

0 tasks

Define a new task