Data
Historical-Financials-Data-for-3000-stocks

Historical-Financials-Data-for-3000-stocks

active ARFF CC0: Public Domain Visibility: public Uploaded 24-03-2022 by Elif Ceren Gok
1 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes
Issue #Downvotes for this reason By


Loading wiki
Help us complete this description Edit
Context Getting access to high-quality historical stock market data can be very expensive and/or complicated; parsing SEC 10-Q filings direct from the SEC EDGAR is difficult due to the varying structures of filings and SEC filing data from providers such as Quandl charge hundreds or thousands of dollars in yearly fees to get access to them. Here, I provide an easy-to-use, straight from the source database of parsed financials information from SEC 10-Q filings for more than 3000 stocks. Content The quarterly financials are provided in a single .csv file, quarterly_financials.csv 50 of the data is NaN either because the field wasn't detected by my XBRL parsing system or the field wasn't addressed in the SEC filing. Acknowledgements All the data is scraped from the SEC from the XBRL files.

45 features

Unnamed:_0numeric101787 unique values
0 missing
commonstocksharesissuednumeric45206 unique values
8616 missing
assetscurrentnumeric51259 unique values
24555 missing
accountspayablecurrentnumeric33624 unique values
35815 missing
commonstockvaluenumeric16551 unique values
11337 missing
liabilitiesnumeric52661 unique values
19366 missing
liabilitiesandstockholdersequitynumeric71000 unique values
419 missing
stockholdersequitynumeric65938 unique values
4885 missing
earningspersharebasicnumeric2524 unique values
10346 missing
netincomelossnumeric56668 unique values
4734 missing
profitlossnumeric28761 unique values
43614 missing
costofgoodssoldnumeric19345 unique values
71667 missing
filing_datestring2635 unique values
0 missing
costsandexpensesnumeric21104 unique values
71771 missing
cashnumeric5509 unique values
81223 missing
notespayablenumeric3247 unique values
87412 missing
preferredstockvaluenumeric1853 unique values
85167 missing
depreciationnumeric16499 unique values
36740 missing
operatingexpensesnumeric33013 unique values
51869 missing
revenuesnumeric34700 unique values
35561 missing
landnumeric5676 unique values
79959 missing
accountsreceivablenetnumeric4847 unique values
87788 missing
deferredrevenuenumeric4015 unique values
82103 missing
grossprofitnumeric34077 unique values
52927 missing
sharesissuednumeric9226 unique values
71886 missing
accruedincometaxesnumeric1433 unique values
94954 missing
sharesoutstandingnumeric11153 unique values
65296 missing
borrowedfundsnumeric307 unique values
100074 missing
inventorygrossnumeric3639 unique values
93715 missing
commercialpapernumeric1195 unique values
95653 missing
dividendsnumeric3894 unique values
87459 missing
commonstocknoparvaluenumeric33 unique values
101080 missing
costofservicesnumeric8025 unique values
88745 missing
debtcurrentnumeric4408 unique values
87706 missing
accruedinsurancecurrentnumeric1681 unique values
96327 missing
officerscompensationnumeric344 unique values
98246 missing
intangibleassetscurrentnumeric198 unique values
100706 missing
salariesandwagesnumeric934 unique values
99034 missing
interestanddebtexpensenumeric2213 unique values
96404 missing
convertibledebtnumeric1133 unique values
95400 missing
assetmanagementcostsnumeric851 unique values
100095 missing
accountsreceivablegrossnumeric1480 unique values
96279 missing
directoperatingcostsnumeric1283 unique values
99631 missing
operatingcyclestring26 unique values
101400 missing
stockstring3189 unique values
0 missing

19 properties

101787
Number of instances (rows) of the dataset.
45
Number of attributes (columns) of the dataset.
Number of distinct values of the target attribute (if it is nominal).
2857964
Number of missing values in the dataset.
101787
Number of instances with at least one value missing.
42
Number of numeric attributes.
0
Number of nominal attributes.
Number of instances belonging to the most frequent class.
Percentage of instances belonging to the least frequent class.
Number of instances belonging to the least frequent class.
0
Number of binary attributes.
0
Percentage of binary attributes.
100
Percentage of instances having missing values.
Average class difference between consecutive instances.
62.4
Percentage of missing values.
0
Number of attributes divided by the number of instances.
93.33
Percentage of numeric attributes.
Percentage of instances belonging to the most frequent class.
0
Percentage of nominal attributes.

0 tasks

Define a new task