{ "data_id": "43038", "name": "dow-jones-index", "exact_name": "dow-jones-index", "version": 1, "version_label": "1", "description": "**Author**: Dr. Michael Brown\n**Source**: [UCI](https:\/\/archive.ics.uci.edu\/ml\/datasets\/dow+jones+index) - 2017\n**Please cite**: [Paper](https:\/\/link.springer.com\/content\/pdf\/10.1007%2F978-3-642-39712-7_3.pdf) \n\n**Dow Jones Index Data Set**\n\nIn our research each record (row) is data for a week. Each record also has the percentage of return that stock has in the following week (percent_change_next_weeks_price). Ideally, you want to determine which stock will produce the greatest rate of return in the following week. This can help you train and test your algorithm.\n\nSome of these attributes might not be use used in your research. They were\noriginally added to our database to perform calculations. (Brown, Pelosi & Dirska, 2013) used percent_change_price, percent_change_volume_over_last_wk, days_to_next_dividend, and percent_return_next_dividend. We left the other attributes in the dataset in case you wanted to use any of them. Of course what you want to maximize is percent_change_next_weeks_price.\n\n**Training data vs Test data**:\nIn (Brown, Pelosi & Dirska, 2013) we used quarter 1 (Jan-Mar) data for training and quarter 2 (Apr-Jun) data for testing.\n\n**Interesting data points**:\nIf you use quarter 2 data for testing, you will notice something interesting in \nthe week ending 5\/27\/2011 every Dow Jones Index stock lost money.\n\n\n### Attribute information\n\n- quarter: the yearly quarter (1 = Jan-Mar; 2 = Apr=Jun)\n- stock: the stock symbol (see above)\n- date: the last business day of the work (this is typically a Friday)\n- open: the price of the stock at the beginning of the week\n- high: the highest price of the stock during the week\n- low: the lowest price of the stock during the week\n- close: the price of the stock at the end of the week\n- volume: the number of shares of stock that traded hands in the week\n- percent_change_price: the percentage change in price throughout the week\n- percent_chagne_volume_over_last_wek: the percentage change in the number of shares of stock that traded hands for this week compared to the previous week\n- previous_weeks_volume: the number of shares of stock that traded hands in the previous week\n- next_weeks_open: the opening price of the stock in the following week\n- next_weeks_close: the closing price of the stock in the following week\n- percent_change_next_weeks_price: the percentage change in price of the stock in the \n- following week days_to_next_dividend: the number of days until the next dividend\n- percent_return_next_dividend: the percentage of return on the next dividend", "format": "arff", "uploader": "Meilina Reksoprodjo", "uploader_id": 24140, "visibility": "public", "creator": "Dr. Michael Brown", "contributor": null, "date": "2021-06-11 20:43:40", "update_comment": null, "last_update": "2021-06-11 20:43:40", "licence": "Publicly available", "status": "active", "error_message": null, "url": "https:\/\/www.openml.org\/data\/download\/22045760\/dataset", "default_target_attribute": null, "row_id_attribute": null, "ignore_attribute": null, "runs": 0, "suggest": { "input": [ "dow-jones-index", "In our research each record (row) is data for a week. Each record also has the percentage of return that stock has in the following week (percent_change_next_weeks_price). Ideally, you want to determine which stock will produce the greatest rate of return in the following week. This can help you train and test your algorithm. Some of these attributes might not be use used in your research. They were originally added to our database to perform calculations. (Brown, Pelosi & Dirska, 2013) used per " ], "weight": 5 }, "qualities": { "NumberOfInstances": 750, "NumberOfFeatures": 16, "NumberOfClasses": null, "NumberOfMissingValues": 60, "NumberOfInstancesWithMissingValues": 30, "NumberOfNumericFeatures": 8, "NumberOfSymbolicFeatures": 0, "Dimensionality": 0.021333333333333333, "PercentageOfNumericFeatures": 50, "MajorityClassPercentage": null, "PercentageOfSymbolicFeatures": 0, "MajorityClassSize": null, "MinorityClassPercentage": null, "MinorityClassSize": null, "NumberOfBinaryFeatures": 0, "PercentageOfBinaryFeatures": 0, "PercentageOfInstancesWithMissingValues": 4, "AutoCorrelation": null, "PercentageOfMissingValues": 0.5 }, "tags": [ { "uploader": "38960", "tag": "Computer Systems" }, { "uploader": "38960", "tag": "Human Activities" } ], "features": [ { "name": "quarter", "index": "0", "type": "numeric", "distinct": "2", "missing": "0", "min": "1", "max": "2", "mean": "2", "stdev": "0" }, { "name": "stock", "index": "1", "type": "string", "distinct": "30", "missing": "0" }, { "name": "date", "index": "2", "type": "string", "distinct": "25", "missing": "0" }, { "name": "open", "index": "3", "type": "string", "distinct": "722", "missing": "0" }, { "name": "high", "index": "4", "type": "string", "distinct": "713", "missing": "0" }, { "name": "low", "index": "5", "type": "string", "distinct": "711", "missing": "0" }, { "name": "close", "index": "6", "type": "string", "distinct": "711", "missing": "0" }, { "name": "volume", "index": "7", "type": "numeric", "distinct": "750", "missing": "0", "min": "9718851", "max": "1453438639", "mean": "117547801", "stdev": "158438089" }, { "name": "percent_change_price", "index": "8", "type": "numeric", "distinct": "745", "missing": "0", "min": "-15", "max": "10", "mean": "0", "stdev": "3" }, { "name": "percent_change_volume_over_last_wk", "index": "9", "type": "numeric", "distinct": "720", "missing": "30", "min": "-61", "max": "327", "mean": "6", "stdev": "41" }, { "name": "previous_weeks_volume", "index": "10", "type": "numeric", "distinct": "720", "missing": "30", "min": "9718851", "max": "1453438639", "mean": "117387645", "stdev": "159232228" }, { "name": "next_weeks_open", "index": "11", "type": "string", "distinct": "720", "missing": "0" }, { "name": "next_weeks_close", "index": "12", "type": "string", "distinct": "715", "missing": "0" }, { "name": "percent_change_next_weeks_price", "index": "13", "type": "numeric", "distinct": "745", "missing": "0", "min": "-15", "max": "10", "mean": "0", "stdev": "3" }, { "name": "days_to_next_dividend", "index": "14", "type": "numeric", "distinct": "105", "missing": "0", "min": "0", "max": "336", "mean": "53", "stdev": "46" }, { "name": "percent_return_next_dividend", "index": "15", "type": "numeric", "distinct": "729", "missing": "0", "min": "0", "max": "2", "mean": "1", "stdev": "0" } ], "nr_of_issues": 0, "nr_of_downvotes": 0, "nr_of_likes": 0, "nr_of_downloads": 0, "total_downloads": 0, "reach": 0, "reuse": 0, "impact_of_reuse": 0, "reach_of_reuse": 0, "impact": 0 }