Dataset Description Story
View the ReadMe file in my Github repo for this project.
Check out all the info on my portfolio's webpage for this project.
As I write this, I'm a Data Science student. To add to my portfolio, I wanted to build a web app to predict the prices of airline flight prices: the user would be able to select an origin and a destination. I found a database from The Bureau of Transportation Statistics.
I downloaded their data from Q1, Q2, Q3, and Q4 from 2018 a total of 27M+ rows and 42 columns.
For my pricing prediction purposes, I eliminated unnecessary columns, renamed some columns, and refined it for consistency bringing it to a new total of 9M+ rows and 13 columns.
Have fun and share your kernels, please!
Column Descriptions
1. Unnamed: drop this column (it's a duplicate index column)
2-3. ItinID MktID: vaguely demonstrates the order in which tickets were ordered (lower ID 's being ordered first)
4. MktCoupons: the number of coupons in the market for that flight
5. Quarter: 1, 2, 3, or 4, all of which are in 2018
6. Origin: the city out of which the flight begins
7. OriginWac: USA State/Territory World Area Code
8. Dest: the city out of which the flight begins
9. DestWac: USA State/Territory World Area Code
10. Miles: the number of miles traveled
11. ContiguousUSA: binary column -- (2) meaning flight is in the contiguous (48) USA states, and (1) meaning it is not (ie: Hawaii, Alaska, off-shore territories)
12. NumTicketsOrdered: number of tickets that were purchased by the user
13. Airline Company: the two-letter airline company code that the user used from start to finish (key codes below)
14. PricePerTicket: target prediction column
Airline Company Codes (in order of frequency for this dataset)
WN -- Southwest Airlines Co.
DL -- Delta Air Lines Inc.
AA -- American Airlines Inc.
UA -- United Air Lines Inc.
B6 -- JetBlue Airways
AS -- Alaska Airlines Inc.
NK -- Spirit Air Lines
G4 -- Allegiant Air
F9 -- Frontier Airlines Inc.
HA -- Hawaiian Airlines Inc.
SY -- Sun Country Airlines d/b/a MN Airlines
VX -- Virgin America
USA State/Territory World Area Codes
1 Alaska
2 Hawaii
3 Puerto Rico
4 U.S. Virgin Islands
5 U.S. Pacific Trust Territories and Possessions
11 Connecticut
12 Maine
13 Massachusetts
14 New Hampshire
15 Rhode Island
16 Vermont
21 New Jersey
22 New York
23 Pennsylvania
31 Delaware
32 District of Columbia
33 Florida
34 Georgia
35 Maryland
36 North Carolina
37 South Carolina
38 Virginia
39 West Virginia
41 Illinois
42 Indiana
43 Michigan
44 Ohio
45 Wisconsin
51 Alabama
52 Kentucky
53 Mississippi
54 Tennessee
61 Iowa
62 Kansas
63 Minnesota
64 Missouri
65 Nebraska
66 North Dakota
67 South Dakota
71 Arkansas
72 Louisiana
73 Oklahoma
74 Texas
81 Arizona
82 Colorado
83 Idaho
84 Montana
85 Nevada
86 New Mexico
87 Utah
88 Wyoming
91 California
92 Oregon
93 Washington