Prices for airline tickets might be challenging to estimate. In order to construct a model that predicts travel prices using multiple input features, we have been given the pricing of flight tickets for a number of airlines between the months of March and June 2019 and between a number of places.
We have 2 datasets here — training set and test set.
The training set contains the features, along with the prices of the flights. It contains 10683 records, 10 input features and 1 output column — ‘Price’.
The test set contains 2671 records and 10 input features. The output ‘Price’ column needs to be predicted in this set. We will use Regression techniques here, since the predicted output will be a continuous value.
Following is the features available in the dataset – Airline, Date_of_Journey, Source, Destination, Route, Dep_Time, Arrival_Time ,Duration, Total_Stops, Additional_Info, Price.
- Exploratory Data Analysis
- Feature Engineering
- Feature selection
- Model Deployment
The Code is written in Python 3.7. To install the required packages and libraries, run this command in the project directory after cloning the repository:
-
pip install -r requirements.txt
-
Used two datasets, Train data and Test data from Kaggle
-
Language – Python
-
Other libraries for analyzing & visualization: Pandas, Numpy, Matplotlib, Seaborn
-
AI/ML : Scikit-Learn , ML model
-
Web Frameworks : Flask
-
Hosting: Heroku
-
Tracking & SC: GitHub