Using Different Machine Learning Algorithms to Predict the Prices of Flight Tickets

Authors

DOI:

https://doi.org/10.47611/jsrhs.v12i4.5303

Keywords:

Machine Learning, Artificial Intelligence, Regression, Airlines, Dynamic Pricing, Flight Tickets, Inflation, Linear Regression, Ridge Regression, DecisionTree, ML Models

Abstract

The rising prices of flight tickets and the lack of transparency in the dynamic pricing strategies of airlines have caused many consumers to wonder, what factors actually determine these prices. In order to investigate this question, a large dataset of flight ticket bookings that includes the most price-defining variables was acquired. This data was preprocessed using discretization, normalization, and principal component analysis. This preprocessed data was then used to train 5 different Machine Learning algorithms: Linear Regression, DecisionTree, Ridge Regression, RandomForest, and SVR. The training of the RandomForest and SVR models was not possible due to runtime errors, however, the other models trained as expected. All models performed well, with the Linear Regression and Ridge Regression performing identically. Overall, the DecisionTree model performed the best at predicting the prices of flights, and by adjusting hyperparameters the performance could be further increased. The investigation could be continued by using a larger dataset to investigate how the model performs with more variables and under broader conditions. Additionally, the model could be reappropriated to make a user-friendly flight price prediction tool that helps consumers with their purchasing decisions.

Downloads

Download data is not yet available.

References or Bibliography

Allwright, Stephen. 2022. “MSE vs MAE, Which Is the Better Regression Metric?” Stephen Allwright. July 7, 2022. https://stephenallwright.com/mse-vs-mae/.

Bathwal, Shubham. n.d. “Flight Price Prediction.” Www.kaggle.com. https://www.kaggle.com/datasets/shubhambathwal/flight-price-prediction.

Castillo, Dianne. 2021. “Machine Learning Regression Explained.” Seldon. October 29, 2021. https://www.seldon.io/machine-learning-regression-explained#:~:text=Machine%20Learning%20Regression%20is%20a.

ChatGPT. 2023. “Response to ‘What Does It Mean, When My Linear Regression and Ridge Regression Model Perform the Exact Same Way?’” July 8, 2023. https://chat.openai.com.

“Decision Tree.” n.d. CORP-MIDS1 (MDS). Accessed July 8, 2023. https://www.mastersindatascience.org/learning/machine-learning-algorithms/decision-tree/#:~:text=A%20decision%20tree%20is%20a.

Fernando, Jason. 2021. “R-Squared Definition.” Investopedia. September 12, 2021. https://www.investopedia.com/terms/r/r-squared.asp.

Geisler Mesevage, Tobias. 2021. “What Is Data Preprocessing & What Are the Steps Involved?” MonkeyLearn Blog. May 24, 2021. https://monkeylearn.com/blog/data-preprocessing/.

Hayward, Justin, Daniel Martínez Garbuno, and Pranjal Pande. 2020. “How Airline Ticket Pricing Works.” Simple Flying. October 22, 2020. https://simpleflying.com/how-airline-ticket-pricing-works/#future-of-airline-pricing.

“How Linear Regression Algorithm Works—ArcGIS pro | Documentation.” n.d. Pro.arcgis.com. Accessed July 8, 2023. https://pro.arcgis.com/en/pro-app/latest/tool-reference/geoai/how-linear-regression-works.htm.

IBM. n.d. “What Is Supervised Learning? | IBM.” Www.ibm.com. Accessed June 8, 2023. https://www.ibm.com/topics/supervised-learning.

“Linear Regression in Machine Learning - Javatpoint.” n.d. Www.javatpoint.com. Accessed July 8, 2023. https://www.javatpoint.com/linear-regression-in-machine-learning.

Mark, Lois Alter. 2021. “This Is the Best Time to Buy Flights.” Reader’s Digest. December 6, 2021. https://www.rd.com/article/when-to-buy-plane-tickets/.

Numpy. 2009. “NumPy.” Numpy.org. 2009. https://numpy.org/.

Nyuytiymbiy, Kizito. 2022. “Parameters and Hyperparameters in Machine Learning and Deep Learning.” Medium. January 15, 2022. https://towardsdatascience.com/parameters-and-hyperparameters-aa609601a9ac#:~:text=Hyperparameters%20are%20parameters%20whose%20values.

Pandas. 2018. “Python Data Analysis Library — Pandas: Python Data Analysis Library.” Pydata.org. 2018. https://pandas.pydata.org/.

scikit-learn. 2019. “Scikit-Learn: Machine Learning in Python.” Scikit-Learn.org. 2019. https://scikit-learn.org/stable/.

“Sklearn.svm.SVR — Scikit-Learn 0.23.1 Documentation.” n.d. Scikit-Learn.org. Accessed July 9, 2023. https://scikit-learn.org/stable/modules/generated/sklearn.svm.SVR.html.

Willaert, Jorrit. 2021. “How to Calculate the Mean and Standard Deviation — Normalizing Datasets in Pytorch.” Medium. Towards Data Science. September 24, 2021. https://towardsdatascience.com/how-to-calculate-the-mean-and-standard-deviation-normalizing-datasets-in-pytorch-704bd7d05f4c#:~:text=The%20data%20can%20be%20normalized,channel%20is%20normalized%20this%20way.

Published

11-30-2023

How to Cite

Bollack, J. R., & Vincent, J. A. (2023). Using Different Machine Learning Algorithms to Predict the Prices of Flight Tickets. Journal of Student Research, 12(4). https://doi.org/10.47611/jsrhs.v12i4.5303

Issue

Section

HS Research Projects