Predicting Running Injuries with Classification Machine Learning Models




F-beta score, Random Forest Classifier, Logistic Regression, classification, running injury, Hyperparameter Tuning, imbalanced dataset


Can running injuries be predicted using only a dataset and machine learning models? This paper explores this question using classification models, including the Logistic Regression model and the Random Forest Classifier model. In the dataset used, ten features were taken into account when predicting running injuries. With slight modifications, the Weighted Logistic Regression and over and down-sampling Random Forest Classifier models were used to mitigate the imbalance in the dataset. The results suggested that the best model was Weighted Logistic Regression and that the best score metric to consider was the F-beta score. 


Download data is not yet available.

References or Bibliography

Lovdal, S., den Hartigh, R., & Azzopardi, G. (2021). Injury Prediction in Competitive Runners with Machine Learning. International Journal of Sports Physiology and Performance, 16(10), 1522–1531.

Chmait, N., & Westerbeek, H. (2021). Artificial Intelligence and Machine Learning in Sport Research: An Introduction for Non-data Scientists. Frontiers in Sports and Active Living, 3.

Fawcett, T. (2006). An introduction to ROC analysis. Pattern Recognition Letters, 27(8), 861–874.

F-beta score. (n.d.). Hasty.Ai. Retrieved September 29, 2022, from

Iyer, S. R., & Sharda, R. (2009). Prediction of athletes performance using neural networks: An application in cricket team selection. Expert Systems with Applications, 36(3, Part 1), 5510–5522.

Maalouf, M., & Siddiqi, M. (2014). Weighted logistic regression for large-scale imbalanced and rare events data. Knowledge-Based Systems, 59, 142–148.

Lovdal, S., den Hartigh, R., & Azzopardi, G. (2021). Replication Data for: Injury Prediction In Competitive Runners With Machine Learning. DataverseNL.



How to Cite

Vuong, E., & Vincent, J. (2023). Predicting Running Injuries with Classification Machine Learning Models. Journal of Student Research, 12(1).



HS Research Projects