Predicting Running Injuries with Classification Machine Learning Models




F-beta score, Random Forest Classifier, Logistic Regression, classification, running injury, Hyperparameter Tuning, imbalanced dataset


Can running injuries be predicted using only a dataset and machine learning models? This paper explores this question using classification models, including the Logistic Regression model and the Random Forest Classifier model. In the dataset used, ten features were taken into account when predicting running injuries. With slight modifications, the Weighted Logistic Regression and over and down-sampling Random Forest Classifier models were used to mitigate the imbalance in the dataset. The results suggested that the best model was Weighted Logistic Regression and that the best score metric to consider was the F-beta score. 


