Predicting and identifying the most important factors on player performance in the National Basketball Association using Machine Learning.

Authors

  • Raghav Singh
  • Michael Stanley

DOI:

https://doi.org/10.47611/jsrhs.v12i3.4715

Keywords:

Machine Learning, Sports Analytics, Data Science

Abstract

The field of data analytics in basketball has been on a meteoric rise in recent years. Improved optical tracking technology has allowed data collection on a scale that has never been seen before. As a result, a plethora of work related to how player performances get affected by various in-game factors have been published. The rush of incoming data has also allowed for teams to get a better handle on how they can mitigate certain factors to increase a player's performance. However, no one has been able to capture the relationship that extrinsic factors, such as weather, precipitation, or attendance could affect player performance, and rank which factors have the most impact. In this paper, I will describe the process of creating a novel dataset that consists of game attendance, opponent information, weather information, date information, and player form. With these data points, I also identify which group of factors have the most impact on player performance. I will also discuss the process of creating a supervised model, which has the potential to predict how well a player will perform in a game based on the input external factors for a particular game.

Downloads

Download data is not yet available.

References or Bibliography

“Moneyball.” IMDb, IMDb.com, 23 Sept. 2011, www.imdb.com/title/tt1210166/.

“What Is a Slump in Sports? Definition & Meaning on Sportslingo.com.” What Is A Slump In Sports? Definition & Meaning On SportsLingo.com, 28 Jan. 2022, www.sportslingo.com/sports-glossary/s/slump/#:~:text=Slump%20In%20Basketball%3F-,1.,shooting%20technique%2C%20injury%20or%20fatigue.

“Supervised Machine Learning - Javatpoint.” Www.javatpoint.com, www.javatpoint.com/supervised-machine-learning.

Wheeler, Kevin. “Stanford University.” Predicting NBA Player Performance, cs229.stanford.edu/proj2012/Wheeler-PredictingNBAPlayerPerformance.pdf.

Nguyen, Nguyen Hoang, et al. “The Application of Machine Learning and Deep Learning in Sport: Predicting NBA Players’ Performance and Popularity.” Journal of Information and Telecommunication, vol. 6, no. 2, 2021, pp. 217–235., doi:10.1080/24751839.2021.1977066.

Medvedovsky , Kostya. “What Is Darko?” DARKO Exploration, apanalytics.shinyapps.io/DARKO//.

“Lebron Introduction.” Basketball Index, 24 May 2022, www.bball-index.com/lebron-introduction/.

“Game Score in Basketball Explained.” NBAstuffer, 13 June 2020, www.nbastuffer.com/analytics101/game-score/.

“Learn.” Scikit, scikit-learn.org/stable/.

Yiu, Tony. “Understanding Random Forest.” Medium, Towards Data Science, 29 Sept. 2021, towardsdatascience.com/understanding-random-forest-58381e0602d2.

“What Is Bagging?” IBM, www.ibm.com/in-en/topics/bagging#:~:text=Bagging%2C%20also%20known%20as%20bootstrap,be%20chosen%20more%20than%20once.

“What Is the K-Nearest Neighbors Algorithm?” IBM, www.ibm.com/in-en/topics/knn#:~:text=The%20k%2Dnearest%20neighbors%20algorithm%2C%20also%20known%20as%20KNN%20or,of%20an%20individual%20data%20point.

“Most Popular Distance Metrics Used in KNN and When to Use Them.” KDnuggets, www.kdnuggets.com/2020/11/most-popular-distance-metrics-knn.html.

Kanade, Vijay. “What Is Linear Regression? Types, Equation, Examples, and Best Practices for 2022.” Spiceworks, 3 Apr. 2023, www.spiceworks.com/tech/artificial-intelligence/articles/what-is-linear-regression/.

Glen, Stephanie. “Absolute Error & Mean Absolute Error (MAE).” Statistics How To, 28 Dec. 2020, www.statisticshowto.com/absolute-error/..

“Players Box Scores: Stats.” Players Box Scores | Stats | NBA.com, www.nba.com/stats/players/boxscores.

Walsh, Wyatt. NBA Database, Kaggle, www.kaggle.com/datasets/wyattowalsh/basketball.

“NBA Attendance Report - 2021.” ESPN, ESPN Internet Ventures, www.espn.com/nba/attendance.

Rajkumar , Sudalai. “Daily Temperature of Major Cities.” Kaggle, 5 June 2020, www.kaggle.com/datasets/sudalairajkumar/daily-temperature-of-major-cities.

[email protected]. “City Time Series: Climate at a Glance.” City Time Series | Climate at a Glance | National Centers for Environmental Information (NCEI), www.ncei.noaa.gov/access/monitoring/climate-at-a-glance/city/time-series/USW00023234/tavg/all/1/2021-2022.

Published

08-31-2023

How to Cite

Singh, R., & Stanley, M. (2023). Predicting and identifying the most important factors on player performance in the National Basketball Association using Machine Learning. Journal of Student Research, 12(3). https://doi.org/10.47611/jsrhs.v12i3.4715

Issue

Section

HS Research Projects