The Impact of X (Formerly Twitter) Sentiment on Stock Returns Using Machine Learning Models

Authors

DOI:

https://doi.org/10.47611/jsrhs.v13i3.7416

Keywords:

Stock Returns, Social Media Sentiment, Random Forest, K Nearest Neighbors, Ridge Regression, Natural Language Processing, Market Prediction, Big Data

Abstract

The financial world is influenced by numerous factors such as social media. Posts on platforms like X (formerly Twitter) may reflect investors’ sentiment and therefore impact the growth of the stock market. In January 2021, Reddit users heavily impacted the Gamestop stock as well as the overall stock market with only a few posts (New York Times, 2021). There exist many similar instances where online statements from individuals or groups changed the direction of a stock (CNBC, 2023).  In this study, we analyzed social media data to determine whether different artificial intelligence models can predict the direction of future stock movements. We used Random Forest, K Nearest Neighbors, and Ridge models to analyze 367,666 tweets from X and predict the stocks’ direction of change over the course of 1 day or up to 1 week. The performance of these models was assessed by a common metric, F1 score, ranging from 0 to 1 where 0 indicates poor performance and 1 indicates perfect performance. The evaluated machine learning models predicted the direction with F1 Scores of around 0.8, peaking at 0.9, indicating that tweets and social media posts can be used as a tool to guide financial investment. Further studies can investigate the longevity of the impact of specific tweets, incorporating more tweet-related features such as the counts of retweets, followers, and likes.

Downloads

Download data is not yet available.

Author Biography

Paris Zhang, Inspirit AI

Senior Data Scientist at TikTok

References or Bibliography

CHATGPT can forecast stock price movement with better accuracy than humans: Study. Business Today. (2023, April 27). https://www.businesstoday.in/technology/news/story/chatgpt-can-forecast-stock-price-movement-with-better-accuracy-than-humans-study-379071-2023-04-27

Franchin, W. (2023, July 10). CHATGPT enters trading in the Financial Markets. Medium. https://medium.com/the-investors-handbook/revolutionizing-investment-strategies-chatgpts-ai-takes-on-the-financial-markets-53ef04a9b2c0

Karolina Sowinska, & Pranava Madhyastha. (2021). A Tweet-based Dataset for Company-Level Stock Return Prediction [Data set]. Zenodo. https://doi.org/10.5281/zenodo.4662780

Koehrsen, W. (2020, August 18). Random Forest Simple Explanation. Medium. https://williamkoehrsen.medium.com/random-forest-simple-explanation-377895a60d2d

Javed Awan, M., Shafry Mohd Rahim, M., Nobanee, H., Munawar, A., Yasin, A., & Mohd Zain Azlanmz, A. (2021). Social Media and Stock Market Prediction: A big data approach. Computers, Materials & Continua, 67(2), 2569–2583. https://doi.org/10.32604/cmc.2021.014253

Jcunningham. (2023, February 22). The Increasing Influence of Social Media on the Stock Market. Journal of High Technology Law. https://sites.suffolk.edu/jhtl/2023/02/22/the-increasing-influence-of-social-media-on-the-stock-market/

Lopez-Lira, A., & Tang, Y. (2023). Can CHATGPT forecast stock price movements? return predictability and large language models. SSRN Electronic Journal. https://doi.org/10.2139/ssrn.4412788

Mullainathan, S., & Thaler, R. (2000). Behavioral Economics. https://doi.org/10.3386/w7948

Nuñez-Mora, J. A., & Mendoza-Urdiales, R. A. (2023). Social sentiment and impact in US equity market: An automated approach. Social Network Analysis and Mining, 13(1). https://doi.org/10.1007/s13278-023-01116-6

Phillips, M., & Lorenz, T. (2021, January 27). “dumb money” is on gamestop, and it’s beating wall street at its own game. The New York Times. https://www.nytimes.com/2021/01/27/business/gamestop-wall-street-bets.html

R, D. A. (2019, September 24). KNN visualization in just 13 lines of code. Medium. https://towardsdatascience.com/knn-visualization-in-just-13-lines-of-code-32820d72c6b6

Rogoswami. (2023, April 4). Dogecoin jumps more than 30% after musk changes Twitter logo to image of Shiba Inu. CNBC. https://www.cnbc.com/2023/04/03/dogecoin-jumps-over-30percent-after-twitter-changes-logo-to-doges-symbol.html

Sahayak, V., Shete, V., & Pathan, A. (2015). Sentiment Analysis on Twitter Data. International Journal of Innovative Research in Advanced Engineering, 2(1).

Tellez, E. S., Miranda-Jiménez, S., Graff, M., Moctezuma, D., Suárez, R. R., & Siordia, O. S. (2017). A simple approach to multilingual polarity classification in Twitter. Pattern Recognition Letters, 94, 68–74. https://doi.org/10.1016/j.patrec.2017.05.024

Twitter. (n.d.). The X rules: Safety, privacy, authenticity, and more. Twitter. https://help.twitter.com/en/rules-and-policies/x-rules

What is ridge regression?. IBM. (2024, April 10). https://www.ibm.com/topics/ridge-regression

What the finance industry tells us about the future of ai. Harvard Business Review. (2023, August 11). https://hbr.org/2023/08/what-the-finance-industry-tells-us-about-the-future-of-ai

Published

08-31-2024

How to Cite

Zhang, A., & Zhang, Y. (2024). The Impact of X (Formerly Twitter) Sentiment on Stock Returns Using Machine Learning Models. Journal of Student Research, 13(3). https://doi.org/10.47611/jsrhs.v13i3.7416

Issue

Section

HS Research Projects