Preprint / Version 1

Comparing the Effectiveness of Support Vector Classifier and Stochastic Gradient Descent in Hate-Speech Detection

##article.authors##

  • Dania Ali

Keywords:

Support Vector Classifier, Stochastic Gradient Descent, Hate-Speech Detection

Abstract

The increased use of Social Media with easy access to most people in the world has given rise to a multitude of problems; with cyberbullying and online hate-speech standing out as significant issues. With the choice of a user to maintain there anonymity and post most things that would be considered uncivil in a one-to-one real life conversation, has led to a widespread dissemination of online hate-speech, posing significant societal challenges and determinantal effects to an individual’s mental health. In this paper, we explored two simple Classifiers, Support Vector Classifier (SVC) and Stochastic Gradient Descent (SGD) which are compared and analysed through there accuracy score to determine there effectiveness in detecting hate-speech within the context of Twitter data. To train the models, a publicly available dataset by Analytics Vidhya which can be found on Kaggle.com is used which contains 32k tweets labelled with a ‘1’ if it is sexist/racist or ‘0’ if it’s not. The goal of this paper is identifying the differences in performances in hate-speech detection by the two classifiers.

References or Bibliography

Ml: Stochastic gradient descent (sgd). 2023.

Difference between batch gradient descent and stochastic gradient de-

scent. 2023.

[Alv17] Winter F Alvarez, A. Normative change and culture of hate: An

experiment in online environments. . European Sociological Review,

[Ban22] S. Bansal. A comprehensive guide to understand and implement text

classification in python. Analytics Vidhya, 2022.

[Bot18] Curtis F. E. Nocedal J Bottou, L. Optimization methods for large-scale

machine learning. arXiv.org, 2018.

[Dea12] Corrado G. S. Monga R. Chen K. Devin M. Le Q. V. Mao M. Z. Ran-

zato M. A. Senior A. Tucker P. Yang K. Ng A. Y Dean, J. Large

scale distributed deep networks - neurips. large scale distributed deep

networks. NeurIPS (Conference on Neural Information Processing Sys-

tems), 2012.

[Hui19] P Huilgol. Accuracy vs. f1-score. medium.com, 2019.

[Too] A. (n.d.) Toosi. Twitter sentiment analysis.

[Twi] Twitter. X’s policy on hateful conduct x help. twitter. rules-and-

policies/hateful-conduct-policy.

[Zha18] Z Zhang. Hate speech detection: A solved problem? the challenging

case of long tail on twitter. arXiv, 2018.

Downloads

Posted

10-25-2023