Analyzing the Performance of TabTransformer in Brain Stroke Prediction

Authors

  • Hao Ming Xia University Hill Secondary
  • Ramin Ramezani University of California, Los Angeles

DOI:

https://doi.org/10.47611/jsrhs.v12i1.3935

Keywords:

Tabular Data Analysis, Machine Learning, Transformer Models, TabTransformer, Electronic Health Records, Brain Stroke Prediction

Abstract

The adoption of electronic patient health records has paved the way for machine learning and
deep learning in disease diagnostics and prediction. Though traditionally tree-based algorithms
have performed well on structural data, neural networks are known to perform well on
unstructured data and data with a large number of input features. Furthermore, transformer-
based models such as TabTransformer have been shown to perform competitively with tree-based
algorithms (Huang et al. 2020). In this paper, we compare TabTransformer’s performance with
other state-of-art machine learning algorithms such as XGBoost, RandomForest, DecisionTree, and
feed-forward Multilayer Perceptron. We discovered that TabTransformer shows no significant
improvement over MLP and performs worse in certain metrics. Neither TabTransformer nor MLP
performed better than XGBoost, the best-performing algorithm for brain stroke prediction in
Kaggle competitions.

Downloads

Download data is not yet available.

References or Bibliography

Chen, Tianqi, and Carlos Guestrin. 2016. “XGBoost.” In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM. https://doi.org/10.1145/2939672.2939785.

Dev, Soumyabrata, Hewei Wang, Chidozie Shamrock Nwosu, Nishtha Jain, Bharadwaj Veeravalli, and Deepu John. 2022. “A Predictive Analytics Approach for Stroke Prediction Using Machine Learning and Neural Networks.” arXiv. https://doi.org/10.48550/ARXIV.2203.00497.

Huang, Xin, Ashish Khetan, Milan Cvitkovic, and Zohar Karnin. 2020. “TabTransformer: Tabular Data Modeling Using Contextual Embeddings.” arXiv. https://doi.org/10.48550/ARXIV.2012.06678.

Nwosu, Chidozie Shamrock, Soumyabrata Dev, Peru Bhardwaj, Bharadwaj Veeravalli, and Deepu John. 2019. “Predicting Stroke from Electronic Health Records.” arXiv. https://doi.org/10.48550/ARXIV.1904.11280.

Stekhoven, D. J., and P. Buhlmann. 2011. “MissForest–Non-Parametric Missing Value Imputation for Mixed-Type Data.” Bioinformatics 28 (1): 112–18. https://doi.org/10.1093/bioinformatics/btr597.

Published

02-28-2023

How to Cite

Xia, H. M., & Ramezani, R. (2023). Analyzing the Performance of TabTransformer in Brain Stroke Prediction. Journal of Student Research, 12(1). https://doi.org/10.47611/jsrhs.v12i1.3935

Issue

Section

HS Research Projects