Using A Stacked-LSTM Architecture to Analyze API Call Sequences from Dynamic Malware Sandbox

Authors

  • Advay Balakrishnan Valley Christian High School
  • Guillermo Goldsztein Georgia Tech University

DOI:

https://doi.org/10.47611/jsrhs.v11i3.3708

Keywords:

Malware analysis, AI, Recurrent neural networks, LSTM

Abstract

This paper explores the temporal hierarchy of API sequence calls using Deep Recurrent Neural Networks. RNN’s have the capability to easily capture the nature of time series processing due to their sequential structure, whereas generic neural networks cannot find an underlying relationship between data that is separated by timesteps. The primary goal of the paper is to describe an approach to analyzing malware attacks on networks while accounting for each timestep of data and gaining more flexibility in the size of the detection request using the features of the Long Short-Term Memory (LSTM) architecture. We will also touch on the more recent Gated Recurrent Units (GRU) network architecture. We will discuss the tested network configurations and the most optimal model. Lastly, the paper will suggest some applications of this model and how it could be integrated into standard operating systems.

Downloads

Download data is not yet available.

References or Bibliography

Glamoslija, K. (2021, March 22). 10 most dangerous virus & malware threats in 2022. SafetyDetectives. Retrieved August 28, 2022, from https://www.safetydetectives.com/blog/most-dangerous-new-malware-and-security-threats/

Hermans, M., & Schrauwen, B. (1970, January 1). [PDF] training and analyzing deep recurrent neural networks: Semantic scholar. undefined. Retrieved August 28, 2022, from https://www.semanticscholar.org/paper/Training-and-analyzing-deep-recurrent-neural-Hermans-Schrauwen/64cd8a192de0f4de444db759b14cadce111fd904

A guide to RNN: Understanding recurrent neural networks and LSTM Networks. Built In. (n.d.). Retrieved August 28, 2022, from https://builtin.com/data-science/recurrent-neural-networks-and-lstm

Co-uk. ssla. (2020, August 20). Retrieved August 28, 2022, from https://www.ssla.co.uk/long-short-term-memory/

Cheng, H.-Y., & Yu, C.-C. (n.d.). LSTM cells. (a) forget gate, (b) input gate, (c) updating cell state ... Retrieved August 29, 2022, from https://www.researchgate.net/figure/LSTM-Cells-a-Forget-Gate-b-Input-Gate-c-Updating-Cell-State_fig2_355833547

A guide to RNN: Understanding recurrent neural networks and LSTM Networks. Built In. (n.d.). Retrieved August 28, 2022, from https://builtin.com/data-science/recurrent-neural-networks-and-lstm

Oliveira, A. (2019, December 12). Malware analysis datasets: API call sequences. IEEE DataPort. Retrieved August 28, 2022, from https://ieee-dataport.org/open-access/malware-analysis-datasets-api-call-sequences

What is cuckoo?¶. What is Cuckoo? - Cuckoo Sandbox v2.0.7 Book. (n.d.). Retrieved August 28, 2022, from https://cuckoo.sh/docs/introduction/what.html

Windows API calls: The malware edition. (2020, April 29). Retrieved August 28, 2022, from https://sensei-infosec.netlify.app/forensics/windows/api-calls/2020/04/29/win-api-calls-1.html

Winitor. (n.d.). Retrieved August 28, 2022, from https://www.winitor.com/

Published

08-31-2022

How to Cite

Balakrishnan, A., & Goldsztein , G. . (2022). Using A Stacked-LSTM Architecture to Analyze API Call Sequences from Dynamic Malware Sandbox. Journal of Student Research, 11(3). https://doi.org/10.47611/jsrhs.v11i3.3708

Issue

Section

HS Research Projects