Malware Detection Using Byte Stream Using Different File Formats


  • V Himavanth B
  • Mahammad Shafi A
  • Brinda D


Malware detection, byte stream, non-executables, deep learning, convolutional neural networks, Hangul word processor, portable document format.


Malware detection is becoming more important task as we face more data on the Internet. Web users are vulnerable to non-executable files such as Word files and Hangul Word Processor files because they usually open such files without paying attention. As new infected non-executables keep appearing, deep- learning models are drawing attention because they are known to be effective and have better generalization power. Especially, the deep-learning models have been used to learn arbitrary patterns from byte streams, and they exhibited successful performance on malware detection task. Although there have been malware detection studies using the deep-learning models, they commonly aimed at a single file format and did not take using different formats into consideration. In this paper, we assume that different file formats may contribute to each other, and deep-learning models will have a better chance to learn more promising patterns for better performance. We demonstrate that this assumption is possible by experimental results with our annotated datasets of two different file formats (e.g., Portable Document Format (PDF) and Hangul Word Processor (HWP)).


Download data is not yet available.




How to Cite

V Himavanth B, Mahammad Shafi A, & Brinda D. (2023). Malware Detection Using Byte Stream Using Different File Formats. International Journal of Progressive Research in Science and Engineering, 4(5), 123–130. Retrieved from