Robust Malware Detection in Adversarial Environments: Analysis, Evaluation, and Defense Strategies

Project Description

The dynamic evolution of malware, combined with increasingly sophisticated evasion techniques such as packing, obfuscation, and polymorphism, presents a significant challenge to conventional security mechanisms. As a result, machine learning (ML)-based malware detection systems are being adopted widely due to their ability to generalize and automate malware identification. However, these systems are also susceptible to adversarial threats, and current solutions struggle to robustly identify evasive or morphed malware.

To address this critical issue, InfoLab at Sungkyunkwan University (SKKU) has led a comprehensive research project spanning three key investigations, each targeting a unique vulnerability in ML-based malware detection pipelines—from data representation and feature manipulation to evasion through software packing.

Core Research Contributions

1. Spectral Analysis of Control Flow Graphs for Malware Detection

We propose a novel approach for malware classification using spectral representations of control flow graphs (CFGs). Leveraging heat and wave kernels, the research extracts size- and permutation-invariant graph signatures for malware detection.

Key Insight: Spectral signatures provide a scalable and effective alternative to byte-level feature extraction, especially in adversarial scenarios involving structural manipulation.


2. MLxPack: Investigating the Effects of Packers on ML-Based Malware Detection

This study examines how packing techniques—used to disguise malicious intent—affect ML classifier accuracy. Using a large dataset of 107,000 packed and unpacked samples, the research explores both static and dynamic features.

Key Insight: Detection systems must account for packing effects by incorporating diverse feature representations and multi-perspective analysis.


3. Visualization-Based Malware Analysis Using Feature Fusion

Focusing on Android malware, this study introduces a feature fusion technique that combines handcrafted texture descriptors (GIST, LBP, GLCM) with deep CNN features from grayscale images of malware components (e.g., classes.dex, manifest files).

Key Insight: Visualization-based static analysis offers a powerful and resilient approach to detect obfuscated and packed Android malware.


Project Objectives

Research Impact

This project by InfoLab at SKKU presents a multidimensional approach to adversarial malware analysis, bridging the gap between ML robustness and real-world evasion tactics. Key contributions include:

Together, these efforts lay a strong foundation for secure, explainable, and adversarially resilient malware detection systems.