A pragmatic android malware detection procedure

Research output: Contribution to journalArticleScientificpeer-review

Researchers

Research units

  • F-Secure Corp.
  • Arcada University of Applied Sciences

Abstract

The academic security research community has studied the Android malware detection problem extensively. Machine learning methods proposed in previous work typically achieve high reported detection performance on fixed datasets. Some of them also report reasonably fast prediction times. However, most of them are not suitable for real-world deployment because requirements for malware detection go beyond these figures of merit. In this paper, we introduce several important requirements for deploying Android malware detection systems in the real world. One such requirement is that candidate approaches should be tested against a stream of continuously evolving data. Such streams of evolving data represent the continuous flow of unknown file objects received for categorization, and provide more reliable and realistic estimate of detection performance once deployed in a production environment. As a case study we designed and implemented an ensemble approach for automatic Android malware detection that meets the real-world requirements we identified. Atomic Naive Bayes classifiers used as inputs for the Support Vector Machine ensemble are based on different APK feature categories, providing fast speed and additional reliability against the attackers due to diversification. Our case study with several malware families showed that different families are detected by different atomic classifiers. To the best of our knowledge, our work contains the first publicly available results generated against evolving data streams of nearly 1 million samples with a model trained over a massive sample set of 120,000 samples.

Details

Original languageEnglish
Pages (from-to)689-701
Number of pages13
JournalComputers and Security
Volume70
Publication statusPublished - 1 Sep 2017
MoE publication typeA1 Journal article-refereed

    Research areas

  • Android, Classification, Ensemble learning, Feature selection, Machine learning, Malware detection, Static analysis

ID: 15293288