Malware mining

Igor Muttik McAfee

Heuristic detection methods are well established and well known in the AV industry. In the end they usually produce a decision as to whether the analysed object is suspicious enough to be detected or not.

As opposed to such traditional heuristics, in this paper we explore the possibilities of using 'weak' heuristics based on data mining technologies in malware analysis. By weak heuristics we mean utilizing a large set of simple features which generally would not qualify as elements for traditional malware heuristic routines (the examples include features like count of sections in a PE file, number of imports, depth of resource tree, size, time stamps, etc.). In fact, some of them may be 'negative' heuristics. 'Weak' heuristics based on such general features may not be suitable for detection but they are still very useful in AV operation.

We describe how data-mining techniques can be used to provide such 'weak' heuristics, we compare the properties of decision trees/forests and support vector machines in application to malware analysis.

We discuss the selection of the features extracted from a sample (or obtained by a behavioural product when a program runs), the construction of a primitive decision tree from these features and the properties of the ROC (receiver operating characteristic) curve. The same decision tree can be used at its strong and weak points on the ROC curve and used to drive different decisions in security products.

We list possible uses of weak heuristics:

  • to prioritize and de-prioritize samples in research queues
  • to drive the depth of the sample analysis (emulation, behaviour monitoring) on the endpoint, gateway or a server
  • to check the most suspicious samples using cloud-based security
  • etc.

 del.icio.us  digg this! digg this

Quick Links

Poll
Should software vendors extend support for their products on Windows XP beyond the end-of-life of the operating system?
Yes - it keeps their users secure
No - it encourages users to continue to use a less secure OS
I don't know
Leave a comment
View 23 comments

AusCert2014

VB100 certification
VB100 For the first time in living memory, this test saw a clean sweep of certification passes, with all products reaching the required standard for a VB100 badge, and most also doing well in terms of stability.
See full results.

Virus Bulletin currently has 231,290 registered users.