Structural Classification Of Malware presented at BlueHat 2007

by Thomas ( Halvar Flake ) Dullien (SABRE Security ),

Tags: Security

Summary : Malware authors are changing. In the past, their motivation was fame, nowadays it is mostly money. With the change of focus, development practices on the side of the malware authors are changing, too: Hand-crafted polymorphic assembly code is out, cheap-to-maintain-and-develop C/C++ code is in. Simple ëoffline polymorphismí (for example, clever recompile with small changes) and targeted attacks allow the evasion of traditional AV signatures without giving up on massive code reuse. To automatically deal with the (almost boringly) growing flood of malware, several classification methods have been proposed - ranging from looking at instructions, n-grams and n-perms, and other "features" to generate high-dimensional vectors to behavioral techniques. These techniques suffer from the drawback of high "brittleness", that is, they can be easily circumvented without requiring significant skill or time on the side of the malware author.
This talk will discuss using structural (for example, callgraph- and flowgraph-based) metrics for the automated classification of malware into families. The advantage of the discussed approach is its relative "suppleness" - it is resistant up to drastic measures such as "recompiling a virus for a different architectures" etc. A significant investment of work is needed on the malware authorsí side to break the analysis. We will discuss an example implementation of a fully automated malware classification system (VxClass) which automatically unpacks, disassembles, and compares new malware against an existing database. A number of horribly incorrect predictions about the future will be given.