How to recover virtualized x86 instructions by Themida presented at Virus Bulletin 2009

by Zhenxiang jim Wang (Microsoft),

Tags: Security

URL : http://www.virusbtn.com/conference/vb2009/abstracts/Wang.xml

Summary : In recent years, we have started to see the emergence of third-generation packer technology used on obfuscating malware.
Generally, packer technology evolution can be divided into three generations:
First generation: Compressor
Second generation: Protector
Third generation: VM protection system
We observe a upward trend of increasing prevalence of third-generation packers from the packer distribution statistics
we actively track.
To date, there are two kinds of technology traditionally used to deal with packed malware:
Emulation, also called generic unpacking
Static unpacking
We will show in this paper that third-generation packers are armed with technologies targeted at defeating these traditional
de-obfuscation approaches. The third generation of packers often translate lots of x86 opcodes to equivalent VM instruction series which can be
interpreted by the VM interpreter. To prevent itself from being cracked, the interpreter always is heavily code-obfuscated.
Taking Themida as an example, an x86 instruction will be implemented by executing about 10,000-25,000 instructions. This
gives the third-generation packers a natural anti-emulation ability. It is infeasible for the emulators that are
implemented by present emulation technology to decrypt a piece of malware that is packed by a packer of this kind in a
reasonable length of time. But how to recover x86 virtualized instructions is one of the difficulties in developing
static unpackers. In this paper, we propose a methodology, based on pattern-matching technology, to recover virtualized x86 instructions,
and thereby de-obfuscate the packer in an efficient manner. Specifically in the case of Themida, we will show how this
approach can:
handle obfuscation techniques employed by Themida VM
determine the function of all Themida VM double-byte instructions
determine all random values used in all of the VM instructions generated randomly by Themida packer
generate VM instructions
translate VM instructions into X86 instructions