by Rodrigo Rubira Branco, Gabriel negreira Barbosa,


Summary : Malware is widely acknowledged as a growing threat with hundreds of thousands of new samples reported each week. Analysis of these malware samples has to deal with this significant quantity but also with the defensive capabilities built into malware. Malware authors use a range of evasion techniques to harden their creations against accurate analysis. The evasion techniques aim to disrupt attempts of disassembly, debugging or analyze in a virtualized environment.
Two years ago, in 2011, we presented (with other researchers) at Black Hat USA a wide range of anti-reverse engineering techniques that malware were currently employing. For each technique, we documented how it works, we created an algorithm to detect its usage, and we provided statistics on the technique prevalence in a 4 million samples database. We also provided a fully-working PoC implementing each of the techniques (either in C or Assembly). Our expectation was that the AV industry would use our ideas (proven with the prevalence numbers) to significantly improve the malware prevention coverage. Nothing changed. In the meanwhile, we improved our detection algorithms, fixed bugs, and expanded the research to 12+ million samples.
In this talk, we are going to give another try and demonstrate the prevalence of more than 50 non-defensive additional characteristics found in modern malware. Additionally to that, we also extended our previous research demonstrating what the malware does once it detects it is being analyzed. The resulting data will help security companies and researchers around the world to focus their attention on making their tools and processes more efficient to rapidly avoid the malware authors' countermeasures.
This first of its kind, comprehensive catalog of malware characteristics was compiled by the paper's authors by researching some techniques employed by malware, and in the process new detections were proposed and developed. The underlying malware sample database has an open architecture that allows researchers not only to see the results of the analysis, but also to develop and plug-in new analysis capabilities.