Comparing Malicious Files presented at BSidesLjubljana 2019

by Robert Simmons,


Summary : The questions that are asked by researchers and incident responders are closely related, but are driven by different immediate needs. Two major questions are: “can I identify a sample as a member of a malware family?” and “can I find more samples that are related to a sample?” When trying to identify a particular sample, there are even two more problems introduced by the various nomenclatures used to identify malware samples with some names coming from AV vendors and other names coming from marketing departments.This talk covers a number of techniques for identifying malware samples. Even though there is a nomenclature problem, methods of working with AV scanner results are also covered in the talk. A single solution to the problem of nomenclature is tough to agree upon, but a number of historical attempts including Common Malware Enumeration (CME), WildList, CARO, and more are reviewed. Additionally, I point out how other disciplines, specifically Biology, have solved nomenclature problems that still satisfy many of the needs of malware identification including some marketing problems.Beyond the process of identification, I cover a number of techniques for finding additional samples related to a particular sample. Topics covered in this area include various types of fuzzy hashes, URL structure analysis, code signing signatures, metadata from static analysis, behavioral data from dynamic analysis, adversary infrastructure analysis (including the diamond model for intrusion analysis), clusterization algorithms, and control flow graph analysis. The listener will come away with a set of techniques that can be put to immediate use, from simple to complex.”