Packer visualisation: a fast entropy scanning algorithm that preserves local detail presented at Virus Bulletin 2008

by Tim Ebringer (University of melbourne),


Summary : "Entropy or randomness calculations can be a fast way to estimate whether a file is packed. However, most algorithms we
have seen simply give an overall entropy value for the entire file. We present an algorithm that is not only fast, but
preserves local detail, so a plot can be made of an entire file showing areas of high entropy and low entropy. Areas of
low entropy (code and header, in the case of a PE file) show as lower points on the plot, whereas areas of high entropy
(encrypted or compressed data) show as high areas.
We present local-detail preserving entropy plots for a variety of packers, showing that many appear to pack files in a
distinctive manner. This seems to be because compressed data and the code that unpacks it tends to be placed in the same
relative location in the packed file, leading to a kind of signature based on an 'entropy signal'. We give algorithms
which have shown early promise in comparing the entropy signals of various packed files.
Finally, in what we believe to be a unique presentation, we are able to visualize the work of a packed program as it
unpacks itself by placing breakpoints on decompression/decryption loops, dumping memory, and performing our
detail-preserving entropy analysis on the dump. A YouTube video (link below) shows a UPX-packed file unpacking itself.
The start of the video shows the entropy of a UPX-packed file as it is initially loaded into memory. As the video
progresses, the uncompressed code is written into the empty section created by the loader. Note the interesting
behaviour whereby UPX actually overrides its compressed data with code during the unpacking process. In this video,
the y-axis is the amount of entropy, and the x-axis is memory address of the main UPX sections from low to high.
Unpacking video:"