QUANTIFYING MALICIOUSNESS IN ALEXA TOP-RANKED DOMAINS presented at Blackhat Abu Dhabi 2012

by Paul Royal,

Tags: Malware Statistics Alexa

Summary : Many people assume that it is safe to visit popular, long-lived websites. While anecdotal examples of popular website compromises (e.g., USAToday.com, PBS.org) contradict this expectation, there exist few comprehensive studies that attempt to systematically quantify maliciousness in top-ranked sites.
To address this gap in understanding, my presentation details the design and results of long-running experiments that identify maliciousness in popular websites in a vulnerability and exploit-independent manner. To perform experimentation, I created a scalable URL analysis system that forces a browser within a sterile virtual machine to visit a given site, then examines the network-level actions of the VM to determine whether a drive-by download occurred. As input to this system, I provided the Alexa top 25,000 most popular domains each day in what became a series of month-long studies.
In combination with reverse engineering parts of the Alexa rankings system, detailed analysis of the results yields cause for concern. My findings show that each month, millions of users are served malicious content from just tens of popular websites, and at least one million users are successfully compromised. In addition to an assessment of the experimentation results (e.g., use of Java or ad networks in drive-by downloads), my presentation will coincide with release of the raw data collected to promote a better understanding of this issue.