On scanning the Internet or the curse of in-the-cloud URL scanning presented at Virus Bulletin 2010

by Alexandru catalin Cosoi (Bitdefender),

It is an undisputed fact that the number of malicious URLs has skyrocketed dramatically in the past six months and we are
all expecting this phenomenon to evolve into something even worse in the near future. Menaces such us URLs that install
malware on the victim's computer, phishing URLs, spam URLs, shortened URLs and even legitimate URLs that have been modified
to perform one of the tasks above, are just a few of the current issues the industry is facing, with a high potential of
'exploding' in the near future.
For years now, two major techniques have dominated the security industry technologies: blacklists and local content-based
analysis. These two techniques performed very well in their time, but as in the case of Bayesian filtering of spam
messages, the time has come to explore new techniques, since while the first one is highly reactive, the later is immune
to highly new emerging threats.
Cloud computing is one of the new candidates suitable to perform this task since it offers access to new techniques like
outbreak detection and data mining, but as with any new technology, has both advantages and a series of shortcomings.
This paper will explore the benefits and the downsides of scanning URLs in the cloud and propose solutions whenever is
possible. We will discuss issues like geographical differences of websites, reputation, content filtering, encrypted connections,
storage, scanning and rescanning algorithms, keeping in mind all the good things this technology has to offer.