Scalable Network Recon: Why Port Scans are for Pussies presented at HackInTheBox 2014

by Fred Raynal, Adrien Guinet,

Summary : Scanning the Internet is not a new topic. It has been done since forever and becomes more and more easy thanks to network infrastructure improvements. We have worked, not on an Internet wide scanner – we rely on good available tools like massscan / nmap for instance – but on a network reconnaissance framework. The difference is that we follow a 3 step process, based on Intelligence process:
1. Define a target and the information you want to collect
2. Collect information
3. Store information
4. Analyze information
5. Goto step 1 based on results
Thus, the network reconnaissance we perform is based on different problems to solve than a “simple” port scan:
1. What is a target? If we want to scan a country, how to define it? Based on maxmind database? What is a “network”? Its ranges or its AS? What is a company? AS, domain names, address ranges, network
routes (BGP)?
2. Many organisms still consider a port scan as an aggressive action, so they detect and complain about them. How to avoid either detection, or minimize the complaints? There are many solutions there, but we decided to pick one which is randomizing the IP ranges to scan and distribute the scan on many hosts. That is why we have created libleeloo, which is already opensource on our github. Other similar tools might appear.
3. What information to collect? We wanted to identify open ports, but also run specific scripts to perform some actions, e.g. IPSec is available, we run ike-scan to get crypto / key information
4. We have huge amount of heterogeneous data after a scan. How to store them, and to analyze them. Have you hard of the buzzword “big data” ? We analyze the data in 3 ways:
- Statistics: we perform various statistics on the data.
- Temporal: we see the evolution of parameters, which gives good trends.
- Differential: what changed between 2 scans?
5. Performance is not really an issue, as we do not care about scanning the US or China in less than 5 minutes. Zmap or massscan are using an asynchronous communication mode, very efficient. However, if they can decide quickly if a port is open, there is no way to perform more action. We had to deal with a full asynchronous model during the collect. Then, the data are stored as they arrive, and analysis can start. Performances are
more critical once the data are there, and one wants to browse them.
We will then focus on various use case to illustrate what kind of useful information can be retrieved, or what can be done. First we will illustrate our results on a country. We will show the basic usage of Ivy for that: collect & analyze. Once the scan is performed, filters allow to get more precise information and focus on one aspect of the data.
Then, we will give results on crypto usage on the Internet, based on statistics for certificates and public key mechanisms (as in every talk on Internet scanning, seems to be a must-be). However, we will also see that collecting this information allows to detect hosts with poor crypto choice, or sharing the same SSH keys.
Another obvious usage of such a scanner is to find vulnerable services. This can be done once a vulnerability is disclosed, in order to look specifically for it. We performed test on the recent TCP-32764 backdoor. We found thousands of hosts answering, but with our scripting mechanisms, we acknowledged only 6000+ vulnerable hosts. We will show how we can patch them all or monitor the propagation of the official patch. Another way to look for vulnerability is to look what is already in the database … and that is the reason why Ivy will not be made open source. One just have to dig in the database to find many vulnerable servers. We will take the example of the widely used (Mongo, Redis, RabbitMQ) which are let open, giving access to huge amount of data. We will also give some figures of the expected performances of such processes, and present our ongoing work on that subject.
Last but not least, we will show the monitoring capacity when you can do it at large scale. We will provide real use-case examples, like paytv piracy, where we can monitor the activity of CCcam servers, users, and so on, just with open source information. We have other “work in progress” on that too. So, conversely to what the first idea is, such a network recon tool in not only to pwn the Internet. It also gives lots of clues to the good guys, giving them the opportunity to be in prevention of issues rather than always handling incidents. The bad guys are using this information, why wouldnt the good ones not do the same? Having this information is the first step to organize one’s defense.