The difficulties in sizing up botnets
The main metric with which security researchers identify how effective and disruptive specific botnets are is the number of computers they consists of.
Estimating their size allows them to assess whether concentrating their efforts on the disruption of one is better than focusing on “attacking” another, and to estimate which resources they will have to dedicate to this task.
But measuring the size of a botnet once is not enough. It has to be done over and over again, as the number of zombie computers it consists of changes with each passing day. This continuous effort also allows researchers to see whether and in what measure their efforts have been successful.
In a recent blog post, Jose Nazario, senior manager of security research at Arbor Networks, gave insight into a number of measurement methodologies researchers use to effect that task.
Sinkholing botnets by identifying its C&C server and redirecting infected computers to another controlled by the researchers is a very popular technique at the present moment. Not only can they then count the number of unique IP addresses that connect to it in a certain period, but they can also sometimes identify whether there is more than one PC on one IP address.
Dark IP monitoring consists of a completely different approach. “This method takes large unused IP address blocks and then listens for traffic,” explains Nazario. “The collection system is able to fingerprint bots based on specific signs. This could include the exploit traffic or traffic to a specific TCP/IP service used. This then gives you some passive mechanism to watch the botnet and try to spread.”
A third methodology – counting infected hosts – is more direct than the previous ones, and is often employed by Microsoft as the company can dip into the reports sent by its AV solutions such as the Malicious Software Removal Tool. Still, only Microsoft can effectively use this method, as AV solutions by other vendors are not that widely and uniformly distributed around the world.
Crawling a peer-to-peer botnet in order to gather the peer list from every node and walk the botnet recursively is also a good and direct option, but in order to do it, researchers must know the P2P protocol it uses – and it must not be strongly encrypted.
Unfortunately, the effectiveness of each of these methodologies can be thwarted by a number of things.
ISPs could be directing affected computers to their own sinkholes to identify and help their own infected customers, so the sinkholes set up by the researchers don’t catch all the IPs. If a bot is offline, it won’t be contacting the sinkhole and will not, therefore, be counted.
Finally, the gathered IP addresses are not always equivalent to one affected computer. IP addresses for one machine can be changed multiple times per day, leading to an over count.
On the other hand, Network Address Translation (NAT) can account for significantly smaller numbers recorded, as in some parts of the network up to a hundred PCs share the same IP address.
Obviously, none of these measurement methodologies are perfect.
“We’re trying to identify the causes for the gaps in the methodologies (e.g. network vs host measurements) and provide stronger data by closing those gaps,” Nazario concludes.
“Based on this data, we also work globally to identify working strategies that effectively shut down botnets and drop infection rates. We then want to coordinate these efforts globally to lead to lower infections in each region.”