Feedback loops in the fight against spam
Nearly everybody gets spam. But do you ever wonder what happens when you click that ‘Report Spam’ button on your mail reader? Does it do anything useful, or is it really the same as just clicking ‘Delete’?
The Internet is plagued by messaging abuse, such as spam and viruses. In the context of messaging, defences such as anti-spam and anti-virus filters are typically deployed; however the simplest of these defence filters are easily circumvented by mutating the spam attack, either in form or in origin, just enough to avoid detection. In order to be effective, these filters must adapt to threats as they mutate to prolong their success.
Filters can only adapt when they have new details about what it is they need to filter. In the simplest case, a user issues a complaint to a customer service centre about an undesirable or dangerous piece of spam, and the representative then acts on the complaint by disabling the source, retraining the filter based on the details of the complaint, or both. But in a world of automation and enormous volumes of data, such a manual system simply cannot survive; it doesn’t scale.
Learning is defined as a change in behavior based on experience. Therefore, what consumers and service providers need is a system that is capable of learning, with maximum accuracy, what constitutes an illegitimate message that must be kept out and what constitutes legitimate traffic that should be allowed in. To be effective in the face of mutating attacks, the defenses must themselves mutate, as quickly and accurately as possible. The filter needs more ‘experience’ in order to learn.
Much effort has been expended to try to define what spam is in order to classify and filter it. However, not only do spam campaigns mutate to avoid detection, but we have also learned that spam is in the eye of the beholder: What one person says is junk might be of some value to someone else, with great consequences if a filter gets it wrong. A career spam fighter once opined, when tasked to define the problem: “Spam is what our users say it is.” So how do we embrace that idea in software?
We have found over time that the most effective systems are those that learn to classify undesirable content based on feedback from users. The user is truly the best judge of what is and isn’t spam. The faster that consistent feedback becomes available, the sooner a filter can be re-trained to detect and respond to new attacks. This is known as a feedback loop.
Open solutions that use feedback loops have been attracting attention for several years. In particular, a mechanism called the Abuse Reporting Format (ARF) was created by participants in the Messaging Anti-Abuse Working Group (MAAWG) some years ago. ARF allows exchange of feedback information between peer Internet Service Providers (ISPs) when spam or other abuse originating at one is received at another; a user clicks a ‘Spam’ button in the mail reader and an ARF message is generated and sent to the originating service, where automated software quickly processes the complaint, and the systems at both ends have more data from which to learn.
Once proven, the ARF work was taken up by the Internet Engineering Task Force (IETF) which has now posted it as a proposed standard. ARF continues to evolve as new categories of email threats emerge.
With the enormous growth of messaging from email into the mobile world, the same problems exist and similar solutions are beginning to appear. Trial systems now exist where a mobile subscriber can forward a piece of mobile SMS spam to a mobile operator for filtering and investigation, while others are working to construct learning systems using user-based feedback loops. A standardization effort has already reached prototype phase within the Open Mobile Alliance (OMA), a collaborative standards body in the mobile world that creates specifications for mobile handset software. This will define the very language used to communicate among systems when you click a ‘Report Spam’ button on your handset when mobile spam rear its ugly head.
Feedback loops are a proven tool in the fight against abuse. They are key features in a highly responsive, accurate filtering system. So, yes, do click ‘Report Spam’ instead of ‘Delete’. We’ll all be glad you did.