The CURSE of anti-spam testing

Martijn Grooten Virus Bulletin

  download slides

Although email spam has been pestering end-users for more than a decade and anti-spam solutions have been helping them keep their inboxes tidy for almost as long, relatively few attempts have been made to test such solutions. Perhaps this is not too surprising: as an expert in the field once said, anti-spam testing is 'fiendishly difficult', which is not in the least because of the filters' tendency to block large amounts of spam even before the tester can have a look at it. Still, we believe that it is possible to run such tests and this paper deals with the running of anti-spam tests.

The paper will consist of two parts. The first part will deal with the general concept of anti-spam testing and some guidelines will be discussed on what a good anti-spam test should be like. A representative and reliable anti-spam test should fulfil five conditions:

  • Comparative: results of an anti-spam test should always be seen within the context of a test; hence the most meaningful results will be where various solutions are being compared using the same, or very similar circumstances and corpuses.
  • Unbiased: the test should not bias any filtering method, nor should end-users, when classifying email, have any knowledge of products' decisions.
  • Real email in real time: the corpuses used in the test should consist of real ham and real spam email and should be sent in real time.
  • Statistically valid: the corpus should contain enough ham and spam to make claims about the products' performance within a reasonable error margin.
  • Explaining what is done: the testing setup, including but not limited to the way the corpuses are obtained and the way a golden standard is set, should be well explained.

The second part of the paper will discuss the comparative anti-spam test we have set up - in particular the decisions we have made based on these conditions. We will also discuss our own experience with running such tests.



twitter.png
fb.png
linkedin.png
hackernews.png
reddit.png

We have placed cookies on your device in order to improve the functionality of this site, as outlined in our cookies policy. However, you may delete and block all cookies from this site and your use of the site will be unaffected. By continuing to browse this site, you are agreeing to Virus Bulletin's use of data as outlined in our privacy policy.