The CURSE of anti-spam testing
Martijn Grooten Virus Bulletin
Although email spam has been pestering end-users for more than a decade and anti-spam solutions have been helping them
keep their inboxes tidy for almost as long, relatively few attempts have been made to test such solutions. Perhaps this
is not too surprising: as an expert in the field once said, anti-spam testing is 'fiendishly difficult', which is not
in the least because of the filters' tendency to block large amounts of spam even before the tester can have a look at it.
Still, we believe that it is possible to run such tests and this paper deals with the running of anti-spam tests.
The paper will consist of two parts. The first part will deal with the general concept of anti-spam testing and some
guidelines will be discussed on what a good anti-spam test should be like. A representative and reliable anti-spam test
should fulfil five conditions:
- Comparative: results of an anti-spam test should always be seen within the context of a test; hence the most
meaningful results will be where various solutions are being compared using the same, or very similar circumstances
and corpuses.
- Unbiased: the test should not bias any filtering method, nor should end-users, when classifying email, have any
knowledge of products' decisions.
- Real email in real time: the corpuses used in the test should consist of real ham and real spam email and should
be sent in real time.
- Statistically valid: the corpus should contain enough ham and spam to make claims about the products' performance
within a reasonable error margin.
- Explaining what is done: the testing setup, including but not limited to the way the corpuses are obtained and the
way a golden standard is set, should be well explained.
The second part of the paper will discuss the comparative anti-spam test we have set up - in particular the decisions we
have made based on these conditions. We will also discuss our own experience with running such tests.
del.icio.us
digg this