TREC 2006 Spam Track Public Corpora
All rights reserved. Permission is granted for use of this
data only to those who access it through, and agree to the terms
of the
Agreement for use of the 2006 TREC Public Spam Corpora.
There are two corpora - mostly English (trec06p) and Chinese (trec06c).
Each participant should submit the results of four standard runs for each filter:
- run.sh trec06p/full/ -- Ideal feedback English corpus
- run.sh trec06p/full-delay/ -- Delayed feedback English corpus
- run.sh trec06c/full/ -- Ideal feedback Chinese corpus
- run.sh trec06c/delay/ -- Delayed feedback Chinese corpus
Participants doing the active learning task should do two additional runs for each filter:
- active.sh trec06p/full/
- active.sh trec06c/full/
The TREC 2005 Corpus and TREC 2007 Corpus are also available.