TREC 2006 Spam Track Notes on Submissions and Robustness

1. This year there is no explicit pilot filter submission.

   If you did not do the task last year or are unsure if
   your implementation will work in the test environment,
   please contact me directly.  Within reason, I can run
   examples of your code on a test system and tell you
   the result.

   I cannot promise to do any such run in last week prior
   to the submission deadline.

2. This year deadlines will be strict and enforced by NIST.

   All submissions must be through the NIST submission
   system, and must strictly adhere to the submission
   requirements.  This means on-time, with one file per
   filter, and one file per result run.  There is also
   a filter and run naming convention that must be used.

   The dates have been posted with the guidelines; details
   of the submission system will be announced in due course.

3. If a "classify" or "train" command crashes, the result
   for the message will be recorded as "class=ham score=0" 
   and the filter will be resumed with the next message.

   If you are using a client/server system, you should 
   consider the possibility that your client or server
   may crash and detect and recover from this eventuality.

   Also your "initialize" should  work properly even if 
   there's a server running from a previous run (whether 
   the run failed or completed successfully).

4. Filters that use more than 2 seconds (wall clock time)
   per message (cumulative with a minute's grace for startup) 
   will be killed and the result will be recorded as 
   "class=ham score=0" for any unprocessed messages.

5. Each participant may submit up to four filters for
   each task (filtering and active learning).
   At least one filtering submission per group will
   be evaluated by us on several corpora -- some with
   ideal feedback and some with delayed feedback.

6. Each participant must run the same filters on four
   public corpora and submit the results:

     - TREC 2006 English (no delay)   -- 38K messages
     - TREC 2006 English (with delay) -- 38K messages
     - TREC 2006 Chinese (no delay)   -- 65K messages
     - TREC 2006 Chinese (with delay) -- 65K messages