From: "Gordon V. Cormack" <gvcormac@uwaterloo.ca>
To: Multiple recipients of list <trecspam@nist.gov>
Subject: Blind Spam Filter Evaluation

Testing will be blind; NIST will substitute random run identifiers
before forwarding submissions to testers.  After testing,
NIST will reverse the substitution and communicate raw results 
to paricipants.

Filters will be evaluated mechanically, and there will
be no opportunity for the testers to nurse the filters or to
interact with participants should they fail.  

Please note that:

  (a) Each filter submission is a .zip or .tgz file that 
      unpacks into the current directory.  After unpacking
      the files "initialize" "classify" "train" and "finalize"
      should be present.  (On Windows systems "initialize.exe"
      "classify.exe" "train.exe" and "finalize.exe" are acceptable
      alternatives.)  Additional files may be included as 
      required to run your filter.

      Each active learning shell submission is a .zip or .tgz
      file that unpacks into the current directory.  After unpacking
      the file "active.cpp" must be present.  Additional files
      may be included as necessary to run your active learning
      shell.

  (b) Each result submission should be a single file containing
      the result of running your filter with the toolkit on a
      particular corpus.

  (c) Your filter must be robust.  If it hangs, it will remain hung
      until the time limit is exceeded.  If it crashes, the run
      script may be able to carry on, but you are taking your
      chances.

  (d) You should test your filter on at least the TREC 2005 public
      corpus and the sample Chinese corpus.  If it does not run
      to completion on these there is little chance it will do so
      on the TREC 2006 corpora.

      A successful test does not guarantee, however, that your filter
      will run on new data.  In particular, new data may have 
      mangled headers, very large messages, binary data, or other
      anomalies.  For the *most part* the data consists of valid
      email messages with headers similar to those in the public
      corpus, but we cannot guarantee this to be the case for
      every message.

  (e) Test environment is the same as last year:  AMD64 hardware
      with Fedora Linux (64-bit) or Windows XP (32-bit).  Fedora
      Linux can compile and run 32-bit binaries but that is not
      the default.

      Internet access is not allowed and may cause filters to 
      hang or crash [see (c)].  TCP/IP may be used locally using
      IP address 127.0.0.1

      Machines will have 1GB of RAM.

      Time limit is 60+2n seconds, where n is the number of
      messages in the corpus.

  (f) The organizers will be happy to run a test of your filter
      on the evaluation system until July 6 -- one week prior to
      the submission deadline.  Just send a note to
      gvcormac@uwaterloo.ca 

-- 
Gordon V. Cormack     CS Dept, University of Waterloo, Canada N2L 3G1
gvcormack@uwaterloo.ca            http://cormack.uwaterloo.ca/cormack