From: "Gordon V. Cormack" To: Multiple recipients of list Subject: Blind Spam Filter Evaluation Testing will be blind; NIST will substitute random run identifiers before forwarding submissions to testers. After testing, NIST will reverse the substitution and communicate raw results to paricipants. Filters will be evaluated mechanically, and there will be no opportunity for the testers to nurse the filters or to interact with participants should they fail. Please note that: (a) Each filter submission is a .zip or .tgz file that unpacks into the current directory. After unpacking the files "initialize" "classify" "train" and "finalize" should be present. (On Windows systems "initialize.exe" "classify.exe" "train.exe" and "finalize.exe" are acceptable alternatives.) Additional files may be included as required to run your filter. Each active learning shell submission is a .zip or .tgz file that unpacks into the current directory. After unpacking the file "active.cpp" must be present. Additional files may be included as necessary to run your active learning shell. (b) Each result submission should be a single file containing the result of running your filter with the toolkit on a particular corpus. (c) Your filter must be robust. If it hangs, it will remain hung until the time limit is exceeded. If it crashes, the run script may be able to carry on, but you are taking your chances. (d) You should test your filter on at least the TREC 2005 public corpus and the sample Chinese corpus. If it does not run to completion on these there is little chance it will do so on the TREC 2006 corpora. A successful test does not guarantee, however, that your filter will run on new data. In particular, new data may have mangled headers, very large messages, binary data, or other anomalies. For the *most part* the data consists of valid email messages with headers similar to those in the public corpus, but we cannot guarantee this to be the case for every message. (e) Test environment is the same as last year: AMD64 hardware with Fedora Linux (64-bit) or Windows XP (32-bit). Fedora Linux can compile and run 32-bit binaries but that is not the default. Internet access is not allowed and may cause filters to hang or crash [see (c)]. TCP/IP may be used locally using IP address 127.0.0.1 Machines will have 1GB of RAM. Time limit is 60+2n seconds, where n is the number of messages in the corpus. (f) The organizers will be happy to run a test of your filter on the evaluation system until July 6 -- one week prior to the submission deadline. Just send a note to gvcormac@uwaterloo.ca -- Gordon V. Cormack CS Dept, University of Waterloo, Canada N2L 3G1 gvcormack@uwaterloo.ca http://cormack.uwaterloo.ca/cormack