Baseline Model Implementation for Automatic Participation in the TREC 2015 Total Recall Track

The Baseline Model Implemention ("BMI") is free software (licence: GPL v3) that participants may use to automate the TREC 2015 Total Recall Track. BMI uses Continuous Active Learning (CAL) to fully automate the task required of Total Recall Track participants. The goal for participants is to achive better effectiveness than BMI, either by modifying it, or by implementing their own solutions from scratch.

Overview of BMI

To use BMI, participants must:

Installing VirtualBox

Installers for Linux, Mac OS X, and Windows are available from Oracle.

Debian and Debian-based Linux distributions have an available package "virtualbox" that may be installed using a package manager or the command line "sudo apt-get install virtualbox".

Be aware that VirtualBox will run the 64-bit BMI virtual machine only if the BIOS "Virtualization Technology" and "VT-d" features are enabled. It appears that many desktop-class machines disable these features by default. If you don't enable them, BMI will install but fail to boot.

BMI requires 2GB of RAM to run and may consume up to 100GB of disk space for full runs. The host computer on which you install VirtualBox should have substantially more than this.

Installing BMI

On Linux, ensure that either wget or curl is installed, open a terminal, and type one of these commands:

On Mac OS X, open a terminal and type the following command:

On Windows download and run the installer at

On all systems, the installer will create a virtual machine named TRECTR, and will install a folder "vmscripts" in the VirtualBox home folder. The name of the VirtualBox home folder is reported by the installer, and can also be determined using VirtualBox.

Configuring BMI

Configuring BMI Parameters

In the folder "vmscripts" you fill find file "TREC_Config.txt" which you may edit to supply your group name, the name of the test you wish to run, and the name of the TREC Total Recall server. The default contents of "vmscripts" is:
To perform simple testing, you do not need to change any of these settings.

When you receive a participant ID from TREC, you should replace the value of TRUSER with that ID. (The "test" ID is restricted to smaller tests.)

Barring unforseen circumstances, it should never be necessary to chage TRSERVER. TRSERVER is the name of the TREC server from which BMI fetches the datasets, topics, and relevance assessments. The interface to the TREC server is documented here.

TRTEST may (at the time of writing) be one of: "trivial", "test", and "bigtest".

Configuring BMI Scripts

The folder "vmscripts/cmd.dir" is shared with the BMI virtual machine. Immediately after the BMI VM boots, it executes "vmscripts/cmd.dir/start" which is a bash script. This script can execute any command installed on the BMI VM (which is Debian Linux with developer tools installed). The script can also execute other scripts or (Debian) executable files contained in vmscripts.

The BMI VM is configured with a 2GB root drive and a 1TB scratch drive mounted as /tmp. All drives are reinitialized every time the BMI VM is rebooted.

Log files and other persistent information may be written to files in the vmscripts folder.

The scripts run as the user named "user".

Configuring the BMI VM

The password for "user" is "user" and the password for "root" is "root". You can sign on to the VM while it is running. However, since the disks are reinitialized at every boot, you will not be able to permanently install software. You can compile and/or install software into the shared folder.

Participants who need to modify the operating system (or to use an entirely different operating system, such as Windows) will need to create a new root disk (possibly by using VirtualBox's "clonehd" to copy the BMI root disk). However, the shared folder and scratch configurations should be preserved, and the VM must read and honor the contents of the TREC_Config.txt file in the shared folder.