[my photo]

Gordon V. Cormack

Publications and Recorded Presentations

Grossman M.R. and Cormack G.V., The eDiscovery Medicine Show, Ohio State Technology Law Journal 18:2 (December 2021).

Cormack G.V. and Grossman M.R., Navigating Imprecision in Relevance Assessments on the Road to Total Recall: Roger and Me, SIGIR 2017.

Grossman M.R., Cormack G.V. and Roegiest A., Automatic and Semi-Automatic Document Selection for Technology-Assisted Review, SIGIR 2017.

Clarke C.L.A., Cormack G.V., Lin J. and Roegiest A., Ten Blue Links on Mars, WWW 2017.

Cormack G.V., Grossman M.R. and Roegiest A., TREC 2016 Total Recall Track Overview, The Text Retrieval Conference (TREC 2016).

Grossman M.R. and Cormack G.V., A Tour of Technology-Assisted Review, in Perspectives on Predictive Coding and Other Advanced Search Methods for the Legal Practitioner, American Bar Association, 2016.

Cormack G.V. and Grossman M.R., Scalability of Continuous Active Learning for Reliable High-Recall Text Classification, CIKM 2016.

Clarke C.L.A., Cormack G.V., Lin J. and Roegiest A., Total Recall: Blue Sky on Mars, ICTIR 2016.

Lease M., Cormack, G.V., Nguyen A.T., Trikalinos T.A, and Wallace B.C., Systematic Review is e-Discovery in Doctor's Clothing, SIGIR 2016 MedIR Workshop.

Cormack G.V. and Grossman M.R., Engineering Quality and Reliability in Technology-Assisted Review, SIGIR 2016.

Zhang H., Lin J., Cormack G.V., and Smucker M.D., Sampling Strategies and Active Learning for Volume Estimation, SIGIR 2016.

Roegiest A. and Cormack G.V., Impact of Review-Set Selection on Human Assessment for Text Classification, SIGIR 2016.

Roegiest A. and Cormack G.V., An Architecture for Privacy-Preserving and Replicable High-Recall Retrieval Experiments, SIGIR 2016.

Grossman M.R. and Cormack G.V., Continuous Active Learning for TAR, Practical Law April/May 2016.

Roegiest A., Cormack G.V., Grossman M.R. and Clarke C.L.A., TREC 2015 Total Recall Track Overview, The Text Retrieval Conference (TREC 2015).

Hanbury A., Muller H., Balog K., Brodt T., Cormack G.V., Eggel I., Gollub T., Hopfgartner F., Kalpathy-Cramer J., Kando N. Krithara A., Lin J., Mercer S. and Potthast M., Evaluation-as-a-Service: Overview and Outlook, arXiv:1512.07454.

Cormack G.V. and Grossman M.R., Waterloo (Cormack) Participation in the TREC 2015 Total Recall Track, TREC 2015.

Cormack G.V. and Grossman M.R., Multi-Faceted Recall of Continuous Active Learning for Technology-Assisted Review, SIGIR 2015.

Roegiest A., Cormack G.V., Clarke C.L.A., and Grossman M.R., Impact of Surrogate Assessments on High-Recall Retrieval, SIGIR 2015.

Grossman M.R. and Cormack G.V., Comments on "The Implications of Rule 26(g) on the Use of Technology-Assisted Review.", 2014 Fed. Cts. L. Rev. 8 (July 2014).

Cormack G.V. and Grossman M.R. Evaluation of Machine-Learning Protocols for Technology-Assisted Review in Electronic Discovery, SIGIR 2014

Grossman M.R. and Cormack G.V., The Grossman-Cormack Glossary of Technology Assisted Review, with Foreword by John M. Facciola, U.S. Magistrate Judge, 2013 Fed. Cts. L. Rev. 7

Grossman M.R. and Cormack G.V., Inconsistent Responsiveness Determination in Document Review: Difference of Opinion or Human Error?, Pace Law Review 32:2.

Grossman M.R. and Cormack G.V. Inconsistent Assessment of Responsiveness in E-Discovery: Difference of Opinion or Human Error? ICAIL 2011 Workshop on Setting Standards for Searching Electronically Stored Information in Discovery Proceedings (DESI IV Workshop) .

Grossman M.R. and Cormack G.V., Technology-Assisted Review in E-Discovery Can Be More Effective and More Efficient Than Exhaustive Manual Review, Richmond Journal of Law and Technology 17:3.

Buettcher S., Clarke C.L.A. and Cormack G.V., Information Retrieval: Implementing and Evaluating Search Engines, MIT Press, 2010.

Clarke C.L.A., Cormack G.V., Lynam T.R., Buckley C. and Harman D., Swapping Documents and Terms, Information Retrieval 12:6 (December 2009), 680-694.

Cormack G.V., Reflections on CEAS 2009, Virus Bulletin, (August 2009).

Cormack G.V. and Kolcz A., Spam Filter Evaluation with Imprecise Ground Truth, SIGIR 2009.

Cormack G.V., Clarke C.L.A., and Buettcher S., Reciprocal Rank Fusion outperforms Condorcet and individual Rank Learning Methods, SIGIR 2009.

Cormack G.V. and Jose-Marcio Martins da Cruz, On the Relative Age of Spam and Ham Training Samples for Email Filtering, SIGIR 2009.

Kolcz A. and Cormack G.V., Genre-based Decomposition of Email Class Noise, KDD 2009.

Sculley D. and Cormack G.V., Going Mini: Extreme Lightweight Spam filters, CEAS 2009.

Martins da Cruz J.M. and Cormack G.V., Using Old Spam and Ham Samples to Train Email Filters, CEAS 2009.

Cormack G.V. and Mojdeh M., Autonomous Personal Filtering Improves Global Spam Filter Performance, CEAS 2009.

Cormack G.V. Email Spam Filtering: A Systematic Review, NOW Publishers, 2008.

Clarke C.L.A., Kolla M., Cormack G.V., Vechtomova O., Ashkan A., Buttcher S. and MacKinnon I, Novelty and Diversity in Information Retrieval Evaluation, SIGIR 2008, July, 2008.

Mojdeh M. and Cormack G.V., Semi-supervised Spam Filtering: Does it Work?, SIGIR 2008, July, 2008.

Sculley D. and Cormack G.V., Filtering Email Spam in the Presence of Noisy User Feedback, CEAS 2008.

Mojdeh M. and Cormack G.V., A Mail Client Plugin for Privacy-Preserving Spam Filter Evaluation, CEAS 2008.

Kemkes G., Cormack G.V., Munro I. and Vasiga T., New Task Types at the Canadian Computing Competition, Olympiads in Informatics 1 (2007), pp 79-89.

Cormack G.V. and Lynam T.R On-line Supervised Spam Filter Evaluation, ACM Transactions on Information Systems 25:3 (July 2007).

Goodman J., Cormack G.V. and Heckerman Spam and the ongoing battle for the inbox, Communications of the ACM 50(2), February 2007.

Cormack G.V., Hidalgo J.M.G, and Sanz E.P., Spam Filtering for Short Messages, Sixteenth ACM Conference on Information and Knowledge Management (CIKM 2007), November, 2007.

Cormack G.V., TREC 2007 Spam Track Overview, Proceedings of TREC 2007: The Sixteenth Text Retrieval Conference, November 2007.

Cormack G.V., University of Waterloo Participation in the TREC 2007 Spam Track, Proceedings of TREC 2007: The Sixteenth Text Retrieval Conference, November 2007.

Buttcher S., Clarke C.L.A, Cormack G.V. and Lynam T.R., MultiText Legal Experiments at TREC 2007, Proceedings of TREC 2007: The Sixteenth Text Retrieval Conference, November 2007.

Lushman B. and Cormack G.V., A larger decidable semiunification problem, 9th ACM Symposium on Principles and Practice of Declarative Programming (PPDP 07), July 2007.

Cormack G.V. and Lynam T.R., Validity and power of t-test for comparing MAP and GMAP, SIGIR 2007, July 2007.

Cormack G.V. and Lynam T.R., Power and bias of subset pooling strategies, SIGIR 2007, July 2007.

Cormack G.V., Gomez-Hidalgo J.M. and Sanz E.P., Feature Engineering for Mobile (SMS) Spam filtering, SIGIR 2007, July 2007.

Cormack G.V., Data Compression Models for Classification and Behavior Prediciton, invited address at NATO Advanced Worksop: Security Informatics and Terrorism -- Patrolling the Web, June 2007.

Cormack G.V., Content Based Web Spam Detection, Web Spam Challenge session at AIRWeb 2007, April 2007. (Presentation slides.)

Bratko A., Cormack G. V., Filipic B., Lynam T. R. and Zupan B., Spam Filtering Using Statistical Data Compression Models, Journal of Machine Learning Research 7 (Dec 2006), 2699-2720.

Verhoeff T., Horvath G., Diks K. and Cormack G.V., A proposal for an IOI Syllabus, Teaching Mathematics and Computer Science 4:1, 2006.

Cormack G. V. Random Factors in IOI 2005 Test Case Scoring, Informatics in Education 5:1, 2006, pp 5-14.

Cormack G. V., Munro I. , Vasiga T. and Kemkes G. Structure, Scoring and Purpose of Computing Competitions, Informatics in Education 5:1, 2006, pp 15-36.

Cormack G.V. TREC 2005 Spam Track Report, Virus Bulletin January 2006.

Clarke C.L.A., Cormack G.V., Lynam T.R., Terra E.L. and Laszlo M., Question Answering by Passage Selection, in Advances in Open Domain Question Answering, Strzalkowski and Harabagiu (editors), Springer, 2006

Cormack G.V., TREC 2006 Spam Track Overview, Proceedings of TREC 2006: The Fifteenth Text Retrieval Conference, November 2006.

Kemkes G., Vasiga T. and Cormack G.V., Objective Scoring for Computing Competition Tasks, ISSEP 06 - Informatics in Secondary Schools Evolution and Perspectives, Vilnius, November 2006.

Cormack G.V., Harnessing Unlabeled Examples through Iterative Application of Dynamic Markov Modeling, Winner of spam filtering performance award - Task B, ECML/PKDD 2006 Discovery Challenge , Berlin, September 2006.

Cormack G.V. and Lynam T.R. Statistical Precision of Information Retrieval Evaluation, SIGIR 2006, August 2006.

Lynam T.R. and Cormack G.V. On-line Spam Filter Fusion, SIGIR 2006, August 2006.

Cormack G.V. and Bratko A., Batch and On-line Spam Filter Evaluation, CEAS 2006 - Third Conference on Email and Anti-Spam, Mountain View, July 2006.

Cormack G.V., A Multi-Corpus Evaluation of Dynamic Markov Coding for Spam Filtering, presentation to Microsoft, April 3, 2006.

Cormack G.V., Spam filters: Do they work? Can you prove it?, presentation to the Computer Science Club of the University of Waterloo, February 15, 2006.

Cormack G.V., Statistical Analysis of IOI Scoring, Perspectives on Computer Science Competitions for (High School) Students, Dagstuhl, January 2006.

Cormack G.V., Kemkes G., Munro I. and Vasiga T., Structure, Scoring and Purpose of Computing Competition, Perspectives on Computer Science Competitions for (High School) Students, Dagstuhl, January 2006.

Cormack G.V., Clarke C.L.A. and Palmer C.R. MultiText Experiments for TREC, in Experiment and Evaluation in Information Retrieval (Ellen M. Voorhees and Donna K. Harman eds), MIT Press, 2005.

Cormack G.V. Challenges in Spam Filter Evaluation, Virus Bulletin April 2005.

Cormack G.V. and Lynam T.R. TREC 2005 Spam Track Overview in Proc. TREC 2005 - the Fourteenth Text REtrieval Conference, Gaithersburg, 2005.

Cormack G.V. and Lynam T.R. Spam Corpus Creation for TREC, Proc. CEAS 2005 - The Second Conference on Email and Anti-Spam, Palo Alto, July 2005.

Buttcher S., Clarke C.L.A., and Cormack G.V. Domain-Specific Synonym Expansion and Validation, in Proceedings of TREC 2004 - the Thirteenth Text REtrieval Conference, Gaithersburg, 2004.

Lynam T.R., Buckley C., Clarke C.L.A., and Cormack G.V., A Multi-System Analysis of Document and Term Selection for Blind Feedback. Proceedings of the 13th Conference on Information and Knowledge Management, Washington D.C. (November 2004).

Lushman B. and Cormack G. V. A Proof of Correctness of Ressel's adOPTed Algorithm. Information Processing Letters 18:6, 2003, pp 303-310.

Yeung D.L., Clarke C.L.A., Cormack G.V., Lynam T.R., and Terra E.L., Task-Specific Query Expansion, Proceedings of the Twelfth Text Retrieval Conference (TREC 2003), Gaithersburg, MD (November 2003).

Clarke C.L.A., Cormack G.V., Kemkes G., Laszlo M., Lynam T.R., Terra E.L. and Tilker P.L., Statistical Selection of Exact Answers, Proceedings of the Eleventh Text Retrieval Conference (TREC 2002), Gaithersburg, MD (November 2002). 

Clarke C.L.A., Cormack G.V., Laszlo M., Lynam T.R. and Terra E.L., The Impact of Corpus Size on Question Answering Performance, Proceedings of the ACM SIGIR Conference on Information Retrieval, Tampere (August 2002). 

Clarke C.L.A., Cormack G.V., Lynam T.R., Li C.M and McLearn G.L., Web Reinforced Question Answering, Proceedings of the Tenth Text Retrieval Conference (TREC 2001), Gaithersburg, MD (November 2001). 

Clarke C.L.A., Cormack G.V., Kisman D.I.E. and Lynam T.R., Question Answering by Passage Selection, Proceedings of the Ninth Text Retrieval Conference (TREC 2000), Gaithersburg MD (November 2000). 

Clarke C.L.A., Cormack G.V. and Lynam T.R., Exploiting Redundancy in Question Answering, Proceedings of SIGIR 2001, New Orleans (September 2001). .XP

Charles L. A. Clarke and Gordon V. Cormack. Shortest Substring Retrieval and Ranking. ACM Transactions on Information Systems 18:1, 2000, pp 44-78.

Charles L. A. Clarke, Gordon V. Cormack and Elizabeth A. Tudhope. Relevance Ranking for One to Three Term Queries. Information Processing and Management 36:2, 2000, pp 291-311.

Gordon V. Cormack, Charles L. A. Clarke, Christopher R. Palmer and Samuel S. L. To. Passage-Based Query Refinement. Information Processing and Management 36:1, 1999 pp 133-153.

Cormack G.V., Clarke C.L.A., Palmer C.R. and Kisman D.I.E, Fast Automatic Passage Ranking, Proceedings of the Eighth Text Retrieval Conference (TREC-8), Gaithersburg MD (November 1999).

Cormack G.V., Clarke C.L.A., Palmer C.R., Good R.C., The MultiText Retrieval System, Proceedings of SIGIR 99, the ACM International Conference on Research and Development in Information Retrieval, Berkeley, August 1999.

Cormack G.V., Lhotak O., Palmer C.R., Estimating Precision by Random Sampling, Proceedings of SIGIR 99, the ACM International Conference on Research and Development in Information Retrieval, Berkeley, August 1999.
 
Palmer C.R. and Cormack G.V., Operation Transforms for a Distributed Shared Spreadsheet, Proceedings of CSCW 98, the ACM Conference on Computer Supported Cooperative Work, Seattle, 1998.  

Cormack G.V., Palmer C.R. and Clarke C.L.A., Efficient Construction of Large Test Collections, Proceedings of SIGIR 98, the ACM Annual Conference on Research and Development in Information Retrieval, Melbourne, August 1998. 



Cormack G.V., Palmer C.R., Van Biesbrouck M., Deriving Very Short Boolean Queries for High Precision and Recall (MultiText Experiments for TREC-7), Proceedings of the Seventh Text REtrieval Conference (TREC-7), Gaithersburg, Maryland, November 1998.

Clarke C.L.A. and Cormack G.V., On the use of regular expressions for searching text, ACM Trans. Programming Languages and Systems 19, (May 1997), 413-426. see also CS-95-07.

Cormack G.V., Clarke C.L.A., Palmer C.R. and To S.L., Passage Based Refinement (MultiText Experiments for TREC-6), Proceedings of The Sixth Text REtrieval Conference (TREC-6), Gaithersburg, Maryland, November 1997.

Clarke C.L.A. and Cormack G.V., Relevance Ranking for One-to-Three-Term Queries, Proceedings of 5th RIAO - Recherche d'Information Assistee par Ordinateur sur Internet, Montreal, June 1997.

Duggan D., Cormack G. and Ophel J., Kinded type inference for parametric overloading, Acta Informatica 33:1 (1996), 21-68.

Good R.C., Cormack G.V., Clarke C.L.A. and Taylor D.J., A Robust Storage System Architecture, Journal of Computing and Information 2:1 (1996), 796-808. 

Clarke C.L.A. and Cormack G.V., Interactive Substring Retrieval (MultiText Experiments for TREC-5), Proceeding of the Fifth Text REtrieval Conference (TREC-5), Gaithersburg, Maryland, November 1996.


Clarke C.L.A, Cormack G.V. and Burkowski F.J., An algebra for structured text search and a framework for its implementation, Computer Journal 38:1 (1995), 43-56. See also CS-94-30.

Shen J. and Cormack G.V., On abstraction and sharing in generic modules, in Object-oriented technology for database and software, Proceedings of the Colloquium on Object Orientation in Databases and Software Engineering (COODBSE'94), 16-17 May 1994 (V.S. Alagar and R. Missaoui eds.), World Scientific (1995), 22-41.

Charles L. A. Clarke, Gordon V. Cormack and Forbes J. Burkowski., Shortest Substring Ranking (MultiText Experiments for TREC-4), Proceedings of the Fourth Text REtrieval Conference (TREC-4), Gaithersburg, Maryland, November 1995.

Cormack G.V., A Calculus for Concurrent Update (Abstract), Proc. 14th ACM Symposium on Principles of Distributed Computing (1995). See also CS-95-06

Clarke C.L.A., Cormack G.V. and Burkowski F.J., Schema-independent retrieval from heterogeneous structured text, Proc. Fourth Annual Symposium on Document Analysis and Information Retrieval, Las Vegas (Apr. 1995), 279-290. See also CS-94-39.



Shen J. and Cormack G., Access control for private declarations in Ada, Comput. Lang. 20:2 (1994), 117-126.

Shen J., Cormack G.V. and Duggan D., Dynamic Polymorphism in Ada Proc. Joint Modular Language Conference, in Advances in Modular Languages, Universitatsverlag Ulm GmbH (September 1994), 521-539.

Shen J., Cormack G.V., and Duggan D., Local package instances are not equivalent to generic formal package parameters, ACM Ada Letters 12:6 (1992), 47-49.

Shen J. and Cormack G.V., On generic formal package parameters in Ada 9X, ACM Ada Letters 12:3 (1992), 110-116.

Horspool R.N. and Cormack G.V., Constructing Word-Based Text Compression Algorithms, Proc. 2nd IEEE Data Compression Conference, Snowbird (April 1992). 

Shen J. and Cormack G.V., Automatic instantiation in Ada, Proc. ACM Tri-Ada Conference, San Jose (October 1991), 338-346.


Dueck G. and Cormack G.V. Modular attribute grammars, Computer Journal 33:2 (April 1990), 164-172.


Burkowski F.J. and Cormack G.V., Use of perfect hashing in a paged memory management unit, Proc. 1990 International Conference on Parallel Processing, Illinois (1990), 96-100.

Cormack G.V. and Wright A.K., Type-dependent parameter inference, Proc. ACM SIGPLAN 90 Conference on Programming Language Design and Implementatation, Sigplan Not. 25:6 (1990), 127-136.

Cormack G.V., An LR substring parser for noncorrecting syntax error recovery, Proc. ACM SIGPLAN 89 Conference on Programming Language Design and Implementation, Sigplan Not. 24:7 (1989), 161-169.

Salomon D. and Cormack G.V., Scannerless Parsing of Programming Languages, Proc. ACM SIGPLAN 89 Conference on Programming Language Design and Implementation, Sigplan Not. 24:7 (1989), 170-178.

Burkowski F., Cormack G. and Dueck G., Architectural support for synchronous task communication, Proc. ASPLOS III - 3rd ACM/IEEE Symposium on Architectural Support for Programming Languages and Operating Systems, in Operating Syst. Rev. 23 (April 1989), 40-53.

Cormack G.V., Data Compression, Encyclopedia of Science and Technology Yearbook, McGraw-Hill (1988)

Cormack G.V. A Micro Kernel for Concurrency in C, Software: Practice and Experience 18:4 (March 1988), 485-491.

Horspool R.N. and Cormack G.V. Hashing as a Compaction Technique for LR Parser Tables, Software: Practice and Experience 17:6 (June 1987), 413-416.

Cormack G.V., and Horspool R.N., Data Compression using Dynamic Markov Modelling, Computer Journal 30:6 (Dec. 1987), 541-550.

Strothotte T., and Cormack G., Structured Program Lookahead, Computer Languages 12:2 (1987), 95-108.

Horspool R.N. and Cormack G.V., Comments on "A locally adaptive data compression scheme," Commun. ACM 30:9 (Sept. 1987), 792-793.

Cormack G.V. and Wright A.K., Polymorphism in the Compiled Language ForceOne, Proc. 20th Hawaii International Conference on System Sciences (Jan. 1987), 284-292.

Horspool R.N. and Cormack G.V., Comments on "Data compression using static Huffman code-decode tables," Commun. ACM 29:2 (Feb. 1986), 150-152.

Cormack G.V. and Burkowski F.J., Distributed synchronous process communication, Congressus Numerantium 62 (1986). 

Burkowski F.J, Cormack G.V., Dyment J.D., Pachl J.K., A message-based architecture for high concurrency, Proc. Conference on Hypercube Multiprocessors, Knoxville Tennessee (Aug. 1985), in Hypercube Multiprocessors (M.T. Heath ed.), SIAM (1986), 27-37.

Horspool R.N. and Cormack G.V., Dynamic Markov Modelling -- A Prediction Algorithm, Proc. 19th Hawaii International Conference on System Sciences Vol. II (Jan. 1986) 700-707.

Cormack G.V., Data Compression on a Database System, Commun. A.C.M. 28:12 (Dec. 1985), 1336-1342.

Cormack G.V., Horspool R.N., and Kaiserswerth M., Practical Perfect Hashing, Computer Journal 28:1 (1985), 54-58.

Karasick M. and Cormack G.V., An efficient parser for reasonably deterninistic grammars, Congressus Numerantium 46 (1985), 149-157.

Cormack G.V., and Horspool R.N., Algorithms for Adaptive Huffman codes, Inf. Proc. Letters 18:3 (1984), 159-165.

Horspool R.N., and Cormack G.V., A practical method for data compression with computer applications. CIPS Session 84, Calgary (1984).

Cormack G.V., Extensions to Static Scoping, Proc. ACM SIGPLAN Conference on Programming Language Issues in Software Systems: Sigplan Notices 18:6 (1983), 187-191.

Baillie R., Cormack G.V., and Williams H.C., The problem of Sierpinski concerning k2n + 1, Math. Comp. 37:155 (1981), 229-231.

Cormack G.V., An algorithm for the selection of overloaded functions in Ada, ACM SIGPLAN Notices 16:2 (1981), 48-52.

Cormack G.V. and Williams H.C., Some very large primes of the form k2n + 1, Math. Comp. 35:152 (1980), 1419-1421.

Williams H.C., Cormack G.V., and Seah E., Calculation of the regulator of a pure cubic field, Math. Comp. 34 (1980), 567-611.

Cormack, G. V.; Grätzer, G. Using directed graphs for text compression. C. R. Math. Rep. Acad. Sci. Canada 2 (1980), no. 4, 193--198. 

King P.R., Cormack G.V. and Dueck G., Maps as concrete data structures, Congress. Numer. XXII (1979), 329-342.

King P.R., Cormack G.V. and Dueck G., Language features of MABEL, Congress. Numer. XXII (1979), 343-366.

King P.R., Cormack G., Dueck G., Jung N., Kusner G. and Melnyk J MABEL: A Beginner's Programming Language, Algol Bulletin 43 (1978), 54-83.