The Text REtrieval Conference (TREC) is a series of annual workshops designed to build the infrastructure for large-scale evaluation of search systems and thus improve the state-of-the-art. Each workshop is organized around a set "tracks", challenge problems that focus eff ort in particular research areas. The most recent TRECs have contained a Medical Records track whose goal is to enable semantic access to the free-text fields of electronic health records. Such access will enhance clinical care and support the secondary use of health records. The speci c search task used in the track was a cohort-finding task. A search request described the criteria for inclusion in a (possible, but not actually planned) clinical study and the systems searched a set of de-identi ed clinical reports to identify candidates who matched the criteria. As anticipated, the search results demonstrate that language use within electronic health records is suciently different from general use to warrant domain-speci c processing. Top-performing systems each used some sort of vocabulary normalization device specifi c to the medical domain to accommodate the array of abbreviations, acronyms, and other informal terminology used to designate medical procedures and findings in the records. The use of negative language is also much more prevalent in health records (e.g., patient denies pain, no fever) and thus requires appropriate handling for good search results.
ACM Conference on Bioinformatics, Computational Biology, and Biomedical Informatics 2013