Skip to main content
U.S. flag

An official website of the United States government

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

Search Publications by: Ian Soboroff (Fed)

Search Title, Abstract, Conference, Citation, Keyword or Author
Displaying 1 - 25 of 61

Overview of the NTCIR-17 FairWeb-1 Task

February 13, 2024
Sijie Tao, Nuo Chen, Tetsuya Sakai, Zhumin Chu, Hiromi Arai, Ian Soboroff, Nicola Ferro, Maria Maistro
This paper provides an overview of the NTCIR-17 FairWeb-1 Task. FairWeb-1 is an English web search task which seeks more than an ad-hoc web search task. Our task considers not only document relevance but also group fairness. We designed three types of

The BETTER Cross-Language Information Retrieval Datasets

July 27, 2023
Ian Soboroff
The IARPA BETTER (Better Extraction from Text Through Enhanced Retrieval) program held three evaluations of information retrieval (IR) and information extraction (IE). For both tasks, the only training data available was in English, but systems had to

What Makes a Good Podcast Summary?

July 11, 2022
Rezvaneh Rezapour, Sravana Reddy, Rosie Jones, Ian Soboroff
Abstractive summarization of podcasts is motivated by the growing popularity of podcasts and the needs of their listeners. Podcasting is a markedly different domain from news and other media that are commonly studied in the context of automatic

Overview of TREC 2021

May 6, 2022
Ian Soboroff
TREC 2021 is the thirtieth edition of the Text REtrieval Conference (TREC). The main goal of TREC is to create the evaluation infrastructure required for large-scale testing of retrieval technology. This includes research on best methods for evaluation as

Can Old TREC Collections Reliably Evaluate Modern Neural Retrieval Models?

January 26, 2022
Ellen M. Voorhees, Ian Soboroff, Jimmy Lin
Neural retrieval models are generally regarded as fundamentally different from the retrieval techniques used in the late 1990's when the TREC ad hoc test collections were constructed. They thus provide the opportunity to empirically test the claim that

PSCR 2021: Social Media Incident Streams

October 1, 2021
Ian Soboroff
Monitoring social media for public safety is incredibly challenging. The TREC Social Media Incident Streams project collects social media during emergency events, annotates and labels it for public safety use, and provides a metrics-focused environment

PSCR 2021: Pecha Kucha Portfolio Overviews

September 28, 2021
John Beltz, Scott Ledgerwood, Roger Blalock, Joe Grasso, John S. Garofolo, Jesse Frey, Cara O'Malley, Fernando Cintron, Bill Fisher, Gema Howell, Yee-Yin Choong, Jack Lewis, Paul Merritt, Edmond J. Golden III, Ian Soboroff, Craig Connelly, Gary Howarth, Brianna Vendetti, Katelynn Kapalo, Margaret Pinson
PSCR Research Portfolio Leaders join their staff to provide an overview of the projects housed within their PSCR portfolio. Each portfolio overview is delivered in a traditional Pecha Kucha style presentation, dividing topics into 20 slides that when

Searching for Answers in a Pandemic: An Overview of TREC-COVID

September 1, 2021
Ellen M. Voorhees, Ian Soboroff, Kirk Roberts, Tasmeer Alam, Steven Bedrick, Dina Demner-Fushman, Kyle Lo, Lucy L. Wang, William Hersh
We present an overview of the TREC-COVID Challenge, an information retrieval (IR) shared task to evaluate search on scientific literature related to COVID-19. The goals of TREC-COVID include the construction of a pandemic search test collection and the

TREC Deep Learning Track: Reusable Test Collections in the Large Data Regime

July 11, 2021
Ellen M. Voorhees, Ian Soboroff, Nick Craswell, Bhaskar Mitra, Emine Yilmaz, Daniel Campos
The TREC Deep Learning (DL) Track studies ad hoc search in the large data regime, meaning that a large set of human-labeled training data is available. Results so far indicate that the best models with large data are likely deep neural networks. This paper

TREC 2020 News Track Overview

May 21, 2021
Ian Soboroff, Shudong Huang, Donna Harman
The News track focuses on information retrieval in the service of help- ing people read the news. In 2018, in cooperation with the Washington Post1, we released a new collection of nearly 600,000 news articles, and crafted two tasks related to how news is

TREC-COVID: Constructing a Pandemic Information Retrieval Test Collection

February 19, 2021
Ellen M. Voorhees, Ian Soboroff, Tasmeer Alam, William Hersh, Kirk Roberts, Dina Demner-Fushman, Kyle Lo, Lucy L. Wang, Steven Bedrick
TREC-COVID is a community evaluation designed to build a test collection that captures the information needs of biomedical researchers using the scientific literature during a pandemic. One of the key characteristics of pandemic search is the accelerated

PSCR 2020_Social Media Incident Streams

October 29, 2020
Ian M. Soboroff
The ubiquity of mobile internet-enabled devices combined with wide-spread social media use during emergencies is posing new challenges for response personnel. In particular, service operators are now expected to monitor these online channels to extract

International Workshop on Deep Video Understanding

October 21, 2020
Keith Curtis, George Awad, Shahzad K. Rajput, Ian Soboroff
This is the introduction paper to the International Workshop on Deep Video Understanding. In recent years, a growing trend towards working on understanding videos (in particular movies) in a more deeper level started to motivate researchers working in

TREC-COVID: Rationale and Structure of an Information Retrieval Shared Task for COVID-19

July 8, 2020
Ellen M. Voorhees, Ian Soboroff, Tasmeer Alam, Kirk Roberts, William Hersh, Dina Demner-Fushman, Steven Bedrick, Kyle Lo, Lucy L. Wang
TREC-COVID is an information retrieval (IR) shared task initiated to support clinicians and clinical research during the COVID-19 pandemic. IR for pandemics breaks many normal assumptions, which can be seen by examining nine important basic IR research

Overview of the NIST 2016 LoReHLT Evaluation

November 13, 2017
Audrey N. Tong, Lukasz L. Diduch, Jonathan G. Fiscus, Yasaman Haghpanah, Shudong Huang, David M. Joy, Kay Peterson, Ian M. Soboroff
Initiated in conjunction with DARPA's Low Resource Languages for Emergent Incidents (LORELEI) Program, the NIST LoReHLT (Low Re-source Human Language Technology) evaluation series seeks to incubate research on fundamental natural language processing tasks

Using Replicates in Information Retrieval Evaluation

August 2, 2017
Ellen M. Voorhees, Daniel V. Samarov, Ian M. Soboroff
This paper explores a method for more accurately estimating the main effect of the system in a typical test-collection-based evaluation of information retrieval systems, and thus increasing the sensitivity of system comparisons. Randomly partitioning the

Promoting Repeatability Through Open Runs

June 7, 2016
Ellen M. Voorhees, Shahzad K. Rajput, Ian M. Soboroff
TREC 2015 introduced the concept of ‘Open Runs’ in response to the increasing focus on repeatability of information retrieval experiments. An Open Run is a TREC submission backed by a software repository such that the software in the repository reproduces

Computing confidence intervals for common IR measures

December 9, 2014
Ian M. Soboroff
Confidence intervals quantify the uncertainty in an average and o↵er a robust alternative to hypothesis testing. We measure the performance of standard and bootstrapped con- fidence intervals on a number of common IR measures using several TREC and NTCIR

Overview of the TREC-2012 Microblog Track

June 2, 2014
Ian M. Soboroff, Iadh Ounis, Jimmy Lin, Craig Macdonald
The Microblog track examines search tasks and evaluation meth- odologies for information seeking behaviours in microblogging en- vironments such as Twitter. It was first introduced in 2011, address- ing a real-time adhoc search task, whereby the user

Building Better Search Engines by Measuring Search Quality

March 3, 2014
Ellen M. Voorhees, Paul D. Over, Ian Soboroff
Search engines help users locate particular information within large stores of content developed for human consumption. For example, users expect web search engines to direct searchers to web sites based on the content of the site rather than the site

Overview of the TREC 2011 Microblog Track

August 15, 2013
Ian M. Soboroff, Iadh Ounis, Craig Macdonald, Jimmy Lin
The Microblog track examines search tasks and evaluation methodologies for information seeking behaviors in microblogging environments such as Twitter. It was first introduced in 2011, addressing a real-time adhoc search task, whereby the user wishes to

Evaluating Real-Time Search over Tweets

December 10, 2012
Ian M. Soboroff, Dean P. McCullough, Jimmy Lin, Craig Macdonald, Iadh Ounis, Richard McCreadie
Twitter offers a phenomenal platform for the social sharing of information. We describe new resources that have been created in the context of the Text Retrieval Conference (TREC) to support the academic study of Twitter as a real-time information source

Overview of the TREC-2010 Blog Track

August 15, 2012
Ian M. Soboroff, Iadh Ounis, Craig Macdonald
The Blog track aims to investigate the information seeking behavior in the blogosphere. The track was initiated in 2006, and has used an incremental approach in tackling several search tasks by their level of difficulty. In TREC 2010, the track has