This paper is the track report for the TREC-5 confusion track. For TREC-5, retrieval from corrupted data was studied through retrieval of specific target documents from a corpus that was corrupted by applying OCR techniques to page images of varying qualities. Methods that attempted probabilistic estimation of the original clean text fared better than methods that simply accepted corrupted version of the query text.
and Harman, D.
The Fifth Text Retrival Conference [TREC-5], The Fifth Text Retrival Conference [TREC-5], [online], https://tsapps.nist.gov/publication/get_pdf.cfm?pub_id=151347
(Accessed July 3, 2022)