This paper is the track report for the TREC-5 confusion track. For TREC-5, retrieval from corrupted data was studied through retrieval of specific target documents from a corpus that was corrupted by applying OCR techniques to page images of varying qualities. Methods that attempted probabilistic estimation of the original clean text fared better than methods that simply accepted corrupted version of the query text.
and Voorhees, E.
Report on the TREC-5 Confusion Track, Report on the TREC-5 Confusion Track, [online], https://tsapps.nist.gov/publication/get_pdf.cfm?pub_id=151345
(Accessed November 30, 2023)