Abstract
Information retrieval performance evaluation is commonly made based on the classical recall and precision based figures or graphs. However, important information indicating causes for variation may remain hidden under the average recall and precision figures. Identifying significant causes for variation can help researchers and developers to focus on opportunities for improvement which underlay the averages. This paper presents a case study showing the potential of a statistical repeated measures analysis of variance for testing the significance of factors in retrieval performance variation. The TREC-9 Query Track performance data is used as a case study and the factors studied are retrieval method, topic and their interaction. The results show that retrieval method, topic and their interaction are all significant. A topic level analysis is also made in order to see the nature of variation in the performance of retrieval methods across topics. The observed retrieval performances of expansion runs are truly significant improvements for most of the topics. Analyses of the effect of query expansion on document ranking confirm that expansion affects ranking positively.
Citation
Journal of the American Society for Information Science and Technology
Keywords
analysis of variance, information retrieval, performance evaluation, query expansion, TREC
Citation
Smeaton, A.
(2003),
Analysis of Performance Variation Using Query Expansion, Journal of the American Society for Information Science and Technology (Accessed May 15, 2026)
Additional citation formats
Issues
If you have any questions about this publication or are having problems accessing it, please contact [email protected].