Improving Weighting Schemes: Experiences Using Pivoted Document Length Normalization
Darrin L. Dimmick, M P. Smith
Today's state of the art information retrieval document ranking functions contain several tuning parameters. The proper settings for these parameters for a given collection may not be known and may be difficult to derive. This paper reports on efforts to derive a document ranking function with a single tuning parameter that can be set once and will work across many different collections. The performance target for this function is to meet or exceed the performance level of the top weighting functions as evaluated in the TREC conference. To achieve this goal experiments were carried out on several simple weighting functions using pivoted document length normalization to improve their performance. The results show that good performance can be achieved by using a simple pivoted normalization function that works well across the collections used in this study.