Where do good query terms come from ?

Gheorghe Muresan Dmitri Roussinov

This paper reports the results of an investigation on the quality of query expansion terms coming from different sources: pseudo-relevance feedback, web-based expansion, interactive elicitations from the user searchers, and expansion approaches based on query clarity. These sets of expansion terms are compared to an optimal upper-bound set, constructed based on existing relevance judgments. The comparisons are both in terms of set overlaps and in terms of search effectiveness improvement that it generates. In a nutshell, the conclusion is that machines are better than humans at doing statistical calculations and at estimating which query terms are more likely to match documents that are relevant for a given topic. This explains why in the HARD track of TREC 2005 the overall conclusion was that interaction with the searcher could not over-perform automatic procedures for query improvement. However, based on the fact that implicit relevance feedback and novel interaction models based on ostention or mediation have shown great potential, our conclusion is optimistic: rather that fight the machine, use it !

