Tuesday, August 5, 2008
We are just back from Singapore, where we have attended the extremely well organised SIGIR'08 conference. We presented one full paper and three posters.
Craig presented our full paper entitled Retrieval Sensitivity Under Training Using Different Measures. Through a large-scale empirical evaluation, the paper addresses an important practical issue, when deploying a search engine, namely whether it matters which evaluation measure is used during training, especially when the available training data is very incomplete. The paper shows among other results that it is not necessarily appropriate to train by directly optimising the target evaluation measure (e.g. MAP) . In particular, the paper shows that bPref, infAP and nDCG are all superior training measures than MAP when the training dataset is incomplete and when the evaluation measure is MAP. Interestingly, the same research question has been addressed by Stephen Robertson, albeit more theoretically, in his keynote talk at the SIGIR'08 LR4IR workshop, where he justified and illustrated why optimising directly the evaluation measure on the training set is not often a good approach (as we say, "Great minds think alike"!).
The Terrier Team also presented three posters at the conference:
Ranking Opinionated Blog Posts Using OpinionFinder (Presented by Ben): The paper proposes an approach to use and integrate an NLP opinion-identification toolkit, OpinionFinder, into the retrieval process of an IR system, such that opinionated, relevant documents are retrieved in response to a query. This is one of the very few opinion finding detection approaches that were shown to be effective in the TREC Blog Track.
Limits of Opinion-Finding Baseline Systems (Presented by Craig/Iadh): The paper investigates how the underlying baseline retrieval system performance affects the overall opinion-finding performance. Two effective opinion-finding techniques are applied to all the baseline runs submitted to the TREC 2007 Blog track, leading to interesting insights and conclusions.
Automatic Document Prior Feature Selection for Web Retrieval (Presented by PJ): The paper investigates whether the retrieval performance of a Web search engine can be further enhanced by selecting the best document prior feature (e.g. PageRank, URL-Depth, etc.) on a per-query basis. The paper proposes a novel method for selecting the best document prior feature on a per-query basis.
Ps: Photos are from the SIGIR'08 website.