Tuesday, August 5, 2008

SIGIR 2008

We are just back from Singapore, where we have attended the extremely well organised SIGIR'08 conference. We presented one full paper and three posters.

Craig presented our full paper entitled Retrieval Sensitivity Under Training Using Different Measures. Through a large-scale empirical evaluation, the paper addresses an important practical issue, when deploying a search engine, namely whether it matters which evaluation measure is used during training, especially when the available training data is very incomplete. The paper shows among other results that it is not necessarily appropriate to train by directly optimising the target evaluation measure (e.g. MAP) . In particular, the paper shows that bPref, infAP and nDCG are all superior training measures than MAP when the training dataset is incomplete and when the evaluation measure is MAP. Interestingly, the same research question has been addressed by Stephen Robertson, albeit more theoretically, in his keynote talk at the SIGIR'08 LR4IR workshop, where he justified and illustrated why optimising directly the evaluation measure on the training set is not often a good approach (as we say, "Great minds think alike"!).

The Terrier Team also presented three posters at the conference:

Ranking Opinionated Blog Posts Using OpinionFinder (Presented by Ben): The paper proposes an approach to use and integrate an NLP opinion-identification toolkit, OpinionFinder, into the retrieval process of an IR system, such that opinionated, relevant documents are retrieved in response to a query. This is one of the very few opinion finding detection approaches that were shown to be effective in the TREC Blog Track.

Limits of Opinion-Finding Baseline Systems (Presented by Craig/Iadh): The paper investigates how the underlying baseline retrieval system performance affects the overall opinion-finding performance. Two effective opinion-finding techniques are applied to all the baseline runs submitted to the TREC 2007 Blog track, leading to interesting insights and conclusions.

Automatic Document Prior Feature Selection for Web Retrieval (Presented by PJ): The paper investigates whether the retrieval performance of a Web search engine can be further enhanced by selecting the best document prior feature (e.g. PageRank, URL-Depth, etc.) on a per-query basis. The paper proposes a novel method for selecting the best document prior feature on a per-query basis.

Ps: Photos are from the SIGIR'08 website.

Monday, August 4, 2008

Welcome to the Terrier Team Blog

It has been a while since we started thinking about having a blog for the Terrier Team. Actually, since we have been involved in the organisation of a TREC blog track in 2006.

Recently, we have been encouraged by the very informative and interesting information retrieval-related discussions, taking place in blogs such as

From mere regular readers of information retrieval blogs, we thought that it is now the right time to become more actively involved in blogging. Hence the creation of this new forum, where we intend to post news about our research work and activities. We hope to share our thoughts on information retrieval research, and to engage in a dialogue with our fellow colleagues and friends.

We do hope that many of you will join us in this forum.