Tuesday, February 23, 2010

TREC Blog Track 2010

The TREC Blog track will be continuing in 2010. In
 2009, 
the
 Blog 
track 
has
 been
 markedly 
revamped
, addressing 
more
 refined
 Blog 
search 
scenarios
 using 
the new Blogs08 collection, a
 large
 sample 
of
 the 
blogosphere covering the period of 14th January 2008 to 10th February 2009.

A summary of the TREC Blog track 2009 edition has been presented by Iadh Ounis at the main TREC conference (Slides). The Blog track 2009 overview paper will be available on the TREC website shortly, once it is updated and reviewed.

The details of the TREC 2010 Blog track are still being finalised by the organisers. However, following the discussions at the TREC 2009 Blog track workshop, here are some salient details (see also the TREC 2009 Wrap-up Slides):

1. Faceted blog search task will run again in 2010: The task addresses
 the 
quality aspect
 of
 the
 retrieved blogs
. It is a feed search task.
  • We will adopt a two-stage submission procedure: (1) a participating group submits "topically-relevant"blogs for each query; (2) a few standard baselines will be distributed to participants, so that they can re-rank them with respect to various facet inclinations (e.g. opinionated, in-depth, personal).
  • Groups can participate in stage 2 without stage 1, and vice-versa. Stage 1 is akin to an adhoc blog search task.
  • More topics for various facet inclinations.

2. Top news story identification task will run again in 2010: The task addresses the 
news‐related 
dimension
 of 
the 
blogosphere. In particular, it investigates whether the blogosphere can be used to identify the most important news stories of the day.


  • Real-time news search task rather than retrospective.
  • Much larger and a more comprehensive headlines sample, provided by a major news organisation.
  • A two-stage submission procedure: (1) Groups submit a ranking of top stories for some days per-category (e.g. sport, politics, business, etc.) (2) We will then select some top relevant stories, for which we will ask the participating groups to identify the related blog posts, in a manner that covers the various/diverse aspects of each story.
  • Groups can participate in stage 2 without stage 1. In the latter case, its is an adhoc diversity blog post search task, where the headline is the query.
We welcome any feedback and comments on the tasks above to trecblog-organisers (at) dcs.gla.ac.uk

Finally, note that if you wish to participate in TREC 2010, you should answer the TREC 2010 call for participation. We will update the Blog track wiki as things become more refined - keep following the Blog track developments as they happen on our dedicated Wiki web site.

9 comments:

All said...

How many further Facets this year are to be added ???

Iadh Ounis said...

To allow people to use last year's topics for training, we will very likely be using the same facets as in 2009, i.e. Opinionated, Indepth, and Personal. However, if people want to see more facets, then we are happy to consider it.

In all cases, we will have many more topics this year than in 2009.

Gul said...

I dont think people would like to have more facets. Let us get more used to facets we r currenly using. Now when we are understanding the things better, we cannot go for more facets :=)

Iadh Ounis said...

@Gul. Indeed, that was the general consensus during the blog track workshop at TREC. People thought that there is much more to do to have a good understanding of the current three facets, and it would make sense that we stick to them in the meantime, especially that there are now some training topics associated to these three facets.

All said...

A question on TREC Blog 2010:

Are you providing Baselines this year like you did for TREC 2008 so that people not having facility to index such a large collection can also participate. If Yes, then when you will provide? It will be one baseline or many??

Thanks

All said...

Ahh also one thing to add .. of course this baseline would be for topics of both i.e. TREC 2009 n TREC 2010? non??

Iadh Ounis said...

@All Yes, the introduction of the two-stage submission procedure is meant to facilitate the participation of groups who cannot necessarily index the Blogs08 collection. Like in TREC 2008, several runs will be distributed as common baselines.

We have not worked out the precise timescale yet (this is progressing), but we expect that these baselines should be available during early summer.

Indeed, it is good idea to have the baselines runs for both 2009 and 2010 topics.

Note that you can obtain last year's submitted runs from the TREC web site.

random said...

Can you please tell me when will be the baseline runs available?

Iadh Ounis said...

@random The Blog track wiki provides a precise timetable:

http://ir.dcs.gla.ac.uk/wiki/TREC-BLOG

In particular, it indicates when the runs are due, and when we the common baseline runs will be made available.