We have been very busy recently with the TREC 2008 Blog track. Now that all runs have been submitted and that the relevance assessments are on-going, it is the time of the year where we start planning for the future of the track at TREC 2009! Indeed, TREC operates a policy where existing tracks are renewed on an annual basis, and following the submission of a proposal.
Back in 2006, when we first proposed the Blog track, our aim was to have a long-term objective for the track, recognising that the richness of the blogosphere and its peculiarities will require several years of investigation before reaching a full understanding of the different blog search tasks, and how they should be effectively addressed. In particular, we proposed to adopt an incremental approach, where we begin with basic blog search tasks and progressively move to more complex search scenarios.
In the first three years of the track (2006-2008), we addressed two main blog search tasks:
- Opinion finding: involves locating blog posts that express an opinion about a given target.
- Blog distillation: involves locating blogs that are principally devoted to a topic X over the timespan of the feed.
The first task tackles an important aspect of blogs, namely their opinionated/subjective nature, and the tendency of bloggers to express views, thoughts and feelings towards named-entities. This tasks helps users to find out what the bloggers think about X. The second search task addresses a scenario where the user would like to find a blog to follow or read in their RSS reader. Our main findings and conclusions from the first two years of the Blog track at TREC are summarised in the ICWSM 2008 paper, entitled On the Trec Blog Track. The Blog track 2006 and 2007 overview papers provide further detailed analysis and results.
We are now proposing to move to a second phase of the Blog track, where more refined and complex search scenarios should be investigated. In particular, we are thinking to use a new and larger collection of blogs, which has a much longer timespan than the 11-weeks period covered in the Blog06 collection. This allows investigating another important characteristic of the blogosphere, namely the temporal/chronological aspect of blogging, and various related search tasks such as story identification and tracking.
While we were thinking about such possible future tasks, we came across a position paper by Marti Hearst, Matthew Hurst and Susan Dumais, entitled "What Should Blog Search Look Like?", which will be presented in the forthcoming Search in Social Media (SSM 2008) workshop at CIKM 2008.
In particular, Hearst et al. propose that the blog distillation task should be further refined by taking into account a number of dimensions or attributes such as the authority of the blog, the trustworthiness of its authors, the genre of the blog and its style of writing. For example, a user might be interested in blogs to read about a topic X, but where the blogger expresses in-depth viewpoints, backed up by a scientific methodology or evidence. The Cranfield evaluation paradigm adopted by TREC requires deeper thoughts about how relevance assessments should be conducted in such a scenario.
Unsurprisingly for a strong advocate of the importance of user interfaces and visualisation tools for information retrieval, Hearst together with her co-authors propose a faceted blog search interface to help the user explore the attributes of the blogs before choosing those they wish to follow or read, i.e. exploratory search at its best! The conclusion of the paper provides a good summary of Hearst et al.'s views:
For the problem of selecting a blog to read, we propose a faceted interface which highlights different attributes of interest, with a focus on people and on matching the taste preferences of the reader. For the task of “taking the pulse of the blogosphere,” we suggest that blog data be integrated with other social media and that the existing work on tracking trends and aggregating views is heading in the right direction.
As we are trying to wrap up our proposal for TREC 2009, we would like to hear other suggestions and comments about what blog search should look like. Please feel free to post your thoughts and comments in this post, or to email them privately, if you wish so.
8 comments:
Thank you for taking a look at this paper! I'm glad you like the ideas within. And a shout out to Jon Elsas who alerted me to your excellent blog.
I look forward to the new corpus and track opportunities.
I posted some of my thoughts on the possible 2009 blog track tasks.
Thanks a lot.
We are currently compiling a set of possible blog search tasks (with pros and cons). Usually, a TREC track does not run more than two search tasks. If a proposal gets accepted, the TREC conference participants are polled to see which tasks get the most interest.
Iadh --
First of all, we all appreciate the tremendous amount of work that goes into creating & maintaining these collections, and organizing the track. As you know, we've really enjoyed being a part of the blog track research and look forward to what's coming next.
I would love to see more complex tasks addressed, moving beyond topical relevance. Most ad-hoc search tasks at TREC really just target this aspect of retrieval. I'm sure we all agree that this is a necessary but not sufficient component in any effective search engine. As you said, nobody really knows how to go beyond topical relevance, especially in the TREC-style Cranfield paradigm.
Evaluating topical relevance is easy compared to things like authority. In that case, the ground truth is a relevance judgement made by a single person. What is the gold standard for these more complex tasks? And, how do we find a task that's possible to evaluate, but also non-trivial to perform? Most importantly, how do we ensure these tasks are reflective of real-world information needs?
Defining a gold standard on real-world usage data may be the only way to really accomplish this. I don't mean interactive track-style usage, I mean real usage from real blog search engines, feed readers, or something similar. Without that, how can we define a search task that has some basis in reality and a tractable evaluation?
Hi Jon,
Many thanks for the comment.
An objective for the second phase of the Blog track is indeed to address more complex search scenarios.
The main difficulty is how to go beyond topical relevance, and how to evaluate features such as quality or authority, while still being able to conduct relevance assessments within the Cranfield paradigm setting, and with reasonable resources.
I have just completed a draft for the proposal. We suggested a way to get around this problem, which allows for the task to be operationalised in a Cranfield paradigm setting, while having reasonable assessment costs.
We do welcome up-to-date query logs from commercial search engines or feed readers. That would be great, especially if the logs are suitably associated to the time-span of the new blog collection.
If you have any pointer about who might be willing to provide us with such query logs, it would be very much appreciated.
It is great to know that a new and larger collection is being built. This will truly be a valuable asset for research on the impact of temporal features in search.
Considering this, keeping the standard ad-hoc search task would be important so that it is possible to evaluate the impact of a longer time-span of past data in search.
On a side note, I think that the cost of the TREC-BLOG collection is really detrimental to further participation. Would any of the big companies be interested in sponsoring this collection?
Thanks again for all your hard work putting the track together.
I look forward to participating.
As far as cost, I know the open-source community has been struggling to create its own test collections because the average developer is not part of an organization with access to TREC data.
Tom White suggested hosting large data collections on Amazon's S3. The cost for storage/transfer is quite reasonable. What do you think of this?
Post a Comment