tag:blogger.com,1999:blog-6043705792807544709.post6658115910150977890..comments2023-08-09T10:46:25.007+01:00Comments on TerrierTeam: About Blog Search TasksTerrier Team @ Glasgowhttp://www.blogger.com/profile/11678159696002044810noreply@blogger.comBlogger8125tag:blogger.com,1999:blog-6043705792807544709.post-41871262189006449652008-09-15T19:08:00.000+01:002008-09-15T19:08:00.000+01:00Thanks again for all your hard work putting the tr...Thanks again for all your hard work putting the track together. <BR/><BR/>I look forward to participating.<BR/><BR/>As far as cost, I know the open-source community has been struggling to create its own test collections because the average developer is not part of an organization with access to TREC data. <BR/><BR/>Tom White suggested <A HREF="http://www.lexemetech.com/2008/09/hosting-large-public-datasets-on-amazon.html" REL="nofollow">hosting large data collections on Amazon's S3</A>. The cost for storage/transfer is quite reasonable. What do you think of this?jeff.daltonhttps://www.blogger.com/profile/12887721174386884522noreply@blogger.comtag:blogger.com,1999:blog-6043705792807544709.post-24106699333724732202008-09-14T12:32:00.000+01:002008-09-14T12:32:00.000+01:00It is great to know that a new and larger collecti...It is great to know that a new and larger collection is being built. This will truly be a valuable asset for research on the impact of temporal features in search.<BR/><BR/>Considering this, keeping the standard ad-hoc search task would be important so that it is possible to evaluate the impact of a longer time-span of past data in search.<BR/><BR/>On a side note, I think that the cost of the TREC-BLOG collection is really detrimental to further participation. Would any of the big companies be interested in sponsoring this collection?ssnhttps://www.blogger.com/profile/16830838785316511428noreply@blogger.comtag:blogger.com,1999:blog-6043705792807544709.post-42678391590470596012008-09-13T19:58:00.000+01:002008-09-13T19:58:00.000+01:00Hi Jon,Many thanks for the comment. An objective f...Hi Jon,<BR/><BR/>Many thanks for the comment. <BR/><BR/>An objective for the second phase of the Blog track is indeed to address more complex search scenarios. <BR/><BR/>The main difficulty is how to go beyond topical relevance, and how to evaluate features such as quality or authority, while still being able to conduct relevance assessments within the Cranfield paradigm setting, and with reasonable resources. <BR/><BR/>I have just completed a draft for the proposal. We suggested a way to get around this problem, which allows for the task to be operationalised in a Cranfield paradigm setting, while having reasonable assessment costs. <BR/><BR/>We do welcome up-to-date query logs from commercial search engines or feed readers. That would be great, especially if the logs are suitably associated to the time-span of the new blog collection.<BR/><BR/>If you have any pointer about who might be willing to provide us with such query logs, it would be very much appreciated.Iadh Ounishttps://www.blogger.com/profile/05740425172350940695noreply@blogger.comtag:blogger.com,1999:blog-6043705792807544709.post-70606887069418537232008-09-12T18:31:00.000+01:002008-09-12T18:31:00.000+01:00Iadh --First of all, we all appreciate the tremend...Iadh --<BR/><BR/>First of all, we all appreciate the tremendous amount of work that goes into creating & maintaining these collections, and organizing the track. As you know, we've really enjoyed being a part of the blog track research and look forward to what's coming next.<BR/><BR/>I would love to see more complex tasks addressed, moving beyond topical relevance. Most ad-hoc search tasks at TREC really just target this aspect of retrieval. I'm sure we all agree that this is a necessary but not sufficient component in any effective search engine. As you said, nobody really knows how to go beyond topical relevance, especially in the TREC-style Cranfield paradigm.<BR/><BR/>Evaluating topical relevance is easy compared to things like authority. In that case, the ground truth is a relevance judgement made by a single person. What is the gold standard for these more complex tasks? And, how do we find a task that's possible to evaluate, but also non-trivial to perform? Most importantly, how do we ensure these tasks are reflective of real-world information needs?<BR/><BR/>Defining a gold standard on real-world usage data may be the only way to really accomplish this. I don't mean interactive track-style usage, I mean real usage from real blog search engines, feed readers, or something similar. Without that, how can we define a search task that has some basis in reality and a tractable evaluation?Jonhttps://www.blogger.com/profile/14308358891592822280noreply@blogger.comtag:blogger.com,1999:blog-6043705792807544709.post-6288463284292076752008-09-12T17:26:00.000+01:002008-09-12T17:26:00.000+01:00This comment has been removed by the author.Jonhttps://www.blogger.com/profile/14308358891592822280noreply@blogger.comtag:blogger.com,1999:blog-6043705792807544709.post-85751421966353070542008-09-11T21:33:00.000+01:002008-09-11T21:33:00.000+01:00Thanks a lot.We are currently compiling a set of p...Thanks a lot.<BR/><BR/>We are currently compiling a set of possible blog search tasks (with pros and cons). Usually, a TREC track does not run more than two search tasks. If a proposal gets accepted, the TREC conference participants are polled to see which tasks get the most interest.Iadh Ounishttps://www.blogger.com/profile/05740425172350940695noreply@blogger.comtag:blogger.com,1999:blog-6043705792807544709.post-85444542453741697312008-09-11T05:03:00.000+01:002008-09-11T05:03:00.000+01:00I look forward to the new corpus and track opportu...I look forward to the new corpus and track opportunities.<BR/><BR/>I posted some of <A HREF="http://www.searchenginecaffe.com/2008/09/trec-2009-blog-track-thoughts.html" REL="nofollow">my thoughts on the possible 2009 blog track tasks</A>.jeff.daltonhttps://www.blogger.com/profile/12887721174386884522noreply@blogger.comtag:blogger.com,1999:blog-6043705792807544709.post-40102655928702997612008-09-10T17:22:00.000+01:002008-09-10T17:22:00.000+01:00Thank you for taking a look at this paper! I'm gl...Thank you for taking a look at this paper! I'm glad you like the ideas within. And a shout out to Jon Elsas who alerted me to your excellent blog.Unknownhttps://www.blogger.com/profile/05210327736259173271noreply@blogger.com