Tuesday, August 4, 2009

AcademTech: Faceted People Search

AcademTech is a Computing Science-specific expert search engine based on the Terrier IR Platform. Persons working at Computing Science departments in Scottish Universities are considered as candidate experts by the system. Profiles of their expertise evidence are then mined from their homepages, publicly available digital libraries (e.g. DBLP) and related information found on the Web through Yahoo! BOSS. The ranking of experts is provided by a variant of the Voting Model expert search approach.

The system is integrated with a novel faceted search interface to allow users to browse and explore the results using a number of categories such as Location or Conference/Journal publications. Each expert in the system has a profile page containing a number of elements including query specific supporting publications, most informative associated terms displayed as a tag cloud, co-authors and web links. Although the system is currently applied in the context of Scottish Computing Science Academia, it can easily be expanded to go beyond its current Scottish scope, cover other academic fields, and people in general.

I was lucky enough to be able to demo AcademTech at SIGIR 2009 in Boston on July 20th. Thankfully, I spoke to a large number of attendees receiving largely very helpful feedback.

A popular suggestion was to utilize AcademTech's core system in the scope of biology. This would meet the medical field's need for finding related organisms, diseases etc. Possible facets in the area would likely be biological classifications such as species and genus.

Daniel Tunkelang from The Noisy Channel suggested providing profile page-located facets, allowing filtering of search results by features present in a selected expert's page such as co-authors. This would satisfy an example scenario such as "Show me co-authors of this expert who work for the University of Glasgow." Profile facets could also allow the experts publications list to be filtered by a number of fields such as co-author location, conference etc.

Much of the feedback mirrored that of intended future work. Name disambiguation is a high priority update as a current problem with AcademTech is the publication mismatch when multiple experts have the same name. In fact, the system is specifically designed to allow for expansion of facets, and name disambiguation. With a large amount of publication collaborators working in industry a useful move would be to expand to accommodate these experts.

AcademTech Sigir 2009 PosterAcademTech is now publicly accessible from http://www.terrier.org/academtech
A description of the system is available in the SIGIR'09 proceedings.

Thank you to all those who spoke to me and gave me some great feedback.