Archive for the 'Course IR' Category

ImagePile: an Alternative for Vertical Results Lists

Tuesday, May 17th, 2011, posted by Djoerd Hiemstra

by Saskia Akkersdijk, Merel Brandon, Hanna Jochmann-Mannak, Djoerd Hiemstra, and Theo Huibers

ImagePileRecent work shows that children are very well capable of searching with Google, due to their familiarity with the interface. However, children do have difficulties with the vertical list representation of the results. In this paper, we present an alternative result representation for a touch interface, the ImagePile. The ImagePile displays the results as a pile of images where the user navigates through via horizontal swiping. This representation was tested on a search engine for the Emma child hospital’s library. Using a within subject experiment, both representations were tested with children to compare the usability of both systems. The vertical representation was perceived as easier to use, but the ImagePile system was considered more fun to use. Also, with the ImagePile system more relevant results were chosen by the children, and they were more aware of the number of results.

[download pdf]

Guest lecture by Arjen de Vries

Monday, October 18th, 2010, posted by Djoerd Hiemstra

How search logs can help improve future searches

In the European project Vitalas, we had the opportunity to analyze the search log data from a commercial picture portal of a European news agency, which offers access to photographic images to professional users. I will discuss how these logs can be used in various ways to improve image search: to expand the image representation, to make suggestions of alternative queries, to adapt the search results to user context, and to build automatically concept detectors for content-based image retrieval. I also present recent work on using the semantic information that has become publicly available in the form of linked data to improve the search log analysis. The results show that bringing in linked data gives insights beyond the more common term-based analysis, since queries related in the most frequent ways do not usually share terms. I conclude with a discussion of the implications of our findings for improving log analysis, image collection management, and search engine design.

The guest lecture takes place on 20 October 2010 at 13.45 h. in ZI-2126.

Guest lecture by Thijs Westerveld

Tuesday, October 5th, 2010, posted by Djoerd Hiemstra

Automatically Analyzing Word of Mouth

Thijs Westerveld from Teezir B.V., Utrecht, will give a guest lecture on 6 October 2010 in ZI-2126. Teezir uses advanced search technology to aggregate views and opinions found on review sites, in discussion groups or blogs. This way, we create statistics and interpretations about what people are saying. Querying this data allows decision makers to slice and dice the content, and learn what people say, either at the very aggregated level: “what is the share of positive versus negative views about our new product?”, or at the very detailed level: “which sources reflect this negative sentiment, and what exactly are people saying?”

Who Rules ruler In this talk I will demonstrate Teezir’s Opinion Analysis dashboards and discuss the underlying technology. For collecting content from web sites we developed advanced crawling technology that automatically identifies relevant news, blog and forum pages and extracts the relevant content and metadata. The collected content is then further analyzed to identify the main sentiments before everything is indexed to be disclosed in the online dashboards. Various sentiment analysis variants that have proven successful in an academic setting have been evaluated on our live collections. I will demonstrate that success on academic test collections does not necessarily imply the practical use of a sentiment analysis algorithm.

See also: Who rules?

New room for lectures IR

Wednesday, September 15th, 2010, posted by Djoerd Hiemstra

All following lectures Information Retrieval wil be held in room ZI-2126. The lecture of 22 September is canceled to give you the opportunity to visit the Interactief Symposium Predict 2010. See you 29 September, or at Predict 2010!

More information on Blackboard.

Tangible Information Retrieval for Children

Sunday, May 16th, 2010, posted by Djoerd Hiemstra

by Michel Jansen, Wim Bos, Paul van der Vet, Theo Huibers and Djoerd Hiemstra

Despite several efforts to make search engines more child-friendly, children still have trouble using systems that require keyboard input. We present TeddIR: a system using a tangible interface that allows children to search for books by placing tangible figurines and books they like/dislike in a green/red box, causing relevant results to be shown on a display. This way, issues with spelling and query formulation are avoided. A fully functional prototype was built and evaluated with children aged 6-8 at a primary school. The children understood TeddIR to a large extent and enjoyed the playful interaction.

TeddIR in the set-up used during evaluation.

TeddIR will be presented at 9th International Conference on Interaction Design and Children, Barcelona June 9-11, 2010.

[download pdf]

Guest lecture by Pavel Serdyukov

Friday, October 16th, 2009, posted by Djoerd Hiemstra

Pavel Serdyukov from TU Delft will give a guest lecture for the course Information Retrieval

When: Wednesday, October 21, 2009
Where: HO-B1212
Title: Faceted and Expert Search in the Enterprise


Enterprise Search problems recently received a considerable amount of attention from academia, mainly due to the increasing demand in industrial solutions supporting various search tasks in intranets. In this lecture I will give the research perspective on two core aspects of search in the Enterprise: Faceted and Expert search. I will demonstrate typical search scenarios, visualization approaches and ranking techniques. In the first part, I will overview the ways to support faceted search in typical cases, from easiest to hardest: with the availability of structured or unstructured document metadata and with no document metadata available. In the second part, I will talk about the latest developments in expert finding, namely, language model and graph-based based methods. I will also show the ways to to acquire expertise evidence outside of the Enterprise.

Guest lecture by Thijs Westerveld

Wednesday, October 7th, 2009, posted by Djoerd Hiemstra

Thijs Westerveld from Teezir will give a guest lecture for the course Information Retrieval

When: Wednesday, October 14, 2009
Where: HO-B1212
Title: Automatically Analyzing Word of Mouth And Focused Crawling

Teezir is a young and innovative technology company that develops and deploys comprehensive search solutions. Teezir lets companies take advantage of large and diverse amounts of documents or texts, using break through search technology. Teezir’s search platform provides functionality for the entire process of disclosing data: from gathering content, analyzing documents and building indexes for efficient access to effective querying and ranking of information. Teezir’s framework is based on full-text retrieval techniques.

Handouts for practical work

Monday, October 5th, 2009, posted by Paul van der Vet

The handout for the practical part of the course Information Retrieval has been added under Course Materials on Blackboard. Additionally, you will find two useful handouts there that help you to write your report and to insert citations in it.

Deadline to form groups: 30 September

Tuesday, September 29th, 2009, posted by Djoerd Hiemstra

Deadline to form pairs for the Information Retrieval Course Project is 30 September. Please send names and email addresses to the course staff. Groups will be numbered and listed (under Email) on Blackboard.

Information Retrieval Models Tutorial

Thursday, August 20th, 2009, posted by Djoerd Hiemstra

Many applications that handle information on the internet would be completely inadequate without the support of information retrieval technology. How would we find information on the world wide web if there were no web search engines? How would we manage our email without spam filtering? Much of the development of information retrieval technology, such as web search engines and spam filters, requires a combination of experimentation and theory. Experimentation and rigorous empirical testing are needed to keep up with increasing volumes of web pages and emails. Furthermore, experimentation and constant adaptation of technology is needed in practice to counteract the effects of people that deliberately try to manipulate the technology, such as email spammers. However, if experimentation is not guided by theory, engineering becomes trial and error. New problems and challenges for information retrieval come up constantly. They cannot possibly be solved by trial and error alone. So, what is the theory of information retrieval? There is not one convincing answer to this question. There are many theories, here called formal models, and each model is helpful for the development of some information retrieval tools, but not so helpful for the development others. In order to understand information retrieval, it is essential to learn about these retrieval models. In this chapter, some of the most important retrieval models are gathered and explained in a tutorial style.

The tutorial will be published in Ayse Goker and John Davies (eds.), Information Retrieval: Searching in the 21st Century, Wiley, 2009.

[download draft]

[download exercise solutions]