Archive for the 'IR for children' Category

What and How Children Search on the Web

Thursday, August 18th, 2011, posted by Djoerd Hiemstra

by Sergio Duarte Torres and Ingmar Weber (Yahoo! Research)

The Internet has become an important part of the daily life of children as a source of information and leisure activities. Nonetheless, given that most of the content available on the web is aimed at the general public, children are constantly exposed to inappropriate content, either because the language goes beyond their reading skills, their attention span differs from grown-ups or simple because the content is not targeted at children as is the case of ads and adult content. In this work we employed a large query log sample from a commercial web search engine to identify the struggles and search behavior of children of the age of 6 to young adults of the age of 18. Concretely we hypothesized that the large and complex volume of information to which children are exposed leads to ill-defined searches and to dis-orientation during the search process. For this purpose, we quantified their search difficulties based on query metrics (e.g. fraction of queries posed in natural language), session metrics (e.g. fraction of abandoned sessions) and click activity (e.g. fraction of ad clicks). We also used the search logs to retrace stages of child development. Concretely we looked for changes in the user interests (e.g. distribution of topics searched), language development (e.g. readability of the content accessed) and cognitive development (e.g. sentiment expressed in the queries) among children and adults. We observed that these metrics clearly demonstrate an increased level of confusion and unsuccessful search sessions among children. We also found a clear relation between the reading level of the clicked pages and the demographics characteristics of the users such as age and average educational attainment of the zone in which the user is located.

Tag Cloud

The paper will be presented at the 20th ACM International Conference on Information and Knowledge Management (CIKM) in Glasgow, 24-28 October 2011

[download pdf]

ImagePile: an Alternative for Vertical Results Lists

Tuesday, May 17th, 2011, posted by Djoerd Hiemstra

by Saskia Akkersdijk, Merel Brandon, Hanna Jochmann-Mannak, Djoerd Hiemstra, and Theo Huibers

ImagePileRecent work shows that children are very well capable of searching with Google, due to their familiarity with the interface. However, children do have difficulties with the vertical list representation of the results. In this paper, we present an alternative result representation for a touch interface, the ImagePile. The ImagePile displays the results as a pile of images where the user navigates through via horizontal swiping. This representation was tested on a search engine for the Emma child hospital’s library. Using a within subject experiment, both representations were tested with children to compare the usability of both systems. The vertical representation was perceived as easier to use, but the ImagePile system was considered more fun to use. Also, with the ImagePile system more relevant results were chosen by the children, and they were more aware of the number of results.

[download pdf]

Visual Exploration of Health Information for Children

Monday, May 9th, 2011, posted by Djoerd Hiemstra

by Frans van der Sluis, Sergio Duarte, Djoerd Hiemstra, Betsy van Dijk and Frea Kruisinga

human body searchChildren experience several difficulties retrieving information using current Information Retrieval (IR) systems. Particularly, children struggle to find the right keywords to construct queries given their lack of domain knowledge. This problem is even more critical in the case of the specialized health domain. In this work we present a novel method to address this problem using a cross-media search interface in which the textual data is searched through visual images. This solution aims to solve the recall and recognition problem which is salient for health information, by replacing the need for a vocabulary with the easy task of recognising the different body parts.

[download pdf]

Tangible Information Retrieval for Children

Sunday, May 16th, 2010, posted by Djoerd Hiemstra

by Michel Jansen, Wim Bos, Paul van der Vet, Theo Huibers and Djoerd Hiemstra

Despite several efforts to make search engines more child-friendly, children still have trouble using systems that require keyboard input. We present TeddIR: a system using a tangible interface that allows children to search for books by placing tangible figurines and books they like/dislike in a green/red box, causing relevant results to be shown on a display. This way, issues with spelling and query formulation are avoided. A fully functional prototype was built and evaluated with children aged 6-8 at a primary school. The children understood TeddIR to a large extent and enjoyed the playful interaction.

TeddIR
TeddIR in the set-up used during evaluation.

TeddIR will be presented at 9th International Conference on Interaction Design and Children, Barcelona June 9-11, 2010.

[download pdf]

Query log analysis for children

Tuesday, May 11th, 2010, posted by Djoerd Hiemstra

Query log analysis in the context of Information Retrieval for children

by Sergio Duarte Torres, Djoerd Hiemstra, and Pavel Serdyukov

In this paper we analyze queries and sessions intended to satisfy children’s information needs using a large-scale query log. The aim of this analysis is twofold: i) To identify differences between such queries and sessions, and general queries and sessions; ii) To enhance the query log by including annotations of queries, sessions, and actions for future research on information retrieval for children. We found statistically significant differences between the set of general purpose and queries seeking for content intended for children. We show that our findings are consistent with previous studies on the physical behavior of children using Web search engines.

most frequent queries
nickjr.com
elmo
nick jr
coloring pages
postopia
candystand
the wiggles
starfall.com
dora the explorer

The paper will be presented at the ACM SIGIR 2010 Conference, 19-23 July 2010 in Geneva, Switzerland

[download preprint]

Automatic Reformulation of Children’s Search Queries

Monday, May 3rd, 2010, posted by Djoerd Hiemstra

Maarten van Kalsbeek, Joost de Wit, Dolf Trieschnigg, Paul van der Vet, Theo Huibers and Djoerd Hiemstra

The number of children that have access to an Internet connection (at home or at school) is large and growing fast. Many of these children search the web by using a search engine. These search engines do not consider their skills and preferences however, which makes searching difficult. This paper tries to uncover methods and techniques that can be used to automatically improve search results on queries formulated by children. In order to achieve this, a prototype of a query expander is built that implements several of these techniques. The paper concludes with an evaluation of the prototype and a discussion of the promising results.

download pdf

SIGIR Workshop on Accessible Search Systems

Monday, March 29th, 2010, posted by Djoerd Hiemstra

We organize a workshop on an exciting new theme at SIGIR on 23 July 2010 in Geneva, Switzerland.

Current search systems are not adequate for individuals with specific needs: children, older adults, people with visual or motor impairments, and people with intellectual disabilities or low literacy. Search services are typically created for average users (young or middle-aged adults without physical or mental disabilities) and information retrieval methods are based on their perception of relevance as well. The workshop will be the first ever to raise the discussion on how to make search engines accessible for different types of users, including those with problems in reading, writing or comprehension of complex content. Search accessibility means that people whose abilities are considerably different from those that average users have will be able to use search systems with the same success.

The objective of the workshop is to provide a forum and initiate collaborations between academics and industrial practitioners interested in making search more usable for users in general and for users with specific needs in particular. We encourage presentation and participation from researchers working at the intersection of information retrieval, natural language processing, human-computer interaction, ambient intelligence and related areas. The workshop will be a mix of oral presentations for long papers (maximum of 8 pages), a session for posters (maximum of 2 pages) and a panel discussion. All submissions will be reviewed by at least two PC members. Workshop proceedings will be available at the workshop. The workshop welcomes, but is not limited to, contributions on a range of the following key issues:

  • Understanding of search behavior of users with specific needs
  • Understanding of relevance criteria of users with specific needs
  • Understanding the effects of domain expertise, age, user experience and cognitive abilities on search goals and results evaluation
  • Non-topical aspects of relevance: text style, readability, appropriateness of language (harassment and explicit content detection)
  • Development of test collections for evaluation of accessible search systems
  • Collaborative search techniques for assisting users with specific needs (e.g. parents helping children)
  • Potential of search personalization techniques to satisfy users with specific needs
  • Search interfaces and result representation for people with specific needs
  • Using assistive technologies for interaction with search systems, e.g. speech recognition or eye tracking software for querying and browsing.

See the Workshop website.

New DB group member: Sergio Duarte Torres

Monday, October 12th, 2009, posted by Djoerd Hiemstra

Today, Sergio Duarte Torres joined our group to work on PuppyIR, a European project that will develop an open source environment to construct information services for children. Welcome Sergio!

First PuppyIR search architecture

Tuesday, September 29th, 2009, posted by Djoerd Hiemstra

PuppyIR: Designing an Open Source Framework for Interactive Information Services for Children

by Leif Azzopardi, Richard Glassey, Mounia Lalmas, Tamara Polajnar, and Ian Ruthven

One of the main aims of the PuppyIR project is to provide an open source framework for the development of Interactive Information Retrieval Services. The main focus of the project is directed towards developing such services for children, which introduces a number of novel and challenging issues to address (such as language development, security, moderation, etc).

In this poster paper, we outline the preliminary high-level design of the open source framework. The framework uses a layered architecture to minimize dependencies between the user-side concerns of interaction and presentation, and the system-side concerns of aggregating content from multiple sources and processing information appropriately. Each layer will consist of a series of interchangeable components, which can be interconnected to form a complete service. To facilitate the construction of diverse information services, a dataflow language is proposed to enable the assembly of the components in an intuitive and visual manner. One of the the design goals of the architecture, and ultimate measures of success, is to provide a “lego” style building block environment in which researchers and developers of any age can build their own information service. The poster provides the starting point for the design of the framework and aims to seek comments, feedback and suggestions from the community in order to improve and refine the architecture.

[download paper]

Jobs: Three PhD student positions

Monday, March 9th, 2009, posted by Djoerd Hiemstra

Position: Distributed Information Retrieval

The Database Group of the University of Twente offers a job opening in the NWO Vidi Project “Distributed Information Retrieval by means of Keyword Auctions”. The project’s aim is to distribute internet search functionality in such a way that communities of users and/or federations of small search systems provide search services in a collaborative way. Instead of getting all data to a centralized point and process queries centrally, as is done by today’s search systems, the project will distribute queries over many small autonomous search systems and process them locally. In this project, the PhD student will research a new approach to distribute search: distributed information retrieval by means of keyword auctions. Keyword auctions like Google’s AdWords give advertisers the opportunity to provide targeted advertisements by bidding on specific keywords. Analogous to these keyword auctions, local search systems will bid for keywords at a central broker. They “pay” by serving queries for the broker. The broker will send queries to those local search systems that optimize the overall effectiveness of the system, i.e., local search systems that are willing to serve many queries, but also are able to provide high quality results. The PhD student will work within a small team of researchers that approaches the problem from three different angles: 1) modeling the local search system, including models for automatic bidding and multi-word keywords, 2) modeling the search broker’s optimization using the bids, the quality of the answers, and click-through rates, and 3) integration of structured data typically available behind web forms of local search systems with text search.

See official announcement. (Deadline: 19 April 2009)

Two positions: PuppyIR, Information Retrieval for Children

The Groups Human Media Interaction and Databases of the University of Twente offer two job openings in the European Project PuppyIR. Current Information Retrieval (IR) systems are designed for adults: they return information that is unsuitable for children, present information in lists that children find difficult to manage and make it difficult for children to ask for information. PuppyIR will create information search services that are tailored to the specific needs of children, giving children the opportunity to fully and safely exploit the power of the Internet. PuppyIR will develop new interaction paradigms to allow children to easily express their information need, to have results presented in an intuitive way and to engage children in system interaction. It will develop a set of Information Services: components to summarise textual and audiovisual content for children, to help children safely explore new information, to moderate information for children at different ages, to build new social networks and to intelligently aggregate and present information to children. PuppyIR will offer an open source platform that enables system designers to construct useful and usable information retrieval systems for children. The project will demonstrate the effectiveness of the PuppyIR modules through demonstrator systems constructed in collaboration with the Netherlands Public Library Association and the Emma Children’s Hospital. At the university of Twente, a team of six senior researchers and three PhD students will cooperate in PuppyIR. One PhD student will work on user interaction design. The other two positions are described below.

Position 1: Analyzing and structuring textual information (at Human Media Interaction) Analyzing and structuring textual information studies how natural language processing tools can assist the organization of information in a way that enables children to easily access the information. The PhD student at Human Media Interaction will focus on information extraction, text classification, and story understanding and summarization on written and spoken data, for instance for questions or comments created by children (e.g., chats, blogs) and content created explicitly for children (e.g., stories).

Position 2: Multimedia content mining (at Databases) Multimedia content mining will develop database search technology that enables better understanding of the individual behavior of the child and consequently his/her information need. The PhD student at Databases will focus on concept retrieval, faceted search, query formulation assistance, and intuitive relevance feedback mechanisms that allow children to easily access the content of multimedia data sources, for instance for content sharing within online groups including moderated discovery.

See official announcement. (Deadline: 15 April 2009)