Rutger Varkevisser graduates on Large Scale Online Readability Assessment

by Rutger Varkevisser

The internet is an incredible resource for information and learning. By using search engines like Google, information is usually just a click away. Unless you are a child, in which case most of the information on the web is either (way) too difficult to read and/or understand, or impossible to find. This research aims to successfully combine the areas of readability assessment and gamification in order to provide a tech- nical and theoretical foundation for the creation of an automatic large scale child feedback readability assessment system. In which correctly assessing the readability level of online (textual) content for children is the central focus. The importance of having correct readability scores for online content, is that it provides children with a guideline on the difficulty level of textual content on the web. It also allows for external programs i.e. search engines, to potentially take readability scores into account based on the known age/proficiency of the user. Having children actively participate in the process of determining readability levels should improve any current systems which usually rely on fully automated systems/algorithms or human (adult) perception.
The first step in the creation of the aforementioned tool is to make sure the underlying process is scientific valid. This research has adapted the Cloze-test as a method of determining the readability of a text. The Cloze-test is an already established and researched method of readability assessment, which works by omitting certain words from a text and tasking the user with filling in the open spots with the correct words. The resulting overall score determining the readability level. For this research we want to digitize and automate this process. However, while the validity of the Cloze-test and its results in an offline (paper) environment have been proven, this is not the case for any digital adaptation. Therefore the first part of this research focusses on this central issue. By combining the areas of readability assessment (the Cloze-test), gamification (the creation of a digital online adaptation of the Cloze-test) and child computer interaction (a user-test on the target audience with the developed tool) this validity was examined and tested. In the user-test the participants completed several different Cloze-test texts, half of them offline (on paper) and the other half in a recreated online environment. This was done to measure the correlation between the online scores and the offline scores, which we already know are valid. Results of the user-test confirmed the validity of the online version by showing significant correlations between the offline and online versions via both a Pearson correlation coefficient and Spearman’s rank-order analysis.
With the knowledge that the online adaptation of the Cloze-test is valid for determining readability scores, the next step was to automate the process of creating Cloze-tests from texts. Given that the goal of the project was to provide the basis of a scalable gamified approach, and scalable in this context means automated. Several methods were developed to mimic the human process of creating a Cloze-test (i.e. looking at the text and selecting which words to omit given a set of general guidelines). Included in these methods were TF.IDF and NLP approaches in order to find suitable extraction words for the purposes of a Cloze-test. These were tested by comparing the classification performance of each method with a baseline of manually classified/marked set of texts. The final versions of the aforementioned methods were tested, and resulted performance scores of around 50%, i.e. how well they emulated human performance in the creation of Cloze-tests. A combination of automated methods resulted in an even bigger performance score of 63%. The best performing individual method was put to the test in a small Turing-test style user-test which showed promising results. Presented with 2 manually- and 1 automatically created Cloze-test participants attained similar scores across all tests. Participants also gave contradicting responses when asked which of the 3 Cloze-tests was automated. This research concludes the following:

  1. Results of offline- and online Cloze-tests are highly correlated.
  2. Automated methods are able to correctly identify 63% of suitable Cloze-test words as marked by humans.
  3. Users gave conflicting reports when asked to identify the automated test in a mix of both automated- and human-made Cloze-tests.

[download pdf]

Seminar on Cyberbullying

On Friday 12 September the 11th SIKS/Twente Seminar of Searching and Ranking (SSR) takes place discussing Cyberbullying. The goal of the seminar is to bring together researchers from academia and organizations working on the development of strategies and solutions to understand, detect and prevent cyberbullying incidents among adolescents. Invited speakers are:

  • Prof. Debra Pepler (York University, Canada)
  • Prof. Veronique Hoste (Ghent University, Belgium)

More information at: SSR-11.

Analysis of Search and Browsing Behavior of Young Users on the Web

by Sergio Duarte Torres, Ingmar Weber, and Djoerd Hiemstra

The Internet is increasingly used by young children for all kinds of purposes. Nonetheless, there are not many resources especially designed for children on the Internet and most of the content online is designed for grown-up users. This situation is problematic if we consider the large differences between young users and adults since their topic interests, computer skills, and language capabilities evolve rapidly during childhood. There is little research aimed at exploring and measuring the difficulties that children encounter on the Internet when searching for information and browsing for content. In the first part of this work, we employed query logs from a commercial search engine to quantify the difficulties children of different ages encounter on the Internet and to characterize the topics that they search for. We employed query metrics (e.g., the fraction of queries posed in natural language), session metrics (e.g., the fraction of abandoned sessions), and click activity (e.g., the fraction of ad clicks). The search logs were also used to retrace stages of child development. Concretely, we looked for changes in interests (e.g., the distribution of topics searched) and language development (e.g., the readability of the content accessed and the vocabulary size).

[download pdf]

Published in ACM Transactions on the Web (TWEB) Volume 8 Issue 2.

Query Recommendation in the Information Domain of Children

by Sergio Duarte Torres, Djoerd Hiemstra, Ingmar Weber, and Pavel Serdyukov

Children represent an increasing part of web users. One of the key problems that hamper their search experience is their limited vocabulary, their difficulty to use the right keywords, and the inappropriateness of general-purpose query suggestions. In this work we propose a method that utilizes tags from social media to suggest queries related to children topics. Concretely we propose a simple yet effective approach to bias a random walk defined on a bipartite graph of web resources and tags through keywords that are more commonly used to describe resources for children. We evaluate our method using a large query log sample of queries submitted by children. We show that our method outperforms by a large margin the query suggestions of modern search engines and state-of-the art query suggestions based on random walks. We improve further the quality of the ranking by combining the score of the random walk with topical and language modeling features to emphasize even more the child-related aspects of the query suggestions.

to appear in the Journal of the American society for information science and technology JASIST.

[download preprint]

What is information?

Met computers kun je informatie opslaan en versturen, maar wat is informatie eigenlijk? Hoeveel informatie staat er in een boek van 100 pagina's? En welke boekenserie bevat meer informatie: “De wereld van Darren Shan” of “De griezelbus van Paul van Loon”? Hoe meet je dat?

Computers are used to store and send information, but what is information anyway? How much information does a book of 100 pages contains? What book series contain more information: “The Saga of Darren Shan” or “The Horror Bus of Paul van Loon”? How to measure this?

This lecture for the Museum Jeugduniversiteit for children aged 8 to 12 is based on the wonderful Computer Science Unplugged activities by Tim Bell, Ian Witten and Mike Fellows. In the lecture I explain the theories of Claude Shannon, talk about statistical language models, and we play the Twenty Guesses quiz.

Marije de Heus graduates on Recommender Systems for High School Courses

Design and Evaluation of a Recommender System for High School Courses in The Netherlands

by Marije de Heus

This thesis presents a newly developed recommender system for recommending high school courses in The Netherlands. The recommender system recommends a complete set of courses to a student, based on the choices of similar students that have already completed high school. A large historical database containing information of more than 20% of all new Dutch high school students was used for this recommender. The methodologies used are a structured literature review, interviews for requirements, design of the system and offline (with a historical dataset containing grades from tens of thousands students) and online (with on-site experiments at 4 high schools) experiments. The main findings of this report are the following:

  • There is a definite need for an objective recommendation of high school courses by students and school counselors;
  • The recommendations are not accurate;
  • The recommendations received good reviews in the online experiment;
  • The recommendations did not outperform the random recommendation in the online experiment;
  • A serendipitous result: the offline tests have shown that recommenders can predict future exam grades with high accuracy.

Our recommendation to Topicus, based on these findings, is not to implement the recommender system. Instead, a broader search could be started, to find other possible solutions for the need for objective recommendations. One technique that could be explored further, is the prediction of grades for single courses. We expect that school counselors will find such a tool helpful in advicing students which courses to take.

[download pdf]

Vertical Selection in the Information Domain of Children

Sergio Duarte Torres' paper on vertical selection for search for children is nominated for the JCDL Best Student Paper Award.

Vertical Selection in the Information Domain of Children

by Sergio Duarte Torres, Djoerd Hiemstra and Theo Huibers

In this paper we explore the vertical selection methods in aggregated search in the specific domain of topics for children between 7 and 12 years old. A test collection consisting of 25 verticals, 3.8K queries and relevant assessments for a large sample of these queries mapping relevant verticals to queries was built. We gather relevant assessment by envisaging two aggregated search systems: one in which the Web vertical is always displayed and in which each vertical is assessed independently from the web vertical. We show that both approaches lead to a di?erent set of relevant verticals and that the former is prone to bias of visually oriented verticals. In the second part of this paper we estimate the size of the verticals for the target domain. We show that employing the global size and domain specific size estimation of the verticals lead to significant improvements when using state-of-the art methods of vertical selection. We also introduce a novel vertical and query representation based on tags from social media and we show that its use lead to significant performance gains.

Presented on 23 July at the joint ACM/IEEE conference on Digital Libraries JCDL 2013 in Indianapolis, USA.

[download pdf]

Query Recommendation for Children

by Sergio Duarte Torres, Djoerd Hiemstra, Ingmar Weber (Yahoo), Pavel Serdyukov (Yandex)

One of the biggest problems that children experience while searching the web occurs during the query formulation process. Children have been found to struggle formulating queries based on keywords given their limited vocabulary and their difficulty to choose the right keywords. In this work we propose a method that utilizes tags from social media to suggest queries related to children topics. Concretely we propose a simple yet effective approach to bias a random walk defined on a bipartite graph of web resources and tags through keywords that are more commonly used to describe resources for children. We evaluate our method using a large query log sample of queries aimed at retrieving information for children. We show that our method outperforms query suggestions of state-of-the-art search engines and state-of-the art query suggestions based on random walks.

to be presented at the The 21st ACM International Conference on Information and Knowledge Management, CIKM 2012.

[download pdf]

Initial Evaluation of EmSe

EmSe: Initial Evaluation of a Child-friendly Medical Search System

by PuppyIR

When undergoing medical treatment in combination with extended stays in hospitals, children have been frequently found to develop an interest in their condition and the course of treatment. PuppyIR A natural means of searching for related information would be to use a web search engine. The medical domain, however, imposes several key challenges on young and inexperienced searchers, such as difficult terminology, potentially frightening topics or non-objective information offered by lobbyists or pharmaceutical companies. To address these problems, we present the design and usability study of EmSe, a search service for children in a hospital environment.

The paper will be presented at the fourth Information Interaction in Context Symposium, IIiX 2012 on August 21-24, 2012 in Nijmegen, the Netherlands.

[download pdf]