Twente at NTCIR 2013

An API-based Search System for One Click Access to Information

by Dan Ionita, Niek Tax, and Djoerd Hiemstra

This paper proposes a prototype One Click access system, based on previous work in the field and the related 1CLICK-2@NTCIR10 task. The proposed solution integrates methods from previous such attempts into a three tier algorithm: query categorization, information extraction and output generation and offers suggestions on how each of these can be implemented. Finally, a thorough user-based evaluation concludes that such an information retrieval system outperforms the textual preview collected from Google search results, based on a paired sign test. Based on validation results possible suggestions on future improvements are proposed.

To be presented at the Japanese National Institute of Informatics (NII) Testbeds and Community for Information access Research (NTCIR-10) Conference at the National Center of Sciences, Tokyo, Japan on June 18-21

[download pdf]

Traitor: Associating Concepts using the WWW

by Wanno Drijfhout, Oliver Jundt, and Lesley Wevers

Traitor uses Common Crawl's 25TB data set of web pages to construct a database of associated concepts using Hadoop. The database can be queried through a web application with two query interfaces. A textual interface allows searching for similarities and differences between multiple concepts using a query language similar to set notation, and a graphical interface allows users to visualize similarity relationships of concepts in a force directed graph.

To be presented at the 13th Dutch-Belgian Information Retrieval Workshop DIR 2013 on 26 April in Delft, The Netherlands

[download pdf]

Try Traitor at http://traitor.imperamus.eu.

Readability of the Web

A study on 1 billion web pages.

by Marije de Heus

Automated Readability Index for the Web

We have performed a readability study on more than 1 billion web pages. The Automated Readability Index was used to determine the average grade level required to easily comprehend a website. Some of the results are that a 16-year-old can easily understand 50% of the web and an 18-year old can easily understand 77% of the web. This information can be used in a search engine to filter websites that are likely to be incomprehensible for younger users.

To be presented at the 13th Dutch-Belgian Information Retrieval Workshop DIR 2013 on 26 April in Delft, The Netherlands

[download pdf]

Google Online Marketing Challenge

Google Online Marketing Challenge Interested in online advertising and marketing? Together with Inter-Actief we will run a second science challenge in the next quarter from 11 Februari to 4 April. With a US$250 budget provided by Google, students will develop an online advertising strategy for a real business or non-profit organization that has not used Google's AdWords in the last six months. The winners will receive a trip to the Google Headquarters in Mountain View, California to meet with the AdWords team. For more information, and to enroll, visit http://challenges.inter-actief.net.

Also, see the Google Online Marketing Challenge page.

Join TREC FedWeb’13

FedWeb '13 is the new TREC (Text Retrieval Conference) Federated Web Search task, that will provide a test collection that organizes and stimulates research in many areas related to federated search, including aggregated search, distributed search, peer-to-peer search and meta-search engines. The track will evaluate federated and aggregated search in a large heterogeneous setting using the search results of existing search engines.

Join the mailing and keep up-to-date with FedWeb'13.

OLC-IT Jaarverslag 2011-2012

De opleidingscommissie IT (OLC-IT) houdt zich bezig met examenregelingen en het onderwijs­programma van de bacheloropleidingen Technische Informatica en Telematica en de master­opleidingen Computer Science, en Telematics. Ze heeft wettelijk het recht om gevraagd en ongevraagd advies uit te brengen aan de opleidingsdirecteur en de decaan. Elk jaar maakt de OLC een jaarverslag. Dit jaar in het jaarverslag:

  • Curriculumwijzigingen
  • Kwaliteitszorg
  • Universiteitsbrede OER
  • Studieversnellende maatregelen
  • Twents onderwijsmodel
  • Interactie student en docent

Lees het hele jaarverslag 2011-2012.