Archive for the 'Colloquia' Category

DB colloquium: Suzan Verberne of Radboud University

Sunday, September 21st, 2008, posted by Djoerd Hiemstra

Using Structural Information for Improving Why-Question Answering

Who: Suzan Verberne (Radboud University Nijmegen)
When: Tuesday September 30, 2008
Where: ZI-3126

My PhD research project “In Search of the Why” aims at developing a system for answering why-questions. Today I will present my recent work on extending a simple passage retrieval approach with structural information. The starting point is Lemur’s TFIDF, which retrieves a relevant answer in the top 150 for 79% of the test questions. However, only 45% of the questions is answered in the top 10. We aim to improve the ranking by adding a reranking module. For re-ranking we consider a set of 31 features representing structural information of the question and answer candidate: syntactic structure as well as document structure. We find a significant improvement over the baseline for both MRR and Success@10, which is now 55%. The most important features for re-ranking are TFIDF (the baseline score), the presence of cue words, the question’s main verb, and the relation between question focus and document title.

Joost de Wit graduates on evaluating recommender systems

Wednesday, May 21st, 2008, posted by Djoerd Hiemstra

Recommender systems use knowledge about a user’s preferences (and those of others) to recommend them items that they are likely to enjoy. Recommender system evaluation has proven to be challenging since a recommender system’s performance depends on, and is influenced by many factors. The data set on which a recommender system operates for example has great influence on its performance. Furthermore, the goal for which a system is evaluated may differ and therefore require different evaluation approaches. Another issue is that the quality of a system recorded by the evaluation is only a snapshot in time since it may change gradually. Although there exists no consensus among researchers on what recommender system’s attributes to evaluate, accuracy is by far the most popular dimension to measure. However, some researchers believe that user satisfaction is the most important quality attribute of a recommender and that greater user satisfaction is not achieved by an ever increasing accuracy. Other dimensions for recommender system evaluation that are described in literature are coverage, confidence, diversity, learning rate, novelty and serendipity. It is believed that these dimensions contribute in some way to the user satisfaction achieved by a recommender system.

Joost performed a user study for which 133 people subscribed to an evaluation application specially designed and build for this purpose. The user study consisted of two phases. During the first phase users had to rate TV programmes they were familiar with or that they recently watched. This phase resulted in 36.353 programme ratings for 7.844 TV programmes. Based on this data, the recommender system that was part of the evaluation application could start generating recommendations. In phase two of the study the application displayed recommendations for tonight’s TV programmes to its users. These recommendation lists were deliberately varied with respect to the accuracy, diversity, novelty and serendipity dimensions. Another dimension that was altered was programme overlap. Users were asked to provide feedback on how satisfied they were with the list. Over a period of four weeks 70 users provided 9762 ratings for the recommendation lists. For each of the recommendation lists that were rated in the second phase of the user study, the five dimensions (accuracy, diversity, novelty and serendipity) were measured using 15 different metrics. For each of these metrics its correlation with user satisfaction was determined using Spearman’s rank correlation. These correlation coefficients indicate whether there exists a relation between that metric and user satisfaction and how strong this relation is. It appeared that accuracy is indeed the most important dimension in relation to user satisfaction. Other metrics that had a strong correlation were user’s diversity, series level diversity, user’s serendipity and effective overlap ratio. This indicates that diversity, serendipity and programme overlap are important dimensions as well, although to lesser extent.

[more info] [download pdf]

DB Master Students Colloquium

Friday, April 18th, 2008, posted by Djoerd Hiemstra

Next Friday 25 April March there will be a DB master students colloquium at 13.45 h. in ZI-3126 with two speakers:

  • Alex van Oostrum will talk about: “The design of an object- and aspect oriented framework to facilitate software development of enterprise components”
  • Matthijs Ooms will talks about: “Provenance of Biomedical data”

DB Colloquium of Tuesday 29 January

Monday, January 28th, 2008, posted by Djoerd Hiemstra

The DB Colloquium of Tuesday 29 January, 14:00 h.-15:00 h. in ZI-3126 consists of two small presentations.

Comprehending historical election programs using XML and XRPC

by Douwe van der Meij

Party programs for elections can be incomprehensible, not to mention the comparison of current party programs to that of a decade ago. This paper focusses on a way to compre- hend the latter. It shows how to use xml to store election programs and to query those. This paper also comes with a proof of concept (PoC). In retrospect we look at this PoC, and we discuss the design choices made.

Boeken zonder leeftijdscategorie sneller vinden

by Wout Maaskant

In dit onderzoek is een systeem ontwikkeld waarmee gebruikers sneller boeken waar geen leeftijdscategorie aan is toegekend kunnen vinden in een, door gebruik te maken van eigenschappen van vergelijkbare boeken waar wel een leeftijdscategorie aan is toegekend. De gelijkenis tussen boeken wordt bepaald met behulp van het vector space model.

Robert Zwerus graduates on storing “PIM” data

Wednesday, November 28th, 2007, posted by Djoerd Hiemstra

Storing Personal Information Management (PIM) data is not trivial, because of the variety in content types. Existing PIM storage systems have shortcomings in performance, data concistency and/or concurrency. In this thesis, we propose several optimisations and test them in Akonadi, KDE’s new central PIM data access manager. The optimisations include using the D-Bus protocol for transmitting short commands and notifications and an IMAP-compatible protocol for data access and modification. The PIM data is kept in its native format, but compressed and split up into separate, frequently-used parts for increased performance. Both the synthetic and use case based evaluation results show that the proposed modifications perform well and help maintain data consistency in Akonadi.

Read more on E-prints

DB master students colloquium: Fri 30 Nov, 13.45 h.

Friday, November 23rd, 2007, posted by Djoerd Hiemstra

We have two excellent presentations at the DB master students colloquium of Friday 30 November:

“Storing Personal Information Magagement Data”
by Robert Zwerus

“Cooperative Intelligent Transport Systems”
by Bobby Nijssen

When: Friday 30 November, 13.45 h. - 15.30 h. Where: ZI-3126

DB colloquium: Volker Krause of

Thursday, November 22nd, 2007, posted by Djoerd Hiemstra


KDE, The K Desktop Environment: Conquer your desktop
Who: Volker Krause of
When: Wednesday, 28 November 2007, 14.30 h. - 15.15 h.
Where: ZI-4126

Volker Krause of will give an overview of what’s new in KDE4. He will talk about Akonadi, the Personal Information Management (PIM) Storage Service of KDE. Furthermore, Volker will talk about the currently ongoing cooperations between various universities and KDE (students working on KDE in practical courses, thesis on KDE topics, EU-funded research projects).

[Advanced Databases]: Guest lecture 17 Oct at 10.40 h in WA-204: GIS and Geodatabases by Martin Engels of ESRI

Wednesday, October 10th, 2007, posted by Djoerd Hiemstra

Time and Place of next week’s lecture on 17 October is changed to:

WA-204, at 10.40h. - 12.25h.

I would like to ask everyone to be present at the guest lecture. Martin Engels comes all the way from Leiden to talk about GIS. If you are unable to come, please let me know as soon as possile.

Title: GIS and Geodatabases: the ESRI approach
Speaker: Martin Engels, ESRI Netherlands
Room: WA-204
When: Wednesday 17 October 2007, 10:40 h. - 12:25 h.

About ESRI:
ESRI designs and develops the world’s leading geographic information system (GIS) technology. GIS is an important tool - one that helps shape the world around us. GIS technology helps fight forest fires, determine new national boundaries during peace negotiations, find promising sites for fast-growing companies, rebuild cities around the world, support optimal land-use planning, route emergency vehicles, monitor rain forest depletion, contain oil spills, and perform countless other vital tasks every day. Today, ESRI has more than 4,000 skilled employees worldwide who work with hundreds of business partners and tens of thousands of users.

[Information Retrieval]: Guest lecture by Wessel Kraaij (TNO-ICT) Wednesday 4 October, 13.45 h. in LA-1812

Tuesday, October 3rd, 2006, posted by Djoerd Hiemstra

Title: The evaluation of information retrieval systems by Wessel Kraaij (TNO-ICT) Wednesday 4 October, 13.45 hour in LA-1812 This lecture provides the tools and methodology for comparing the effectiveness of two or more information retrieval systems in a meaningful way. Several aspects of information retrieval systems can be evaluated without consulting the potential users or customers of the system, such as for instance the query processing time (measured for instance in miliseconds per query) or the query throughput (measured for instance as the number of queries per second). This lecture, however, focuses on aspects of the system that in uence the quality of the retrieved results. In order to measure the quality of search results, one must at some point consult the potential user of the system. For, what are the correct results for the query “black jaguar”? Cars, or cats? Ultimately, the user has to decide….

New IR colloquium

Friday, September 15th, 2006, posted by Henning Rode
We are going to start a new colloquium on information retrieval related topics. It should bring together people working in this field (from different floors of this building) to discuss their newest research as well as new developments in IR in general.

For the first session, that will be held on Tuesday 26.9. 11:00 in our meeting room No. 3126, Claudia has volunteered to give a report on the Clef conference in Alicante (just 2 days after she returns from Spain. So you will get the newest information possible). We will further discuss how we are going to continue these colloquium meetings.

So if you are interested in IR topics, put this date in you agenda…