Ranking Learning-to-Rank Methods

Slides of the keynote at the 1st International Workshop on LEARning Next gEneration Rankers, LEARNER 2017 on 1 October 2017 in Amsterdam are now available:

Slides of Learner 2017 keynote
learner2017.pdf

Download the paper: Niek Tax, Sander Bockting, and Djoerd Hiemstra. “A cross-benchmark comparison of 87 learning to rank methods'’, Information Processing and Management 51(6), Elsevier, pages 757–772, 2015 [download pdf]

Dutch-Belgian Information Retrieval Workshop 2017

Send in your DIR 2017 submissions (novel, dissemination, or demo) before 15 October.

16th Dutch-Belgian Information Retrieval Workshop
Friday 24th of November 2017
Netherlands Institute for Sound and Vision,
Hilversum, the Netherlands
http://dir2017.nl

Netherlands Institute for Sound and Vision

DIR 2017 aims to serve as an international platform (with a special focus on the Netherlands and Belgium) for exchange and discussions on research & applications in the field of information retrieval as well as related fields. We invite quality research contributions addressing relevant challenges. Contributions may range from theoretical work to descriptions of applied research and real-world systems. We especially encourage doctoral students to present their research.

This year’s edition is co-organized by the CLARIAH project that is developing a Research Infrastructure for the Arts and Humanities in the Netherlands. Use cases in this infrastructure cover a wide range of IR related topics. To foster discussions between the IR community and CLARIAH researchers and developers, DIR2017 organizes a special session on IR related to data-driven research and data critique.

Read more

MTCB: A Multi-Tenant Customizable database Benchmark

by Wim van der Zijden, Djoerd Hiemstra, and Maurice van Keulen

We argue that there is a need for Multi-Tenant Customizable OLTP systems. Such systems need a Multi-Tenant Customizable Database (MTC-DB) as a backing. To stimulate the development of such databases, we propose the benchmark MTCB. Benchmarks for OLTP exist and multi-tenant benchmarks exist, but no MTC-DB benchmark exists that accounts for customizability. We formulate seven requirements for the benchmark: realistic, unambiguous, comparable, correct, scalable, simple and independent. It focuses on performance aspects and produces nine metrics: Aulbach compliance, size on disk, tenants created, types created, attributes created, transaction data type instances created per minute, transaction data type instances loaded by ID per minute, conjunctive searches per minute and disjunctive searches per minute. We present a specification and an example implementation in Java 8, which can be accessed from the following public repository. In the same repository a naive implementation can be found of an MTC-DB where each tenant has its own schema. We believe that this benchmark is a valuable contribution to the community of MTC-DB developers, because it provides objective comparability as well as a precise definition of the concept of MTC-DB.

The Multi-Tenant Customizable database Benchmark will be presented at the 9th International Conference on Information Management and Engineering (ICIME 2017) on 9-11 October 2017 in Barcelona, Spain.

[download pdf]

Alexandru Serban graduates on Personalized Ranking in Academic Search

Context Based Personalized Ranking in Academic Search

by Alexandru Serban

A criticism of search engines is that queries return the same results for users who send exactly the same query, with distinct information needs. Personalized search is considered a solution as search results are re-evaluated based on user preferences or activity. Instead of relying on the unrealistic assumption that people will precisely specify their intent when searching, the user profile is exploited to re-rank the results. This thesis focuses on two problems related to academic information retrieval systems. The first part is dedicated to data sets for search engine evaluation. Test collections consists of documents, a set of information needs, also called topics, queries that represent the data structure sent to the information retrieval tool and relevance judgements for the top documents retrieved from the collection. Relevance judgements are difficult to gather because the process involves manual work. We propose an automatic method to generate queries from the content of a scientific article and evaluate the relevant results. A test collection is generated, but its power to discriminate between relevant and non relevant results is limited. In the second part of the thesis Scopus performance is improved through personalization. We focus on the academic background of researchers that interact with Scopus since information about their academic profile is already available. Two methods for personalized search are investigated.
At first, the connections between academic entities, expressed as a graph structure, are used to evaluate how relevant a result is to the user. We use SimRank, a similarity measure for entities based on their relationships with other entities. Secondly, the semantic structure of documents is exploited to evaluate how meaningful a document is for the user. A topic model is trained to reflect the user’s interests in research areas and how relevant the search results are.
In the end both methods are merged with the initial Scopus rank. The results of a user study show a constant performance increase for the first 10 results.

[download pdf]

Bas Niesink graduates on biomedical information retrieval

Improving biomedical information retrieval with pseudo and explicit relevance feedback

by Bas Niesink

The HERO project aims to increase the quality of supervised exercise during cancer treatment by making use of a clinical decision support system. In this research, concept-based information retrieval techniques to find relevant medical publications for such a system were developed and tested. These techniques were designed to search multiple document collections, without the need to store copies of the collections.
The influence of pseudo and explicit relevance feedback using the Rocchio algorithm were explored. The underlying retrieval models that were tested are TFIDF and BM25.
The tests were conducted using the TREC Clinical Decision Support datasets for the 2014 and 2015 editions. The TREC CDS relevance judgements were used to simulate explicit feedback. The NLM Medical Text Indexer was used to extract MeSH terms from the TREC CDS topics, to be able to conduct concept-based queries. Furthermore, the difference in performance when using inverse document frequencies calculated on the entire PMC dataset, and on a collection of several thousand intermediate search results were measured.
The results show that both pseudo and explicit relevance feedback have a strong positive influence on the inferred NDCG. Additionally, the performance difference when using IDF values calculated on a very small document collection is limited.

[download pdf]

Term Extraction paper in Computing Reviews’ Best of 2016

CR Best of Computing Notable Article The paper Evaluation and analysis of term scoring methods for term extraction with Suzan Verberne, Maya Sappelli and Wessel Kraaij is selected as one of ACM Computing Reviews' 2016 Best of Computing. Computing Reviews is published by the Association for Computing Machinery (ACM) and the editor-in-chief is Carol Hutchins (New York University).

In the paper, we evaluate five term scoring methods for automatic term extraction on four different types of text collections. We show that extracting relevant terms using unsupervised term scoring methods is possible in diverse use cases, and that the methods are applicable in more contexts than their original design purpose.

[download pdf]

SIGIR Test of Time Awardees 1978-2001

Overview of Special Issue

by Donna Harman, Diane Kelly (Editors), James Allan, Nicholas J. Belkin, Paul Bennett, Jamie Callan, Charles Clarke, Fernando Diaz, Susan Dumais, Nicola Ferro, Donna Harman, Djoerd Hiemstra, Ian Ruthven, Tetsuya Sakai, Mark D. Smucker, Justin Zobel (Authors)

This special issue of SIGIR Forum marks the 40th anniversary of the ACM SIGIR Conference by showcasing papers selected for the ACM SIGIR Test of Time Award from the years 1978-2001. These papers document the history and evolution of IR research and practice, and illustrate the intellectual impact the SIGIR Conference has had over time.
The ACM SIGIR Test of Time Award recognizes conference papers that have had a long-lasting influence on information retrieval research. When the award guidelines were created, eligible papers were identified as those that were published in a window of time 10 to 12 years prior to the year of the award. This meant that the first year this award was given, 2014, eligible papers came from the years 2002-2004. To identify papers published during the period 1978-2001 that might also be recognized with the Test of Time Award, a committee was created, which was led by Keith van Rijsbergen. Members of the committee were: Nicholas Belkin, Charlie Clarke, Susan Dumais, Norbert Fuhr, Donna Harman, Diane Kelly, Stephen Robertson, Stefan Rueger, Ian Ruthven, Tetsuya Sakai, Mark Sanderson, Ryen White, and Chengxiang Zhai.
The committee used citation counts and other techniques to build a nomination pool. Nominations were also solicited from the community. In addition, a sub-committee was formed of people active in the 1980s to identify papers from the period 1978-1989 that should be recognized with the award. As a result of these processes, a nomination pool of papers was created and each paper in the pool was reviewed by a team of three committee members and assigned a grade. The 30 papers with the highest grades were selected to be recognized with an award.
To commemorate the 1978-2001 ACM SIGIR Test of Time awardees, we invited a number of people from the SIGIR community to contribute write-ups of each paper. Each write-up consists of a summary of the paper, a description of the main contributions of the paper and commentary on why the paper is still useful. This special issue contains reprints of all the papers, with the exception of a few whose copyrights are not held by ACM (members of ACM can access these papers at the ACM Digital Library as part of the original conference proceedings).
As members of the selection committee, we really enjoyed reading the older papers. The style was very different from todays SIGIR paper: the writing was simple and unpretentious, with an equal mix of creativity, rigor and openness. We encourage everyone to read at least a handful of these papers and to consider how things have changed, and if, and how, we might bring some of the positive qualities of these older papers back to the SIGIR program.

To be published in SIGIR Forum 51(2), Association for Computing Machinery, July 2017

[download pdf]

Exploring the Query Halo Effect in Site Search

Leading People to Longer Queries

by Djoerd Hiemstra, Claudia Hauff, and Leif Azzopardi

People tend to type short queries, however, the belief is that longer queries are more effective. Consequently, a number of attempts have been made to encourage and motivate people to enter longer queries. While most have failed, a recent attempt — conducted in a laboratory setup — in which the query box has a halo or glow effect, that changes as the query becomes longer, has been shown to increase query length by one term, on average. In this paper, we test whether a similar increase is observed when the same component is deployed in a production system for site search and used by real end users. To this end, we conducted two separate experiments, where the rate at which the color changes in the halo were varied. In both experiments users were assigned to one of two conditions: halo and no-halo. The experiments were ran over a fifty day period with 3,506 unique users submitting over six thousand queries. In both experiments, however, we observed no significant difference in query length. We also did not find longer queries to result in greater retrieval performance. While, we did not reproduce the previous findings, our results indicate that the query halo effect appears to be sensitive to performance and task, limiting its applicability to other contexts.

To be presented at SIGIR 2017, the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval in Tokyo, Japan on August 7-11, 2017

Also to be presented at DIR2017, the 16th Dutch-Belgian Information Retrieval Workshop in Hilversum, The Netherlands, on November 24, 2017

[download pdf]