Fieke Hillerström graduates on Deep Verification Learning

December 13th, 2016, posted by Djoerd Hiemstra

by Fieke Hillerström

Deep Verification Learning

Deep learning for biometrics has increasingly gained attention over the last years. Due to the expansion of computational power and the increasing sizes of the available datasets, the performance has surpassed that of humans on certain verification tasks. However, large datasets are not available for every application. Therefore we introduce Deep Verification Learning, to reduce network complexity and train with more modest hardware on smaller datasets. Deep Verification Learning takes two images to be verified at the input of a deep learning network, and trains directly towards a verification score. This topology enables the network to learn differences and similarities in the first layer, and to involve verification signals during training. Directly training towards a verification score reduces the number of trainable parameters significantly. We applied Deep Verification Learning on the face verification task, also it could be extended to other biometric modalities. We compared our face verification learning topology with a network trained for multi-class classification on the FRGC dataset, which contains only 568 subjects. Deep Verification Learning performs substantially better.

[download]

Vacation days for societal impact

December 12th, 2016, posted by Djoerd Hiemstra

by Joe Laufer, Mariëlle Winkler, Djoerd Hiemstra, and Susanne de Gooijer (Free Spirits UniTe)

Many employees of the University of Twente spend part of their free time on volunteer projects that are directly beneficial to society. They use their expertise and professional knowledge for instance by teaching children, by lecturing students in developing countries, by supporting elderly with new technology, or by rewriting an NGO’s strategic plan. These employees struggle allocating enough free time for their volunteer work, whereas others might not need all their vacations days. Our proposal is simple: Employees can take their vacation days, and give them to employees that are in need because of their volunteer work. Our proposal extends the university’s vision “the entrepreneurial university” to explicitly support projects with societal impact. We envision the following steps:

  • Employees initiate projects that have societal impact and ask support of the campus community;
  • Employees can donate one or more vacation days to become part of the community, and to form a pool of additional free time, to be used by the project initiators;
  • Project initiators pitch their ideas to the community, similiar to pitches for crowd funding platforms like Kickstarter;
  • The community can vote for the projects of their choice;
  • Once enough vacations days are donated to an initiative, the project initiator can use the extra time off to carry out their initiative (in addition to the time that the employee already puts in their initiative);
  • Students can participate in projects for credits (starting with building an on-line community platform);
  • Alumni can sponsor intiatives financially, share their network, and coach the project initiators;
  • Project initiators share their experience and accomplishments to the community, for instance by blogging about their project;
  • Initiatives should be done in cooperation with an NGO.

We are proud that our proposal is accepted as one of the Living Smart Campus projects.

Rutger Varkevisser graduates on Large Scale Online Readability Assessment

November 30th, 2016, posted by Djoerd Hiemstra

by Rutger Varkevisser

The internet is an incredible resource for information and learning. By using search engines like Google, information is usually just a click away. Unless you are a child, in which case most of the information on the web is either (way) too difficult to read and/or understand, or impossible to find. This research aims to successfully combine the areas of readability assessment and gamification in order to provide a tech- nical and theoretical foundation for the creation of an automatic large scale child feedback readability assessment system. In which correctly assessing the readability level of online (textual) content for children is the central focus. The importance of having correct readability scores for online content, is that it provides children with a guideline on the difficulty level of textual content on the web. It also allows for external programs i.e. search engines, to potentially take readability scores into account based on the known age/proficiency of the user. Having children actively participate in the process of determining readability levels should improve any current systems which usually rely on fully automated systems/algorithms or human (adult) perception.
The first step in the creation of the aforementioned tool is to make sure the underlying process is scientific valid. This research has adapted the Cloze-test as a method of determining the readability of a text. The Cloze-test is an already established and researched method of readability assessment, which works by omitting certain words from a text and tasking the user with filling in the open spots with the correct words. The resulting overall score determining the readability level. For this research we want to digitize and automate this process. However, while the validity of the Cloze-test and its results in an offline (paper) environment have been proven, this is not the case for any digital adaptation. Therefore the first part of this research focusses on this central issue. By combining the areas of readability assessment (the Cloze-test), gamification (the creation of a digital online adaptation of the Cloze-test) and child computer interaction (a user-test on the target audience with the developed tool) this validity was examined and tested. In the user-test the participants completed several different Cloze-test texts, half of them offline (on paper) and the other half in a recreated online environment. This was done to measure the correlation between the online scores and the offline scores, which we already know are valid. Results of the user-test confirmed the validity of the online version by showing significant correlations between the offline and online versions via both a Pearson correlation coefficient and Spearman’s rank-order analysis.
With the knowledge that the online adaptation of the Cloze-test is valid for determining readability scores, the next step was to automate the process of creating Cloze-tests from texts. Given that the goal of the project was to provide the basis of a scalable gamified approach, and scalable in this context means automated. Several methods were developed to mimic the human process of creating a Cloze-test (i.e. looking at the text and selecting which words to omit given a set of general guidelines). Included in these methods were TF.IDF and NLP approaches in order to find suitable extraction words for the purposes of a Cloze-test. These were tested by comparing the classification performance of each method with a baseline of manually classified/marked set of texts. The final versions of the aforementioned methods were tested, and resulted performance scores of around 50%, i.e. how well they emulated human performance in the creation of Cloze-tests. A combination of automated methods resulted in an even bigger performance score of 63%. The best performing individual method was put to the test in a small Turing-test style user-test which showed promising results. Presented with 2 manually- and 1 automatically created Cloze-test participants attained similar scores across all tests. Participants also gave contradicting responses when asked which of the 3 Cloze-tests was automated. This research concludes the following:

  1. Results of offline- and online Cloze-tests are highly correlated.
  2. Automated methods are able to correctly identify 63% of suitable Cloze-test words as marked by humans.
  3. Users gave conflicting reports when asked to identify the automated test in a mix of both automated- and human-made Cloze-tests.

[download pdf]

IP&M Best Paper Award for A cross-benchmark comparison of 87 learning to rank methods

November 7th, 2016, posted by Djoerd Hiemstra

We are proud of the Information Processing & Management Best Paper Award 2015 for our paper: A cross-benchmark comparison of 87 learning to rank methods.

IPM Best Paper Award Certificate

Published in Information Processing and Management 51(6), pages 757–772

[download preprint]

Inoculating Relevance Feedback Against Poison Pills

November 4th, 2016, posted by Djoerd Hiemstra

by Mostafa Dehghani, Hosein Azarbonyad, Jaap Kamps, Djoerd Hiemstra, and Maarten Marx

Relevance Feedback (RF) is a common approach for enriching queries, given a set of explicitly or implicitly judged documents to improve the performance of the retrieval. Although it has been shown that on average, the overall performance of retrieval will be improved after relevance feedback, for some topics, employing some relevant documents may decrease the average precision of the initial run. This is mostly because the feedback document is partially relevant and contains off-topic terms which adding them to the query as expansion terms results in loosing the retrieval performance. These relevant documents that hurt the performance of retrieval after feedback are called “poison pills”. In this paper, we discuss the effect of poison pills on the relevance feedback and present significant words language models (SWLM) as an approach for estimating feedback model to tackle this problem.

To be presented at the 15th Dutch-Belgian Information Retrieval Workshop, DIR 2016 on 25 November in Delft.

[download pdf]

Dutch-Belgian Information Retrieval workshop in Delft

November 2nd, 2016, posted by Djoerd Hiemstra

The Dutch-Belgian Information Retrieval workshop DIR 2016 will be held in Delft on 25 November. The preliminary workshop program contains 2 keynotes, 12 oral presentations and 7 poster presentations. Max Wilson from the University of Nottingham will provide an Human Computer Interaction perspective on Information Retrieval. Carlos Castillo from Eurecat will talk about the detection of algorithmic discrimination.

DIR 2016

Register at http://dir2016.nl.

Data Science Platform Netherlands

October 7th, 2016, posted by Djoerd Hiemstra

Data Science Platform Netherlands

The Data Science Platform Netherlands (DSPN) is the national platform for ICT research within the Data Science domain. Data Science is the collection and analysis of so-called ‘Big Data’ according to academic methodology. DSPN unites all Dutch academic research institutions where Data Science is carried out from an ICT perspective, specifically the computer science or applied mathematics perspectives. The objectives of DSPN are to:

  • Highlight the importance of ICT research in Big Data and Data Science, especially in national discussions about research and education.
  • Exchange and disseminate information about Data Science research and education.
  • Build and maintain a network of ICT researchers active in the field of Data Science.

DSPN is launched as part of the ICT Research Platform Netherlands (IPN) to give a voice to the Data Science initiatives of the Dutch ICT research organisations. For more information, see the website at: http://www.datascienceplatform.org/.

#WhoAmI in 160 Characters?

October 5th, 2016, posted by Djoerd Hiemstra

Classifying Social Identities Based on Twitter

by Anna Priante, Djoerd Hiemstra, Tijs van den Broek, Aaqib Saeed, Michel Ehrenhard, and Ariana Need

We combine social theory and NLP methods to classify English-speaking Twitter users’ online social identity in profile descriptions. We conduct two text classification experiments. In Experiment 1 we use a 5-category online social identity classification based on identity and self-categorization theories. While we are able to automatically classify two identity categories (Relational and Occupational), automatic classification of the other three identities (Political, Ethnic/religious and Stigmatized) is challenging. In Experiment 2 we test a merger of such identities based on theoretical arguments. We find that by combining these identities we can improve the predictive performance of the classifiers in the experiment. Our study shows how social theory can be used to guide NLP methods, and how such methods provide input to revisit traditional social theory that is strongly consolidated in offline setting

To be presented at the EMNLP Workshop on Natural Language Processing and Computational Social Science (NLP+CSS) on November 5 in Austin, Texas, USA.

[download pdf]

Download the code book and classifier source code from github.

Data Science guest lectures

September 26th, 2016, posted by Djoerd Hiemstra

On 12 October we organize another Data Science Day in the Design Lab with guest lectures by Thijs Westerveld (Chief Science Officer at WizeNoze, Amsterdam), and Iadh Ounis (Professor of Information Retrieval in the School of Computing Science at the University of Glasgow). For more information and registration, see: http://ml.ewi.utwente.nl/ds2016/.

Resource Selection for Federated Search on the Web

September 22nd, 2016, posted by Djoerd Hiemstra

by Dong Nguyen, Thomas Demeester, Dolf Trieschnigg, and Djoerd Hiemstra

A publicly available dataset for federated search reflecting a real web environment has long been absent, making it difficult for researchers to test the validity of their federated search algorithms for the web setting. We present several experiments and analyses on resource selection on the web using a recently released test collection containing the results from more than a hundred real search engines, ranging from large general web search engines such as Google, Bing and Yahoo to small domain-specific engines.
First, we experiment with estimating the size of uncooperative search engines on the web using query based sampling and propose a new method using the ClueWeb09 dataset. We find the size estimates to be highly effective in resource selection. Second, we show that an optimized federated search system based on smaller web search engines can be an alternative to a system using large web search engines. Third, we provide an empirical comparison of several popular resource selection methods and find that these methods are not readily suitable for resource selection on the web. Challenges include the sparse resource descriptions and extremely skewed sizes of collections.

[download pdf]