The Apers Tree

Monday, February 19th, 2018, posted by Djoerd Hiemstra

To celebrate Peter Apers’ retirement, we created The Apers Tree, which displays the Academic Genealogy of Peter Apers. The tree is inspired by the wonderful Mathematics Genealogy Project and a gift from the Database Group of the University Twente on the occasion of Peter’s retirement on 16 February 2018.

Check out the Apers Tree on Github.

Christel Geurts graduates on Cross-Domain Authorship Attribution

Friday, January 12th, 2018, posted by Djoerd Hiemstra

Cross-Domain Authorship Attribution as a Tool for Digital Investigations

by Christel Geurts

On the darkweb sites promoting illegal content are abundant and new sites are constantly created. At the same time Law Enforcement is working hard to take these sites down and track down the persons involved. Often, after taking down a site, users change their name and move to a different site. But what if Law Enforcement could track users across sites? Different sites or sources of information are called a domain. As the domain changes, often the context of a message also changes, making it challenging to track users simply on words used. The aim of this thesis is to develop a system that can link written text of authors in a cross-domain setting. The system was tested on a blog corpus and verified on police data. Tests show that multinomial logistic regression and Support Vector Machines with a linear kernel perform well. Character 3-grams work well as features, combining multiple feature sets increases performance. Tests show that Logistic Regression models with a combined feature set performed best (accuracy = 0.717, MRR = 0.7785, 1000 authors (blog corpus)). On the police data the Logistic Regression model had an accuracy of 0.612 and a MRR of 0.6883 for 521 authors.

Supporting the Exploration of Online Cultural Heritage Collections

Wednesday, January 10th, 2018, posted by Djoerd Hiemstra

The Case of the Dutch Folktale Database

by Iwe Muiser, Mariƫt Theune, Ruud de Jong, Nigel Smink, Dolf Trieschnigg, Djoerd Hiemstra, and Theo Meder

This paper demonstrates the use of a user-centred design approach for the development of generous interfaces/rich prospect browsers for an online cultural heritage collection, determining its primary user groups and designing different browsing tools to cater to their specific needs. We set out to solve a set of problems faced by many online cultural heritage collections. These problems are lack of accessibility, limited functionalities to explore the collection through browsing, and risk of less known content being overlooked. The object of our study is the Dutch Folktale Database, an online collection of tens of thousands of folktales from the Netherlands. Although this collection was designed as a research commodity for folktale experts, its primary user group consists of casual users from the general public. We present the new interfaces we developed to facilitate browsing and exploration of the collection by both folktale experts and casual users. We focus on the user-centred design approach we adopted to develop interfaces that would fit the users’ needs and preferences.

Screen Shot of the Dutch Folktale Database

Published in Digital Humanities Quarterly 11(4), 2017

Access the Folktale Database at: