Wednesday, April 16th, 2014 | Author:

Today I’m going to give a presentation about my fraud detection research for the SCS chair.

Information Combination and Enrichment for Data-Driven Fraud Detection

Abstract
Governmental organizations responsible for keeping certain types of fraud under control, often use data-driven methods for both immediate detection of fraud, or for fraud risk analysis aimed at more effectively targeting inspections. A blind spot in such methods, is that the source data often represents a ‘paper reality’. Fraudsters will attempt to disguise themselves in the data they supply painting a world in which they do nothing wrong. This blind spot can be counteracted by enriching the data with traces and indicators from more ‘real-world’ sources such as social media and internet. One of the crucial data management problems in accomplishing this enrichment is how to capture and handle uncertainty in the data. The presentation will start with a real-world example, which is also used as starting point for a problem generalization in terms of information combination and enrichment (ICE). We then present the ICE technology we have developed and a few more applications in which it has been or is intended to be applied. In terms of the 3 V’s of big data — volume, velocity, and variety — this presentation focuses on the third V: variety.

Date: Wednesday, April 16th, 2014
Room: ZI 2042
Time: 12:30-13:30 hrs

Monday, April 07th, 2014 | Author:

Last year we won the #Microposts2013 challenge; this year we came in second for the new #Microposts2014 challenge called NEEL, “Named Entity Extraction and Linking”, that as opposed to last year also involves Entity Disambiguation (by linking to DBpedia).
Named Entity Extraction and Linking Challenge: University of Twente at #Microposts2014 [Download]
Mena Badieh Habib, Maurice van Keulen, Zhemin Zhu

Monday, December 23rd, 2013 | Author:

Andreas Wombacher and I got some subsidy to valorize some of the research results of the COMMIT/ TimeTrails project. Companies involved are Arcadis and Nspyre. The functionality of the proof-of-concept product can be summarized as

  • A back-end system for collecting, managing and summarizing information from external sources which includes the novel pre-aggregation technology from COMMIT/TimeTrails
  • A visualization component providing a unique view of aggregated information in a map-based application (Geographical Information System). It is geared towards supporting online decision making by providing interactive visualizations of the huge amounts of available information.

Besides the proof-of-concept product, we will be organizing and executing a few pilot projects with customers of Arcadis and Nspyre, develop product training material, and conduct several dissemination activities.

Wednesday, December 04th, 2013 | Author:

Today, PhD student Haihan Yin defended his PhD thesis. I served on his defense committee.
“Defusing the Debugging Scandal – Dedicated Debugging Technologies for Advanced Dispatching Languages”[download]

Friday, November 08th, 2013 | Author:

Wiskundedocent Dick Meijer schreef een stukje over TOM: ‘De Treurige TOM Top Tien‘. Ik ben juist overwegend positief over TOM. In mijn rol als moduleco√∂rdinator van ‘Parels der Informatica’, de eerste module van Technische Informatica (TI), heb ik een stukje geschreven als tegenwicht voor het stuk van Dick: De Fleurige TOM Top Tien

Tuesday, October 01st, 2013 | Author:

I was interviewed for the company magazine E-Novation4U of Unit4
“Big data … Big brothergevoel of juist kans voor de accountant?”

Wednesday, August 28th, 2013 | Author:

My PhD student, Victor de Graaff, has a poster paper on SIGSPATIAL 2013.
Point of interest to region of interest conversion [details]
Victor de Graaff, Rolf A. de By, Maurice van Keulen, and Jan Flokstra
The paper will be presented at the ACM SIGSPATIAL GIS, 5-8 November 2013, Orlando, Florida, USA

Wednesday, June 26th, 2013 | Author:

ACM TechNews picked up the UT homepage news item Gauging the Risk of Fraud From Social Media on Henry Been’s master project “Finding you on the Internet“.

Thursday, June 20th, 2013 | Author:

On 20 June 2013, Ben Companjen defended his MSc thesis on matching author names on publications to researcher profiles on the scale of The Netherlands. He carried out this research at DANS where he applied and validated his techniques on a coupling between the NARCIS scholarly database and the researcher profile database VSOI.
“Probabilistically Matching Author Names to Researchers”[download]
Publications are most important form of scientific communication, but science also consists of researchers, research projects and organisations. The goal of NARCIS (National Academic Research and Collaboration Information System) is to provide a complete and concise view of current science in the Netherlands.
Connecting publications to the researchers, projects and organisations that created them in retrospect is hard, because of a lack in the use of author identifiers in publications and researcher profiles. There is too much data to identify all researchers in NARCIS manually, so an automatic method is needed to assist completing the view of science in the Netherlands.
In this thesis the problems that limit automatic connection of author names in publications to researchers are explored and a method to automatically connect publications and researchers is developed and evaluated.
Using only the author names themselves finds the correct researcher for around 80% of the author names in an experiment, using two test sets. However, none of the correct matches were given the highest confidence of the returned matches. Over 90% of the correct matches were ranked second by confidence. Other correct matches were ranked lower, and using probabilistic results allows working with the correct results, even if they are not the best match. Many names that should not match, were included in the matches. The matching algorithm can be optimised to assign confidence to matches differently.
Including a matching function that compares publication titles and researcher’s project titles did not improve the results, but better results are expected when more context elements are used to assign confidences.

Tuesday, June 18th, 2013 | Author:

The news feed of the UT homepage features an item to the master project of Henry Been.
Gauging the risk of fraud from social media.