Wednesday, February 25th, 2015 | Author:

Today I gave a presentation on the SIKS Smart Auditing workshop at the University of Tilburg.

Thursday, May 15th, 2014 | Author:, and picked up the UT homepage news item on the research of my PhD student Mena Badieh Habib on Named Entity Extraction and Named Entity Disambiguation. UT laat politiecomputers tweets ‘begrijpen’ voor veiligheid bij evenementen Universiteit Twente laat computers beter begrijpend lezen Twentse computer leest beter

Wednesday, May 14th, 2014 | Author:

The news feed of the UT homepage features an item on the research of my PhD student Mena Badieh Habib.
Computers leren beter begrijpend lezen dankzij UT-onderzoek (in Dutch).
Mena defended his PhD thesis entitled “Named Entity Extraction and Disambiguation for Informal Text – The Missing Links on May 9th.

Friday, May 09th, 2014 | Author:

Today, a PhD student of mine, Mena Badieh Habib Morgan, defended his thesis.
Named Entity Extraction and Disambiguation for Informal Text – The Missing Link
Social media content represents a large portion of all textual content appearing on the Internet. These streams of user generated content (UGC) provide an opportunity and challenge for media analysts to analyze huge amount of new data and use them to infer and reason with new information. A main challenge of natural language is its ambiguity and vagueness. When we move to informal language widely used in social media, the language becomes even more ambiguous and thus more challenging for automatic understanding. Named Entity Extraction (NEE) is a sub task of Information Extraction (IE) that aims to locate phrases (mentions) in the text that represent names of entities such as persons, organizations or locations regardless of their type. Named Entity Disambiguation (NED) is the task of determining which correct person, place, event, etc. is referred to by a mention. The main goal of this thesis is to mimic the human way of recognition and disambiguation of named entities especially for domains that lack formal sentence structure. We propose a robust combined framework for NEE and NED in semi-formal and informal text. The achieved robustness has been proven to be valid across languages and domains and to be independent of the selected extraction and disambiguation techniques. It is also shown to be robust against shortness in labeled training data and against the informality of the used language.

Wednesday, April 16th, 2014 | Author:

Today I’m going to give a presentation about my fraud detection research for the SCS chair.

Information Combination and Enrichment for Data-Driven Fraud Detection

Governmental organizations responsible for keeping certain types of fraud under control, often use data-driven methods for both immediate detection of fraud, or for fraud risk analysis aimed at more effectively targeting inspections. A blind spot in such methods, is that the source data often represents a ‘paper reality’. Fraudsters will attempt to disguise themselves in the data they supply painting a world in which they do nothing wrong. This blind spot can be counteracted by enriching the data with traces and indicators from more ‘real-world’ sources such as social media and internet. One of the crucial data management problems in accomplishing this enrichment is how to capture and handle uncertainty in the data. The presentation will start with a real-world example, which is also used as starting point for a problem generalization in terms of information combination and enrichment (ICE). We then present the ICE technology we have developed and a few more applications in which it has been or is intended to be applied. In terms of the 3 V’s of big data — volume, velocity, and variety — this presentation focuses on the third V: variety.

Date: Wednesday, April 16th, 2014
Room: ZI 2042
Time: 12:30-13:30 hrs

Monday, April 07th, 2014 | Author:

Last year we won the #Microposts2013 challenge; this year we came in second for the new #Microposts2014 challenge called NEEL, “Named Entity Extraction and Linking”, that as opposed to last year also involves Entity Disambiguation (by linking to DBpedia).
Named Entity Extraction and Linking Challenge: University of Twente at #Microposts2014 [Download]
Mena Badieh Habib, Maurice van Keulen, Zhemin Zhu

Monday, December 23rd, 2013 | Author:

Andreas Wombacher and I got some subsidy to valorize some of the research results of the COMMIT/ TimeTrails project. Companies involved are Arcadis and Nspyre. The functionality of the proof-of-concept product can be summarized as

  • A back-end system for collecting, managing and summarizing information from external sources which includes the novel pre-aggregation technology from COMMIT/TimeTrails
  • A visualization component providing a unique view of aggregated information in a map-based application (Geographical Information System). It is geared towards supporting online decision making by providing interactive visualizations of the huge amounts of available information.

Besides the proof-of-concept product, we will be organizing and executing a few pilot projects with customers of Arcadis and Nspyre, develop product training material, and conduct several dissemination activities.

Wednesday, December 04th, 2013 | Author:

Today, PhD student Haihan Yin defended his PhD thesis. I served on his defense committee.
“Defusing the Debugging Scandal – Dedicated Debugging Technologies for Advanced Dispatching Languages”[download]

Friday, November 08th, 2013 | Author:

Wiskundedocent Dick Meijer schreef een stukje over TOM: ‘De Treurige TOM Top Tien‘. Ik ben juist overwegend positief over TOM. In mijn rol als modulecoördinator van ‘Parels der Informatica’, de eerste module van Technische Informatica (TI), heb ik een stukje geschreven als tegenwicht voor het stuk van Dick: De Fleurige TOM Top Tien

Tuesday, October 01st, 2013 | Author:

I was interviewed for the company magazine E-Novation4U of Unit4
“Big data … Big brothergevoel of juist kans voor de accountant?”