Tag-Archive for » PayDIBI «

Monday, December 10th, 2012 | Author:

Brend Wanders, a PhD student of mine, presents a poster at the BeNeLux Bioinformatics Conference (BBC 2012) in Nijmegen.
Pay-as-you-go data integration for bio-informatics
Brend Wanders
Background: Scientific research in bio-informatics is often data-driven and supported by biological databases. In a growing number of research projects, researchers like to ask questions that require the combination of information from more than one database. Most bio-informatics papers do not detail the integration of different databases. As roughly 30% of all tasks in workflows are data transformation tasks, database integration is an important issue. Integrating multiple data sources can be difficult. As data sources are created, many design decisions are made by their creators.
Methods: Our research is guided by two use cases: homologues, the representation and integration of groupings; metabolomics integration, with a focus on the TCA cycle
Results: We propose to approach the time consuming problem of integrating multiple biological databases through the principles of ‘pay-as-you-go’ and ‘good-is-good-enough’. By assisting the user in defining a knowledge base of data mapping rules, trust information and other evidence we allow the user to focus on the work, and put in as little effort as is necessary for the integration. Through user feedback on query results and trust assessments, the integration can be improved upon over time.
Conclusions: We conclude that this direction of research is worthy of further exploration. [details]

Thursday, September 01st, 2011 | Author:

A master student performed a problem exploration for the PayDIBI project. This is the report he wrote.
Integration of Biological Sources – Exploring the Case of Protein Homology
Tjeerd W. Boerman, Maurice van Keulen, Paul van der Vet, Edouard I. Severing (Wageningen University)
Data integration is a key issue in the domain of bioin- formatics, which deals with huge amounts of heterogeneous biological data that grows and changes rapidly. This paper serves as an introduction in the field of bioinformatics and the biological concepts it deals with, and an exploration of the integration problems a bioinformatics scientist faces. We examine ProGMap, an integrated protein homology system used by bioinformatics scientists at Wageningen University, and several use cases related to protein homology. A key issue we identify is the huge manual effort required to unify source databases into a single resource. Uncertain databases are able to contain several possible worlds, and it has been proposed that they can be used to significantly reduce initial integration efforts. We propose several directions for future work where uncertain databases can be applied to bioinformatics, with the goal of furthering the cause of bioinformatics integration.

Monday, January 24th, 2011 | Author:

I have a vacancy for a PhD position in a project called “Pay-As-You-Go Data Integration for Bio-Informatics” (PayDIBI). In short, the objective is to develop data coupling and integration technology to support bio-informatics scientists in quickly constructing targeted data sets for researching questions that require the combination of information from more than one biological database. More information and a webform to apply can be found here.