Archive for » September, 2009 «

Friday, September 18th, 2009 | Author:

We developed a demonstration that shows and explains what happens behind the scenes of our ROX approach for run-time query optimization.
The Robustness of a Run-Time XQuery Optimizer against Correlated Data
Riham Abdel Kader, Peter Boncz, Stefan Manegold, Maurice van Keulen
We demonstrate ROX, a run-time optimizer of XQueries, that focuses on finding the best execution order of XPath steps and relational joins in an XQuery. The problem of join ordering has been extensively researched, but the proposed techniques are still unsatisfying. These either rely on a cost model which might result in inaccurate estimations, or explore only a restrictive number of plans from the search space. ROX is developed to tackle these problems. ROX does not need any cost model, and defers query optimization to run-time intertwining optimization and execution steps. In every optimization step, sampling techniques are used to estimate the cardinality of unexecuted steps and joins to make a decision which sequence of operators to process next. Consequently, each execution step will provide updated and accurate knowledge about intermediate results, which will be used during the next optimization round. This demonstration will focus on: (i) illustrating the steps that ROX follows and the decisions it makes to choose a good join order, (ii) showing ROX’s robustness in the face of data with different degree of correlation, (iii) comparing the performance of the plan chosen by ROX to different plans picked from the search space, (iv) proving that the run-time overhead needed by ROX is restricted to a small fraction of the execution time.

The paper will be presented at the 26th International Conference on Data Engineering (ICDE2010), 1-6 Mar 2010, Long Beach, California, USA [details]

Monday, September 07th, 2009 | Author:

In cooperation with ITC (International Institute for Geo-Information Science and Earth Observation), we have a PhD position availble on Neogeography: the challenge of channeling large and ill-behaved data streams. In neogeography, geographic information is derived from end-users, not from official bodies. The technology is meant to reach a substantial user community in the less-developed world through content provision and delivery via cell phone networks. Exploiting such neogeographic data requires a.o. the extraction of the where and when from textual descriptions. This comes with intrinsic uncertainty in space, time, but also thematically in terms of entity identification: which is the restaurant, bus stop, farm, market, forest mentioned in this information source? Anyone with a MSc degree interested in doing PhD research on this topic is welcome to apply before October 10 (see the vacancy for details).

Monday, September 07th, 2009 | Author:

August 28, Ander de Keijzer and I organized another MUD workshop (Management of Uncertain Data). We had 5 presentations of research papers, 2 invited speakers, Olivier Pivert on possibilistic databases and Christoph Koch on probabilistic databases, and a discussion on these two approaches for managing uncertain data. I regard this year’s MUD as very succesful: We counted about 30 participants and the aforementioned discussion was lively. We plan to submit a workshop report to SIGMOD record.

Monday, September 07th, 2009 | Author:

At the August 28 meeting in Lyon of the IFIP Working Group 2.6 on Databases, I was elected to become a full member of the working group.

Category: 3. Research topics  | Tags:  | Comments off