It happens too infrequently with students for my taste, but Irma Veldman, a student of mine, got a paper accepted for the SUM conference about her research project.
Compression of Probabilistic XML documents
Irma Veldman, Ander de Keijzer, Maurice van Keulen
Database techniques to store, query and manipulate data that contains uncertainty receives increasing research interest. Such UDBMSs can be classified according to their underlying data model: relational, XML, or RDF. We focus on uncertain XML DBMS with as representative example the Probabilistic XML model (PXML) of . The size of a PXML document is obviously a factor in performance. There are PXML-specific techniques to reduce the size, such as a push down mechanism, that produces equivalent but more compact PXML documents. It can only be applied, however, where possibilities are dependent. For normal XML documents there also exist several techniques for compressing a document. Since Probabilistic XML is (a special form of) normal XML, it might benefit from these methods even more. In this paper, we show that existing compression mechanisms can be combined with PXML-specific compression techniques. We also show that best compression rates are obtained with a combination of PXML-specific technique with a rather simple generic DAG-compression technique.
The paper will be presented at the third International Conference on Scalable Uncertainty Management (SUM2009), 28-30 Sep 2009, Washington, DC, USA [details]