Pathfinder/MonetDB : a Relational Runtime for XQuery

2005

Rittinger, Jan

This master thesis proposes the use of a relational database as special query processor for the XML query language XQuery. We chose MonetDB, an extensible RDBMS, to become our relational back-end. Its low level interpreter language MIL, which combines a relational algebra and a procedural language, became our target language for the XQuery compilation. The thesis first sketches concepts of the two languages as well as general ideas of the MonetDB DBMS and the Pathfinder compiler. The overview is followed by the description of storage structures for XML documents and XQuery item sequences.

The mapping from normalized XQuery Core to relational algebra by means of inference rules formalizes the compilation scheme and serves as basis for explaining the concepts of the transformation. From the inference rules we also derive the mapping from normalized XQuery Core to our target language MIL. Different optimizations increase the performance of the semantically correct but sometimes inefficient translation. Amongst others, an extension of the staircase join algorithm, which efficiently evaluates XPath location steps, enables us to exploit its techniques in the domain of XQuery. Another important optimization is the join recognition that, based on normalized XQuery Core patterns, detects
relational joins and emits appropriate join plans.

Experiments not only justify the optimizations, but also demonstrate the outstanding scaling of our approach. An extensive performance comparison with other XQuery processors (using the XMark benchmark) furthermore marks the effectiveness of the approach. Finally, a conclusion sums up the ideas developed in this thesis and provides an outlook for the future topics in the course of the Pathfinder project.

