RMIT INEX experiments: XML Retrieval using Lucy and eXist

This paper reports on the RMIT group's approach to XML retrieval while participating in INEX 2003. We indexed XML documents using Lucy, a compact and fast text search engine designed and written by the Search Engine Group at RMIT University. For each INEX topic, up to 1000 highly ranked documents were then loaded and indexed by eXist, an open source native XML database. A query translator converts the INEX topics into corresponding Lucy and eXist query expressions respectively. These query expressions may represent traditional information retrieval tasks(unconstrained, CO topics), or may focus on retrieving and ranking specific document components (constrained, CAS topics). With respect to both these expressions types, we used eXist to extract final answers (either full documents or document components) frome those documents that were judged highly relevant by Luy. Several extraction strategies were used that diffeently influenced the ranking order of the final answers. The final INEX results show that our choice for a translation method and an extraction stategy leads to a very effective XML retrieval for the CAS topics. We observed a system limitation for CO topics resulting in the same or similar choice to have little or no impact on the retrieval performance.

Data and Resources

Additional Info

Field Value
Source 2nd Workshop of the Initiative for the Evaluation of XML Retrieval (INEX'03)
Author Pehcevski, Jovan, Thom, James, A., Vercoustre, Anne-Marie
Maintainer CCSD
Last Updated May 8, 2026, 05:08 (UTC)
Created May 8, 2026, 05:08 (UTC)
Identifier inria-00090569
Language en
Rights https://about.hal.science/hal-authorisation-v1/
contributor Computer Science and Information Technology (CSIT) ; Royal Melbourne Institute of Technology University (RMIT University)
creator Pehcevski, Jovan
date 2003-12-08T00:00:00
harvest_object_id 14aa61a8-a33a-4eea-9a68-24ca295a152f
harvest_source_id 3374d638-d20b-4672-ba96-a23232d55657
harvest_source_title test moissonnage SELUNE
metadata_modified 2024-11-17T00:00:00
set_spec type:COMM