As business and enterprises generate and exchange XML data more often, there is an increasing need for efficient processing of queries on XML data. Searching for the occurrences of a tree pattern query in an XML database is a core operation in XML query processing. Prior works demonstrate that holistic twig pattern matching algorithm is an efficient technique to answer an XML tree pattern with parent-child (P-C) and ancestor-descendant (A-D) relationships, as it can effectively control the size of intermediate results during query processing.
However, XML query languages (e.g. XPath, XQuery) define more axes and functions such as negation function, order-based axis and wildcards.Here we research a large set of XML tree pattern, called extended XML tree pattern , which may include P-C, A-D relationships, negation functions, wildcards and order restriction. We establish a theoretical framework about " matching cross " which demonstrates the intrinsic reason in the proof of optimality on holistic algorithms. Based on our theorems, we propose a set of novel algorithms to efficiently process three categories of extended XML tree patterns. A set of experimental results on both real-life and synthetic data sets demonstrate the effectiveness and efficiency of our proposed theories and algorithms
We build a theoretical framework on optimal processing of XML tree pattern queries. We show that "matching cross" is the key reason to result in the sub-optimality of holistic algorithms. Intuitively, matching cross describes a dilemma where holistic algorithms have to decide whether to output useless intermediate result or to miss useful results. The fact that TwigStack is optimal for queries with only A-D relationships can be explained that no matching cross can be found for any XML document with respect to queries with A-D edges.
We classify matching cross to bound and unbounded matching cross (BMC and UMC). We develop theorems to show that only part of UMC (i.e. UMC with mediator) can force holistic algorithms to potentially output useless intermediate results. Based on the theoretical analysis, we develop a series of holistic algorithms TreeMatch to achieve a large optimal query class for Q/,//,*. Our main technique is to use a concise encoding to present matching results, which leads to the reduction of useless intermediate results.
We conducted an extensive set of experiment on synthetic and real data set for performance comparison. We compared TreeMatch with previous four holistic XML tree pattern matching algorithms. The experimental results show that our algorithm can correctly process extended XML tree patterns, achieving performance speedup for tested queries and data sets, even in their restricted focus. The improvement mainly owes to the reduction of the size of intermediate results
- Operating System :Windows95/98/2000/XP
- Application Server : Tomcat5.0/6.X
- Front End : HTML, Java, Jsp
- Server side Script : Java Server Pages.
- Database Connectivity : JDBC.
- Processor - Pentium -III
- Speed - 1.1 Ghz
- RAM - 256 MB(min)
- Hard Disk - 20 GB
- Floppy Drive - 1.44 MB
- Key Board - Standard Windows Keyboard
- Mouse - Two or Three Button Mouse
- Monitor - SVGA
Related Projects : SMS Based Mobile Banking with Security ,SMS Based Student Intimation ,Social Networking With Advertisement ,Software Project Management ,SPAF: Stateless FSA-based Packet Filters ,Staying Connected in a Mobile Healthcare System ,Stealthy Attacks In Wireless Ad Hoc Networks ,Steganography In Audio Files ,Tanrox Work Force ,Text Encryption And Decryption ,Twitter Client For Android Based Smart Phone ,VAS for Hand Held Device ,Virtual Office Management ,Virtual Router Using Destination-Sequenced Distance Vector ,Voyage Management ,Web Enabled Automated Manufacturing System ,Web Alerts ,Web Blossom Bazzar ,Wireless Health Care System ,World Recipe ,Zeppelin Reservation