giovedì 26 aprile 2012

Choose between XPath and jQuery




Some spot userful reference from the web:
XML is a well-supported Internet standard for encoding structured data in a way that can be easily decoded by practically any programming language and even read or written by humans using standard text editors. 

Many applications, especially modern standards-compliant Web browsers, can deal directly with XML data.

Frequently used acronyms

  • Ajax: Asynchronous JavaScript + XML
  • API: Application Programming Interface
  • DOM: Document Object Model
  • W3C: World Wide Web Consortium
  • XHTML: Extensible Hypertext Markup Language
  • XML: Extensible Markup Language
  • XSLT: Extensible Stylesheet Language Transformations
XPath (the XML Path Language) is a powerful query language for selecting nodes in an XML document. 
Version 1.0 of the XPath standard is widely implemented in a wide range of languages such as Java™, C#, and JavaScript.

jQuery is a de-facto standard cross-browser JavaScript library for selecting and manipulating nodes in an XHTML document (and in XML documents loaded through Ajax). 



If XPath is a W3C standard, and implementations exist in JavaScript, why bother using jQuery instead?

XPath is a generalized XML standard, while jQuery is a lightweight library designed to deal with the intricacies of cross-browser compatibility so you don't have to worry about which browser your users are running

It's flexible enough to work within the browser's DOM using standard JavaScript idioms, and it provides additional features that make Web application development much less painful, such as powerful Ajax and animation support.

You should, however, always use the right tool for the job at hand; knowing more about these two tools will definitely help you pick the right technology for your next project.




Listing 1. A sample XML document (book.xml)


<?xml version="1.0" encoding="utf-8"?>
<catalog>
    <book format="trade">
        <name>Jennifer Government</name>
        <author>Max Barry</author>
        <price curr="CAD">15.00</price>
        <price curr="USD">12.00</price>
    </book>

    <book format="textbook">
        <name>Unity Game Development Essentials</name>
        <author>Will Goldstone</author>
        <price curr="CAD">52.00</price>
        <price curr="USD">45.00</price>
    </book>

    <book format="textbook">
        <name>UNIX Visual QuickPro</name>
        <author>Chris Herborth</author>
        <price curr="CAD">15.00</price>
        <price curr="USD">10.00</price>
    </book>
</catalog>


Note that I have no affiliation with the authors and/or publishers, except for the obvious one there. 
The prices are entirely made up and you should check your favorite book store for actual pricing.

XPath assumptions
For the XPath code in this article, you're going to make these assumptions:
  • You've loaded the book.xml file into a format that your XPath implementation can use.
  • You're starting your searches with an object representing the root of the document. That is, the object that has the <catalog> element as its child. You'll call this root because it's the root of the XML document hierarchy.
Because there are so many XPath implementations on so many different platforms, you'll focus on the XPath statements themselves and use a pseudocode similar to JavaScript to show them in context; check the class library of your favorite development platform for information about loading XML documents and the specific XML node objects you have available.

A sample of basic XPath usage:

var findPattern = "//table[2]//table[2]//table[2]//tr/td[4]/font/a";
 
var resultLinks = document.evaluate( findPattern, document, null, XPathResult.ORDERED_NODE_SNAPSHOT_TYPE, null )

 
var i=0;
 


while ( (res = resultLinks.snapshotItem(i) ) !=null ){
  //do something to the linki++  
}
The jQuery code in this article makes these assumptions:
  • You're using the latest (version 1.4.0) jQuery code
  • You've loaded the book.xml file through the jQuery.get() or jQuery.post() method and have stored the resulting XML document in a variable named root (to be the same as your XPath examples).


Listing 2. Loading the XML sample with jQuery

 
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" 
               "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html;charset=utf-8"/>
<title>Book Catalog</title>
<script type="text/javascript"
src="http://ajax.googleapis.com/ajax/libs/jquery/1.4.0/jquery.min.js"></script>
<script type="text/javascript">// <![CDATA[
var root = null;

$(document).ready( function(){
    $.get( "http://localhost/~chrish/books.xml", 
        function( data ) {
            root = data;

            $("p#status").text( "Loaded." );
        } );
} );
// ]]></script>
</head>

<body>
<p id="status">
Loading book.xml...
</p>
</body>
</html>


In the $(document).ready() function, you use the jQuery get() method to load books.xml from the local Web server, store the resulting document object in the root variable, and set the text of the paragraph with the status ID to indicate that the XML is done loading.

Selecting nodes

The fundamental purpose of both XPath and jQuery is to select nodes from a document

Once you select a node (or a collection of nodes), you can find the data you're looking for and manipulate the document when you need to.

XPath is designed to return exactly the nodes you've asked for; it's generally very specific. 

jQuery, on the other hand, makes it very easy to operate on large collections of nodes, so sometimes you'll have to be careful to narrow down the matches before you start to work through the nodes.

Selecting a node by name

When you search for a specific node, you often know its name, or the name of its parent element.
To find a specific element, you use its name as in Listing 3.

Listing 3. Selecting nodes by name


/* Find all <book> elements through XPath: */
var result = root.find( "//book" );

/* Find all <book> elements through jQuery: */
var result = $(root).find( "book" );


The XPath statement to select all of the <book> elements (//book) uses two forward slashes (//) to specify that all matching nodes, starting from the current node (root in the example), are to be found. 

This is the default behavior of jQuery, so you don't need to include anything else. In both cases, the result will be all three <book> elements from Listing 1.

You can often narrow the search results by specifying a path of elements; the results will be matching nodes from the end of the path (see Listing 4).

Listing 4. Selecting nodes by path—these don't behave the same

 
/* Be more specific (XPath): */
var result = root.find( "/catalog//book" );

/* Be more specific (jQuery): */
var result = $(root).find( "catalog book" );


Starting from the root element (/), this XPath statement will look for the first <catalog> element, and then return all of the <book> elements from that first <catalog>

The jQuery statement behaves a little differently

it will return all <book> elements from all <catalog> elements (see Listing 5). 

With the example book.xml file, the result is the same set of nodes, but what if you wanted to get all of the <author> elements from the <book> elements? 
You'd start the XPath expression with two forward slashes (//) like you did in Listing 3.

Listing 5. Pulling out embedded nodes by path—these examples behave the same


/* Get all authors from all books (XPath): */
var result = root.find( "//book//author" );

/* Get all authors from all books (jQuery): */
var result = $(root).find( "book author" );


To make jQuery return the <book> elements from the first <catalog>, like the XPath sample in Listing 4, you have to instruct it to use the first <catalog> it finds (see Listing 6).

Listing 6. Matching the books in the first catalog—these examples behave the same


/* All books from the first catalog (XPath): */
var result = root.find( "/catalog//book" );

/* All books from the first catalog (jQuery): */
var result = $(root).find( "catalog:first book" );


Finding the last occurrence of an element, such as the last list item in a bulleted list, or the last option in a selection list, is also a common operation. To properly append something to the end of the list, you'll need to know the location of that end (see Listing 7).

Listing 7. Finding the last book in the catalog


/* The last book from the first catalog (XPath): */
var result = root.find( "/catalog/book[last()]" );

/* The last book from the first catalog (jQuery): */
var result = $(root).find( "catalog:first book:last" );


In both cases, you get the last <book> element from the first <catalog> element, which is what you were looking for. In the XPath example, the last() function returns the index of the last matched element, which you use in square brackets.


Sometimes you don't know the name of the element you're looking for, or you need to find an element that might be inside of several different elements. In both XPath and jQuery, you can use an asterisk (*) to match any element (see Listing 8).

Listing 8. The any element


/* Find all authors in all elements inside of <catalog> (XPath): */
var result = root.find( "/catalog//*//author" );

/* Find all authors in all elements inside of <catalog> (jQuery): */
var result = $(root).find( "catalog:first * author" );


Note that I've used :first in the jQuery sample to make it work exactly like the XPath version.

Similar elements often have unique attributes, such as the id attribute used in XHTML elements to give them a unique reference ID (see Listing 9). Sometimes you don't care as much about the specific element as you do about it having an attribute with a specific value.

Listing 9. Find those pesky textbooks


/* Find all books that are textbooks (XPath): */
var result = root.find( "//book[@format='textbook']" );

/* Find all books that are textbooks (jQuery): */
var result = $(root).find( "book[format='textbook']" );


Both examples will return all <book> elements that have a format attribute set to textbook (there are two in the book.xml file from Listing 1). 

XPath's syntax uses an at sign (@ ) to match attributes 
(jQuery just encloses them in square brackets) and you need to include two forward slashes (//) to match all <book> elements, but the two queries are very similar and straightforward.

jQuery includes a couple of shortcuts for the two most commonly matched-against attributes (id and class) in XHTML. 

In XPath, you'll have to write them out explicitly (see Listing 10).

Listing 10. Matching XHTML based on the id and class attributes


/* Find the "status" <p>, then the highlighted elements (XPath) */
var result1 = xhtml_root.find( "//p[@id='status']" );
var result2 = xhtml_root.find( "//*[@class='highlight']" );

/* Find the "status" <p>, then the highlighted elements (jQuery) */
var result1 = $( "p#status" );
var result2 = $( ".highlight" );


Assuming that your XHTML document is valid (and it is, right?), 
the ID matching queries will only return one element, because IDs must be unique in a valid XML document.

If you're a fan of Cascading Style Sheets (CSS), you might notice that the jQuery selectors are pretty much the same as CSS selectors. This is handy, because you only need to remember one standard for finding the elements you want through jQuery and for styling them with CSS.


Both XPath and jQuery let you combine more than one selector to retrieve every node that matches any of the queries (that is, you'll get the union of the results). In XPath, you'll combine statements with the vertical bar (|) character, while in jQuery you'll use a comma (,) (see Listing 11).

Listing 11. Finding the results of multiple selectors


/* Find all book names and all authors (XPath) */
var result = root.find("//name|//author" );

/* Find all book names and all authors (jQuery) */
var result = $(root).find( "name,author" );


In both cases, the result will be a list of all <name> and <author> elements from anywhere in the document. 


In addition to selecting nodes, you often need to traverse the structure of a document, either to find related data or to perform complex manipulations. XPath and jQuery have you covered when you need to get around in your documents.
Given what you've learned previously, you can use these traversal methods to help find ancestors (that is, elements that contain the current element) or descendants (elements contained by the current element).

For example, Listing 12 allows you to find the <catalog> that contains the last <book> you've already found.

Listing 12. What catalog lists the last book?


/* Find the catalog for the last book you know about (XPath) */
var result = root.find( "//book[last()]/ancestor::catalog" );

/* Find the catalog for the last book you know about (jQuery) */
var result = $(root).find( "book:last" ).closest( "catalog" );


Figure 2 shows the result.

Figure 2. The catalog ancestor of the last book

 
Screen capture of highlighted catalog tag for the catalog ancestor of the last book in book.xml


One thing to note is that the jQuery closest() method works more like XPath's ancestor-or-self; it will include the current node if it matches. In this case, it won't, but it's something to keep in mind if you can nest elements with the same name, or if you're matching on attributes.

If you need to go the other way and find elements that might be deeply nested from the one you have, you can do that too (see Listing 13).

Listing 13. Find the prices listed in the catalog


/* Find the prices of everything in the catalog. (XPath) */
var result = root.find( "//catalog/descendant::price" );

/* Find the prices of everything in the catalog. (jQuery) */
var result = $(root).find( "catalog price" );


Like ancestor in XPath, descendant has a descendant-or-self for those special cases where the selected node might match what you're looking for (see Figure 3).

Figure 3. All the prices, selected
Screen capture with highlighted price tags for books listed in book.xml





Nessun commento:

Posta un commento