XPath in JavaScript

There are 10 different result types, each represented by a constant on the XPathResult object. They are

XPathResult.ANY_TYPE – Returns the type of data appropriate for the XPath expression
XPathResult.ANY_UNORDERED_NODE_TYPE – Returns a node set of matching nodes, although the order may not match the order of the nodes within the document
XPathResult.BOOLEAN_TYPE – Returns a Boolean value
XPathResult.FIRST_ORDERED_NODE_TYPE – Returns a node set with only one node, which is the first matching node in the document
XPathResult.NUMBER_TYPE – Returns a number value
XPathResult.ORDERED_NODE_ITERATOR_TYPE – Returns a node set of matching nodes in the order in which they appear in the document. This is the most commonly used result type.
XPathResult.ORDERED_NODE_SNAPSHOT_TYPE – Returns a node set snapshot, capturing the nodes outside of the document so that any further document modification doesn't affect the result set. The nodes in the result set are in the same order as they appear in the document.
XPathResult.STRING_TYPE – Returns a string value
XPathResult.UNORDERED_NODE_ITERATOR_TYPE – Returns a node set of matching nodes, although the order may not match the order of the nodes within the document
XPathResult.UNORDERED_NODE_SNAPSHOT_TYPE – Returns a node set snapshot, capturing the nodes outside of the document so that any further document modification doesn't affect the node set. The nodes in the node set are not necessarily in the same order as they appear in the document.

The information returned from evaluate() depends wholly on the result type requested.

The simplest results return a single value (Boolean, Node, Number, and String) while the more complex ones return multiple nodes. When called, evaluate() returns an XPathResult object.

This object’s properties contain the result of the evaluation. There is a property for each type of simple result: booleanValue, singleNodeValue, numberValue, and stringValue. Additionally, there is a resultType property whose value maps to one of the XPathResult constants. This is useful in determining the type of result when using XPathResult.ANY_TYPE. If there is no matching result, evaluate() returns null.

To perform an XPath query, you’ll need to use an XPathEvaluator object.

You can either create a new instance or use a built-in one. Creating your own means instantiating XPathEvaluator (Opera only implemented this as of version 9.5):

var evaluator = new XPathEvaluator();

//get first div
var result = evaluator.evaluate("//div", document.documentElement, null,
                 XPathResult.FIRST_ORDERED_NODE_TYPE, null);
alert("First div ID is " + result.singleNodeValue.id);

In Firefox, Safari, Chrome, and Opera, all instances of Document also implement the XPathEvaluator interface, which means you can access document.evaluate() if you want to query the HTML page.

If you load an XML document via XMLHttpRequest or another mechanism, the evaluate() method is also available. For example:

//get first div
var result = document.evaluate("//div", document.documentElement, null,
                 XPathResult.FIRST_ORDERED_NODE_TYPE, null);

alert("First div ID is " + result.singleNodeValue.id);

Note that you cannot use document.evaluate() outside of document; you can use an instance of XPathEvaluator any document.

There are two ways to return multiple nodes, via iterator or snapshot. Iterator results are still tied to the document, so any changes made will automatically be reflected in the result set.

Snapshot results, on the other hand, take the results at that point in time and are not affected by further document augmentation. Both result types require you to iterate over the results. For iterator results, you’ll need to use the iterateNext() method, which will either return a node or null (this works for both ordered and unordered iterator results):

//get all divs - iterator style
var result = document.evaluate("//div", document.documentElement, null,
                 XPathResult.ORDERED_NODE_ITERATOR_TYPE, null);
if (result){
    var node = result.iterateNext();
    while(node) {
        alert(node.id);
        node = node.iterateNext();
    }
}

For snapshot results, you can use the snapshotLength property to determine how many results were returned and the snapshotItem() method to retrieve a result in a specific position. Example (this works for both ordered and unordered snapshot results):

//get all divs - iterator style
var result = document.evaluate("//div", document.documentElement, null,
                 XPathResult.ORDERED_NODE_SNAPSHOT_TYPE, null);
if (result){
    for (var i=0, len=result.snapshotLength; i < len; i++) {
        alert(result.snapshotItem(i).id);
    }
}

In most cases, a snapshot result is preferable to an iterator result because the connection with the document has been severed; every call to iterateNext() re-executes the XPath query on the document and so is much slower.

In short, iterator results have the same performance implications as using HTMLCollection objects, which also query the document repeatedly.

If you’re simply using XPath to query an HTML document, then the namespace resolver argument for evaluate() will always be null; if you intend to use XPath on an XML document containing namespaces, then you’ll need to learn how to create and use namespace resolvers.

Every namespace URI is mapped to a specific prefix defined in the XML document with the exception of the default namespace, which doesn’t require a prefix. A namespace resolver performs the mapping between namespace prefix and namespace URI for the XPath engine. There are two ways to create namespace resolvers. The first is to create a function that accepts the namespace prefix as an argument and returns the appropriate URI. For example:

function resolver(prefix){
    switch(prefix){
        case "wrox": return "http://www.wrox.com/";
        case "ncz": return "http://www.nczonline.net/";
        default: return "http://www.yahoo.com/";
    }
}

This approach may work if you already have the prefixes and namespace URIs handy. When the default namespace is going to be resolved, an empty string is passed into the function.
The second approach is to create a namespace resolver using a node that contains namespace information, such as:

<books xmlns:wrox="http://www.wrox.com/" xmlns="http://www.amazon.com/">
    <wrox:book>Professional JavaScript</book>
</books>

The <books> element contains all of the namespace information for this XML snippet. You can pass a reference to this node into the XPathEvaluator object’s createNSResolver() method and get a namespace resolver automatically created:

var evaluator = new XPathEvaluator();
var resolver = evaluator.createNSResolver(xmldoc.documentElement);

This approach is more useful when the namespace information is embedded in the XML document, in which case it doesn’t make sense to duplicate that information and too tightly couple the JavaScript to the XML document.
Using either approach, you can easily evaluate XPath expressions on XML documents that have namespaces:

var evaluator = new XPathEvaluator();
var resolver = evaluator.createNSResolver(xmldoc.documentElement);
var result = evaluator.evaluate("wrox:book", xmldoc.documentElement,
                 resolver, XPathResult.FIRST_ORDERED_NODE_TYPE, null);
if (result){
    alert(result.singleNodeValue.firstChild.nodeValue);
}

If you don’t provide a namespace resolver when an XPath query is run against a document that uses namespaces, then an error will occur.
Once again, this information is valid for Firefox, Safari, Chrome, and Opera; Internet Explorer does not natively support DOM Level 3 XPath. It does remain an option in other browsers, though, for super fast DOM querying.

Disclaimer: Any viewpoints and opinions expressed in this article are those of Nicholas C. Zakas and do not, in any way, reflect those of my employer, my colleagues, Wrox Publishing, O'Reilly Publishing, or anyone else. I speak only for myself, not for them.

Both comments and pings are currently closed.

Android Dev & Splinters

venerdì 27 aprile 2012

XPath in JavaScript

XPath in JavaScript

Nessun commento:

Posta un commento