XPath

XPath is a query language for locating nodes and fragments in XML trees. It is very similar to XPointers.

XPath provides a common syntax for functionality shared by XPointer and XSLT.

Location paths

A location path is sequence of location steps. The location steps are seperated by a slash (/).

Location step

A location step looks like

axis::node-test[predicate]

axis

The axis selects a set of nodes that are candidates for the result.

node-test

The node-test examines the candidates and filters them based on node type (element, chardate etc) and names (eg element name, attribute name).

predicate

The predicate causes a further filtration.

Available axes

/

In an XPath expression / denotes the root of the document tree.

The slash is also used as a path separator to identify the children node of any given node. Consider the following document:

<a>
  <b><x>one</x></b>
  <c><x>two</x></c>
  <d><x>three</x></d>
</a>

Given this document, the following expression

/a/c/x

returns

<x>two</x>

child::

Returns the children of the context node.

Can be abbreviated by leaving child:: entirly out. That is, the child is the default axis.

descendant::

All descendants, not only children, but also the children's children and so on.

parent::

The parent, or null if document root.

ancestor::

The parent as well as the parent's parent, as well as the parent's parent's parent ....

following-sibling::

Brothers and sisters to the right.

preceding-siblings::

Brothers and sisters to the left.

following::

All following nodes in the document minus descendants.

preceding::

All previous nodes in the document minus ancestors.

attribute::

<a>
  <b x="1">one</b>
  <c x="2">two</c>
</a>'

Given the preceeding document, the following XPath expression will return 2:

/A/C/attribute::X

Given the same document, attribute:: can also be specified as predicate:

/a/*[attribute::x=2]

It will return:

<c x="2">two</c>

attribute:: can be abbreviated with the @ symbol.

namespace::

self::

Returns itself...

descendant-or-self::

descendents plus self.

Consider the following document:

<a>
  <x>one</x>
  <b><x>two</x></b>
  <c><d><x>three</x></d></c>
  <e><f><g><x>four</x></g></f></e>
  <e><f><g><x><!-- some comment --><z>five</z></x></g></f></e>
</a>

Given the document above, the following XPath expression

/a/descendant-or-self::x

returns

<x>one</x>
<x>two</x>
<x>three</x>
<x>four</x>
<x>
  <!-- some comment -->
  <z>five</z>
</x>

ancestor-or-self::

ancestors plus self.

Node Tests

text(), comment(), text() and text() test the node's type.

text()

Text nodes.

comment()

Comment nodes.

processing-instruction

processing instructions.

node()

All nodes except attributes and namespace declarations.

name

Tests for the name.

*

The star is a wildcard that matches any, but exactly one, child node.

Consider the following XML document:

<a>
  <x>zero</x>
  <b><x>one</x></b>
  <c><y>two</y></c>
  <d><x>three</x></d>
  <e><f><x>four</x></f></e>
</a>

Then this XPath expression:

/a/*/x

returns

<x>one</x>
<x>three</x>

Note: neither zero nor four are returned although they're in an x tag.

Functions

last

position

A document full of things:

<things>
    <numbers><item>1</item><item>59</item></numbers>
    <animals><item>bird</item><item>cat</item><item>dog</item></animals>
  </things>

Now, let's find the 2nd animal:

//animals/item[position()=2]

count

id

localname

namespace-uri

name

string

concat

starts-with

contains

substring-before

substring-after

substring

See Substring in XPath.

string-length

normalize-space

translate

boolean

not

true

Returns true

false

Returns false

lang

number

sum

floor

ceiling

round

Abbreviations

nothing

child::

@

The @ symbol is an abbreviation for attribute::. The following two expressions are equivalent:

/A/C/@X

/A/C/attribute::X

//

// is an abbreviation for descendant-or-self::.

See here for an example.

.

/self::node()/

Location paths starting with a slash (/) begin execution at the root.

..

parent::node()

Examples

elem
matches any element named elem.
*
matches any element
elem_1|elem_2
matches any elem_1 or elem_2
elem_parent/elem_child
matches elem_child whose parent node is elem_parent
Similarly, / matches the document root
elem_ancestor//elem
matches any element named elem with an ancestor element named elem_ancestor
text()
matches any text node.
processing-instruction()
matches any processing instruction.
node()
matches any node that is not an attribute node or root node.
id("foo")
matches the element with unique ID foo
elem[1]
matches any element named para that is the first elem child element of its parent
*[position()=1 and self::para]
matches any para element that is the first child element of its parent
para[last()=1]
matches any para element that is the only para child element of its parent
elem_parent/elem_child[position()>1]
matches any element named elem_child that has a parent element named elem_parent and that is not the first item child of its parent
elem[position() mod 2 = 1]
matches be any element named elem that is an odd-numbered item child of its parent.
div[@attr="appendix"]//p matches any p element within a div ancestor element that has an attribute named attr with value appendix
@attr matches any attribute named attr. It does not match elements that have an attribute named attr.
@* matches any attribute
id("foo")/child::para[position()=5]
???

Links

XPath visualizer written in JavaScript.

Overview of XSLT and XPath.


	René Nyffenegger's collection of things on the web
	René Nyffenegger on Oracle - Most wanted - Feedback - Follow @renenyffenegger