Supported XPath Expressions
Hpricot gets its XPath support from JQuery, so much of what's here is straight from JQuery's XPath docs.
Here are some samples:
#!ruby
require 'hpricot'
require 'open-uri'
doc = Hpricot(URI.parse("http://google.com/").read)
doc.search("/html/body//p")
doc.search("//p")
doc.search("//p/a")
doc.search("//a[@src]")
doc.search("//a[@src='google.com']")
Location Paths
Absolute Paths
#!ruby
doc.search("/html/body//p")
doc.search("/*/body//p")
doc.search("//p/../div")
Relative Paths
#!ruby
doc.search("a",this)
doc.search("p/a",this)
Supported Axes
descendant
Element has a descendant element.
#!ruby
doc.search("//div/descendant::p")
Identical to doc.search("//div//p").
child
Element has a child element.
#!ruby
doc.search("//div/child::p")
Which is identical to: doc.search("//div/p").
preceding-sibling
Element has an element before it, on the same axes.
#!ruby
doc.search("//div/preceding-sibling::form")
parent
Selects the parent element of the element
#!ruby
doc.search("//div/parent::div")
Which is identical to doc.search("//div/../div").
self
Selects the element itself.
Supported Predicates
- [@*] Has an attribute
#!ruby doc.search("//div[@*]")`) - [@foo] Has an attribute of foo
#!ruby doc.search("//input[@checked]")`) - [@foo='test'] Attribute foo is equal to test
#!ruby doc.search("//a[@ref='nofollow']")`) - [ Nodelist] Element contains a node list, for example:
#!ruby doc.search("//div[p]") doc.search("//div[p/a]")
Supported Predicates, but differently
- [last()] or [position()=last()] becomes :last
#!ruby doc.search("p:last")`) - [ 0] or [position()=0] becomes :eq(0) or :first
#!ruby doc.search("p:first") doc.search("p:eq(0)") - [position() < 5] becomes :lt(5)
#!ruby doc.search("p:lt(5)")`) - [position() > 2] becomes :gt(2)
#!ruby doc.search("p:gt(2)")`)
