Ticket #64 (assigned defect)
Amazon.com XPath problem
| Reported by: | scrubber | Owned by: | why |
|---|---|---|---|
| Priority: | major | Milestone: | 0.5 |
| Component: | ext/hpricot_scan | Version: | 0.4 |
| Keywords: | amazon xpath | Cc: |
Description
This code:
require 'rubygems'
require 'hpricot'
require 'open-uri'
doc = Hpricot(open('http://www.amazon.com/s/ref=sr_nr_i_0/105-4295811-9538065?ie=UTF8&keywords=ruby&rh=i:aps,k:ruby,i:stripbooks&page=1'))
titles = doc/"/html/body/table/tr/td/table/tr/td/div/table/tr/td/table/tr/td/table/tr/td/table/tr/td/table/tr/td/table/tr/td/table/tr/td/a/span"
p titles.size
prints "4".
However, in Firefox, this XPath:
/html/body/table/tbody/tr/td/table/tbody/tr/td/div/table/tbody/tr/td/table/tbody/tr/td/table/tbody/tr/td/table/tbody/tr/td/table/tbody/tr/td/table/tbody/tr/td/table/tbody/tr/td/a/span
returns 12 results. (It is the same XPath, just the Firefox one has additional tbody tags) Since your quest is to make hpricot as FF compliant as possible,I guess this is not right, or?
Change History
Note: See
TracTickets for help on using
tickets.
