Hpricot

A Fast, Enjoyable HTML Parser for Ruby

Hpricot is a very flexible HTML parser, based on Tanaka Akira's HTree and John Resig's JQuery, but with the scanner recoded in C (using Ragel for scanning.) I've borrowed what I believe to be the best ideas from these wares to make Hpricot heaps of fun to use.

 #!ruby
 require 'hpricot'
 require 'open-uri'
 # load the RedHanded home page
 doc = Hpricot(open("http://redhanded.hobix.com/index.html"))
 # change the CSS class on links
 (doc/"span.entryPermalink").set("class", "newLinks")
 # remove the sidebar
 (doc/"#sidebar").remove
 # print the altered HTML
 puts doc

A Proper Start

The Tougher Things

If you're on a machine with a compiler, you can give the Hpricot quickstart a try: http://balloon.hobix.com/hpricot.

The Hpricot Mailing List

To join:

Send a message to hpricot@…
Cc: why@…


 #!html
 <p style="margin-left: 140px;"><strong>Want to follow Hpricot development? <a href="/hpricot.xml"><img src="/camping/chrome/site/images/rss.gif" /></a></strong></p>