Ticket #101 (closed defect: fixed)

Opened 11 months ago

Last modified 5 months ago

[PATCH] Fix get_elements_by_tag_name

Reported by: wycats Owned by: why
Priority: major Milestone:
Component: ext/hpricot_scan Version:
Keywords: Cc:

Description

In browser DOMs getElementsByTagName returns only Elements, not text nodes. This patch:

  • Only selects Elements in get_elements_by_tag_name
  • Adding support for get_elements_by_tag_name("*"), which is already supported by just not sending any parameter, but this brings it in line with the browser DOM APIs.

Change History

Changed 11 months ago by wycats

Index: lib/hpricot/traverse.rb
===================================================================
--- lib/hpricot/traverse.rb	(revision 154)
+++ lib/hpricot/traverse.rb	(working copy)
@@ -513,8 +513,9 @@
 
     def get_elements_by_tag_name(*a)
       list = Elements[]
+      a.delete("*")
       traverse_element(*a.map { |tag| [tag, "{http://www.w3.org/1999/xhtml}#{tag}"] }.flatten) do |e|
-          list << e
+        list << e if e.elem?
       end
       list
     end

Changed 11 months ago by wycats

A better patch with tests:

Index: test/test_parser.rb
===================================================================
--- test/test_parser.rb	(revision 154)
+++ test/test_parser.rb	(working copy)
@@ -47,6 +47,13 @@
     assert_equal 'link1', @basic.get_elements_by_tag_name('a')[0].get_attribute('id')
     assert_equal 'link1', @basic.get_elements_by_tag_name('body')[0].get_element_by_id('link1').get_attribute('id')
   end
+  
+  def test_get_elements_by_tag_name_star
+    simple = Hpricot.parse("<div><p id='first'>First</p><p id='second'>Second</p></div>")
+    assert_equal 3, simple.get_elements_by_tag_name("*").size
+    assert_equal 1, simple.get_elements_by_tag_name("div").size
+    assert_equal 2, simple.get_elements_by_tag_name("p").size
+  end
 
   def test_output_basic
     @basic = Hpricot.parse(TestFiles::BASIC)
Index: lib/hpricot/traverse.rb
===================================================================
--- lib/hpricot/traverse.rb	(revision 154)
+++ lib/hpricot/traverse.rb	(working copy)
@@ -513,8 +513,9 @@
 
     def get_elements_by_tag_name(*a)
       list = Elements[]
+      a.delete("*")
       traverse_element(*a.map { |tag| [tag, "{http://www.w3.org/1999/xhtml}#{tag}"] }.flatten) do |e|
-          list << e
+        list << e if e.elem?
       end
       list
     end

Changed 5 months ago by why

  • status changed from new to closed
  • resolution set to fixed

Super, the patch is applied in [161]. Yesssss.

Note: See TracTickets for help on using tickets.