Ticket #118 (new defect)

Opened 13 months ago

CDATA being selectively swallowed

Reported by: pistos Owned by: why
Priority: major Milestone:
Component: lib/hpricot Version:
Keywords: xml cdata Cc:

Description

CDATA is being emptied on a very particular node in my XML. This happens only once every few hundred nodes. Here is some unit test code with sample XML that causes the problem (I could not attach files to this ticket). It is 100% repeatable with this unit test.

To use the test code, unpack, then "cp full.xml /tmp", and run "./test.rb".

Test output:

% ./test.rb
Loaded suite ./test
Started
Event 10690997 ... OK
Event 10691011 ... OK
Event 10691039 ... OK
Event 10691054 ... F
Finished in 9.7764 seconds.

  1) Failure:
test_hpricot(TC_HpricotXMLParser)
    [./test.rb:66:in `test_hpricot'
     ./test.rb:51:in `each'
     ./test.rb:51:in `test_hpricot']:
XML for event 10691054 was not parsed properly.
--- /tmp/raw.xml        2007-10-01 10:51:50.000000000 -0400
+++ /tmp/hraw.xml       2007-10-01 10:51:50.000000000 -0400
@@ -1,7 +1,7 @@
 <event id='10691054'>
-<html><![CDATA[<a href='/game/23680' class='public_game'>view game</a><a href='/user/TallyHo' class='player'><img src='/images/profile/9532_w4j2uF.jpg' width='25' height='25' /> TallyHo</a> has repaired a Light Infantry <span class='battle'><img src='/images/yellow_001.png' width='32' height='34' alt='Light Infantry' /><img class='strength' src='/images/6.png' width='32' height='34' alt='6' /></span>]]></html>
+<html><![CDATA[]]></html>
 <forwardXML><response>
 <fields>
 <field x='12' y='7'>
 <unit quantity='6'>yellow_001.png</unit>
 <dark/>.
<false> is not true.

1 tests, 12 assertions, 1 failures, 0 errors
Note: See TracTickets for help on using tickets.