Ticket #162 (new enhancement)

Opened 3 months ago

fixup_tags doesn't handle nested p tags

Reported by: juozasg Owned by: why
Priority: minor Milestone:
Component: ext/hpricot_scan Version:
Keywords: Cc:

Description

Here's sample html:

<p>
<b>Description</b><br>

Basic concepts, theories and methods used in the comparative study of socio-cultural systems. Includes cultural ecology and change; political, economic and kinship systems; language, art and religion; cultural perspectives on contemporary issues.

<p>
<b>Grading</b><br>
Normal Grade Rules

<p>
<b>Units</b><br>

3


Doing this:

doc = Hpricot(open("TEST.HTML"), :fixup_tags => true))
doc.to_html

will produce something like this

<p>
<b>Description</b><br>

Basic concepts, theories and methods used in the comparative study of socio-cultural systems. Includes cultural ecology and change; political, economic and kinship systems; language, art and religion; cultural perspectives on contemporary issues.

<p>
<b>Grading</b><br>
Normal Grade Rules

<p>
<b>Units</b><br>

3

</p></p></p>

I think that fixup_tags should fix nested paragraph tags, because that's the usual interpretation for outdated HTML

Note: See TracTickets for help on using tickets.