Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
Home
Discussion GroupsGeneralPHPASPPerlColdFusionFlashHTML, CSS, ScriptsBrowsers

Webmaster Forum / Perl / Modules / January 2007



Tip: Looking for answers? Try searching our database.

HTML:Parser how to remove "//<![CDATA[ ... //]]>" ?

Thread view: 
Enable EMail Alerts  Start New Thread
Thread rating: 
Gerwin - 31 Jan 2007 11:00 GMT
Hi,

I'm using HTML::Parser to strip HTML tags from my files. I noticed
how //<![cdata[ ... //]]> and the javascript between that is not
stripped. Any idea how to do this?

-Gerwin
Andy - 31 Jan 2007 19:15 GMT
The CDATA tag can be looked upon as being a comment in HTML.

According to the documentation at http://search.cpan.org/~gaas/HTML-Parser-3.56/Parser.pm
you have to disable the strict_comment switch to strip such tags:

$p->strict_comment( $bool )
By default, comments are terminated by the first occurrence of "-->".
This is the behaviour of most popular browsers (like Mozilla, Opera
and MSIE), but it is not correct according to the official HTML
standard. Officially, you need an even number of "--" tokens before
the closing ">" is recognized and there may not be anything but
whitespace between an even and an odd "--".

The official behaviour is enabled by enabling this attribute.

Enabling of 'strict_comment' also disables recognizing these forms as
comments:

 </ comment>
 <! comment>                         notice how this is similar to
the first two and last characters of <!  [cdata[...//]] >
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2009 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.