Do you have a question? Post it now! No Registration Necessary. Now with pictures!
- Posted on
May 12, 2005, 6:26 pm
rate this thread
I'm wondering what it would take to employ Perl to strip selected tags,
attributes, and optionally the content between those tags from html
files for a scripting and/or programming illiterate such as me.
Initially I investigated learning regexes since they are supported by my
editor (Homesite). But after skimming through Jeffrey E.F. Friedl's
"Mastering Regular Expressions" I'm inclined to think that regexes are
actually poorly suited for this task. Afaics ideal would be if the
content could be manipulated via a parser that understands html & sgml.
I didn't manage to find such a tool with a "dummy/GUI" interface. But I
am intrigued by the result of a search that suggested to me that Perl
has such a html & sgml parser and a module that seems designed
specifically for this job:
I'm looking for estimates on how much work it would be to learn how to
use Perl for this specific purpose.
Re: Feasibility of using Perl to strip selected html tags and/orattributes (scripting/programming novice)
If you want to learn Perl, you should do so for proper edification.
Approaching the topic in as "what is the least amount of learning
do I need for one specific problem" is contraproductive.