Regular Expression Help

Do you have a question? Post it now! No Registration Necessary.  Now with pictures!

Threaded View

Can someone help create one/two regular expression(s) for this?

<table width=100%% class='list'
cellpadding=2 cellspacing=0>
<tr bgcolor='#E5ECF9'>
<td colspan=3><font size=-1>&nbsp;<b>Languages</b></font></td></tr><tr
Quoted text here. Click to load it
border-left: hidden; border-bottom: hidden;'>
<font size=-1>1.</font></td>
<td align=left valign=center  style='border-bottom: hidden;'
style='padding: 5 3 5 3'><font size=-1>English</font></td><td
valign=center style='padding: 2 3 2 3; border-right: hidden; border-
bottom: hidden;' ><table class=bar cellspacing=0 width=100
height=4><td bgcolor=4684ee ></table>
<table class=bar cellspacing=0 width=30 height=4><td bgcolor=dc3912 ></

I want to pull out the width value located here:

<table class=bar cellspacing=0 width=100 height=4>


<table class=bar cellspacing=0 width=30 height=4>

It doesn't matter if it's done in one expression and the result is
separated with a ,  (ie results = 100,30)  or two expressions... one
putting the width value of one and the other pulling the width value
of the other.

If anyone can help, it would be greatly appreciated.


Re: Regular Expression Help

I forgot to mention.. the code

 <table class=bar cellspacing=0 width=xx height=4>

is mentioned multiple times throughout the entire page... so the
expression can't just take that code only and extract it...

It has to somehow reference the section (english) of the page as well.


30 height=4>

Quoted text here. Click to load it

Re: Regular Expression Help

S. Cole wrote:
Quoted text here. Click to load it

I've not used this module, but perhaps HTML::DOM (available on CPAN)
might be a more appropriate fit than using regular expressions.

If you're trying to pull this information programatically, that suggests
to me that you're probably not authoring the source itself. So you've
not control over it's precise format (whitespace, use of FONT elements, etc)

Using HTML::DOM might be your best bet for ensuring that you can pull
out the relevant information without getting broken by changes in the HTML.

Site Timeline