a regular expression inquiry

Do you have a question? Post it now! No Registration Necessary.  Now with pictures!

Threaded View

for an array element $fields[$j] containing:

gb:AF367205.1  DB_XREF=gi:17225988  TID=Os.1005.1  CNT=36  FEA=FLmRNA  
TIER=FL+Stack  STK=9  UG=Os.1005  DEF=Oryza sativa 1-deoxy-D-xylulose  
5-phosphate reductoisomerase precursor, mRNA, complete cds; nuclear gene for  
plastid product.  PROD=1-deoxy-D-xylulose 5-phosphate  
reductoisomeraseprecursor  FL=gb:AK059692.1 gb:AK099702.1 gb:AF367205.1  
REP_ORG=O. sativa

i try to extract useful content by:

if (preg_match("/PROD=(.+)\s/", $fields[$j], $match ) )
  $fields[$j] = $match[1];
else if (preg_match("/UG_TITLE=(.+)\s/", $fields[$j], $match ) )
  $fields[$j] = $match[1];
else if (preg_match("/DEF=(.+)\s/", $fields[$j], $match ) )
  $fields[$j] = $match[1];

i have confirmed it is 2 spaces (i.e. not tab, linefeed, new line). i just  
don't know why sometimes it gives me:

PROD=1-deoxy-D-xylulose 5-phosphate reductoisomeraseprecursor  
FL=gb:AK059692.1 gb:AK099702.1 gb:AF367205.1

or more (i.e. run-on matching). i don't know if it deals anything with  
matching as much as possible. i also tried:

"/DEF=(.+)\s[A-Z].*/" but it still doesn't work. BTW, because i know  
sometimes what i want is at the end, so i also use "/DEF=(.+)$|\s/" but  
it also doesn't work.

i really appreciate anybody could help on this. i feel really desperated as  
i find difficult to find out relevant documentatoin to explain why this  
special case fails. thanks a lot

Re: a regular expression inquiry

vito wrote:
Quoted text here. Click to load it

Yeah thats it.


Is a greedy match, it will match to the last two spaces it finds and
not the first.
Adding a ? after the brackets make it a non greedy match.



Re: a regular expression inquiry

Tim wrote:
Quoted text here. Click to load it

Good explanation, totally correct, but nort the right regex:
(A random character 1 or mor times) zero or once.

explanation was good, regex should be:


As I've learned, correct me if I'm wrong, the ? to make a selector
non-greedy should be directly after the selector itself AFAIK.

Rik Wasmus

Re: a regular expression inquiry

Quoted text here. Click to load it

you're correct! Thanks, it works.  

Re: a regular expression inquiry

Quoted text here. Click to load it

thanks but it seems it doesn't work.  adding one more ? at the end also  
helps nothing. :(


Site Timeline