Click here to get back home

Help: Replace Help

 HomeNewsGroups | Search | About
 comp.lang.perl.misc    Post an article   get this group's latest topics as an RSS feed add this group's latest topics to your My MSN content add this group's latest topics to your My Yahoo content
Subject Author Date
Help: Replace Help Amy Lee 05-01-2008
Posted by Ben Bullock on May 1, 2008, 9:02 am
Please log in for more thread options
On Thu, 01 May 2008 20:29:38 +0800, Amy Lee wrote:

> So how to solve this kind of order problem? I suppose that the
> replacement must process at the same time.

For single letters you can use

tr/ACGU/CAUG/;

If the strings to swap are longer than a single character,

s/A/unlikely/g;
s/C/A/g;
s/unlikely/C/g;
s/G/unlikely/g;
s/U/G/g;
s/unlikely/U/g;

where "unlikely" is a string which is unlikely to occur in your data.

Posted by A. Sinan Unur on May 1, 2008, 9:46 am
Please log in for more thread options
@ml.accsnet.ne.jp:

> On Thu, 01 May 2008 20:29:38 +0800, Amy Lee wrote:
>
>> So how to solve this kind of order problem? I suppose that the
>> replacement must process at the same time.
>
> For single letters you can use
>
> tr/ACGU/CAUG/;
>
> If the strings to swap are longer than a single character,
>
> s/A/unlikely/g;
> s/C/A/g;
> s/unlikely/C/g;
> s/G/unlikely/g;
> s/U/G/g;
> s/unlikely/U/g;
>
> where "unlikely" is a string which is unlikely to occur in your data.

A simple lookup table driven solution would obviate the need to make
assumptions about the unlikeliness of a given character as well as
getting rid of the multiple substitutions.

Sinan

--
(remove .invalid and reverse each component for email address)

comp.lang.perl.misc guidelines on the WWW:
http://www.rehabitation.com/clpmisc/

Posted by Ben Bullock on May 1, 2008, 9:40 pm
Please log in for more thread options

>> If the strings to swap are longer than a single character,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>> s/A/unlikely/g;
>> s/C/A/g;
>> s/unlikely/C/g;
>> s/G/unlikely/g;
>> s/U/G/g;
>> s/unlikely/U/g;
>>
>> where "unlikely" is a string which is unlikely to occur in your data.
>
> A simple lookup table driven solution would obviate the need to make
> assumptions about the unlikeliness of a given character as well as
> getting rid of the multiple substitutions.

And a simple tr/// based solution would obviate the need to for you to
write a lookup table solution. But if the strings to swap are longer than
a single character, the lookup table solution is going to be somewhat
complex.

Here is an example of a badly-written lookup table solution:

#!/usr/bin/perl

use strict;
use warnings;

my %subst = qw( A C C A G U U G );
my @strings = qw( ACGU GUACCGU );

print "Before:\t@strings\n";

s/([ACGU])/$subst/g for @strings;

print "After\t@strings\n";

__END__

The problem here is that the writer has put the same data, the list of
stuff to swap, in three different places. Maybe that kind of clumsy
solution is OK for an example program, but for the real world it's
not. If one uses a lookup table, then the swapping data should only be
in exactly one place:

my %subst = qw/A C G U/; # Do not repeat this data anywhere!!!!!
%subst = (%subst, reverse %subst);
my $substkeys = join ('|',keys %subst); # We want to swap strings so use |
my @strings = qw( ACGU GUACCGU );
s/($substkeys)/$subst/g for @strings;

If one uses the original solution proposed above, as the list of data
to swap changes, (and since the strings consist of more than one
character, remember), bugs will occur if the programmer is not
extremely careful about updating both parts of the list of stuff to
swap and the left hand side of the substitution.

So I don't recommend a lookup table, unless one knows what one is doing.


Posted by A. Sinan Unur on May 1, 2008, 10:16 pm
Please log in for more thread options
benkasminbullock@gmail.com (Ben Bullock) wrote in

>
>>> If the strings to swap are longer than a single character,
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>> s/A/unlikely/g;
>>> s/C/A/g;
>>> s/unlikely/C/g;
>>> s/G/unlikely/g;
>>> s/U/G/g;
>>> s/unlikely/U/g;
>>>
>>> where "unlikely" is a string which is unlikely to occur in your
>>> data.
>>
>> A simple lookup table driven solution would obviate the need to make
>> assumptions about the unlikeliness of a given character as well as
>> getting rid of the multiple substitutions.
>
> And a simple tr/// based solution would obviate the need to for you to
> write a lookup table solution. But if the strings to swap are longer
> than a single character, the lookup table solution is going to be
> somewhat complex.

Granted.

> Here is an example of a badly-written lookup table solution:
>

<snipped for brevity>

>
> The problem here is that the writer has put the same data, the list of
> stuff to swap, in three different places. Maybe that kind of clumsy
> solution is OK for an example program,

and that was the spirit in which those lines were written.

> but for the real world it's not. If one uses a lookup table, then the
> swapping data should only be in exactly one place:
>
> my %subst = qw/A C G U/; # Do not repeat this data anywhere!!!!!
> %subst = (%subst, reverse %subst);
> my $substkeys = join ('|',keys %subst); # We want to swap strings so use |
> my @strings = qw( ACGU GUACCGU );
> s/($substkeys)/$subst/g for @strings;
>
> If one uses the original solution proposed above, as the list of data
> to swap changes, (and since the strings consist of more than one
> character, remember), bugs will occur if the programmer is not
> extremely careful about updating both parts of the list of stuff to
> swap and the left hand side of the substitution.
>
> So I don't recommend a lookup table, unless one knows what one is
> doing.

Well, if one uses the solution you proposed above and the list of data
to swap changes to

my %subst = qw( A|C C|A G|U U|G );

there will be issues with the way you build the search string.

So:

#!/usr/bin/perl

use strict;
use warnings;

my %replace = qw( A|C C|A G|U U|G A$A Z$Z);
%replace = (%replace, reverse %replace);

my $search = join ('|', map { "(?:\Q$_\E)" } keys %replace);
my @strings = qw( A|C G|U G|UA|CC|AG|U Z$Z A$A );

print "Before:\t@strings\n";

s/($search)/$replace/g for @strings;

print "After\t@strings\n";

__END__

--
(remove .invalid and reverse each component for email address)

comp.lang.perl.misc guidelines on the WWW:
http://www.rehabitation.com/clpmisc/

Posted by Ben Bullock on May 1, 2008, 10:32 pm
Please log in for more thread options

> Well, if one uses the solution you proposed above and the list of data
> to swap changes to
>
> my %subst = qw( A|C C|A G|U U|G );
>
> there will be issues with the way you build the search string.

> my $search = join ('|', map { "(?:\Q$_\E)" } keys %replace);

So you agree that the lookup table driven solution isn't simple?

I think my original method of substituting in an unlikely string,
which you objected to, was fairly appropriate for this particular
question. I often use this kind of method for quick jobs.



Similar ThreadsPosted
s/$match/$replace/ fails when $replace has backreferences September 5, 2005, 7:05 pm
Replace with nothing May 12, 2005, 7:54 am
HOW TO replace ' but not ?' October 27, 2004, 4:30 pm
search & replace September 7, 2004, 12:27 pm
search and replace help April 22, 2005, 2:11 am
searc and replace April 22, 2005, 3:10 am
search and replace September 2, 2005, 6:12 am
match 1/2/3, replace with a/b/c October 15, 2005, 10:30 am
Search and Replace December 6, 2005, 3:41 am
Search and Replace May 24, 2006, 8:11 am

Our other projects:

Art Dolls, Fairies and Mermaids - Sunnyfaces.net

Roy's Linux, Programming and Search Engines messages

1-Script XML SitemapXML Sitemap