# how do I chop out a piece of a sting surrounded by two hex codes?

#### Do you have a question? Post it now! No Registration Necessary.  Now with pictures!

•  Subject
• Author
• Posted on
Hi All,

Please forgive me, I know I am being a mooch here.

I have a string.  Inside it is a substring I want
to cut out.

Problem, the substring is surrounded on both sides by
two hex codes: C2 and A0

"junk" . 0xC2 . 0xA0 . "my substring" . 0xC2 . 0XA0 . "junk"

How do I cut out "my substring"?

Many thanks
-T

## Re: how do I chop out a piece of a sting surrounded by two hex codes?

Use a regular expression with capture:

\$x = "junk\xc2\xa0my substring\xc2\xa0junk";
if( \$x =~ m) {
print "\$1\n";
}

--
Jim Gibson

## Re: how do I chop out a piece of a sting surrounded by two hex codes?

On 08/30/2015 05:06 PM, Jim Gibson wrote:

I am doing something wrong.

\$ perl -e '\$x = "junk\xc2\xa0my substring\xc2\xa0junk"; ( \$y, \$x ) =~
m; print "\$y\n"';

:'(

## Re: how do I chop out a piece of a sting surrounded by two hex codes?

On 08/30/2015 06:08 PM, T wrote:

In case it helps, the actual string looks like this (hexedit):

69 6F 6E C2  A0 37 2E 35  2E 39 3C 0A  3C 70 20 73  74 79 6C 65
3D 22 74 65  78 74 2D 61  6C 69 67 6E  3A 20 63 65  6E 74 65 72
3B 22 3E 4E

ion..7.5.9<.<p style="text-align: center

## Re: how do I chop out a piece of a sting surrounded by two hex codes?

On 8/30/2015 18:12, T wrote:

I never test things from the commandline - introduces additional variables
to the equation.

my \$x = "junk\xc2\xa0my substring\xc2\xa0junk";

(my \$y = \$x) =~ s/^.*?\xc2\xa0(.+)\xc2\xa0.*\$/\$1/is;
print "\$y\n";

# or

\$x =~ /^.*?\xc2\xa0(.+)\xc2\xa0.*\$/is;
\$y = \$1 // '??';
print "\$y\n";

# or (you said 'chop out' not extract, so:

(\$y = \$x) =~ s/^(.*?\xc2\xa0).*(\xc2\xa0.*)\$/\$1\$2/is;
print "\$y\n";

# or

\$x =~ /^(.*?\xc2\xa0).*(\xc2\xa0.*)\$/is;
\$y = "\$1\$2" // '??';
print "\$y\n";

__END__

my substring
my substring
junk-Ã¡-Ã¡junk
junk-Ã¡-Ã¡junk

## Re: how do I chop out a piece of a sting surrounded by two hex codes?

On 08/30/2015 07:24 PM, \$Bill wrote:

Hi Bill,

\$ perl -e 'my \$x = "junk\xc2\xa0my substring\xc2\xa0junk"; ( my \$y = \$x
) =~ s/^.*?\xc2\xa0(.+)\xc2\xa0.*\$/\$1/is; print "\$y\n"';
my substring

Thank you!

Is there a way to do it with {} instead of // ?

-T

## Re: how do I chop out a piece of a sting surrounded by two hex codes?

On Monday, 31 August 2015 09:16:11 UTC+5:30, T  wrote:
[snip]

Look at Meth-4 in the following code for a way do the s/// using
the alternate delimiter pair {}

perl -Mstrict -Mwarnings -le '

{my \$kount=0;sub incr { return sub } }

my %h = (
0 => sub { "The original string" },
1 => sub { "Meth-\$_[0]: using capturing parentheses" },
2 => sub { "Meth-\$_[0]: w/o   capturing parentheses" },
3 => sub { "Meth-\$_[0]: with capturing parentheses but modifying the original" },
4 => sub { "Meth-\$_[0]: with capturing parentheses but modifying the original & making use of alternate delimiters for the s///" },
5 => sub { "Meth-\$_[0]: using negative lookaround" },
6 => sub { "Meth-\$_[0]: with a pure nonregex-based approach" },
7 => sub { "Meth-\$_[0]: with a mixed regex-based + string approach" },
8 => sub { "Meth-\$_[0]: using split" },
"-1" => sub { "-" x 3 },
);

local \$_ = "junk\xC2\xA0my substring\xC2\xA0junk";

print \$h->();
print;
print \$h{ -1 }->();

# Meth-1: with capturing parentheses
\$a=incr->();
print \$h{ \$a }->( \$a );
/\xc2\xa0(.+)\xc2\xa0/ and print \$1;
print \$h->();

# Meth-2: w/o  capturing parentheses
\$a=incr->();
print \$h{ \$a }->( \$a );
print for /(?<=\xc2\xa0).+(?=\xc2\xa0)/g;
print \$h{ -1 }->();

# Meth-3: with capturing parentheses but modifying the original
\$a=incr->();
print \$h{ \$a }->( \$a );
(my \$t = \$_) =~ s/\xc2\xa0(.+)\xc2\xa0// and print \$1;
print \$h{ -1 }->();

# Meth-4: with capturing parentheses but modifying the original & making
#         use of alternate delimiters for the s///
\$a=incr->();
print \$h{ \$a }->( \$a );
(my \$T = \$_) =~ s
{} and print \$1;
print \$h{ -1 }->();

# Meth-5: with negative lookaround
\$a=incr->();
print \$h{ \$a }->( \$a );
my \$neighbor = "\xc2\xa0";
/\$neighbor((?:(?!\$neighbor).)+)/ and print \$1;
print \$h{ -1 }->();

# Meth-6: with a pure nonregex-based approach
\$a=incr->();
print \$h{ \$a }->( \$a );
if ( (my \$start = index(\$_, \$neighbor)) > -1 ) {
if ( (my \$end = rindex(\$_, \$neighbor)) > -1 ) {
if ( \$end > \$start ) {
my \$len = length \$neighbor;
print substr(\$_, \$start+\$len, \$end-\$start-\$len);
}
}
}
print \$h{ -1 }->();

# Meth-7: with a mixed regex-based + string approach
\$a=incr->();
print \$h{ \$a }->( \$a );
my (\$s, @A) = (\$_);
push @A,pos() while /\$neighbor/g;
\$#A-- if @A % 2; # remove dangling element s.t. @A has even elements.
print
for
map { substr(\$s, \$A[\$_], \$A[1+\$_]-\$A[\$_]) }
0 .. \$#A-1
;
print \$h{ -1 }->();

# Meth-8: using split
\$a=incr->();
print \$h{ \$a }->( \$a );
my @B = split /\$neighbor/;
while ( @B > 1 ) {
shift @B;
print \$B[0];
shift @B;
}
print "/Fini.";
'

## Re: how do I chop out a piece of a sting surrounded by two hex codes?

On 08/31/2015 02:31 AM, sharma__r@hotmail.com wrote:

I copied it down to try to figure out later.

Thank you for helping me with this.

## Re: how do I chop out a piece of a sting surrounded by two hex codes?

On 08/30/2015 07:24 PM, \$Bill wrote:

What does the "\$1" and the "is" do in the above?

## Re: how do I chop out a piece of a sting surrounded by two hex codes?

See 'perldoc perlreref' for what the i and s options do.

\$1 will contain the first capture group after a successful match
against a regular expression that contains capture groups ( () ).

--
Jim Gibson

## Re: how do I chop out a piece of a sting surrounded by two hex codes?

On 08/30/2015 10:29 PM, Jim Gibson wrote:

I will and thank you!

Believe is or not, I thought "is" was a function, like "die"

:-)

-T

## Re: how do I chop out a piece of a sting surrounded by two hex codes?

On Monday, 31 August 2015 12:54:38 UTC+5:30, T  wrote:
[snip]

How could "is", which is occurring at the end be
considered as a function is hard to fathom  :-/

Are you telling that you write something like:
"Error could not open file" die;

## Re: how do I chop out a piece of a sting surrounded by two hex codes?

On 08/31/2015 01:23 AM, sharma__r@hotmail.com wrote:

I thought "is" meant "the answer is".

It is after midnight here and I am loopy with exhaustion.
I couldn't tell you which way is up at the moment.

## Re: how do I chop out a piece of a sting surrounded by two hex codes?

On Monday, 31 August 2015 14:48:46 UTC+5:30, T  wrote:

On 08/31/2015 01:23 AM, sharma__r@hotmail.com wrote:
> On Monday, 31 August 2015 12:54:38 UTC+5:30, T  wrote:
> [snip]
>
>>
>>>>> (my \$y = \$x) =~ s/^.*?\xc2\xa0(.+)\xc2\xa0.*\$/\$1/is;
>>
>> Believe is or not, I thought "is" was a function, like "die"
>>
>> -T
>
> How could "is", which is occurring at the end be
> considered as a function is hard to fathom  :-/
>
> Are you telling that you write something like:
>     "Error could not open file" die;
>

>I thought "is" meant "the answer is".

Still not convinced.

>It is after midnight here and I am loopy with exhaustion.
>I couldn't tell you which way is up at the moment.

What is the point in working so late when  you
don't even know whether you're coming or going.

## Re: how do I chop out a piece of a sting surrounded by two hex codes?

On 08/31/2015 03:36 AM, sharma__r@hotmail.com wrote:

You were suppose to have a chuckle at my expense.  :-)

## Re: how do I chop out a piece of a sting surrounded by two hex codes?

On 08/31/2015 03:36 AM, sharma__r@hotmail.com wrote:

I work my ass off. I have to stick things in where I can.

## Re: how do I chop out a piece of a sting surrounded by two hex codes?

sharma__r@hotmail.com wrote:

In OO land it's not farfetched to write
"Error could not open file".print, e. g.

--
MartinS

## Re: how do I chop out a piece of a sting surrounded by two hex codes?

On 08/31/2015 11:40 PM, Martin Str|mberg wrote:

You were all suppose to get a chuckle off me.

:-)

## Re: how do I chop out a piece of a sting surrounded by two hex codes?

On 08/30/2015 10:29 PM, Jim Gibson wrote:

Is this what you mean?

s  match as a Single line - . matches \n
i  case-Insensitive

## Re: how do I chop out a piece of a sting surrounded by two hex codes?

On Tuesday, 1 September 2015 03:12:12 UTC+5:30, T  wrote:
[snip]

The /s and /i are regex pattern match modifiers, in that they alter the
behavior of how the regex engine goes about it's job of matching.

The /i makes the engine match case insensitively, like as:

perl -le '
# /i
\$_ = "TOdd";
print "foundme only when typed exactly." if /Odd/;
print "found me even inexactly." if /Odd/i;

# /s
\$_ = "a\nb\nc";
print "ok without /s" if /.b/; #will not  print coz . not gonna match newline
print "ok using /s" if /.b/s;  # will print coz . matches newline with /s
'