# Is my algorithm wrong?

#### Do you have a question? Post it now! No Registration Necessary.  Now with pictures!

•  Subject
• Author
• Posted on
Here is what I am trying to do:

For a given string, break it into two parts. The first part should be as
long as possible but less than 25 characters. The SPLIT key is a whitespace.
The second part is whatever left in the string;

My algorithm is to break the string to into many small parts the join them
with space.

@parts= split (/\W/, \$original);

\$thelength=0;\$newpart1='';
foreach (@parts) {
\$thelength=\$thelength+length(\$_)+1;
if (\$thelength > 25)
else
}

I think my algorithm has problem. The simple task should be done within 2
lines of regex.

I am thinking of substr(); but I don't know how to find the position of the
last whitespace before character #25. Or I can do substr(\$original, 0, 25)
then check if the 25 is a whitespace etc... if not then go to #24.

Both my algorithms take n*n, which is really bad speed.

## Re: Is my algorithm wrong?

Looking wrote:

use warnings;
use strict;

my \$original = 'This is a test string about 46 characters long';

my \$end = \$original;
my \$start = substr(\$end,0,25,'');
\$start =~ s/ ([^ ]+)\$//;
\$end = \$1.\$end if defined \$1;

print "\$start --- \$end\n";
print '\$original length - '.length(\$original).' --- \$start length - '.
length(\$start).' --- \$end length - '.length(\$end)."\n";

If there is no space in the first 25 charactetrs, it will just return
the first 25 characters as the start value, otherwise it will return the
longest string smaller than 25 characters of space delimited character
groups.

## Re: Is my algorithm wrong?

Looking wrote:

(edited message; I cancelled a previous one but it may still show up)

use warnings;
use strict;

my \$original = 'This is a test string about 46 characters long';

my \$end = \$original;
my \$start = substr(\$end,0,25,'');
\$start =~ s/\s+(\S+)\$//;
\$end = \$1.\$end if defined \$1;

print "\$start --- \$end\n";
print '\$original length - '.length(\$original).' --- \$start length - '.
length(\$start).' --- \$end length - '.length(\$end)."\n";

If there is no space in the first 25 characters, it will just return
the first 25 characters as the start value, otherwise it will return the
longest string smaller than 25 characters of space delimited character
groups. It allows for multiple spaces between words. Check the length of
the \$start value to determine if there was a space or not. 24 or less
means there WAS a space. 25 means there wasn't.

## Re: Is my algorithm wrong?

whitespace.
the

Here is my new code, Please tell me if there are better ways:

\$s=qq("sadf content= "this is what i' want " asd " sdf " adfa  " sdf');

\$sub= substr (\$s, 0, 25);
\$offset=0;
while (\$sub=~ m/ /g) ; #find the offset of the last space
print \$offset."\n";
\$sub= substr (\$s, 0, \$offset); # from start to offeset
print "\$sub\n";
\$sub= substr (\$s, \$offset+1); # from the offset to end
print "\$sub\n";

i still think i did something wrong at
while (\$sub=~ m/ /g) ;
it got to be a better way to run the loop.

## Re: Is my algorithm wrong?

\$sub=~ s/\s//g;
does the same thing, faster without loop;

## Re: Is my algorithm wrong?

[Not a very descriptive Subject.]

\W means a non-word character, not whitespace. Maybe that's what you
wanted, but it doesn't match the description above.

[snip rest of program]

Quick and dirty:

my \$string = 'something at least twenty-five characters long';
my (\$part1, \$part2) = \$string =~ /^(.\S)\s(.*)/s;

Ideally you should check to see if the match succeeded.

I do hope this isn't homework.

## Re: Is my algorithm wrong?

it is not a homework. it is a feature i am adding to a site.

your works except in the case that there is no space in the first 25.
my codes have problems too.

thundergnat's solution works perfect. this is his
#my \$original = 'This is a test string about 46 characters long';
my \$original = 'ThisSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS is a test

my \$end = \$original;
my \$start = substr(\$end,0,25,''); #cut them into 2
\$start =~ s/\s+(\S+)\$//; # remove the last none whitespaces following at
least 1 whitespace
\$end = \$1.\$end if defined \$1; # add it to the second part

## Re: Is my algorithm wrong?

OK. No offense meant.

I didn't think of that. (Obviously.)

Yeah, I like it. Looks a bit like the code in Text::Wrap, but I'd
guess he/she came up with it independently.

Just to redeem myself a little, here's a slightly altered version of
the single regex solution. (Although I suspect there's a more
succinct way to express it.)

my \$string = 'thisistheoriginalstringblahblahblah';

my (\$part1, \$part2) = grep defined,
\$string =~ /^
(?:  (.\S) \s  (.*)  )
|
(?:  (.\S) \s? (.*)  )
/sx;

Oh, and I changed it to grab at most 25 characters, not 24; at first
I just saw the text that said "less than 25 characters", that is,
< 25, not <= 25.  <shrug>

## Re: Is my algorithm wrong?

[...]

> Just to redeem myself a little, here's a slightly altered version of
> the single regex solution. (Although I suspect there's a more
> succinct way to express it.)
>
>
> my \$string = 'thisistheoriginalstringblahblahblah';
>
> my (\$part1, \$part2) = grep defined,
>     \$string =~ /^
>                 (?:  (.\S) \s  (.*)  )
>                 |
>                 (?:  (.\S) \s? (.*)  )
>                /sx;

pos( \$string) = 24;
my (\$part1, \$part2) = /(.*\s|.*)(\S*\G.*)/;

That's more compact, but not necessarily better than other solutions.

Anno

## Re: Is my algorithm wrong?

On Thu, 16 Sep 2004 15:43:20 GMT, in comp.lang.perl.misc you wrote:

I'm not sure if I understand what you mean, but couldn't something
along these lines suit your needs?

#!/usr/bin/perl -l

use strict;
use warnings;

\$_='foo bar baz ' x 5;

my \$pos;
while (/ /g) {
last if pos > 25;
\$pos=pos;
}

print for unpack 'A' . \$pos . 'A*', \$_;

__END__

Michele
--
->(map substr
((\$a||=join'',map--\$|x\$_,(unpack'w',unpack'u','G^<R<Y]*YB='
..'KYU;*EVH[.FHF2W+#"\Z*5TI/ER<Z`S(G.DZZ9OX0Z')=~/./g)x2,\$_,
256),7,249);s/[^\w,]/ /g;\$ \=/^J/?\$/:"\r";print,redo}#JAPH,