Click here to get back home

Re: Rename File Using Strring Found in File?

 HomeNewsGroups | Search | About
 comp.lang.perl.misc    Post an article   get this group's latest topics as an RSS feed add this group's latest topics to your My MSN content add this group's latest topics to your My Yahoo content
Subject Author Date
Re: Rename File Using Strring Found in File? He Who Greets With Fire 03-04-2008
Posted by He Who Greets With Fire on March 4, 2008, 10:21 am
Please log in for more thread options



OK, thanks, but the script does not seem to rename the files.
I added some troubleshooting code, most of which I commented out. I
also moved a copy of the personalinjury folder and all its files
inside the C:\Perl directory so it can access it directly.


See below for my additional comments.


#!/bin/perl


#sleep 2;
print "here I am! \n";
#sleep 2;
my $counter =1;

foreach my $file ( glob 'personalinjury/*.htm' ) {

# print "here I am A \n";
# sleep 1;

open my $PI, '<', $file or die "could not open '$file' $!";

# print "here I am! B \n";
# sleep 1;

print $counter;
print "\n";
while ( <$PI> ) {
# print "\n inside whileloop";

I AM getting to this point here.

next unless /Citation: [\d-]+.*([\d.]+)/;

but I never get to this point here--apparently the regex never sees a
match for the "Citation:" etc string.

Here is a screen shot of the typical file, with a red arrow pointing
to the string in this particular file that I want to match.
I do not know why the regex does not see a match, because it looks
like it matches it???

See here:
http://img225.imageshack.us/img225/91/citationue2.jpg

my $newfile = $1;
rename $file, "$newfile.htm" or die "could not mv '$file' $!";
print "\n renamed a file";
sleep 1;
last;
}#end while

$counter++;
print "\n count is ";
print $counter;
print "\n";
#sleep 1;

close $PI;
} #end foreach



I think the script would work ok except that it never sees a match for
the regex pattern inside the file. I am seeing the script go through
each substring of all 821 files, but it never sees a match.


Posted by He Who Greets With Fire on March 4, 2008, 10:37 am
Please log in for more thread options
On Tue, 04 Mar 2008 15:21:19 GMT, He Who Greets With Fire

> next unless /Citation: [\d-]+.*([\d.]+)/;

I think it has to be something to do with the colon or the white
spaces between the colon and the first of the digits. Is the colon a
special character in perl? One white space is in the regex, but there
appears to be two white spaces in the screen shot of the file I linked
to above....


Posted by Josef Moellers on March 4, 2008, 10:59 am
Please log in for more thread options
He Who Greets With Fire wrote:
> On Tue, 04 Mar 2008 15:21:19 GMT, He Who Greets With Fire
>
>> next unless /Citation: [\d-]+.*([\d.]+)/;
>
> I think it has to be something to do with the colon or the white
> spaces between the colon and the first of the digits. Is the colon a
> special character in perl? One white space is in the regex, but there
> appears to be two white spaces in the screen shot of the file I linked
> to above....

I usually replace any white space to be matched by "\s+". That catches
TABs *and* blanks, so maybe
next unless /Citation:\s+[\d-]+.*([\d.]+)/;
will do?
--
These are my personal views and not those of Fujitsu Siemens Computers!
Josef Möllers (Pinguinpfleger bei FSC)
        If failure had no penalty success would not be a prize (T. Pratchett)
Company Details: http://www.fujitsu-siemens.com/imprint.html

Posted by Ben Morrow on March 4, 2008, 11:15 am
Please log in for more thread options

>
> OK, thanks, but the script does not seem to rename the files.
> I added some troubleshooting code, most of which I commented out. I
> also moved a copy of the personalinjury folder and all its files
> inside the C:\Perl directory so it can access it directly.

Don't do that. You can set the working directory from within your Perl
script using the chdir function. In any case, the working directory may
not be what you expect under Win32.

> See below for my additional comments.
>
> #!/bin/perl

Perl is *never* installed as /bin/perl.

> #sleep 2;
> print "here I am! \n";

Diagnostics like this are better given with warn, which will .a. print
them to STDERR, where they ought to be and .b. tell you where you are in
the script.

> #sleep 2;
> my $counter =1;
>
> foreach my $file ( glob 'personalinjury/*.htm' ) {
>
> # print "here I am A \n";
> # sleep 1;
>
> open my $PI, '<', $file or die "could not open '$file' $!";
>
> # print "here I am! B \n";
> # sleep 1;
>
> print $counter;
> print "\n";
> while ( <$PI> ) {
> # print "\n inside whileloop";
>
> I AM getting to this point here.
>
> next unless /Citation: [\d-]+.*([\d.]+)/;
>
> but I never get to this point here--apparently the regex never sees a
> match for the "Citation:" etc string.
>
> Here is a screen shot of the typical file, with a red arrow pointing
> to the string in this particular file that I want to match.
> I do not know why the regex does not see a match, because it looks
> like it matches it???
>
> See here:
> http://img225.imageshack.us/img225/91/citationue2.jpg

*DON'T* do that. Had you done the right thing, and copy-pasted a small
section of the relevant file into your message, you would have found
that the file doesn't in fact contain the string 'Citation: whatever' at
all. It's an HTML file, so there is markup in there as well, and the
string may well be spread across several lines. Get into the habit of
looking at files in a text editor before you try parsing them with Perl.

> my $newfile = $1;
> rename $file, "$newfile.htm" or die "could not mv '$file' $!";
> print "\n renamed a file";
> sleep 1;
> last;
> }#end while

If you had used proper indentation, you would be able to see that
comments like this are completely useless.

Ben


Posted by He Who Greets With Fire on March 4, 2008, 12:19 pm
Please log in for more thread options
On Tue, 04 Mar 2008 11:13:44 -0600, He Who Greets With Fire

>Here is another snippet that looks much more promising. The TITLE of
>the html page. This is not the instance of "citation....etc" that I
>was looking for, but now that I see it, it looks like a good candidate
>for use as a filename:
>
><title>Get a Document - by Citation - 21-340 Dorsaneo, Texas
>Litigation Guide § 340.02</title>
>
>Are the angle brackets special characters in perl so that they have to
>be backslashed inside the regex?
>
>I wonder if this regex would work?
>next unless /\<title\>Get a Document - by Citation -
>[\d-]+.*([\d.]+)\<\/title\>/;



well, I modified it by adding backslashes in front of the dashes like
so:
next unless /\<title\>Get a Document \- by Citation \-
[\d-]+.*([\d.]+)\<\/title\>/;

But it still does not work. Again, it does seem to cycle through all
the files, but nothing matches.




Similar ThreadsPosted
Re: Rename File Using Strring Found in File? March 3, 2008, 11:08 pm
Re: Rename File Using Strring Found in File? March 6, 2008, 8:11 pm
file not found in DBD::Oracle August 30, 2005, 9:21 am
FAQ 7.24: Why can't a method included in this same file be found? November 10, 2004, 12:03 pm
FAQ 7.24 Why can't a method included in this same file be found? March 5, 2005, 12:03 pm
FAQ 7.24 Why can't a method included in this same file be found? May 21, 2005, 11:03 pm
FAQ 7.24 Why can't a method included in this same file be found? August 6, 2005, 4:03 pm
FAQ 7.24 Why can't a method included in this same file be found? September 4, 2005, 4:03 am
FAQ 7.24 Why can't a method included in this same file be found? January 4, 2006, 5:03 am
FAQ 7.24 Why can't a method included in this same file be found? February 9, 2006, 12:03 pm

Our other projects:

Art Dolls, Fairies and Mermaids - Sunnyfaces.net

Roy's Linux, Programming and Search Engines messages

1-Script XML SitemapXML Sitemap