removing paragraphs from text files

Do you have a question? Post it now! No Registration Necessary.  Now with pictures!

Threaded View

i have a specific paragraph in a bunch of configuration files that i
want to remove.  the lines are as follows

define service{
        use                             linux-service
        host_name                       ninjasrv
        service_description             PING
        check_command                   check_ping!100.0,20%!500.0,60%
        action_url /nagios/pnp/index.php?host=$HOSTNAME$&srv=

the 'use' and 'host_name' directives are different in each file.  the
unique string is 'PING'.

i was just wondering if it is possible to do such thing in Perl?


Re: removing paragraphs from text files

Quoted text here. Click to load it

    perl -p0777 -i -e 's/define service\]*PING[^}]*\}\s+//g' *.cf

you might want to use
instead of a bare "-i"...

Tad McClellan
email: perl -le "print scalar reverse qq/moc.noitatibaher0cmdat/"

Re: removing paragraphs from text files

Quoted text here. Click to load it

that was so amazing, all done in a single shot.  could you please also
help on what exactly is -p0777 and how did this substitution work 's/
define service\]*PING[^}]*\}\s+//g'.  i have never seen/read such

thanks again.

Re: removing paragraphs from text files

Quoted text here. Click to load it

i just found
    the separator between records is 777 in octal; this is not a real
ASCII char so the whole file is slurped in as a single record;

now my confusion is the regex match.
it goes like, search for
define service followed by a { then any characters but not } then PING
then any characters but not } then atleast one space and replace with
nothing.  i am just wondering what exactly is this [^}]* doing.  i
tried it with .* like

define service\\s+//g
but it would not replace.

my understanding is that it should work because [^}]* (any character
but not }) is same as .* in this case since I know there is no }
before PING string.

what am i missing?

Re: removing paragraphs from text files

Quoted text here. Click to load it
Quoted text here. Click to load it

/./ is not "any character" but "any character except newline" unless you
use the /s modifier. So your substitution would only work if the whole
section was on a single line.

s/define service\\s+//sg

OTOH would match anything from the first "define service{" to the last
"}" in the file (provided there's a PING somewhere between them) so it
would probably remove a lot more than you want. The /[^}]*/ in Tad's
regex is there to keep the match within a single brace-delimited block
(and it's a bit simple-minded: It won't work if you have a } inside a
comment, for example, but you probably don't, so that doesn't matter).


Re: removing paragraphs from text files

*skipping alfonsobaldaserra since he skipped Tad anyway*

Quoted text here. Click to load it

Then stricter


and stricter

    qr/\}(?:\h*\n)+/ # needs 5.10

and stricter


What leads as to

    perdoc -q nesting

and applieing regexes at HTML.

Torvalds' goal for Linux is very simple: World Domination
Stallman's goal for GNU is even simpler: Freedom

Re: removing paragraphs from text files

Quoted text here. Click to load it
510 is great, a lot of new stuff in the engine.
New nesting, etc.

When you can write a regex without the need for the
  # needs 5.10
maybe it might be usefull.

Btw, I don't think anybody skipped Tad, who never skips


Re: removing paragraphs from text files

On Mon, 13 Jul 2009 01:52:14 -0700 (PDT), alfonsobaldaserra

Quoted text here. Click to load it

If you have never read such a regex, you don't know regex. This is very simple.
You should visit this group/site more often.

Assuming a slurped in file and your test:  s/define service\\s+//g,
as Holzer said .* will greedily grab all the chars up until the last anchor
that is all except '\n' newline because you don't have /s modifier, and won't
match anything.
Try   's/define service\\s+//sg'.

Also, using greedy quantifiers with '.' is a tricky prospect. They have thier
though. Most beginners just throw '.*' in the middle of thier regex, when in
they should only be put in when the regex can already be described without them,
if at all.

The reason is that there is no guarantee of the shape of text when it is written
a file, none! For this reason, regexs' should be molded with at least a certain
of built in error checking (qualification). And while not %100, 90-95 will do as
minimal QA check.

Thus, Tad used the '[^}]*' character class to describe all characters, but one.
Specifically NOT '}' which would signify the end of a block. Which leads to the

How do you know the syntax of what the known parser uses to extract information
from that file? Even if the form of the writer is simple, even custom, there may
anomolies introduced from the file system, even if the writer changes form, then
Surely you would want a little robustness of QA built into the regex.

Tad gave you what you wanted from your simple problem statement. Indeed it was
in simple terms, that would not be acceptable in a production environment.

A lot of times (most of them) here on this group/site, that is the case.
It just amazes me sometimes that people come back with, 'but it doesen't work if
have this condition', that was never stated.

Tads regex could have been written (untested) like this:


and still work, that maybe give some variability the way normal parsers work.
But you didn't state information on where it came from or how it is parsed.
Whether 'use' or 'service_description' any other other var type is there,
what order, required, etc...

No, you stated PING, the only constant, is in this form:
  'define service'

Not alot to go on, but don't expect this to be a real parser unless you
the RULES.

Good luck.


Re: removing paragraphs from text files

Quoted text here. Click to load it

that was an excellent explanation.  thank you very much guys, i have
understood it now.

Site Timeline