M.L. (me@privacy.invalid) wrote on VCCLXX September MCMXCIII in
{}  Hi. Using the Xref header field of newsgroup messages, I need a regexp
{}  that will allow me to drop all newsgroups in that list containing the
{}  word "general". I need one that works, and to find why the one I created
{}  isn't working.
{}  Example Xref:
{}  Xref: general.soc:449517 chi.general:641065
{}  soc.general.sci:329682 francom.chatting.generale:152591
{}  My regexp attempt passes general.soc (good) but not soc.general (bad) or
{}  chi.general (good):
{}  .*[^(chi\.)(microsoft\.public\.windowsxp\.)]general.*
{}  In short, I want the following newsgroup examples to pass the Xref
{}  filter so they can be dropped:
{}  general.soc
{}  soc.general
{}  soc.general.sci
{}  francom.chatting.generale
{}  I want the following two to fail so they will be retained:
{}  chi.general
{}  microsoft.public.windowsxp.general
{}  Any assistance would be greatly appreciated. Thanks in advance.



Of course, it would be much simpler if you'd have the newsgroups in a list,
then I'd do:

  @groups = grep {$_ eq 'chi.general'                        ||
                  $_ eq 'microsoft.public.windowsxp.general' ||
                 !/general/} @groups;

BEGIN {print "Just "   }
INIT  {print "Perl "   }
CHECK {print "another "}
END   {print "Hacker\n"}

As Abigail does, forget about the exceptions in the regex. Pass those
through first then look at the rest. Note that I added \b to regex so
you don't filter out alt.generalissimo.franco. :)


my $string = " general.soc:449517 chi.general:641065
soc.general.sci:329682 francom.chatting.generale:152591";

%Exceptions = map { $_, 1 } qw( chi.general
microsoft.public.windowsxp.general );

join " ",
grep {
   my( $group, $number ) = split /:/;
   exists $Exceptions or $group !~ /\bgenerale?\b/
   split /\s+/, $string;

Sorry to be replying so late. Thanks for the advice, I'll try to follow
what I can. I need to mention that the regexp is not going to be used
within a Perl script. It's just an entry to a Windows program that uses
Perl regexps for filtering. So grep and map are out. I'll check to see
if Abigail's will work though. Thanks again.

