Quick and dirty article filter

Do you have a question? Post it now! No Registration Necessary.  Now with pictures!

As I've been spending more time on Usenet, I've come to become rather
annoyed at the "features" of Google Groups. Yes, it sucks, it makes
things a mess, and so on. But! Instead of just whining about the
problem, I've put together a small, simple Perl program to filter Google
Groups (and other) articles into something resembling sanity. I wrote
this for slrn, and will include the macro I wrote for slrn as well.

A quick precaution about this code: yes, it uses Email::Simple. For the
most part, email messages and news articles are compatible--but that's
not the part that summons this precaution. The code uses Email::Simple
to stick its fingers in its ears and pretend that encodings other than
US ASCII do not exist. For Usenet, this is still mostly okay (at least
for the parts of it I read, i.e. the Big 8 and a few groups in alt.*).
It very likely will break on multibyte messages, but I tend not to read
or encounter those. You have been warned.

This requires a new-ish version of libslang (2.2.4 works) due to a
recently-fixed bug in process.sl that I stumbled over while developing
the slang macro. It was fixed in git at the time I found it, but not in
the released version of libslang.

Patches are, of course, welcome. If someone wants to do proper encoding
handling, that'd be pretty awesome. If you really care about licenses,
I'll release it as the same terms as perl 5.16.0 or any later version.
Provided as-is, no warranty, yadda yadda.


=== cut ===
#!/usr/bin/env perl
use strict;
use warnings;

use IO::All;
use Email::Simple;
use Text::Autoformat;

my $raw = io('-')->all;
my $article = Email::Simple->new($raw);
my $body = $article->body;
$body =~ s/^(>+)(\w)/$1 $2/mg;
$body =~ s/^>( >)+/>>/mg;
$body = autoformat $body;
print $article->as_string;
=== cut ===


=== cut ===
define fix_article_stupidity ()
  variable a = article_as_string();
  variable p = new_process(["/path/to/fix_article.pl"]; write=, read=0);
  fputslines(a, p.fp0);
  variable r = fgetslines(p.fp1);
  variable n = strjoin(r, "");
!if(register_hook("read_article_hook", "fix_article_stupidity"))
  message("Warning: Could not register fix_article_stupidity" +
    " for read_article_hook");
=== cut ===

Thanks and best regards,
Chris Nehren

Site Timeline