Possible bug in 5.8.6: accept/fork/wait/exit ?

Do you have a question? Post it now! No Registration Necessary.  Now with pictures!

Threaded View

Any multi-process/networking gurus here?

I'm trying to write a simple forking HTTP proxy but am having a problem
when running it with Perl 5.8.6.  The program spontaneously exits after
one or a few connections.  The problem does not happen when I use Perl
5.6.1.  I'm running this on Linux with the kernel
(SuSE 9.3).

Here's a short program that demonstrates it, between the bars below.  To
see the problem:  First, run this program from one shell prompt.  Then,
from another shell prompt, run "telnet localhost 8001".  The connection is
made then closed, as expected.  You can see the output in the first shell
window.  Now, repeat the telnet command over and over.  After 1-20
telnets, the script exits with no message.

#!/usr/bin/perl -w

use strict ;
use Socket ;

use vars qw($LISTEN_PORT  $paddr  $pid  $conn_id) ;

$LISTEN_PORT= shift || 8001 ;

# clean up zombies
$SIG= sub { wait } ;

print "Starting $0, listening on port $LISTEN_PORT.\n" ;

# Set up the listening socket
socket(S_LISTEN, PF_INET, SOCK_STREAM, getprotobyname('tcp')) or die ;
setsockopt(S_LISTEN, SOL_SOCKET, SO_REUSEADDR, pack("l", 1)) or die ;
bind(S_LISTEN, sockaddr_in($LISTEN_PORT, INADDR_ANY)) or die $! ;
listen(S_LISTEN,SOMAXCONN) or die ;

# Accept one connection at a time, and fork a new process to handle it.
for ( ; $paddr= accept(S_CLIENT, S_LISTEN) ; close S_CLIENT) {
     $conn_id++ ;
     select((select(S_CLIENT), $|=1)[0]) ;      # unbuffer the socket
     if ($pid= fork) {    # parent process
         print "starting parent, conn=$conn_id, pid=$$\n" ;
         next ;
     } else {             # child process
         print "starting child,  conn=$conn_id, pid=$$\n" ;
         exit ;


The problem goes away if I comment out either the exit statement, or the
$SIG statement.  Of course, neither one is acceptable, because all
the extra processes would keep hanging around (either as processes or
zombies).  If the for() loop is replaced with a simple "while (1)" loop,
the problem goes away, so the accept() seems to be part of the problem.

Does anyone have any ideas of why this is happening, or suggested
workarounds?  Do you see something I'm missing?

Thanks a lot for any help!

   James Marshall      james@jmarshall.com       Berkeley, CA      @}-'-,--
                         "Teach people what you know."

Re: Possible bug in 5.8.6: accept/fork/wait/exit ?

james@jmarshall.com wrote:
Quoted text here. Click to load it

I also don't see the problem on 5.8.0 or 5.8.3.

Quoted text here. Click to load it

That is not surprising, since you don't ask for a message.

Quoted text here. Click to load it

for ( ; $paddr= accept(S_CLIENT, S_LISTEN) or die $!; close S_CLIENT) {


-------------------- http://NewsReader.Com/ --------------------
Usenet Newsgroup Service                        $9.95/Month 30GB

Re: Possible bug in 5.8.6: accept/fork/wait/exit ?

(My news server is down, so I'm posting from Google Groups.)

Thanks Xho for the suggestion.  It helped lead me to a place in the
Camel book where this very problem is discussed.

The apparent problem was that the accept() system call was getting
interrupted by the CHLD signal, which caused accept() to return undef,
which caused the loop (and thus the program) to exit.  According to the
Camel book, this happens on systems without restartable system calls,
which explains why it was a problem on some systems and not others.

The workaround is also discussed in the Camel book.  It entails
rewriting the $SIG handler as

    sub {$waitedpid= wait}

.... and rewriting the for() loop as

   for (waitedpid= 0 ;
         ($paddr= accept(S_CLIENT, S_LISTEN)) or $waitedpid ;
         $waitedpid= 0, close(S_CLIENT) )
        next if $waitedpid ;

Site Timeline