Re: 2.2.16 crashes on ES40 (with spinlock messages...)

From: Martin Frey (frey@scs.ch)
Date: Mon Jul 24 2000 - 12:36:01 EDT

  • Next message: Andries Brouwer: "Re: IDE driver porting problems"

    Hi,

    on our ES40 2.2.14, NFS mounted directory I get in syslog:
    /.nfs0000000021364c6b0000000f, error=-13
    NFS: can't silly-delete mqueue/.nfs0000000021358c8700000015, error=-13
    NFS: can't silly-delete tmp/.nfs0000000021364c6f0000001a, error=-13
    NFS: can't silly-delete tmp/.nfs0000000021364c710000001c, error=-13
    NFS: can't silly-delete tmp/.nfs0000000021364c730000001e, error=-13
    NFS: can't silly-delete mqueue/.nfs0000000021358c8900000028, error=-13
    NFS: can't silly-delete tmp/.nfs0000000021364c700000001b, error=-13
    NFS: can't silly-delete tmp/.nfs0000000021364c720000001d, error=-13
    NFS: can't silly-delete tmp/.nfs0000000021364c740000001f, error=-13
    NFS: can't silly-delete mqueue/.nfs0000000021358c8c0000002b, error=-13

    The test4_out.1 file says:
     
    The e-mail address is frey
    The hostname of this machine is es0.scs.ch
    The architecture of this machine is ALPHA
    The process id is 873
    The executable is szin_w-11-2.ALPHA
    The reader is read-szin-11-15
     
    Time = map_shmem: error in shmget: Invalid argument
    test4_a : Error in szin_w-11-2.ALPHA : program crashed, status=1

    Exaclty the same (including the kernel messages) comes when
    running on a local SCSI disk with ext2 filesystem. What's
    wrong?

    Andrew Pochinsky wrote:
    >
    > Hi,
    >
    > Some time ago I posted a message to this list about misterious crashes
    > on Alpha ES40. Peter Rival, Pat O'Rourke and Michal Jaegermann made
    > some interesting suggestions for a possible cause. Unfortunately, I
    > was not able to fix the problem. This time, however, our user built
    > the code which reliably crashes the system after a few second run.
    > Each crash is accompanied by a 'spinlock ... stuck' message repeated
    > for every processor in the system. Once all the processors are stuck,
    > the system goes catatonic. To check that the problem is not related to
    > some flaky hardware, I rebooted the same kernel with nosmp flag. The
    > problem is gone (of course, the machine now runs four times slower ;(
    >
    > --andrew
    >
    > P.S. The tarball of the executable could be found at
    > <ftp://ftp.lns.mit.edu/pub/avp/smp-crash.tar.gz>. Simply start runme
    > and wait.
    >
    > -
    > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    > the body of a message to majordomo@vger.rutgers.edu
    > Please read the FAQ at http://www.tux.org/lkml/

    -- 
    Supercomputing Systems AG        email: frey@scs.ch	
    Martin Frey                      www:   http://www.scs.ch/~frey
    Technoparkstrasse 1		 phone: +41 (0)1 445 16 00
    CH-8005 Zurich			 fax:	+41 (0)1 445 16 10
    

    - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/



    This archive was generated by hypermail 2b29 : Mon Jul 24 2000 - 12:41:57 EDT