Re: OOPS in nfsd, affects all 2.2 and 2.4 kernels

From: Neil Brown (neilb@cse.unsw.edu.au)
Date: Sat Oct 28 2000 - 01:41:33 EDT

  • Next message: H. Peter Anvin: "Re: IA-32 (was Re: [PATCH] cpu detection fixes for test10-pre4)"

    On Friday October 27, ajlill@ajlc.waterloo.on.ca wrote:
    > This was first reported in 2.2.12, according to Deja. Solaris clients,
    > on rare occaisons, will send some command to a linux server which
    > causes a null resp->fh.fh_dentry to be passed to routines in
    > /usr/src/linux/fs/nfsd/nfsxdr.c. This causes an oops, and then the nfs
    > server subsystem stop functioning. A fix is to check that this is not
    > null before de-referencing it in the following three routines. I
    > looked and this bug is present in the latest 2.2 and 2.4 kernels.
    >
    > Whatever condition causes this is very rare. We had a linux server
    > supporting 100 Solaris and HP-UX boxes running flawlessly for 8
    > months, then one day something triggered this bug, and it wouldn't go
    > away until I implemented this fix. There were no apparent side effects
    > to doing this, although you may want to print some informative message
    > to try and track down the real culprit.
    >
    > THis patch is against 2.2.16, but the code looks unchanged in
    > 2.4.0.

    Thanks for sending me this.

    This problem that you are addressing is caused when solaris sends a
    zero length write (I assume to implement the "access" system call, but
    I haven't checked).
    If you look at the code at the top of nfsd_write in nfsd/vfs.c, you
    will see that if cnt (the number of bytes to write) is zero, then it
    jumps straight out without setting an error or initialising fhp (by
    calling nfsd_open).
    As there is no error, nfsd tried to return status information (hence
    the call the nfssvc_encode_attrstat) but doesn't have a valid file
    handle. So it Oopses.

    The correct fix, which is in 2.4 and would have recently gone into the
    2.2.18 pre-patches (but I haven't actually looked at them) is to move
    the test for (!cnt) to after the call to nfsd_open.

    NeilBrown
    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    Please read the FAQ at http://www.tux.org/lkml/



    This archive was generated by hypermail 2b29 : Sat Oct 28 2000 - 01:44:00 EDT