NFS stops dead in 2.3.99pre3

From: Russell King (rmk@arm.linux.org.uk)
Date: Sat Mar 25 2000 - 09:13:50 EST

  • Next message: Chris Horry: "Re: Max File descriptors in Linux 2.2.14"

    Hi,

    I've been reporting strange NFS oddities in all the 2.3.99 kernels thus
    far to this list, but I think this has to be the most serious one yet.

    While using xaudio to play mp3s on a machine (tasslehoff) from my NFS
    server (flint), the sound glitches (the old problem, reported earlier).
    I tracked this down to the client pausing between sending requests to
    the server for some reason (I never got a response from l-k).

    However, with 2.3.99pre3, it eventually stopped, never to resume, and
    I see the following on the network:

      root@tasslehoff:[/root]:<1003> tcpdump -i eth0 port 2049 -s 2048
      tcpdump: listening on eth0
      14:04:03.337916 caramon.arm.linux.org.uk.2810905088 > flint.arm.linux.org.uk.nfs: 116 lookup fh Unknown/1 "var"
      14:04:03.337916 caramon.arm.linux.org.uk.2827682304 > flint.arm.linux.org.uk.nfs: 172 write fh Unknown/1 47 (47) bytes @ 4203018 (4203018)
      14:04:03.337916 flint.arm.linux.org.uk.nfs > caramon.arm.linux.org.uk.2810905088: reply ok 128 lookup fh Unknown/1
      14:04:03.337916 flint.arm.linux.org.uk.nfs > caramon.arm.linux.org.uk.2827682304: reply ok 96 write
      14:04:03.337916 caramon.arm.linux.org.uk.2844459520 > flint.arm.linux.org.uk.nfs: 116 lookup fh Unknown/1 "lock"
      14:04:03.337916 flint.arm.linux.org.uk.nfs > caramon.arm.linux.org.uk.2844459520: reply ok 128 lookup fh Unknown/1
      14:04:04.147944 tasslehoff.arm.linux.org.uk.3188470476 > flint.arm.linux.org.uk.nfs: 124 read fh Unknown/1 4096 bytes @ 3092480 (DF)
      14:04:04.217946 tasslehoff.arm.linux.org.uk.3456905932 > flint.arm.linux.org.uk.nfs: 124 read fh Unknown/1 4096 bytes @ 3158016 (DF)
      14:04:13.338212 caramon.arm.linux.org.uk.2861236736 > flint.arm.linux.org.uk.nfs: 108 getattr fh Unknown/1
      14:04:13.338212 flint.arm.linux.org.uk.nfs > caramon.arm.linux.org.uk.2861236736: reply ok 96 getattr REG 100644 ids 0/0 sz 4203065
      14:04:13.338212 caramon.arm.linux.org.uk.2878013952 > flint.arm.linux.org.uk.nfs: 172 write fh Unknown/1 47 (47) bytes @ 4203065 (4203065)
      14:04:13.338212 flint.arm.linux.org.uk.nfs > caramon.arm.linux.org.uk.2878013952: reply ok 96 write
      14:04:22.948417 raistlin.arm.linux.org.uk.2459697152 > flint.arm.linux.org.uk.nfs: 40 null
      14:04:22.948417 flint.arm.linux.org.uk.nfs > raistlin.arm.linux.org.uk.2459697152: reply ok 24 null
      14:04:23.338424 caramon.arm.linux.org.uk.2894791168 > flint.arm.linux.org.uk.nfs: 108 getattr fh Unknown/1
      14:04:23.338424 flint.arm.linux.org.uk.nfs > caramon.arm.linux.org.uk.2894791168: reply ok 96 getattr REG 100644 ids 0/0 sz 4203112
      14:04:23.338424 caramon.arm.linux.org.uk.2911568384 > flint.arm.linux.org.uk.nfs: 172 write fh Unknown/1 47 (47) bytes @ 4203112 (4203112)
      14:04:23.338424 flint.arm.linux.org.uk.nfs > caramon.arm.linux.org.uk.2911568384: reply ok 96 write
      14:04:23.428425 raistlin.arm.linux.org.uk.3910229159 > flint.arm.linux.org.uk.nfs: 116 getattr fh Unknown/1
      14:04:23.428425 flint.arm.linux.org.uk.nfs > raistlin.arm.linux.org.uk.3910229159: reply ok 96 getattr REG 100600 ids 501/12 sz 1062606
      20 packets received by filter
      0 packets dropped by kernel

    Note that the NFS server is still alive and servicing requests from
    "caramon", but not "tasslehoff".

    There are no messages in the logs on "flint" to indicate why it's not
    responding to requests. Tasslehoff contains the following kernel
    messages:

      root@tasslehoff:[/root]:<1008> dmesg | tail -n 17
      VFS: Mounted root (ext2 filesystem) readonly.
      Freeing unused kernel memory: 120k init
      Adding Swap: 64508k swap-space (priority -1)
      nfs: server flint not responding, still trying
      nfs: server flint not responding, still trying
      tcpdump uses obsolete (PF_INET,SOCK_PACKET)
      device eth0 entered promiscuous mode
      device eth0 left promiscuous mode
      device eth0 entered promiscuous mode
      device eth0 left promiscuous mode
      device eth0 entered promiscuous mode
      device eth0 left promiscuous mode
      device eth0 entered promiscuous mode
      device eth0 left promiscuous mode
      device eth0 entered promiscuous mode
      device eth0 left promiscuous mode
      nfs: task 3420 can't get a request slot

    This behaviour started with the 2.3.99 kernel series (it used to work
    100% in 2.3.51, if not 2.3.50 iirc).

    Kernel versions:
    caramon kernel 2.2.14
    flint kernel 2.2.14 knfsd nfs-utils 0.1.5
    tasslehoff kernel 2.3.99-pre3
       _____
      |_____| ------------------------------------------------- ---+---+-
      | | Russell King rmk@arm.linux.org.uk --- ---
      | | | | http://www.arm.linux.org.uk/~rmk/aboutme.html / / |
      | +-+-+ --- -+-
      / | THE developer of ARM Linux |+| /|\
     / | | | --- |
        +-+-+ ------------------------------------------------- /\\\ |

    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.rutgers.edu
    Please read the FAQ at http://www.tux.org/lkml/



    This archive was generated by hypermail 2b29 : Sat Mar 25 2000 - 09:20:28 EST