NFS client stuck in nfs_free_dentries (2.2.15)

From: Miquel van Smoorenburg (miquels@cistron.nl)
Date: Fri May 19 2000 - 10:16:16 EDT

  • Next message: Dr J Pelan: "Re: [PROBLEM] UDMA66 hangs on boot-up (2.2.15+ide.patch & 2.3.99-pre8)"

    We have a shell server that mounts the user home directories, the
    mail directories and the stuff from our public FTP server using
    NFS from other Linux (unfsd) servers.

    We tried to run the 2.2.13 kernel on it, but it crashed every few
    hours. Last week I decided to retry it with 2.2.15, and it got an
    uptime of 9 days until it crashed again today.

    Magic sysrq revealed that the program counter was always in the same
    area:

    EIP: 0010:[c01466eb]
    EIP: 0010:[c0146702]
    EIP: 0010:[c014673c]
    EIP: 0010:[c0146777]
    EIP: 0010:[c0146771]

    Excerpt from System.map:

    c01466c4 t nfs_free_dentries
    c014678c t nfs_zap_caches

    In other words, it's stuck in nfs_free_dentries(). But how is it
    possible that it gets stuck in while ((tmp = tmp->next) != head) { } ?
    Ah, perhaps if dentry->d_count == 0, is that a valid possibility ?

    I've applied the following patch to see if I can catch it in the act.
    Seeing that it might take a week to trigger the bug, I thought I'd
    post this here first for the real nfs hackers to look at.

    --- fs/nfs/inode.c.orig Thu May 4 02:16:46 2000
    +++ fs/nfs/inode.c Fri May 19 16:12:27 2000
    @@ -411,6 +411,7 @@
     {
             struct list_head *tmp, *head = &inode->i_dentry;
             int unhashed;
    + int cnt = 0;
     
     restart:
             tmp = head;
    @@ -422,6 +423,13 @@
                     dprintk("nfs_free_dentries: found %s/%s, d_count=%d, hashed=%d\n",
                             dentry->d_parent->d_name.name, dentry->d_name.name,
                             dentry->d_count, !list_empty(&dentry->d_hash));
    + if (cnt++ > 20000) {
    + printk("nfs_free_dentries: got stuck - debug:\n");
    + printk(" found %s/%s, d_count=%d, hashed=%d\n",
    + dentry->d_parent->d_name.name, dentry->d_name.name,
    + dentry->d_count, !list_empty(&dentry->d_hash));
    + break;
    + }
                     if (!dentry->d_count) {
                             dget(dentry);
                             d_drop(dentry);

    Mike.

    -- 
    Denial. It's not just a river in Egypt.
    

    - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/



    This archive was generated by hypermail 2b29 : Fri May 19 2000 - 10:26:31 EDT