Re: (shm WARNING) shm still deadlocks

From: Christoph Rohland (hans-christoph.rohland@sap.com)
Date: Mon Mar 20 2000 - 13:05:42 EST

  • Next message: Alan Cox: "Re: new IRQ scalability changes in 2.3.48"

    "Manfred Spraul" <manfreds@colorfullife.com> writes:

    > From: "Christoph Rohland" <hans-christoph.rohland@sap.com>
    > >
    > > I tried to reproduce this on 2.3.99-pre3-2. I did run 60 recursive
    > > wget over 100MBIT against my apache and no lockup at all. This is on
    > > SMP and UP, under memory pressure and without.
    > >
    > > Do you use any special apache modules?
    >
    > I don't have enough time for a full review of the recent changes, but at
    > least the comment in IPC_RMID is dangerous:
    >
    > up(&shm_ids.sem);
    > /* The kernel lock prevents new attaches being happening. [...]
    > */
    > shm_remove_name(id)

    I think the comment is wrong and the lock_kernel is superfluous. See
    below.

    > Sounds very dangerous. The scheduler automatically releases the big kernel
    > lock, and if VFS sleeps on a semaphore, or if shm_unlink sleeps on
    > down(&shm_ids.sem), then we get a crash.
    > Unless VFS protects us, the following could happen:
    >
    > CPU0:
    > sys_shmctl(IPC_RMID);
    > up(&shm_ids.sem);
    > do_unlink();
    > CPU1:
    > sys_shmget();
    > down(&shm_ids.sem);
    > CPU0:
    > shm_unlink() : calls down(&shm_ids.sem); <<<< sleeps. releases the
    > kernel lock.
    > CPU1:
    > sys_shmget() returns with success.
    > sys_shmat(). attaches.
    >
    > and now the scheduler decides that the shm_unlink operation should continue.
    > crash.

    This should not lead to a crash:
    1) sys_shmget does not give you access to the resource but only the
       identifier.
    2) shmat is using vfs to get access to the object.
    3) sys_shmctl(IPC_RMID) calls vfs to do the unlink
    So we should end up having an unlinked shm segment with attachee. This
    is somewhat unexpected but it should not be fatal.

    But you are right that the big kernel lock does not prevent from that
    since it is released by down(&shm_ids.sem). So we can also just remove
    the lock and put it directly around do_unlink

    Greetings
                    Christoph

    -- 
    

    - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/



    This archive was generated by hypermail 2b29 : Mon Mar 20 2000 - 15:19:30 EST