Re: thread group comments

From: Theodore Y. Ts'o (tytso@MIT.EDU)
Date: Fri Sep 01 2000 - 20:48:33 EDT

  • Next message: Ulrich Drepper: "Re: thread group comments"

       From: Ulrich Drepper <drepper@redhat.com>
       Date: 01 Sep 2000 14:52:28 -0700

       1st Problem: One signal handler process process-wide

       What is handled correctly now is sending signals to the group. Also
       that every thread has its mask. But there must be exactly one signal
       handler installed. I.e., a sigaction() call to set a new handler has
       consequences process-wide. Since this muse be atomic I think the
       information should be kept in the thread group leader's data
       structures and the other threads should simply use this information
       all the time. Yeah, I know, one more indirection.

    Unfortunately, I think you're right. This will mean an extra
    indirection and locking, but signals aren't performance critical.

       2nd Problem: Fatal signal handling

       kernel/signal.c contains:

        * Send a thread-group-wide signal.
        *
        * Rule: SIGSTOP and SIGKILL get delivered to _everybody_.

       That's OK. Except that is a signal whose default action is to
       terminate the process is not caught be the application, this signal is
       also handled process-wide. E.g., if there is no SIGSEGV handler the
       whole process is terminated.

    True, but this can be handled by having the master thread process catch
    SIGSEGV and redistributing the signal to all of its child-threads.

    (The assumption I'm making here is that the master thread doesn't do
    anything except spawn all threads for the process and monitors its child
    processes for death. This is the n+1 model.)

       This will have to go hand in hand with an extension of the core file
       format to include information about all threads but for the time being
       it's enough if only the offending thread is dumped and the rest simply
       killed.

    True, although my bias has always been that if you wanted debuggable
    programs, you wouldn't have been using threads in the first place. :-)
    (That was a joke, for the humor-impaired!)

       3rd Problem: one uid/gid process-wide

       All the ID (uid/guid/euid/egid/...) must be process wide. The problem
       is similar to the signal handler. I think one should again keep the
       information exclusively in the master thread and have all others refer
       to this information.

    This will probably have to be emulated in user-mode via IPC's. UID
    changes don't happen often, and they're **NOT** performance critical.
    The aternative is to put locking around all of the uid/gid checks in the
    mainline kernel code, and that's been considered unreasonable.

       4th Problem: thread termination

       In general, thread termination is not of much interest for the rest of
       the system. It is in the moment but if the fatal signal handling is
       done this will change.

       If a thread gets a fatal signal, the whole process is killed. No
       cleanup necessary. Signal handlers can be installed if necessary.

       If a thread terminates naturally and can perform the cleanup itself.

       In any case, the death signal should be ignored. Except for the last
       thread, of course, which has to notify the process starting the MT
       application.

    I assume here that when you say "death signal" you mean SIGCHLD? If so,
    if all threads are children of the master thread, then the master thread
    will get all of the SIGCHLD's, and it can deal with them appropriately.
    (In fact, the master thread can get the notification that one of the
    threads died due to an unhandled SEGV, and it can use that information
    to kill off all over the other threads.)

       5th Problem: suspended starting

       Related to the last problem a good old friend pops up. Depending on
       the solution of the last problem it might be necessary to add
       suspended starting of threads. The problem is that sometimes the
       starter has to modify parameters (e.g., scheduler) of the newly
       started thread before it can actually start working. If this fails,
       the new thread must be terminated immediately. But who will get the
       termination signal? The data structures for the new thread must be
       removed as well and this after the new thread is guaranteed to be
       vanished.

    Umm.... why can't this be done in user space? We can do it in kernel
    space, but it means adding a new flag to clone() which tells it to
    suspend the child immediately and let the parent return. Why can't it
    be done in userpsace by having the library which calls clone() as part
    of starting a new thread suspend itself if it's the child? I must be
    missing something.....

                                                    - Ted
    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    Please read the FAQ at http://www.tux.org/lkml/



    This archive was generated by hypermail 2b29 : Fri Sep 01 2000 - 20:49:57 EDT