Re: SCO: "thread creation is about a thousand times faster than on

From: Alexander Viro (viro@math.psu.edu)
Date: Tue Aug 29 2000 - 00:04:40 EDT

  • Next message: Albert D. Cahalan: "Re: SCO: "thread creation is about a thousand times faster than on"

    On Mon, 28 Aug 2000, Albert D. Cahalan wrote:

    > > what, in the name of Cthulhu, makes us unclone() everything, no matter
    > > whether we need it or not?
    >
    > Setuid is a third special case. What happened to good taste? :-)

    Erm?
            * setuid is a fscking wart - THE mistake of dmr and/or ken.
    There were better ways to do that and credit where credit is due, they
    _did_ go for better ways later. Unfortunately, we are stuck with the crap.
    setuid program is _never_ a good thing and all w*nking around credentials
    only spreads the mess thinner, but doesn't clean it up.
            * close-on-exec originated as a covenience hack mostly to make
    life with setuid sh*t somewhat easier. It is both redundant and
    insufficiently strong - e.g. dup() doesn't copy c-o-e bit. Just look at
    the interface - fcntl() and ioctl(), aka. dirty hack instead of clean
    design.
            * pseudo-roots are _WAY_ past the fscking wart level. Terminal
    stage of syphilis would be more like that - almost braindead, truely
    grisly sight, all innards are rotten, the thing barely limps around and
    just plain stinks. Albert, I've dealt with that code quite recently and
    trust me, it's a living horror. It's a bastard kludge from hell.

    > > E.g. would you really want to get an
    > > independent namespace upon exec()? Goodby mount(8)...
    >
    > The namespace bit is a backwards bit, so it can be kept.
    > This is a wart that must appear everywhere CLONE_* does.
    > The pre-existing UNIX cruft is something you have to live with.
    > You didn't make fork() unshare namespaces.

    fork() should _NOT_ unshare them. Otherwise - goodby mount. Same reasons
    why cd is a built-in.

    > BTW, mount(8) is going to need some serious hackery already,
    > just because admins will demand global (all namespace) mounting.

    Umm.. I'm afraid that you do not realize what you are talking about. Just
    think what mount(8) would do if it got a private namespace. Right, modify
    it. Then what? Right, exit(2). And what kind of impact would it have on
    the namespace of shell and friends? Exactly, zero. mount(8) may need
    hackery, but it's not going to be in that area. The whole point of that
    utility is to modify the namespace of parent (and everyone who shares it
    with the parent). I.e. it must share the namespace with parent.

    > Under plan c, an exec should make the world look pretty much as
    > if the last clone was a fork(). This is great for the CGI example.
    >
    > Imagine documenting partial unclone. "... except that a CLONE_FILES
    > relationship is broken when there are close-on-exec file descriptors,
    > a CLONE_FS relationship is broken if loading the new executable
    > causes a personality change ... error may be occur iff a task tries
    > to execute privileged code (setuid, setgid, set-capability-bit, or
    > unreadable executable) while party to a thread construct involving
    > CLONE_FS, CLONE_FILES, CLONE_THREAD, CLONE_PTRACE, CLONE_PARENT, or
    > certain yet-to-be-defined future clone() flags ..."

            Nope. Much simpler. exec() replaces the VM with new one, built
    according to the binary you are trying to load. That's what it is for.
    Consequently, it unshares replaces the signal context (it is bound to VM).
    Use of some warts may force exec() to unshare other resources. Precisely,
            * personalities with pseudo-roots - CLONE_FS
            * close-on-exec - CLONE_FILES
            * suid - CLONE_FILES|CLONE_FS
            [whatever else is needed]
    Notice that regulare way to get something unshared is explicit call of
    unshare() before the exec(). Cases above are unfortunate results of the
    features that are inherently unsuitable for the clone()/rfork() model.

    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    Please read the FAQ at http://www.tux.org/lkml/



    This archive was generated by hypermail 2b29 : Tue Aug 29 2000 - 00:06:16 EDT