Re: Avoiding OOM on overcommit...?

From: James Sutherland (jas88@cam.ac.uk)
Date: Fri Mar 24 2000 - 09:47:01 EST

  • Next message: James Sutherland: "Re: Avoiding OOM on overcommit...?"

    On Tue, 21 Mar 2000 13:20:05 -0600 (CST), you wrote:
    >James Sutherland <jas88@cam.ac.uk>:
    >> On Mon, 20 Mar 2000 20:50:08 -0600, you wrote:
    >> >On Mon, 20 Mar 2000, Horst von Brand wrote:
    >> >>Jesse Pollard <pollard@tomcat.admin.navo.hpc.mil> said:
    >> >>
    >> >>[...]
    >> >>
    >> >>> Without overcommit:
    >> >>> You can support 100 simultaneous connections, with full saturation of
    >> >>> each server, with no failures.
    >> >>>
    >> >>> Result: happy customers, happy management, maybe even a raise.
    >> >>
    >> >>Management nagging about supporting more pages, worried by waste of several
    >> >>Gb of disk that has never, ever been touched. System is sluggish, needs
    >> >>constant tweaking of "resource allocation quotas" as applications crash,
    >> >>even there are resources available. Seriously consider firing inept
    >> >>sysadmin.
    >> >
    >> >Disk space is cheap.
    >>
    >> Still not cheap enough to throw away, unless you are Boeing.
    >>
    >> >System is no more sluggish than currently, and is much more reliable.
    >>
    >> It is slower, because of the extra overhead you have imposed on every
    >> memory allocation. And it is no more reliable, except in OOM
    >> situations - which your draconian resource quotas make MUCH more
    >> frequent for the users.
    >>
    >> >First, "constant tweaking" means the system was/is undersized, and the
    >> >accounting records justify the additional business support for expansion.
    >>
    >> Point to a system which has never passed 50% of VM in use, and your
    >> request will be laughed out of purchasing.
    >
    >Cray YMP-8 - reached 100% several years ago. Replace by C90, C90 replaced
    >by dual SV1 system. SV1 system at 80%, expanding to 4 nodes.
    >SGI Power Challenge array - reached 90% usage in 1 year. Replaced by Origin
    >2000. Origin 2000 (128 node) updated when load reached 80% (with several
    >OOM crashes due to managements choice to overcommit)

    What do you mean by "overcommit" in this case - the total online user
    quota exceeding total VM, or the Linux (memory allocated on demand)
    sense?

    And are you seriously telling me that if I take an Irix box, disable
    user VM quotas and do a malloc() loop, the system crashes?! (Mental
    note: Replace all Irix boxes with NT)

    > Added another 128 nodes,
    >and a TB of disk (30-40GB used for swap.. still overcommitted, but not as
    >severly). I have more.
    >
    >>
    >> >Second, If it is determined that overcommitment is necessary, then it can
    >> >still be done - along with the documentation on what happens when the system
    >> >crashes.
    >>
    >> Overcommit does not cause system crashes; nor is it related in any way
    >> to user resource quotas.
    >
    >Of course it is. If too many users are on one system trying to do too much
    >at one time the system fails.

    "The system fails"?? Either you have one broken system, or you have a
    strange definition of "failure". The users should just get errors that
    they don't have the resources available.

    > Resource quotas force the users to schedule
    >their work, and to cooperate with the system to accomplish the goals of
    >business management.
    >
    >> >I've never been fired for guaranteeing system uptime.
    >>
    >> You should be, if you do so by disabling legitimate use of the system.
    >
    >I didn't - legitimate use was guaranteed. Crashing the system wasn't one
    >of the legitimate uses. Nor was causing the failure of other users processes.

    Crashing the system should not be possible under any circumstances,
    regardless of resource quotas. If a user-mode program can crash the
    system using only unprivileged syscalls, you have found a nasty bug.

    >> >Obviously you don't work where uptime is counted in the thousands of dollars
    >> >per hour.
    >>
    >> If users are unable to perform legitimate tasks due to a "lack of
    >> resources", when the machine has ample resources to spare, I would
    >> consider the system down. I would then fire the sysadmin who
    >> configured it that inefficiently.
    >
    >Thats just it - the system was saturated.

    Your quota settings would cause the system to begin denying resource
    allocation BEFORE the system is saturated.

    >> However, if you REALLY insist on setting the system up so people can
    >> login and display all sorts of nice OOM error messages on an idle
    >> system, go ahead. I'm not stopping you.
    >
    >If it were idle there would be no OOM error messages :)

    Yes there would - you run out of YOUR memory, even when the system has
    plenty to spare.

    >> >>> If you overcommit:
    >> >>> You might support 100 simultaneous connections, but not full saturation
    >> >>> of each server, without aborting some connections or crashing the
    >> >>> server/system.
    >> >>>
    >> >>> Result: some unhappy customers, not so happy management, difficulty
    >> >>> in identifying problems, and definitely no raise.
    >> >>
    >> >>Other customers are unhappy because of long download times, NS (or
    >> >>whatever) crashing, not getting to the site because of (local or remote)
    >> >>network congestion. The "catastrophe" isn't even a blip on the sysadmin's
    >> >>screen.
    >> >
    >> >Yup - long download times since the system keeps crashing, forcing the
    >> >customer to a different site and or different company.
    >>
    >> No, the system never gets as far as crashing, because the inept
    >> sysadmin has already disabled it.
    >
    >not worth answering.

    Your approach, with a typical workload, would waste a large percentage
    of the system's resources. Users rarely max out their quotas - but you
    set the system to ensure that a user with a 128Mb quota has 128Mb
    exclusively reserved for their use alone anyway?

    >> >The catastrophe is loss of business, loss of reputation, going out of
    >> >business because you can't keep the customers.
    >>
    >> The customers have already gone somewhere they can actually run
    >> software without OOM errors.
    >
    >not your site - it keeps crashing.

    Why? If I have the same level of resources as you, so the same
    workload will succeed without any problems.

    You still don't appear to understand what overcommit means in this
    context.

    James.

    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.rutgers.edu
    Please read the FAQ at http://www.tux.org/lkml/



    This archive was generated by hypermail 2b29 : Fri Mar 24 2000 - 10:33:29 EST