Process Aggregates: module based support for jobs

From: Sam Watters (watters@sgi.com)
Date: Fri Jun 16 2000 - 14:43:51 EDT

  • Next message: Carlos Ungil Gutierrez-Rave: "2.3.99-pre9 ethertap bug (?) and solution (?????)"

    Los Alomos National Laboratory (LANL) and SGI are collaborating
    together to provide an accounting solution called Comprehensive System
    Accounting (CSA). CSA is for demanding Linux users who require the
    ability to track system resource utilization and charge back the cost
    of those resources used to the actual users who consume them. To
    accomplish this task, CSA performs job level accounting, as opposed to
    the more familiar process level accounting (for CSA information see,
    http://oss.sgi.com/projects/csa). CSA requires that a job container for
    processes be made available on Linux.

    A job is defined as a group of related processes, all descended from a
    point of entry process and identified by a unique job ID. A job can
    contain multiple process groups, session, and processes. The job acts
    as a process containment mechanism and a process is not allowed to
    escape from the job container.

    To provide a job container on Linux, we are proposing a generalized
    mechanism for providing process containers. We call this mechanism
    Process Aggregates, or PAGGs. PAGG will allow job containers to be
    provided as a Linux kernel module, greatly lessening the impact of
    providing jobs on the base Linux kernel. In addition, other developers
    can use PAGG to provide additional process container types. In
    addition to the job module using PAGG, we expect to provide a PAGG
    module to further assist with managing parallel process applications
    such as MPI applications in the near future.

    PAGG consists of a set of kernel changes that provide functions for
    modules to register and unregister as providers of process aggregate
    containers. The registration functions operate in much the same manner
    as those currently provided for filesystems, block and character
    devices, symbol tables, and execution domains.

    The changes to the kernel consist of about 90 lines of code. Of those
    90 lines, about 10 lines are to existing kernel functions and the
    balance of the code consists of new procedures. The code changes are
    organized so that compiling them into the kernel is optional. In cases
    where PAGG support is compiled into the kernel, but no PAGG modules are
    in use, the added burden to the kernel consists of the execution of an
    additional if statment at process fork and another at process exit.

    In addition to the registration functions, the PAGG changes provide
    hooks for updating process aggregate containers when processes fork and
    exit. In addition, a new paggctl system call is proposed to allow the
    following types of services:

      1) creation of a new pagg container
      2) signal all processes that are attached to the pagg container
      3) wait for the completion of all processes in the pagg container
      4) future resource limit capabilities based upon pagg container

    Each pagg module would handle their own paggctl requests.

    Please see http://oss.sgi.com/projects/pagg for information on the
    proposed kernel changes and further description of what PAGG is and why
    we are proposing this work.

    The initial prototype kernel code is done and an initial implementation
    of a job container module has been written to test the kernel code. A
    description of these kernel changes is provided at the PAGG project
    home page (http://oss.sgi.com/projects/pagg) or you may access it directly
    at http://oss.sgi.com/projects/pagg/pagg-lkd.txt. We would appreciate any
    comments and guidance concerning the PAGG work.

    The code for PAGG will be posted to the PAGG home page around June
    23rd, as I need to make sure I have it updated for the latest 2.3
    release. If you think I should post it as a patch to this list
    (linux-kernel) please let me know.

    Thanks!
      - Sam

    -- 
    ----------------------------------------
    Sam Watters
    SGI
    watters@sgi.com
    (651) 683-5647
    ----------------------------------------
    

    - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/



    This archive was generated by hypermail 2b29 : Fri Jun 16 2000 - 14:46:45 EDT