Linux job accounting (CSA)

From: Marlys Kohnke (kohnke@sgi.com)
Date: Fri Jun 16 2000 - 14:46:14 EDT

  • Next message: wollny: "Re: The next ramfs bug ... Bad news - the patch is broken"

         Los Alamos National Laboratory (LANL) and SGI are working
    together to provide a job accounting package on Linux. This
    accounting solution, called Comprehensive System Accounting (CSA),
    provides the ability to track system resource utilization per
    job and charge back the cost of those resources to users.
    Please see http://oss.sgi.com/projects/csa for information on
    the proposed kernel changes and a CSA overview.

         CSA is a set of kernel changes, C programs and shell scripts
    that provide methods for collecting per-task resource usage data,
    monitoring disk usage, and charging fees to specific login
    accounts. CSA takes this per-task accounting information and
    combines it outside of the kernel by job identifier (jid) within
    system boot uptime periods. Another project, Process
    Aggregates (PPAG), is providing the kernel job infrastructure
    needed by CSA (http://oss.sgi.com/projects/pagg).

         Job accounting is important to production sites. As these sites
    install large Linux systems, they need the enterprise style accounting
    provided by CSA. Since numerous other Linux sites may not be
    interested in job accounting, we're proposing that most of the
    kernel code for CSA be contained in a loadable kernel module.
    The new resource usage counters can also be used by performance
    tools like sar and Performance Co-Pilot (PCP). These counters have
    value outside of CSA and should be available regardless of
    whether CSA is in use.

         Additional task resource usage counters are being proposed for
    the number of characters read/written, blocks read/written, block
    i/o wait time, number of read/write syscalls, physical and virtual
    memory highwater marks, and physical and virtual memory integrals.

         These new counters plus a couple inline procedures add about
    60 lines of kernel code. That number doesn't include adding new CSA
    structures to the existing linux acct.h file or the new loadable kernel
    module. The CSA source code will be available as soon as the
    LANL and SGI lawyers come to final agreement on how and when
    we will open source the code.

         A new acctctl syscall is needed to allow the kernel to
    provide the following services related to CSA:

    1) enable, disable and status processing of daemon and record
            accounting types
    2) provide system accounting file name to kernel; allow switching
            to a new file (monitoring of file size is done outside
            of the kernel)
    3) set memory and cpu threshold values (end-of-process accounting
            records written only if usage exceeds these values)
    4) start and stop user job accounting (ja command is used to write
            accounting records for the current job to a user
            specified file in addition to the system accounting file)
    5) provide daemon accounting records from system daemons like tape and
            workload management to the kernel to write to the system
            accounting file

         CSA will also use the resource usage counters that are currently
    available in the kernel and which are used by the GNU process accounting
    package. There will be no intermingling of accounting records between
    these two packages. Each will write records into its own accounting
    file. Each package will have its own set of user and administrator
    commands to process its own accounting records and generate reports.
    CSA will be a superset of the GNU process accounting, but a site
    could choose to run both concurrently during a transition period.

         The initial prototype kernel code is done and accounting records
    are being written. The commands haven't been ported yet. There's
    still plenty of work to do, so please let me know if you're
    interested in helping provide Linux job accounting.

         I'd appreciate any comments on the kernel counters and
    guidance on whether the project should use a loadable kernel module
    or follow the GNU accounting model in kernel/acct.c and use configuration
    #ifdefs to manage compilation and execution of the kernel code.

    ----
    Marlys Kohnke			Silicon Graphics Inc.
    kohnke@sgi.com			655F Lone Oak Drive
    (651)683-5324			Eagan, MN 55121
    

    - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/



    This archive was generated by hypermail 2b29 : Fri Jun 16 2000 - 14:54:24 EDT