Actual environment size comparison of CML1 and CML2

From: Eric S. Raymond (esr@thyrsus.com)
Date: Sat May 27 2000 - 02:41:58 EDT

  • Next message: Matt Yourst: "File corruption after Alt+SysRq unmount (?)"

    david parsons <orc@pell.portland.or.us>:
    > The whoops-time-to-fork-the-linux-kernel showstopper for me is that
    > the reference implementation of this new configuration language is
    > written in Python, and, given the fluidity of linux kernel
    > development and the impossibility of getting patches to Linus unless
    > you're a member of the Core Team, this would probably mean that the
    > Python implementation would be the only implementation that would
    > ever work.

    I hate to ruin a nice juicy flamewar by introducing such dull things as
    facts into it, but...

    On Red Hat, a Python 1.5.2 RPM installation looks as though it
    requires about 5M. This is probably more than a bare-bones
    install on other distributions; it includes the Tix widget set
    and a bunch of other goodies. I shall bend over backwards to
    the Python-dislikers and ignore that detail. 5M sure sounds like a
    lot, doesn't it?

    Sure does. Until you start thinking about the actual numbers attached
    to various possible alternatives. Here are some byte sizes I
    collected from my Red Hat 6.2 system and the 2.3.99pre9 kernel tree:

     4,971,072 Python 1.5.2
    16,290,796 Perl-5.00503
     2,001,475 Tcl/Tk

       251,538 CML1 config files
       156,183 CML2 rulebase

     1,976,362 CML1 tools (with generated tk files needed to run)
       177,143 CML2 tools (with generated pyc files needed to run)

       165,254 bison-1.28
       309,583 flex-2.5.4a

    One thing we see right away is that moving from CM1 to CML2 shaves 189,4574
    bytes out of the kernel tree itself. That's not the measure that seems
    to exercise people, however.

    Another thing we see is that anybody who'd take Perl over Python
    on size-economy grounds is smoking serious drugs and should be taken
    somewhere to calm down. Perl has its uses but if what we're after
    is a minimalist build environment this is not one of them.

    So let's compare the size of minimum environments needed for a kernel
    build under a couple different more realistic scenarios. We'll agree
    not to count stuff like sh, make, and gcc that the kernel needs
    anyway.

    CML1:
       sizeof(CML1 tools) + sizeof(CML1 rulebase) + sizeof(Tcl/Tk) = 4,229,375

    CML2-in-Python:
       sizeof(CML2 tools) + sizeof(CML2 rulebase) + sizeof(Tcl/Tk)
       + sizeof(Python) = 5,304,418

    CML2-in-C:
       sizeof(CML2 tools) + sizeof(CML2 rulebase) + sizeof(Tcl/Tk)
       + sizeof(Bison) + sizeof(Yacc)

    It's interesting to notice exactly where CML1 is porking up. It
    turns out that the generated tk files make a lot of the difference --
    kconfig.tk alone is 1567874 bytes, over 1.5M. CML2 makes all that go
    away.

    Now let's consider the minimum build-environment size for a
    hypothetical pure-C implementation of CML2. Let's start with the
    parts we can total up:

        sizeof(CML2 rulebase) + sizeof(Tcl/Tk) + sizeof(Bison) + sizeof(Yacc)

    Why am I including Tcl/Tk? Because there is no other toolkit for the
    GUI mode that is (a) anywhere near as stable, or (b) at all likely to
    *be* in a minimum distribution. GTK ain't stable enough yet, nor
    deployed enough. So the minimum size for CML2-in-C would be 2632495
    bytes.

    Let's look at those three numbers:

    Case 1: CML1 = 4,229,375
    Case 2: CML2-in-Python = 5,304,418
    Case 3: CML2-in-C = 2,632,495 (without the CML2 object code itself)

    That's kind of interesting. Call me a calculatin' fool, but I only
    see 1,075,043 bytes' difference between case 1 and case 2. A hair
    over 1M. So I have to ask you: David, is 1M of disk space really a
    "whoops-time-to-fork-the-linux-kernel showstopper"? Really?

    Another interesting question is whether we can get better space economy
    in case 3. Basically this comes down to the question of whether the
    object code of CML2-in-C can be made to fit in less than 1,596,880 bytes.

    Assuming that CML2-in-C has to do what CML2-in-Python does and doing a
    bit of long division, we find that CML2-in-C can only have a smaller
    minimum environment than CML1 only if the compression ratio of the
    CML2-in-Python vs. CML2-in-C code is less than 9.

    If we assume that CML-2-in-C is allowed to have merely a smaller footprint
    than CML2-in-Python, the ratio changes to 15 to 1.

    That is, to believe that CML2-in-C is a good idea on
    total-size-of-environment grounds, we absolutely need to believe that
    we can express every line of Python in CML2-in-Python in fifteen or
    fewer lines of C. I think it shouldn't take anyone more than ten
    minutes of reading the Python code to dispell *that* illusion.

    These numbers and ratios are not very sensitive to the largest changes
    in CML2's size that I can imagine at this point. I've calculated and
    checked; even if CML2-in-Python doubles in size (wildly unlikely at
    this point) the resulting space penalty with respect to CML1 would
    still be less than 1.1M.

    Conclusion: Those of you who are obsessing about Python bloating the
    minimum build environment should take a chill pill. Or three. The
    generated Tk files are bad enough space hogs that you end up fussing
    over a *single megabyte*, fer chrissakes! And it doesn't even have to
    live on the target machine...

    Now let's get back to the *real* problems, shall we?

    -- 
    		<a href="http://www.tuxedo.org/~esr">Eric S. Raymond</a>
    

    It would be thought a hard government that should tax its people one tenth part. -- Benjamin Franklin

    - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/



    This archive was generated by hypermail 2b29 : Sat May 27 2000 - 02:36:01 EDT