Re: missing mxcsr initialization

From: Gabriel Paubert (paubert@iram.es)
Date: Fri Oct 20 2000 - 13:32:18 EDT

  • Next message: Juri Haberland: "Re: Quota fixes and a few questions"

    On Fri, 20 Oct 2000, Andrea Arcangeli wrote:

    > Many thanks to Doug and Gabriel for very useful explanations about this FPU
    > stuff. I suggest Gabriel to submit his way faster and more correct tag word
    > conversion function to Linus for 2.4.x.

    Here it a first shot, twd_i387_to_fxsr is guaranteed to not add any
    new bug (since there are only 65536 cases, I wrote a short test program
    to check that the results are the same).

    For the other way around, I only corrected the most glaring error: that
    the sign never affects the tag. Some obscure corner cases may still be
    wrong, but at least now negative zero and positive infinity and NaNs are
    handled correctly. Performance could also be improved by handling
    separately the common case of an empty FP stack.

    Now the question: is it worth bloating the code to exactly return the tag
    values of a 387 (I actually find that the Intel doc is confusing, what
    is the state of the tag word after an fnsave when MMX is in use, 0xaaaa
    or 0x0000, not counting the bugs in this area) ?

    My feeling is that the only important information in the tag field is
    empty/not empty. Does any code (debuggers, etc...) actually rely on the
    tags for anything else ?

    Patch relative to 2.4.0-test9.

            Gabriel

    ===== arch/i386/kernel/i387.c 1.1 vs edited =====
    --- 1.1/arch/i386/kernel/i387.c Tue Jun 27 20:14:20 2000
    +++ edited/arch/i386/kernel/i387.c Fri Oct 20 17:56:49 2000
    @@ -79,16 +79,16 @@
     
     static inline unsigned short twd_i387_to_fxsr( unsigned short twd )
     {
    - unsigned short ret = 0;
    - int i;
    -
    - for ( i = 0 ; i < 8 ; i++ ) {
    - if ( (twd & 0x3) != 0x3 ) {
    - ret |= (1 << i);
    - }
    - twd = twd >> 2;
    - }
    - return ret;
    + unsigned int tmp; /* to avoid 16 bit prefixes in the code */
    +
    + /* Transform each pair of bits into 01 (valid) or 00 (empty) */
    + tmp = ~twd;
    + tmp = (tmp | (tmp>>1)) & 0x5555; /* 0V0V0V0V0V0V0V0V */
    + /* and move the valid bits to the lower byte. */
    + tmp = (tmp | (tmp >> 1)) & 0x3333; /* 00VV00VV00VV00VV */
    + tmp = (tmp | (tmp >> 2)) & 0x0f0f; /* 0000VVVV0000VVVV */
    + tmp = (tmp | (tmp >> 4)) & 0x00ff; /* 00000000VVVVVVVV */
    + return tmp;
     }
     
     static inline unsigned long twd_fxsr_to_i387( struct i387_fxsave_struct *fxsave )
    @@ -105,8 +105,8 @@
                     if ( twd & 0x1 ) {
                             st = (struct _fpxreg *) FPREG_ADDR( fxsave, i );
     
    - switch ( st->exponent ) {
    - case 0xffff:
    + switch ( st->exponent & 0x7fff ) {
    + case 0x7fff:
                                     tag = 2; /* Special */
                                     break;
                             case 0x0000:

    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    Please read the FAQ at http://www.tux.org/lkml/



    This archive was generated by hypermail 2b29 : Fri Oct 20 2000 - 13:36:54 EDT