Re: [PATCH] Updates to FXSR/SSE support in 2.4.0-test1

From: Mark Kettenis (kettenis@wins.uva.nl)
Date: Sat Jun 10 2000 - 20:26:15 EDT

  • Next message: Linux Kernel Mailing List: "Big D/L Speed Improvement"

       Date: Sat, 10 Jun 2000 12:51:46 -0600
       From: Gareth Hughes <gareth@precisioninsight.com>

       Here are the patches against 2.4.0-test1-ac12 for the fixes to the PIII
       SSE/FXSR support.

    Sorry, but I don't think this is right:

     #define save_fpu(tsk) do { \
    - asm volatile("fxsave %0 ; fwait" \
    - : "=m" (tsk->thread.i387.hard)); \
    + asm volatile("fnstenv %0 ; fxsave %1 ; fwait" \
    + : "=m" (tsk->thread.i387.hard), \
    + "=m" (tsk->thread.i387.hard.fxsr_space[0])); \

    If I'm not mistaken this means that you've added an additional
    overhead on *every* task switch between FPU-using tasks. Instead you
    should only convert to the old fsave format when that's actually
    needed:

     1. When setting up a signal context for invoking a signal handler.

     2. For the PTRACE_GETFPREGS request.

    I also don't understand why you expose the strange user_xfpregs_struct
    format via the PTRACE_{GET,SET}XFPREGS requests. Why don't you just
    provide the 512-byte FXSAVE area?

    All in all, I think that the origional code in 2.4.0test1 is much
    closer to what the endresult should look like. Just add the
    PTRACE_{GET,SET}XFPREGS requests that return, write the FXSAVE data
    (should be *very* simple), add the necessary bits for setting up the
    signal frame (more or less what's in your patch, but this is the place
    where the FXSAVE -> FSAVE conversion should be done), and add some
    code to the core dumping code to add an extra note with the FXSAVE
    area.

    Mark

    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.rutgers.edu
    Please read the FAQ at http://www.tux.org/lkml/



    This archive was generated by hypermail 2b29 : Sat Jun 10 2000 - 20:30:38 EDT