Re: Athlon possible fixes

From: Seth Goldberg (bergsoft@home.com)
Date: Sat May 05 2001 - 18:44:31 EDT

  • Next message: dean gaudet: "Re: [PATCH] arp_filter patch for 2.4.4 kernel."

    >
    > As all this is trying to avoid bus turnarounds (i.e. switching from
    > reading to writing), wouldn't it be fastest to just trust that the CPU
    > has at least 4k worth of cache? (and hope for the best that we don't
    > get interrupted in the meanwhile).
    >
    > void copy_page (char *dest, char *source)
    > {
    > long *dst = (long *)dest,
    > *src=(long *)source,
    > *end= (long *)(source+PAGE_SIZE);
    > #if 1
    > register int i;
    > long t=0;
    > static long tt;
    >
    > for (i=0;i<PAGE_SIZE/sizeof (long);i += cache_line_size()/sizeof(long))
    > /* Actually the innards of this loop should be:
    > (void) from[i];
    > however, the compiler will probably optimize that away. */
    > t += src[i];
    >
    > tt = t;
    > #endif
    > while (src < end)
    > *dst++ = *src++;
    >
    > }
    >
    > So, this is 15 lines of C, and it'd be interesting to benchmark this
    > against the assembly.
    >

      Well you asked for it :) :

    clear_page by 'normal_clear_page' took 12196 cycles (318.1 MB/s)
    clear_page by 'slow_zero_page' took 12207 cycles (317.9 MB/s)
    clear_page by 'fast_clear_page' took 29272 cycles (132.6 MB/s)
    clear_page by 'faster_clear_page' took 4831 cycles (803.1 MB/s)
     
    copy_page by 'normal_copy_page' took 12607 cycles (307.8 MB/s)
    copy_page by 'slow_copy_page' took 13617 cycles (285.0 MB/s)
    copy_page by 'fast_copy_page' took 9531 cycles (407.1 MB/s)
    copy_page by 'faster_copy' took 5585 cycles (694.7 MB/s)
    copy_page by 'even_faster' took 5621 cycles (690.3 MB/s)
    copy_page by 'even_faster_nopre' took 5837 cycles (664.8 MB/s)
    copy_page by 'c_source' took 17296 cycles (224.3
    MB/s)

     The last one is yours :). I'd assume this is because the compiler is
    not
    using mmx instructions for this. (the nopre is a routine I added to
    check
    the speed with only a single prefetch instruction. When I tried adding
    the routing with the single prefetch instruction to mmx.c and
    recompiling
    and rebooted, the system stayed up a lot longer, but it still crashed (I
    was in Xwindows and the crash was partially written to the log file)
    after around 3 minutes of work in X.
    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/



    This archive was generated by hypermail 2b29 : Sat May 05 2001 - 18:49:07 EDT