lowish-lat for 2.4.0-test10-pre3

From: Andrew Morton (andrewm@uow.edu.au)
Date: Sat Oct 21 2000 - 09:58:00 EDT

  • Next message: tigran@veritas.com: "Re: PCI bookkeeping"

    Patch is at

    http://www.uow.edu.au/~andrewm/linux/2.4.0-test10-pre3-low-latency.patch

    Changes:

    - Non-inlined the set_current_state()/schedule() code. Saves
      a couple of hundred bytes.

    - Simplified some code in dcache.c - the performance benefit
      wasn't worth the ugliness.

    - Put a couple of rescheduling points in the /proc files. This
      is because reading /proc/meminfo is part of Benno's test suite :)

    - There were some complaints about the decision to disable this
      patch for SMP. So it can be reenabled via CONFIG_LOLAT_SMP (Kernel
      Hacking menu).

      But no promises here at all. It may seem to work, but if you
      start getting heavy spinlock contention in the kernel it
      goes bad.

    - Fixed (I believe) the reschedule and signal race in
      arch/i386/kernel/entry.S. This is the only x86-specific
      part of this patch.

      I simply stuck a `cli' in there:

    --- linux-2.4.0-test10-pre3/arch/i386/kernel/entry.S Sat Oct 14 17:02:03 2000
    +++ linux-akpm/arch/i386/kernel/entry.S Sat Oct 21 22:15:47 2000
    @@ -215,21 +215,27 @@
             jne handle_softirq
             
     ret_with_reschedule:
    - cmpl $0,need_resched(%ebx)
    - jne reschedule
    - cmpl $0,sigpending(%ebx)
    - jne signal_return
    + cli
    + movl need_resched(%ebx),%eax
    + orl sigpending(%ebx),%eax
    + jne signal_or_resched
     restore_all:
             RESTORE_ALL
     
             ALIGN
    -signal_return:
    +signal_or_resched:
    + cmpl $0,need_resched(%ebx)
    + jne reschedule
    + # Must be a pending signal
             sti # we can get here from an interrupt handler
             testl $(VM_MASK),EFLAGS(%esp)
             movl %esp,%eax
             jne v86_signal_return
             xorl %edx,%edx
             call SYMBOL_NAME(do_signal)
    + cli
    + cmpl $0,need_resched(%ebx)
    + jne reschedule
             jmp restore_all
     
             ALIGN
    @@ -285,6 +291,7 @@
             
             ALIGN
     reschedule:
    + sti
             call SYMBOL_NAME(schedule) # test
             jmp ret_from_sys_call

      On the Mendocino the `cli' adds ~15 cycles to system calls. The
      cunning removal of a conditional jump from the syscall and interrupt
      fastpath reduces this to 13 cycles.

      So `getpid()' now takes 1.005x as long as it used to. So shoot me.

    - Some testing results on 2.4.0-test9.

      These tests were run with the separate tcp_minisocks patch because
      the reaping of timed-wait sockets gets in the way with lmbench.
      See http://www.uow.edu.au/~andrewm/linux/schedlat.html#ddt

      Machine: 566 MHz UP Celeron, 256M RAM, UDMA66
      Workload: 2 instances of bonnie++
                2 instances of lmbench
                1 instance of mmap001, mmap002, mmap001, mmap002, ...
                1 instance of netperf
                1 SCHED_FIFO process handling a 1024 Hz signal stream

      After 20 hours the latency histogram was:

      0-1 milliseconds: ~78,000,000
      1-2 milliseconds: 3
      5-8 milliseconds: 8 (kmem_cache_reap->kmem_slab_destroy->avl_remove)
      9-10 milliseconds: 3 (ide_intr)

      The kmem_cache_reap thing is due to the insane number of
      inodes and dentries which bonnie++ leaves around. It doesn't
      happen normally.

      Not sure why the IDE ISR had those glitches.
    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    Please read the FAQ at http://www.tux.org/lkml/



    This archive was generated by hypermail 2b29 : Sat Oct 21 2000 - 10:00:02 EDT