Re: 2.2.14 SMP 3com905: transmit timed out: Odd lost irq and ip-stack lockup

From: Andrew Morton (andrewm@uow.edu.au)
Date: Thu Oct 12 2000 - 11:16:57 EDT

  • Next message: Oliver Xymoron: "Re: Updated 2.4 TODO List"

    "Maciej W. Rozycki" wrote:
    >
    > On Fri, 13 Oct 2000, Andrew Morton wrote:
    >
    > > > Oct 9 17:29:02 fwintern kernel: eth0: Interrupt posted but not
    > > > delivered -- IRQ blocked by another device?
    > >
    > > This is the infamous APIC bug. I have about ten reports of this over a
    > > four-month period. Mark Hemment mentioned it just yesterday.
    > >
    > > This is not a 3c59x problem. It is due to the APIC forgetting how to
    > > generate interrupts for a particular IRQ. It happens mostly for NICs
    > > because they generate a lot of interrupts. I've had it happen just
    > > once. In that case, _nothing_ would make the interrupt come back
    > > (including a driver unload/reload).
    > >
    > > This gets reported a lot by 3c59x users because this driver specifically
    > > detects and reports on the problem.
    >
    > Hmm, that's interesting. It would be worthwhile to see a dump of APICs'
    > state when this happens -- maybe an EOI message gets lost for some reason
    > or an erratum is biting us. There are functions for such kind of
    > diagnostics already available; they are print_IO_APIC() and
    > print_all_local_APICs() and may be called on demand by a tiny module, for
    > example.

    Thanks!

    Michael, would you be able to:

    - go back to 2.4.0
    - make sure you have 'Magic SYSRQ' enabled
    - apply the below patch
    - type ALT-SYSRQ-A when the machine is behaving
    - make the machine misbehave
    - type ALT-SYSRQ-A and
    - send the resulting log output?

    --- linux-2.4.0-test9/arch/i386/kernel/io_apic.c Wed Oct 4 21:27:20 2000
    +++ linux-akpm/arch/i386/kernel/io_apic.c Fri Oct 13 02:08:28 2000
    @@ -692,7 +692,7 @@
             printk(KERN_WARNING " to linux-smp@vger.kernel.org\n");
     }
     
    -void __init print_IO_APIC(void)
    +void print_IO_APIC(void)
     {
             int apic, i;
             struct IO_APIC_reg_00 reg_00;
    --- linux-2.4.0-test9/drivers/char/sysrq.c Fri Aug 11 19:06:11 2000
    +++ linux-akpm/drivers/char/sysrq.c Fri Oct 13 02:09:19 2000
    @@ -72,6 +72,12 @@
             console_loglevel = 7;
             printk(KERN_INFO "SysRq: ");
             switch (key) {
    + case 'a':
    + printk("print_IO_APIC()\n");
    + print_IO_APIC();
    + printk("print_all_local_APICs()\n");
    + print_all_local_APICs();
    + break;
             case 'r': /* R -- Reset raw mode */
                     if (kbd) {
                             kbd->kbdmode = VC_XLATE;
    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    Please read the FAQ at http://www.tux.org/lkml/



    This archive was generated by hypermail 2b29 : Thu Oct 12 2000 - 11:21:29 EDT