Re: Oops in IO_APIC with linux-2.2.15pre17 on dual-pentium-III

From: Winfried Magerl (winfried.magerl@mch.sbs.de)
Date: Fri May 19 2000 - 11:50:19 EDT

  • Next message: Linus Torvalds: "Re: New PCI IRQ routing code for pre9-2"

    On Fri, May 19, 2000 at 05:20:43PM +0200, Manfred Spraul wrote:
    > AFAICS the sequence of assembler instructions is impossible:
    > eax: 9
    > ebx 801fe8cc

    Maybe it's a typo. I had to write down the screenoutput by hand.
    From another crash:

    eax: 0000000b ebx: 801fe8e4 ecx: 0000c926 edx: 801ea000
    esi: ffffe010 edi: ffffe000 ebp: 801ebfa0 esp: 801ebf78

    > the faulting instruction is
    >
    > > 0: 8b 13 movl (%ebx),%edx <=====
    > i.e. load an integer from (%ebx)
    >
    > and the CPU failed with:
    >
    > Unable to handle kernel NULL pointer dereference at virtual address 00000009
    >
    > but that's eax!
    >
    > I doubt that you are overclocking your computer, but could you check your
    > power supply, cabling, cooling,...?

    The problem is much greater. We have three such servers which crashes
    randomly. Same hardware, no overclocking. Unfortunatly I have only
    the output of this server.

    > And check your kernel log: oops always on CPU1, or on both cpus?
    > What about these messages, is that common with eepro cards?

    For the two Oops I had written down: CPU1
    I connect serial consoles on the other too server to see if they reason
    for the crash isn the same.

    > >>>>>
    > kernel: eepro100.c:v1.09l 8/7/99 Donald Becker
    > http://cesdis.gsfc.nasa.gov/linux/drivers/eepro100.html
    > kernel: Uhhuh. NMI received. Dazed and confused, but trying to continue
    > kernel: You probably have a hardware problem with your RAM chips
    > kernel: eth0: OEM i82557/i82558 10/100 Ethernet at 0xb8895000,
    > 00:90:27:9B:D2:A1, IRQ 9.
    > <<<<<<<<<

    I see this message "Uhhuh. NMI received." also on another server. Randomly
    generated when loading the eepro100-module. I give 2.2.16-pre3 a try bur
    unfortunaltly the eepro100 failes with:

    messages:May 17 18:07:09 proxy8 kernel: eth0: card reports no RX buffers.
    messages:May 17 18:07:09 proxy8 kernel: eth0: card reports no resources.
    messages:May 17 18:07:09 proxy8 kernel: eth0: card reports no RX buffers.
    messages:May 17 18:07:09 proxy8 kernel: eth0: card reports no resources.
    messages:May 17 18:07:09 proxy8 kernel: eth0: card reports no RX buffers.
    [....]

    At the moment we want to give 3com a chance. But I have no idea if it is
    a driver-problem or a hardware-problem (with random crashes on several
    servers with the same hardware).

    regards

            winfried

    -- 
    Winfried Magerl - Internet Administration
    Siemens Business Services, 81739 Munich, Germany
    Internet-Mail: Winfried.Magerl@mch.sbs.de
    

    - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/



    This archive was generated by hypermail 2b29 : Fri May 19 2000 - 11:58:34 EDT