ksymoops from diskless crash

From: Fred Feirtag (ffeirtag@integrityns.com)
Date: Fri Aug 11 2000 - 17:09:39 EDT

  • Next message: Theodore Ts'o: "Linux 2.4 status page"

    Greetings--

    Looks like there's some sort of bug crashing my etherboot-diskless
    workstation usage. I see it only on about 5% of boots. Here's
    ksymoops:

    [root@linux82 /boot]$ ksymoops -m /usr/src/linux-diskless/System.map
    ksymoops 2.3.4 on i686 2.4.0-test6. Options used
         -V (default)
         -k /proc/ksyms (default)
         -l /proc/modules (default)
         -o /lib/modules/2.4.0-test6/ (default)
         -m /usr/src/linux-diskless/System.map (specified)

    No modules in ksyms, skipping objects
    Warning (read_lsmod): no symbols in lsmod, is /proc/modules a valid
    lsmod
    file?
    Reading Oops report from the terminal
    se "unset watch" printing eip:
    .
    c010da24
    *pc010da24
    de = 00000000
    Oop*pde = 00000000
    s: 0002
    Oops: 0002
    CPU: 0
    CPU: 0
    EIP: 0010:[<c010da24>]
    eax: 0EIP: 0010:[<c010da24>]
    Using defaults from ksymoops -t elf32-i386 -a i386
    0000000 ebx: 00000000 ecx: 0000002b edx: 00000202
    eax: 00000000 ebx: 00000000 ecx: 0000002b edx: 00000202
    esi: bfff7020 edi: bfff72f4 ebp: bfff7078 esp: c7a75e80
    dsesi: bfff7020 edi: bfff72f4 ebp: bfff7078 esp: c7a75e80
    : 0018 es: 0018 ss: 0018
    Prods: 0018 es: 0018 ss: 0018
    cess linuxrc (pid: 5, stackpage=c7a75000)
    Stack:Process linuxrc (pid: 5, stackpage=c7a75000)
     c010858a bfff7078 bfff7018 c7a75fc4 c7a75fb0 c7a74000 c7a74000
    Stack: c010858a bfff7078 bfff7018 c7a75fc4 c7a75fb0 c7a74000 c7a74000
    00000286
           00000286
    c010869f bfff7020 bfff7078 c7a75fc4 00010002 c7a73144 00000011
    c c010869f bfff7020 bfff7078 c7a75fc4 00010002 c7a73144 00000011
    7a75fb0
    c7a75fb0
           c7a75fb0 c0108a69 00000011 c7a73144 c7a75fb0 c7a75fc4 00000011
    c7 c7a75fb0 c0108a69 00000011 c7a73144 c7a75fb0 c7a75fc4 00000011
    a74000
    Call Tracc7a74000
    e: [<c010858a>] [<c010869f>] [<c0108a69>] [<c0108ce2>]
    [Call Trace: [<c010858a>] [<c010869f>] [<c0108a69>] [<c0108ce2>]
    <c0114c0f>] [<c0108074>] [<c0108e2b>]
    Co[<c0114c0f>] [<c0108074>] [<c0108e2b>]
    de: 00 01 00 d0 59 05 91 d3 c0 a8 0f 71 00 00 00 00 00 00 c0 a8
    Code: 00 01 00 d0 59 05 91 d3 c0 a8 0f 71 00 00 00 00 00 00 c0 a8

    >>EIP; c010da24 <save_i387+0/6c> <=====
    Trace; c010858a <setup_sigcontext+ea/130>
    Trace; c010869f <setup_frame+cf/1a4>
    Trace; c0108a69 <handle_signal+75/e8>
    Trace; c0108ce2 <do_signal+206/26c>
    Trace; c0114c0f <schedule+2c3/4b4>
    Trace; c0108074 <sys_rt_sigsuspend+e0/f4>
    Trace; c0108e2b <system_call+33/38>
    Code; c010da24 <save_i387+0/6c>
    00000000 <_EIP>:
    Code; c010da24 <save_i387+0/6c> <=====
       0: 00 01 add %al,(%ecx) <=====
    Code; c010da26 <save_i387+2/6c>
       2: 00 d0 add %dl,%al
    Code; c010da28 <save_i387+4/6c>
       4: 59 pop %ecx
    Code; c010da29 <save_i387+5/6c>
       5: 05 91 d3 c0 a8 add $0xa8c0d391,%eax
    Code; c010da2e <save_i387+a/6c>
       a: 0f 71 (bad)
    Code; c010da30 <save_i387+c/6c>
       c: 00 00 add %al,(%eax)
    Code; c010da32 <save_i387+e/6c>
       e: 00 00 add %al,(%eax)
    Code; c010da34 <save_i387+10/6c>
      10: 00 00 add %al,(%eax)
    Code; c010da36 <save_i387+12/6c>
      12: c0 a8 00 00 00 00 00 shrb $0x0,0x0(%eax)

    1 warning issued. Results may not be reliable.

    Trond Myklebust wrote:

    > >>>>> " " == Fred Feirtag <ffeirtag@integrityns.com> writes:
    >
    > > Trond--
    >
    > > Helpful? Let me know if more is needed.
    >
    > Yup: Helpful in that it exhonerates NFS. Unfortunately it's in a part
    > of the code where I'm not too familiar, namely the i387 (math
    > coprocessor) save/restore code.
    >
    > I'd suggest trying to send the decoded Oops to the l-k list.
    >
    > Cheers,
    > Trond

    Linux version 2.4.0-test6 (root@ffeirtag) (gcc version egcs-2.91.66
    19990314/Linux (egcs-1.1.2 release)) #5 Fri Aug 11 09:45:47 MDT 2000
    BIOS-provided physical RAM map:
     e820: 000000000009fc00 @ 0000000000000000 (usable)
     e820: 0000000000000400 @ 000000000009fc00 (reserved)
     e820: 0000000000010000 @ 00000000000f0000 (reserved)
     e820: 0000000000010000 @ 00000000ffff0000 (reserved)
     e820: 0000000007ef0000 @ 0000000000100000 (usable)
     e820: 000000000000d000 @ 0000000007ff3000 (ACPI data)
     e820: 0000000000003000 @ 0000000007ff0000 (ACPI NVS)
    Scan SMP from c0000000 for 1024 bytes.
    Scan SMP from c009fc00 for 1024 bytes.
    Scan SMP from c00f0000 for 65536 bytes.
    Scan SMP from c009fc00 for 4096 bytes.
    On node 0 totalpages: 32752
    zone(0): 4096 pages.
    zone(1): 28656 pages.
    zone(2): 0 pages.
    mapped APIC to ffffe000 (01202000)
    Kernel command line: auto rw root=/dev/nfs nfsroot=/diskless/%s single
    console=ttyS0,9600 ramdisk_size=6144
    ip=192.168.15.82:192.168.15.33:192.168.15.1:255.255.255.0:
    Initializing CPU#0
    Detected 601073024 Hz processor.
    Console: colour VGA+ 80x25
    Calibrating delay loop... 1199.31 BogoMIPS
    Memory: 124248k/131008k available (1606k kernel code, 6372k reserved,
    116k data, 216k init, 0k highmem)
    Dentry-cache hash table entries: 16384 (order: 5, 131072 bytes)
    Buffer-cache hash table entries: 4096 (order: 2, 16384 bytes)
    Page-cache hash table entries: 32768 (order: 5, 131072 bytes)
    Inode-cache hash table entries: 8192 (order: 4, 65536 bytes)
    CPU: L1 I Cache: 64K L1 D Cache: 64K (64 bytes/line)
    CPU: L2 Cache: 1K
    CPU: AMD Duron(tm) Processor stepping 00
    Checking 386/387 coupling... OK, FPU using exception 16 error reporting.

    Checking 'hlt' instruction... OK.
    POSIX conformance testing by UNIFIX
    PCI: PCI BIOS revision 2.10 entry at 0xfb260, last bus=1
    PCI: Using configuration type 1
    PCI: Probing PCI hardware
    PCI: Using IRQ router VIA [1106/0596] at 00:07.0
    Linux NET4.0 for Linux 2.4
    Based upon Swansea University Computer Society NET3.039
    NET4: Unix domain sockets 1.0/SMP for Linux NET4.0.
    NET4: Linux TCP/IP 1.0 for NET4.0
    IP Protocols: ICMP, UDP, TCP
    IP: routing cache hash table of 512 buckets, 4Kbytes
    TCP: Hash tables configured (established 8192 bind 8192)
    ACPI: "VIA694" found at 0x000f7040
    Starting kswapd v1.7
    pty: 256 Unix98 ptys configured
    RAMDISK driver initialized: 16 RAM disks of 6144K size 1024 blocksize
    Uniform Multi-Platform E-IDE driver Revision: 6.31
    ide: Assuming 33MHz system bus speed for PIO modes; override with
    idebus=xx
    VP_IDE: IDE controller on PCI bus 00 dev 39
    VP_IDE: chipset revision 16
    VP_IDE: not 100% native mode: will probe irqs later
    hda: Conner Technology CT204, ATA DISK drive
    hdc: TOSHIBA CD-ROM XM-6702B, ATAPI CDROM drive
    ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
    ide1 at 0x170-0x177,0x376 on irq 15
    hdc: ATAPI 48X CD-ROM drive, 128kB Cache
    Uniform CD-ROM driver Revision: 3.11
    Floppy drive(s): fd0 is 1.44M
    FDC 0 is a post-1991 82077
    scsi : 0 hosts.
    scsi : detected total.
    RAMDISK: Compressed image found at block 0
    Freeing initrd memory: 2156k freed
    Serial driver version 5.01 (2000-05-29) with MANY_PORTS SHARE_IRQ
    SERIAL_PCI enabled
    ttyS00 at 0x03f8 (irq = 4) is a 16550A
    ttyS01 at 0x02f8 (irq = 3) is a 16550A
    eepro100.c:v1.09j-t 9/29/99 Donald Becker
    http://cesdis.gsfc.nasa.gov/linux/drivers/eepro100.html
    eepro100.c: $Revision: 1.33 $ 2000/05/24 Modified by Andrey V. Savochkin

    <saw@saw.sw.com.sg> and others
    eth0: Intel Corporation 82557 [Ethernet Pro 100], 00:D0:B7:27:93:82, IRQ

    10.
      Receiver lock-up bug exists -- enabling work-around.
      Board assembly 721383-008, Physical connectors present: RJ45
      Primary interface chip i82555 PHY #1.
      General self-test: passed.
      Serial sub-system self-test: passed.
      Internal registers self-test: passed.
      ROM checksum self-test: passed (0x04f4518b).
    [drm] Initialized tdfx 1.0.0 20000719 on minor 63
    Via 686a audio driver 1.1.8
    ac97_codec: AC97 audio codec, id: 0x4943:0x4511 (Unknown)
    via82cxxx: board #1 at 0xDC00, IRQ 11
    Creative EMU10K1 PCI Audio Driver, version 0.5, 17:05:16 Aug 10 2000
    kmem_create: Forcing size word alignment - nfs_fh
    VFS: Mounted root (ext2 filesystem).
    cannot stat /varUnable to handle kernel NULL pointer
    dereference/run/utmp. Plea at virtual address 0000002b
    se "unset watch" printing eip:
    .
    c010da24
    *pde = 00000000
    Oops: 0002
    CPU: 0
    EIP: 0010:[<c010da24>]
    eax: 00000000 ebx: 00000000 ecx: 0000002b edx: 00000202
    esi: bfff7020 edi: bfff72f4 ebp: bfff7078 esp: c7a75e80
    ds: 0018 es: 0018 ss: 0018
    Process linuxrc (pid: 5, stackpage=c7a75000)
    Stack: c010858a bfff7078 bfff7018 c7a75fc4 c7a75fb0 c7a74000 c7a74000
    00000286
           c010869f bfff7020 bfff7078 c7a75fc4 00010002 c7a73144 00000011
    c7a75fb0
           c7a75fb0 c0108a69 00000011 c7a73144 c7a75fb0 c7a75fc4 00000011
    c7a74000
    Call Trace: [<c010858a>] [<c010869f>] [<c0108a69>] [<c0108ce2>]
    [<c0114c0f>] [<c0108074>] [<c0108e2b>]
    Code: 00 01 00 d0 59 05 91 d3 c0 a8 0f 71 00 00 00 00 00 00 c0 a8
    Looking up port of RPC 100003/2 on 192.168.15.33
    Looking up port of RPC 100005/2 on 192.168.15.33
    VFS: Mounted root (nfs filesystem).
    change_root: old root has d_count=4
    Trying to unmount old root ... <3>error -16
    Change root to /initrd: error -2
    Freeing unused kernel memory: 216k freed
    INIT: version 2.78 booting
    Unable to handle kernel NULL pointer dereference at virtual address
    0000002b
    ...

    Let me know if I can provide more useful info.

    Fred



    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.rutgers.edu
    Please read the FAQ at http://www.tux.org/lkml/



    This archive was generated by hypermail 2b29 : Fri Aug 11 2000 - 17:28:15 EDT