2.3.99pre2-5 console problems

From: Russell King (rmk@arm.linux.org.uk)
Date: Sat Mar 18 2000 - 19:21:41 EST

  • Next message: Martin Costabel: "Re: 2.3.99-pre1 PPC compile errors"

    Original message:
    > From: Russell King <rmk@arm.linux.org.uk>
    > Subject: 2.3.99pre1 console and NFS problems?
    > To: linux-kernel@vger.rutgers.edu
    > Date: Thu, 16 Mar 2000 22:43:29 +0000 (GMT)
    >
    > I'm using fbcon, and with 2.3.99pre1 I see the following problems:
    >
    > 2. If I ssh to one of my other machines from 2.3.99pre1, and do a
    > vdir /usr/bin, then the local ssh process will hang in an interruptible
    > wait in write_chan (just where the schedule() call is at the bottom of
    > the loop.

    The same is true with 2.3.99pre2-5. It is hanging in the same place, and
    this problem is very very reproducable. All I need to do is to use ssh to
    log into a remote machine on a 10base link, and then a simple "vdir /usr/bin"
    stops ssh in it's tracks.

    One recovery method from this situation is to login as root and strace the
    ssh process. That kicks it forward by a couple of page-fulls, then
    you have to stop stracing it and re-strace it until it reaches the end
    of the list.

    The reason it seems to be happening is that ssh is trying to write to the
    console using large blocks, which is causing the console (via
    tty->driver.write, line 1170 in n_tty.c) to return before all the data is
    processed. The task is then placed into TASK_INTERRUPTIBLE, and hits
    schedule at the bottom of the loop where there is now no "automatic" way
    to re-awake the process - unfortunately the console will never say "I can
    accept more characters" via the tty->write_wait wait queue.

    The console code does not appear to be setting the task to TASK_RUNNING,
    so I can only presume that it worked with previous kernels because it was
    relying on the page fault from copy_from_user to set the task back to the
    running state.

    Thinking about it a little more, isn't there a race condition there? The
    task is set to "RUNNING" and tty->driver.write is called. It gobbles up,
    say, 50% of the data and returns. Meanwhile, the device has processed
    that data and signals it via the write_wait queue. The task is set to
    "INTERRUPTIBLE" and hits schedule, to sleep indefinitely.

    Any comments?
       _____
      |_____| ------------------------------------------------- ---+---+-
      | | Russell King rmk@arm.linux.org.uk --- ---
      | | | | http://www.arm.linux.org.uk/~rmk/aboutme.html / / |
      | +-+-+ --- -+-
      / | THE developer of ARM Linux |+| /|\
     / | | | --- |
        +-+-+ ------------------------------------------------- /\\\ |

    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.rutgers.edu
    Please read the FAQ at http://www.tux.org/lkml/



    This archive was generated by hypermail 2b29 : Sun Mar 19 2000 - 01:56:06 EST