Re: The INN/mmap bug

From: Alexander Viro (viro@math.psu.edu)
Date: Tue Sep 19 2000 - 10:11:51 EDT

  • Next message: David Howells: "Re: GCC proposal for "@" asm constraint"

    On Tue, 19 Sep 2000, Daniel Phillips wrote:

    > The more I think about it the less clear and ambiguous I find it.
    > When you add the dirty bit into the pot you get:
    >
    > Mapped, Uptodate, Dirty: not possible

    Sure, it is possible - that's how the write happens

    > !Mapped, Uptodate, Dirty: not possible
    > Mapped, !Uptodate, Dirty: pending write

    s/pending write/obvious bug/, damnit. Daniel, just think for a second: we
    have the buffer read to be picked by bdflush and written to disk, while
    the _contents_ _is_ _not_ _uptodate_. Just what can you expect when write
    succeeds? Junk on disk, right? You know, GIGO queue - garbage in, garbage
    out...

    > !Mapped, !Uptodate, Dirty: pending map and write

    Wrong. It's an instant BUG at line 711 in ll_rw_blk.c - remember these
    reports?

    > Mapped, Uptodate, !Dirty: regular block
    > !Mapped, Uptodate, !Dirty: hole of zeroes
    > Mapped, !Uptodate, !Dirty: unread
    > !Mapped, !Uptodate, !Dirty: pending map
     
    > Those two 'not possible' states bother me, they show that the three
    > bits are not orthogonal. We can make arbitrary assignments of meaning
    > to them as you did with !Mapped, !Uptodate (pending map) but in that
    > case why not just enumerate all the states we need and lose the bit
    > assignments? I think the deep problem here is trying to look at this
    > as a bit-flipping problem instead of a state-transition problem.

    Sure, the dirty bit is not orthogonal to the rest. You don't need to do
    any complex analysis - it's as simple as
            * if I don't know _where_ to write the data - I'ld better not feed
    the request to ll_rw_block(), or it may get PO'd
            * if I know that data is junk - I don't want it hitting the disk.

    Dirty bit == request fed into the funnel and can be on disk any moment now.
    Locked == already in IO subsystem.

    That's it - completely independent from the rest, except that you don't
    want the whole write mechanism applied to non-uptodate or non-mapped
    pieces.

    Think about IO as a memory bus - cache controller deals with writing the
    cache line to RAM, but you don't want it to try that on lines that don't
    have PA already calculated or have invalid contents. "mark dirty" == point
    the write-behind mechanism to it and let it decide when the thing must be
    written. Think how to make the cache indexed by virtual address (not by
    physical, as in case of x86) work correctly. That's what pagecache is -
    software MMU with cache-by-VA architecture.

    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    Please read the FAQ at http://www.tux.org/lkml/



    This archive was generated by hypermail 2b29 : Tue Sep 19 2000 - 10:13:07 EDT