Re: [RFC] generic IO write clustering

From: Rik van Riel (riel@conectiva.com.br)
Date: Fri Jan 19 2001 - 21:58:38 EST

  • Next message: Arnaldo Carvalho de Melo: "Re: -ac10 compile error"

    On Fri, 19 Jan 2001, Marcelo Tosatti wrote:

    > The write clustering issue has already been discussed (mainly at Miami)
    > and the agreement, AFAIK, was to implement the write clustering at the
    > per-address-space writepage() operation.
    >
    > IMO there are some problems if we implement the write clustering in this
    > level:
    >
    > - The filesystem does not have information (and should not have) about
    > limiting cluster size depending on memory shortage.

    Is there ever a reason NOT to do the best possible IO
    clustering at write time ?

    Remember that disk writes do not cost memory and have
    no influence on the resident set ... completely unlike
    read clustering, which does need to be limited.

    > - By doing the write clustering at a higher level, we avoid a ton of
    > filesystems duplicating the code.
    >
    > So what I suggest is to add a "cluster" operation to struct address_space
    > which can be used by the VM code to know the optimal IO transfer unit in
    > the storage device. Something like this (maybe we need an async flag but
    > thats a minor detail now):
    >
    > int (*cluster)(struct page *, unsigned long *boffset,
    > unsigned long *poffset);

    Makes sense, except that I don't see how (or why) the _VM_
    should "know the optimal IO transfer unit". This sounds more
    like a job for the IO subsystem and/or the filesystem, IMHO.

    > "page" is from where the filesystem code should start its search
    > for contiguous pages. boffset and poffset are passed by the VM
    > code to know the logical "backwards offset" (number of
    > contiguous pages going backwards from "page") and "forward
    > offset" (cont pages going forward from "page") in the inode.

    Yes, this makes a LOT of sense. I really like a pagecache
    helper function so the filesystems can build their writeout
    clusters easier.

    regards,

    Rik

    --
    Virtual memory is like a game you can't win;
    However, without VM there's truly nothing to lose...
    

    http://www.surriel.com/ http://www.conectiva.com/ http://distro.conectiva.com.br/

    - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/



    This archive was generated by hypermail 2b29 : Fri Jan 19 2001 - 21:59:35 EST