NWFS sb->s_blocksize issues

From: Jeff V. Merkey (jmerkey@timpanogas.com)
Date: Thu Aug 31 2000 - 12:22:55 EDT

  • Next message: Timur Tabi: "Re: Reserving a (large) memory block"

    Why was it necessary to interlock the blocksize reported by a file
    system with the superblock with the page cache and the ll_rw_block()
    interface? I've gotten to the bottom of the bmap() problem with using a
    1024/2048/4096 blocksize and the problem is related to the way the page
    cache asks for pages. NWFS will suballocate files < volume blocksize
    (4K,8K,16K,32K,64K). These suballocated file runs are not always block
    aligned which results in the page cache getting partial or scattered
    runs.

    The page cache always seems to assume that the blocks for a page will
    conform to the sb->s_blocksize value (which is ok), however, what's
    stilted about this model is that the underlying logic in ll_rw_block()
    will enforce buffer head reads to this blocksize value. My benchmarking
    shows best performance on Linux at 1024 blocksize settings (drivers
    seems optimized to this case). I have found that if I define
    sb->s_blocksize to a certain value, buffer heads get rejected if I
    attempt other values because set_blocksize() fails if the superblock
    value does not match.

    I can completely get around this problem by setting everything to a
    blocksize of 512, and using buffer heads set to this blocksize, however,
    this eats up more memory and creates more overhead in the ll_rw_block()
    code since it has to merge all these requests, plus there's the overhead
    of allocating tons of buffer heads for all the 512 blocks. It also
    increases (oink) the bmap() interative calling to map a file since
    smaller blocksizes result in more calls through the page cache
    interface. I guess what I am asking is why does there need to be a
    correlation beween page cache superblock size and an artificial
    enforement underneath in ll_rw_block() to force block I/O operations to
    be the same.

    One would think that ideally, I/O requests would allow variable length
    sector operations up to any size, and a config option to allow the page
    cache to bmap() sizes independently of I/O requests to the disks.

    I can get around what's there, but it's wasteful of memory and runs
    slower. This is probably a Linus issue.

    Please Advise,

    Jeff
    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    Please read the FAQ at http://www.tux.org/lkml/



    This archive was generated by hypermail 2b29 : Thu Aug 31 2000 - 12:26:14 EDT