Re: hfs support for blocksize != 512

From: Alexander Viro (viro@math.psu.edu)
Date: Wed Aug 30 2000 - 00:04:40 EDT

  • Next message: root: "2.4.0-test5 problems with Promise ATA100"

    On Wed, 30 Aug 2000, Roman Zippel wrote:

    > > > hfs. For example reading from a file might require a read from a btree
    > > > file (extent file), with what another file write can be busy with (e.g.
    > > > reordering the btree nodes).
    > >
    > > And?
    >
    > The point is: the thing I like about Linux is its simple interfaces, it's
    > the basic idea of unix - keep it simple. That is true for most parts - the
    > basic idea is simple and the real complexity is hidden behind it. But
    > that's currently not true for vfs interface, a fs maintainer has to fight
    > right now with fscking complex vfs interface and with a possible fscking

    Yes? And it will become simpler if you will put each and every locking
    scheme into the API?

    Look: we have Hans with his trees-all-over-the-place + journal. He has a
    very legitimate need to protect the internal data structures of Reiserfs
    and do it without changing the VFS<->reiserfs interaction whenever he
    decides to change purely internal structures.

    We have ext2 with indirect blocks, inode bitmaps and block bitmaps, one
    per cylinder group + counters in each cylinder group. Should VFS know
    about the internal locking rules? Should it be aware of the fact that
    inodes foo and bar belong to the same cylinder group and if we remove them
    we will need to protect the bitmap for a while?

    We have FAT32 where we've got a nasty allocation data with rather
    interesting locking rules. Should it be protected by VFS? If it should -
    well, I have bad news for you: write() on a file will lock the whole
    filesystem until write() completes. Don't like it for every fs? Tough, it
    will mean that VFS will not protect the thing and fs will have to do it
    itself.

    We have AFFS with totally fscked directory structures. Do you propose to
    make unlink() block all directory operations on the whole fs? No? Too
    bad, because only AFFS knows enough to protect its data structures without
    _that_ locking. Sorry, the only rule that would not require the knowledge
    of layout and would be strong enough to protect is "no directory access
    while unlink() is in progress". Yup, on the whole fs. Hardly acceptable
    even for one filesystem, but try to impose that on everyone and see how
    long you will survive. JPEGs of the murder scene would be appreciated,
    BTW.

    We have HFS with the data structures of its own. You want locking in VFS
    that would protect the things VFS doesn't know about and has no business
    to meddle with? Fine, post the locking rules.

    It's insane - protection of purely internal data structures belongs to the
    module that knows about them. Generic stuff can, should be and _is_
    protected. Private one _can't_ be protected without either horribly
    crippled system (see above) or putting the knowledge of each data
    structure into the generic layer. And the latter will be on the author of
    filesystem anyway, because only he knows what rules he need.

    Please, propose your magical locking scheme that will protect everything
    on every fs. And let maintainers of filesystems tell you whether it is
    sufficient. Then check what's left after that locking - e.g. can two
    processes access the same fs at the same time or not?

    If you are complaining about the fact that maintaining complex data
    structures in multithreaded program (which kernel is) may be, well,
    complex - welcome to reality. It had been that way since the very
    beginning on _all_ MT projects, Linux included. You have complex private
    data - you may be in for pain protecting yourself from races. Protection
    of the public structures is there, so life became easier than it used to
    be back in 2.0/2.1/2.2 days.

    Making VFS single-threaded will not fly. If you can show simpler MT one -
    do it and a lot of people will be extremely grateful. 4.4BSD and SunOS
    ones are more complex and make the life harder for filesystem writers.
    Check yourself. OSF/1 is _much_ more complex. Hell knows what NT has, but
    filesystem glue there looks absolutely horrible - compared to them we are
    angels in that respect. v7 was simpler, sure enough. Without mmap(),
    rename() and truncate() _and_ with only one fs type - why not? Too bad
    that it was racey as hell... Plan 9 is nice and easy. Without mmap(),
    without link(), without truncate(), without cross-directory rename() and
    without support of crazy abortions from hell a-la AFFS. 2.0 and 2.2 are
    _way_ more complex, just compare filesystem code size in 2.4 with them and
    you will see. And yes, races in question are not new. I can reproduce them
    on 2.0.9 box. Single-processor one - nothing fancy and SMP-related.

    If you have a way to simplify VFS and/or filesystems - by all means,
    post it on fsdevel/l-k. Just tell what locking warranties you provide.
    Current ones are documented in the tree, so it will be very easy to
    compare. I'm not saying that they are ideal (check the documentation in
    question - I'm saying the opposite in quite a few cases). They _can_ be
    made better. But if you are saying that you know how to protect purely
    internal data structures without losing MT VFS - sorry, I will believe it
    when I see it. Go ahead, prove me wrong.

    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    Please read the FAQ at http://www.tux.org/lkml/



    This archive was generated by hypermail 2b29 : Wed Aug 30 2000 - 00:06:31 EDT