Re: hfs support for blocksize != 512

From: Jeff V. Merkey (jmerkey@timpanogas.com)
Date: Wed Aug 30 2000 - 00:05:48 EDT

  • Next message: David A. Gatwood: "Re: hfs support for blocksize != 512"

    I concur with this appraisal from Al Viro. Single threading the VFS is
    going backwards -- not a good idea.

    :-)

    Jeff

    Alexander Viro wrote:
    >
    > On Wed, 30 Aug 2000, Roman Zippel wrote:
    >
    > > > > hfs. For example reading from a file might require a read from a btree
    > > > > file (extent file), with what another file write can be busy with (e.g.
    > > > > reordering the btree nodes).
    > > >
    > > > And?
    > >
    > > The point is: the thing I like about Linux is its simple interfaces, it's
    > > the basic idea of unix - keep it simple. That is true for most parts - the
    > > basic idea is simple and the real complexity is hidden behind it. But
    > > that's currently not true for vfs interface, a fs maintainer has to fight
    > > right now with fscking complex vfs interface and with a possible fscking
    >
    > Yes? And it will become simpler if you will put each and every locking
    > scheme into the API?
    >
    > Look: we have Hans with his trees-all-over-the-place + journal. He has a
    > very legitimate need to protect the internal data structures of Reiserfs
    > and do it without changing the VFS<->reiserfs interaction whenever he
    > decides to change purely internal structures.
    >
    > We have ext2 with indirect blocks, inode bitmaps and block bitmaps, one
    > per cylinder group + counters in each cylinder group. Should VFS know
    > about the internal locking rules? Should it be aware of the fact that
    > inodes foo and bar belong to the same cylinder group and if we remove them
    > we will need to protect the bitmap for a while?
    >
    > We have FAT32 where we've got a nasty allocation data with rather
    > interesting locking rules. Should it be protected by VFS? If it should -
    > well, I have bad news for you: write() on a file will lock the whole
    > filesystem until write() completes. Don't like it for every fs? Tough, it
    > will mean that VFS will not protect the thing and fs will have to do it
    > itself.
    >
    > We have AFFS with totally fscked directory structures. Do you propose to
    > make unlink() block all directory operations on the whole fs? No? Too
    > bad, because only AFFS knows enough to protect its data structures without
    > _that_ locking. Sorry, the only rule that would not require the knowledge
    > of layout and would be strong enough to protect is "no directory access
    > while unlink() is in progress". Yup, on the whole fs. Hardly acceptable
    > even for one filesystem, but try to impose that on everyone and see how
    > long you will survive. JPEGs of the murder scene would be appreciated,
    > BTW.
    >
    > We have HFS with the data structures of its own. You want locking in VFS
    > that would protect the things VFS doesn't know about and has no business
    > to meddle with? Fine, post the locking rules.
    >
    > It's insane - protection of purely internal data structures belongs to the
    > module that knows about them. Generic stuff can, should be and _is_
    > protected. Private one _can't_ be protected without either horribly
    > crippled system (see above) or putting the knowledge of each data
    > structure into the generic layer. And the latter will be on the author of
    > filesystem anyway, because only he knows what rules he need.
    >
    > Please, propose your magical locking scheme that will protect everything
    > on every fs. And let maintainers of filesystems tell you whether it is
    > sufficient. Then check what's left after that locking - e.g. can two
    > processes access the same fs at the same time or not?
    >
    > If you are complaining about the fact that maintaining complex data
    > structures in multithreaded program (which kernel is) may be, well,
    > complex - welcome to reality. It had been that way since the very
    > beginning on _all_ MT projects, Linux included. You have complex private
    > data - you may be in for pain protecting yourself from races. Protection
    > of the public structures is there, so life became easier than it used to
    > be back in 2.0/2.1/2.2 days.
    >
    > Making VFS single-threaded will not fly. If you can show simpler MT one -
    > do it and a lot of people will be extremely grateful. 4.4BSD and SunOS
    > ones are more complex and make the life harder for filesystem writers.
    > Check yourself. OSF/1 is _much_ more complex. Hell knows what NT has, but
    > filesystem glue there looks absolutely horrible - compared to them we are
    > angels in that respect. v7 was simpler, sure enough. Without mmap(),
    > rename() and truncate() _and_ with only one fs type - why not? Too bad
    > that it was racey as hell... Plan 9 is nice and easy. Without mmap(),
    > without link(), without truncate(), without cross-directory rename() and
    > without support of crazy abortions from hell a-la AFFS. 2.0 and 2.2 are
    > _way_ more complex, just compare filesystem code size in 2.4 with them and
    > you will see. And yes, races in question are not new. I can reproduce them
    > on 2.0.9 box. Single-processor one - nothing fancy and SMP-related.
    >
    > If you have a way to simplify VFS and/or filesystems - by all means,
    > post it on fsdevel/l-k. Just tell what locking warranties you provide.
    > Current ones are documented in the tree, so it will be very easy to
    > compare. I'm not saying that they are ideal (check the documentation in
    > question - I'm saying the opposite in quite a few cases). They _can_ be
    > made better. But if you are saying that you know how to protect purely
    > internal data structures without losing MT VFS - sorry, I will believe it
    > when I see it. Go ahead, prove me wrong.
    >
    > -
    > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    > the body of a message to majordomo@vger.kernel.org
    > Please read the FAQ at http://www.tux.org/lkml/
    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    Please read the FAQ at http://www.tux.org/lkml/



    This archive was generated by hypermail 2b29 : Wed Aug 30 2000 - 00:12:03 EDT