[PATCH] 0-byte read()/write() behaviour

From: Philipp Rumpf (prumpf@parcelfarce.linux.theplanet.co.uk)
Date: Fri Oct 20 2000 - 13:30:49 EDT

  • Next message: kuznet@ms2.inr.ac.ru: "Re: TCP: peer x.x.x.x:y/z shrinks window a:b:c..."

    Single Unix specifies that 0-byte reads, as well as 0-byte writes, should
    "return 0 and have no other results". Our current implementation violates
    the first requirement and makes it very easy to violate the second one.

     read(page_cache_fd, invalid_ptr, 0) returns -EFAULT; IMHO, this is a
    clear bug - suppose you have managed to acquire the page covering virtual
    addresses up to and including PAGE_OFFSET-1 (on x86).
    read(fd, PAGE_OFFSET-n, n) now is successful for PAGE_SIZE >= n > 0, but
    fails with -EFAULT for n = 0.

     looking through drivers/char/*.c, many drivers seem to behave unexpectedly
    for a 0-byte read; this includes code that overwrites the byte read()'s
    second argument points to, endless loops, and several different error
    returns.

     generic_file_read and generice_file_write, which I suppose are the most
    critical functions for read()/write() performance, both have special cases
    to handle the count == 0 case.

    So I think modifying sys_read and sys_write to take care of the count == 0
    case and only calling the file-specific function when it's actually
    necessary makes most sense; as generic_file_(read|write) get simplified
    accordingly, there should be no performance impact.

    Linus ?

    patch against test10-pre4

    diff -ur linux/fs/read_write.c linux-prumpf/fs/read_write.c
    --- linux/fs/read_write.c Fri Oct 20 04:52:58 2000
    +++ linux-prumpf/fs/read_write.c Fri Oct 20 06:03:23 2000
    @@ -120,6 +120,10 @@
             ssize_t ret;
             struct file * file;
     
    + /* Required for SU compatibility. It also simplifies generic_file_read */
    + if (!count)
    + return 0;
    +
             ret = -EBADF;
             file = fget(fd);
             if (file) {
    @@ -145,6 +149,10 @@
     {
             ssize_t ret;
             struct file * file;
    +
    + /* Required for SU compatibility. */
    + if (!count)
    + return 0;
     
             ret = -EBADF;
             file = fget(fd);
    diff -ur linux/mm/filemap.c linux-prumpf/mm/filemap.c
    --- linux/mm/filemap.c Fri Oct 20 05:05:04 2000
    +++ linux-prumpf/mm/filemap.c Fri Oct 20 06:08:41 2000
    @@ -1124,27 +1124,18 @@
      */
     ssize_t generic_file_read(struct file * filp, char * buf, size_t count, loff_t *ppos)
     {
    - ssize_t retval;
    + read_descriptor_t desc;
     
    - retval = -EFAULT;
    - if (access_ok(VERIFY_WRITE, buf, count)) {
    - retval = 0;
    -
    - if (count) {
    - read_descriptor_t desc;
    -
    - desc.written = 0;
    - desc.count = count;
    - desc.buf = buf;
    - desc.error = 0;
    - do_generic_file_read(filp, ppos, &desc, file_read_actor);
    -
    - retval = desc.written;
    - if (!retval)
    - retval = desc.error;
    - }
    - }
    - return retval;
    + if (!access_ok(VERIFY_WRITE, buf, count))
    + return -EFAULT;
    +
    + desc.written = 0;
    + desc.count = count;
    + desc.buf = buf;
    + desc.error = 0;
    + do_generic_file_read(filp, ppos, &desc, file_read_actor);
    +
    + return desc.written ? : desc.error;
     }
     
     static int file_send_actor(read_descriptor_t * desc, struct page *page, unsigned long offset , unsigned long size)
    @@ -2424,13 +2415,12 @@
             }
     
             status = 0;
    - if (count) {
    - remove_suid(inode);
    - inode->i_ctime = inode->i_mtime = CURRENT_TIME;
    - mark_inode_dirty(inode);
    - }
     
    - while (count) {
    + remove_suid(inode);
    + inode->i_ctime = inode->i_mtime = CURRENT_TIME;
    + mark_inode_dirty(inode);
    +
    + do {
                     unsigned long bytes, index, offset;
                     char *kaddr;
     
    @@ -2480,7 +2470,7 @@
     
                     if (status < 0)
                             break;
    - }
    + } while (count);
             *ppos = pos;
     
             if (cached_page)
    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    Please read the FAQ at http://www.tux.org/lkml/



    This archive was generated by hypermail 2b29 : Fri Oct 20 2000 - 13:35:24 EDT