<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/fs/block_dev.c, branch linux-4.1.y</title>
<subtitle>Hosts the 0x221E linux distro kernel.</subtitle>
<id>https://universe.0xinfinity.dev/distro/kernel/atom?h=linux-4.1.y</id>
<link rel='self' href='https://universe.0xinfinity.dev/distro/kernel/atom?h=linux-4.1.y'/>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/'/>
<updated>2017-05-17T19:08:24Z</updated>
<entry>
<title>fs/block_dev: always invalidate cleancache in invalidate_bdev()</title>
<updated>2017-05-17T19:08:24Z</updated>
<author>
<name>Andrey Ryabinin</name>
<email>aryabinin@virtuozzo.com</email>
</author>
<published>2017-05-03T21:56:02Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=ae5175c8dba4ce0c8cd7da54cccf98295ab149f9'/>
<id>urn:sha1:ae5175c8dba4ce0c8cd7da54cccf98295ab149f9</id>
<content type='text'>
[ Upstream commit a5f6a6a9c72eac38a7fadd1a038532bc8516337c ]

invalidate_bdev() calls cleancache_invalidate_inode() iff -&gt;nrpages != 0
which doen't make any sense.

Make sure that invalidate_bdev() always calls cleancache_invalidate_inode()
regardless of mapping-&gt;nrpages value.

Fixes: c515e1fd361c ("mm/fs: add hooks to support cleancache")
Link: http://lkml.kernel.org/r/20170424164135.22350-3-aryabinin@virtuozzo.com
Signed-off-by: Andrey Ryabinin &lt;aryabinin@virtuozzo.com&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Acked-by: Konrad Rzeszutek Wilk &lt;konrad.wilk@oracle.com&gt;
Cc: Alexander Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Ross Zwisler &lt;ross.zwisler@linux.intel.com&gt;
Cc: Jens Axboe &lt;axboe@kernel.dk&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Alexey Kuznetsov &lt;kuznet@virtuozzo.com&gt;
Cc: Christoph Hellwig &lt;hch@lst.de&gt;
Cc: Nikolay Borisov &lt;n.borisov.lkml@gmail.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@verizon.com&gt;
</content>
</entry>
<entry>
<title>block_dev: don't test bdev-&gt;bd_contains when it is not stable</title>
<updated>2017-01-13T01:56:57Z</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.com</email>
</author>
<published>2016-12-12T15:21:51Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=5eba61298b748b9f948a1de1055e84e620299d81'/>
<id>urn:sha1:5eba61298b748b9f948a1de1055e84e620299d81</id>
<content type='text'>
[ Upstream commit bcc7f5b4bee8e327689a4d994022765855c807ff ]

bdev-&gt;bd_contains is not stable before calling __blkdev_get().
When __blkdev_get() is called on a parition with -&gt;bd_openers == 0
it sets
  bdev-&gt;bd_contains = bdev;
which is not correct for a partition.
After a call to __blkdev_get() succeeds, -&gt;bd_openers will be &gt; 0
and then -&gt;bd_contains is stable.

When FMODE_EXCL is used, blkdev_get() calls
   bd_start_claiming() -&gt;  bd_prepare_to_claim() -&gt; bd_may_claim()

This call happens before __blkdev_get() is called, so -&gt;bd_contains
is not stable.  So bd_may_claim() cannot safely use -&gt;bd_contains.
It currently tries to use it, and this can lead to a BUG_ON().

This happens when a whole device is already open with a bd_holder (in
use by dm in my particular example) and two threads race to open a
partition of that device for the first time, one opening with O_EXCL and
one without.

The thread that doesn't use O_EXCL gets through blkdev_get() to
__blkdev_get(), gains the -&gt;bd_mutex, and sets bdev-&gt;bd_contains = bdev;

Immediately thereafter the other thread, using FMODE_EXCL, calls
bd_start_claiming() from blkdev_get().  This should fail because the
whole device has a holder, but because bdev-&gt;bd_contains == bdev
bd_may_claim() incorrectly reports success.
This thread continues and blocks on bd_mutex.

The first thread then sets bdev-&gt;bd_contains correctly and drops the mutex.
The thread using FMODE_EXCL then continues and when it calls bd_may_claim()
again in:
			BUG_ON(!bd_may_claim(bdev, whole, holder));
The BUG_ON fires.

Fix this by removing the dependency on -&gt;bd_contains in
bd_may_claim().  As bd_may_claim() has direct access to the whole
device, it can simply test if the target bdev is the whole device.

Fixes: 6b4517a7913a ("block: implement bd_claiming and claiming block")
Cc: stable@vger.kernel.org (v2.6.35+)
Signed-off-by: NeilBrown &lt;neilb@suse.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@verizon.com&gt;
</content>
</entry>
<entry>
<title>block: protect iterate_bdevs() against concurrent close</title>
<updated>2017-01-13T01:56:55Z</updated>
<author>
<name>Rabin Vincent</name>
<email>rabinv@axis.com</email>
</author>
<published>2016-12-01T08:18:28Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=73e7d7aef06c4f590c539844782d3720c7958efa'/>
<id>urn:sha1:73e7d7aef06c4f590c539844782d3720c7958efa</id>
<content type='text'>
[ Upstream commit af309226db916e2c6e08d3eba3fa5c34225200c4 ]

If a block device is closed while iterate_bdevs() is handling it, the
following NULL pointer dereference occurs because bdev-&gt;b_disk is NULL
in bdev_get_queue(), which is called from blk_get_backing_dev_info() (in
turn called by the mapping_cap_writeback_dirty() call in
__filemap_fdatawrite_range()):

 BUG: unable to handle kernel NULL pointer dereference at 0000000000000508
 IP: [&lt;ffffffff81314790&gt;] blk_get_backing_dev_info+0x10/0x20
 PGD 9e62067 PUD 9ee8067 PMD 0
 Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
 Modules linked in:
 CPU: 1 PID: 2422 Comm: sync Not tainted 4.5.0-rc7+ #400
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)
 task: ffff880009f4d700 ti: ffff880009f5c000 task.ti: ffff880009f5c000
 RIP: 0010:[&lt;ffffffff81314790&gt;]  [&lt;ffffffff81314790&gt;] blk_get_backing_dev_info+0x10/0x20
 RSP: 0018:ffff880009f5fe68  EFLAGS: 00010246
 RAX: 0000000000000000 RBX: ffff88000ec17a38 RCX: ffffffff81a4e940
 RDX: 7fffffffffffffff RSI: 0000000000000000 RDI: ffff88000ec176c0
 RBP: ffff880009f5fe68 R08: 0000000000000000 R09: 0000000000000000
 R10: 0000000000000001 R11: 0000000000000000 R12: ffff88000ec17860
 R13: ffffffff811b25c0 R14: ffff88000ec178e0 R15: ffff88000ec17a38
 FS:  00007faee505d700(0000) GS:ffff88000fb00000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
 CR2: 0000000000000508 CR3: 0000000009e8a000 CR4: 00000000000006e0
 Stack:
  ffff880009f5feb8 ffffffff8112e7f5 0000000000000000 7fffffffffffffff
  0000000000000000 0000000000000000 7fffffffffffffff 0000000000000001
  ffff88000ec178e0 ffff88000ec17860 ffff880009f5fec8 ffffffff8112e81f
 Call Trace:
  [&lt;ffffffff8112e7f5&gt;] __filemap_fdatawrite_range+0x85/0x90
  [&lt;ffffffff8112e81f&gt;] filemap_fdatawrite+0x1f/0x30
  [&lt;ffffffff811b25d6&gt;] fdatawrite_one_bdev+0x16/0x20
  [&lt;ffffffff811bc402&gt;] iterate_bdevs+0xf2/0x130
  [&lt;ffffffff811b2763&gt;] sys_sync+0x63/0x90
  [&lt;ffffffff815d4272&gt;] entry_SYSCALL_64_fastpath+0x12/0x76
 Code: 0f 1f 44 00 00 48 8b 87 f0 00 00 00 55 48 89 e5 &lt;48&gt; 8b 80 08 05 00 00 5d
 RIP  [&lt;ffffffff81314790&gt;] blk_get_backing_dev_info+0x10/0x20
  RSP &lt;ffff880009f5fe68&gt;
 CR2: 0000000000000508
 ---[ end trace 2487336ceb3de62d ]---

The crash is easily reproducible by running the following command, if an
msleep(100) is inserted before the call to func() in iterate_devs():

 while :; do head -c1 /dev/nullb0; done &gt; /dev/null &amp; while :; do sync; done

Fix it by holding the bd_mutex across the func() call and only calling
func() if the bdev is opened.

Cc: stable@vger.kernel.org
Fixes: 5c0d6b60a0ba ("vfs: Create function for iterating over block devices")
Reported-and-tested-by: Wei Fang &lt;fangwei1@huawei.com&gt;
Signed-off-by: Rabin Vincent &lt;rabinv@axis.com&gt;
Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@verizon.com&gt;
</content>
</entry>
<entry>
<title>blockdev: don't set S_DAX for misaligned partitions</title>
<updated>2015-10-22T21:43:13Z</updated>
<author>
<name>Jeff Moyer</name>
<email>jmoyer@redhat.com</email>
</author>
<published>2015-08-14T20:15:32Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=eb218995a9031ed04fb5e0566fac5ab21bfe4b4e'/>
<id>urn:sha1:eb218995a9031ed04fb5e0566fac5ab21bfe4b4e</id>
<content type='text'>
commit f0b2e563bc419df7c1b3d2f494574c25125f6aed upstream.

The dax code doesn't currently support misaligned partitions,
so disable O_DIRECT via dax until such time as that support
materializes.

Suggested-by: Boaz Harrosh &lt;boaz@plexistor.com&gt;
Signed-off-by: Jeff Moyer &lt;jmoyer@redhat.com&gt;
Signed-off-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>direct-io: only inc/dec inode-&gt;i_dio_count for file systems</title>
<updated>2015-04-24T19:45:28Z</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@fb.com</email>
</author>
<published>2015-04-15T23:05:48Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=fe0f07d08ee35fb13d2cb048970072fe4f71ad14'/>
<id>urn:sha1:fe0f07d08ee35fb13d2cb048970072fe4f71ad14</id>
<content type='text'>
do_blockdev_direct_IO() increments and decrements the inode
-&gt;i_dio_count for each IO operation. It does this to protect against
truncate of a file. Block devices don't need this sort of protection.

For a capable multiqueue setup, this atomic int is the only shared
state between applications accessing the device for O_DIRECT, and it
presents a scaling wall for that. In my testing, as much as 30% of
system time is spent incrementing and decrementing this value. A mixed
read/write workload improved from ~2.5M IOPS to ~9.6M IOPS, with
better latencies too. Before:

clat percentiles (usec):
 |  1.00th=[   33],  5.00th=[   34], 10.00th=[   34], 20.00th=[   34],
 | 30.00th=[   34], 40.00th=[   34], 50.00th=[   35], 60.00th=[   35],
 | 70.00th=[   35], 80.00th=[   35], 90.00th=[   37], 95.00th=[   80],
 | 99.00th=[   98], 99.50th=[  151], 99.90th=[  155], 99.95th=[  155],
 | 99.99th=[  165]

After:

clat percentiles (usec):
 |  1.00th=[   95],  5.00th=[  108], 10.00th=[  129], 20.00th=[  149],
 | 30.00th=[  155], 40.00th=[  161], 50.00th=[  167], 60.00th=[  171],
 | 70.00th=[  177], 80.00th=[  185], 90.00th=[  201], 95.00th=[  270],
 | 99.00th=[  390], 99.50th=[  398], 99.90th=[  418], 99.95th=[  422],
 | 99.99th=[  438]

In other setups, Robert Elliott reported seeing good performance
improvements:

https://lkml.org/lkml/2015/4/3/557

The more applications accessing the device, the worse it gets.

Add a new direct-io flags, DIO_SKIP_DIO_COUNT, which tells
do_blockdev_direct_IO() that it need not worry about incrementing
or decrementing the inode i_dio_count for this caller.

Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Christoph Hellwig &lt;hch@lst.de&gt;
Cc: Theodore Ts'o &lt;tytso@mit.edu&gt;
Cc: Elliott, Robert (Server Storage) &lt;elliott@hp.com&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
</entry>
<entry>
<title>VFS: assorted d_backing_inode() annotations</title>
<updated>2015-04-15T19:06:59Z</updated>
<author>
<name>David Howells</name>
<email>dhowells@redhat.com</email>
</author>
<published>2015-03-17T22:26:21Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=bb668734c4c960c8f61f017585b323b97e5f47b5'/>
<id>urn:sha1:bb668734c4c960c8f61f017585b323b97e5f47b5</id>
<content type='text'>
Signed-off-by: David Howells &lt;dhowells@redhat.com&gt;
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
</entry>
<entry>
<title>blkdev_write_iter: expand generic_file_checks() call in there</title>
<updated>2015-04-12T02:29:48Z</updated>
<author>
<name>Al Viro</name>
<email>viro@zeniv.linux.org.uk</email>
</author>
<published>2015-04-07T15:35:14Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=7ec7b94a3339756dfbb88234e3e45a428e8c08fb'/>
<id>urn:sha1:7ec7b94a3339756dfbb88234e3e45a428e8c08fb</id>
<content type='text'>
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
</entry>
<entry>
<title>lift generic_write_checks() into callers of __generic_file_write_iter()</title>
<updated>2015-04-12T02:29:47Z</updated>
<author>
<name>Al Viro</name>
<email>viro@zeniv.linux.org.uk</email>
</author>
<published>2015-04-07T15:28:12Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=5f380c7fa7e01f15ca0816bd241ece9a64a73192'/>
<id>urn:sha1:5f380c7fa7e01f15ca0816bd241ece9a64a73192</id>
<content type='text'>
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
</entry>
<entry>
<title>direct_IO: remove rw from a_ops-&gt;direct_IO()</title>
<updated>2015-04-12T02:29:45Z</updated>
<author>
<name>Omar Sandoval</name>
<email>osandov@osandov.com</email>
</author>
<published>2015-03-16T11:33:53Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=22c6186ecea0be9eff1c399298ad36e94a59995f'/>
<id>urn:sha1:22c6186ecea0be9eff1c399298ad36e94a59995f</id>
<content type='text'>
Now that no one is using rw, remove it completely.

Signed-off-by: Omar Sandoval &lt;osandov@osandov.com&gt;
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
</entry>
<entry>
<title>Remove rw from {,__,do_}blockdev_direct_IO()</title>
<updated>2015-04-12T02:29:44Z</updated>
<author>
<name>Omar Sandoval</name>
<email>osandov@osandov.com</email>
</author>
<published>2015-03-16T11:33:50Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=17f8c842d24ac054e4212c82b5bd6ae455a334f3'/>
<id>urn:sha1:17f8c842d24ac054e4212c82b5bd6ae455a334f3</id>
<content type='text'>
Most filesystems call through to these at some point, so we'll start
here.

Signed-off-by: Omar Sandoval &lt;osandov@osandov.com&gt;
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
</entry>
</feed>
