<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/block, branch linux-5.19.y</title>
<subtitle>Hosts the 0x221E linux distro kernel.</subtitle>
<id>https://universe.0xinfinity.dev/distro/kernel/atom?h=linux-5.19.y</id>
<link rel='self' href='https://universe.0xinfinity.dev/distro/kernel/atom?h=linux-5.19.y'/>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/'/>
<updated>2022-10-24T07:58:30Z</updated>
<entry>
<title>blk-wbt: fix that 'rwb-&gt;wc' is always set to 1 in wbt_init()</title>
<updated>2022-10-24T07:58:30Z</updated>
<author>
<name>Yu Kuai</name>
<email>yukuai3@huawei.com</email>
</author>
<published>2022-10-09T10:10:38Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=7a5dc0f4bc45ea8633d1248e1fffa4ce92c155b1'/>
<id>urn:sha1:7a5dc0f4bc45ea8633d1248e1fffa4ce92c155b1</id>
<content type='text'>
commit 285febabac4a16655372d23ff43e89ff6f216691 upstream.

commit 8c5035dfbb94 ("blk-wbt: call rq_qos_add() after wb_normal is
initialized") moves wbt_set_write_cache() before rq_qos_add(), which
is wrong because wbt_rq_qos() is still NULL.

Fix the problem by removing wbt_set_write_cache() and setting 'rwb-&gt;wc'
directly. Noted that this patch also remove the redundant setting of
'rab-&gt;wc'.

Fixes: 8c5035dfbb94 ("blk-wbt: call rq_qos_add() after wb_normal is initialized")
Reported-by: kernel test robot &lt;yujie.liu@intel.com&gt;
Link: https://lore.kernel.org/r/202210081045.77ddf59b-yujie.liu@intel.com
Signed-off-by: Yu Kuai &lt;yukuai3@huawei.com&gt;
Reviewed-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Link: https://lore.kernel.org/r/20221009101038.1692875-1-yukuai1@huaweicloud.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>blk-mq: use quiesced elevator switch when reinitializing queues</title>
<updated>2022-10-24T07:58:28Z</updated>
<author>
<name>Keith Busch</name>
<email>kbusch@kernel.org</email>
</author>
<published>2022-09-27T15:56:52Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=63a681bcc32a43528ce0f690569f7f48e59c3963'/>
<id>urn:sha1:63a681bcc32a43528ce0f690569f7f48e59c3963</id>
<content type='text'>
[ Upstream commit 8237c01f1696bc53c470493bf1fe092a107648a6 ]

The hctx's run_work may be racing with the elevator switch when
reinitializing hardware queues. The queue is merely frozen in this
context, but that only prevents requests from allocating and doesn't
stop the hctx work from running. The work may get an elevator pointer
that's being torn down, and can result in use-after-free errors and
kernel panics (example below). Use the quiesced elevator switch instead,
and make the previous one static since it is now only used locally.

  nvme nvme0: resetting controller
  nvme nvme0: 32/0/0 default/read/poll queues
  BUG: kernel NULL pointer dereference, address: 0000000000000008
  #PF: supervisor read access in kernel mode
  #PF: error_code(0x0000) - not-present page
  PGD 80000020c8861067 P4D 80000020c8861067 PUD 250f8c8067 PMD 0
  Oops: 0000 [#1] SMP PTI
  Workqueue: kblockd blk_mq_run_work_fn
  RIP: 0010:kyber_has_work+0x29/0x70

...

  Call Trace:
   __blk_mq_do_dispatch_sched+0x83/0x2b0
   __blk_mq_sched_dispatch_requests+0x12e/0x170
   blk_mq_sched_dispatch_requests+0x30/0x60
   __blk_mq_run_hw_queue+0x2b/0x50
   process_one_work+0x1ef/0x380
   worker_thread+0x2d/0x3e0

Signed-off-by: Keith Busch &lt;kbusch@kernel.org&gt;
Reviewed-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Link: https://lore.kernel.org/r/20220927155652.3260724-1-kbusch@fb.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>blk-throttle: prevent overflow while calculating wait time</title>
<updated>2022-10-24T07:58:25Z</updated>
<author>
<name>Yu Kuai</name>
<email>yukuai3@huawei.com</email>
</author>
<published>2022-08-29T02:22:38Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=cc6f0855bf8d9b729df28ff443ced7350c380dbd'/>
<id>urn:sha1:cc6f0855bf8d9b729df28ff443ced7350c380dbd</id>
<content type='text'>
[ Upstream commit 8d6bbaada2e0a65f9012ac4c2506460160e7237a ]

There is a problem found by code review in tg_with_in_bps_limit() that
'bps_limit * jiffy_elapsed_rnd' might overflow. Fix the problem by
calling mul_u64_u64_div_u64() instead.

Signed-off-by: Yu Kuai &lt;yukuai3@huawei.com&gt;
Acked-by: Tejun Heo &lt;tj@kernel.org&gt;
Link: https://lore.kernel.org/r/20220829022240.3348319-3-yukuai1@huaweicloud.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>blk-wbt: call rq_qos_add() after wb_normal is initialized</title>
<updated>2022-10-24T07:56:59Z</updated>
<author>
<name>Yu Kuai</name>
<email>yukuai3@huawei.com</email>
</author>
<published>2022-09-13T10:57:49Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=dd54b94e72d0cb307dfec677518ac33f6a8ea483'/>
<id>urn:sha1:dd54b94e72d0cb307dfec677518ac33f6a8ea483</id>
<content type='text'>
commit 8c5035dfbb9475b67c82b3fdb7351236525bf52b upstream.

Our test found a problem that wbt inflight counter is negative, which
will cause io hang(noted that this problem doesn't exist in mainline):

t1: device create	t2: issue io
add_disk
 blk_register_queue
  wbt_enable_default
   wbt_init
    rq_qos_add
    // wb_normal is still 0
			/*
			 * in mainline, disk can't be opened before
			 * bdev_add(), however, in old kernels, disk
			 * can be opened before blk_register_queue().
			 */
			blkdev_issue_flush
                        // disk size is 0, however, it's not checked
                         submit_bio_wait
                          submit_bio
                           blk_mq_submit_bio
                            rq_qos_throttle
                             wbt_wait
			      bio_to_wbt_flags
                               rwb_enabled
			       // wb_normal is 0, inflight is not increased

    wbt_queue_depth_changed(&amp;rwb-&gt;rqos);
     wbt_update_limits
     // wb_normal is initialized
                            rq_qos_track
                             wbt_track
                              rq-&gt;wbt_flags |= bio_to_wbt_flags(rwb, bio);
			      // wb_normal is not 0，wbt_flags will be set
t3: io completion
blk_mq_free_request
 rq_qos_done
  wbt_done
   wbt_is_tracked
   // return true
   __wbt_done
    wbt_rqw_done
     atomic_dec_return(&amp;rqw-&gt;inflight);
     // inflight is decreased

commit 8235b5c1e8c1 ("block: call bdev_add later in device_add_disk") can
avoid this problem, however it's better to fix this problem in wbt:

1) Lower kernel can't backport this patch due to lots of refactor.
2) Root cause is that wbt call rq_qos_add() before wb_normal is
initialized.

Fixes: e34cbd307477 ("blk-wbt: add general throttling mechanism")
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Yu Kuai &lt;yukuai3@huawei.com&gt;
Link: https://lore.kernel.org/r/20220913105749.3086243-1-yukuai1@huaweicloud.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>blk-throttle: fix that io throttle can only work for single bio</title>
<updated>2022-10-24T07:56:59Z</updated>
<author>
<name>Yu Kuai</name>
<email>yukuai3@huawei.com</email>
</author>
<published>2022-08-29T02:22:37Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=8b1f9fde48aa8cc3f1c73356be40f9b15fe6d1ee'/>
<id>urn:sha1:8b1f9fde48aa8cc3f1c73356be40f9b15fe6d1ee</id>
<content type='text'>
commit 320fb0f91e55ba248d4bad106b408e59099cfa89 upstream.

Test scripts:
cd /sys/fs/cgroup/blkio/
echo "8:0 1024" &gt; blkio.throttle.write_bps_device
echo $$ &gt; cgroup.procs
dd if=/dev/zero of=/dev/sda bs=10k count=1 oflag=direct &amp;
dd if=/dev/zero of=/dev/sda bs=10k count=1 oflag=direct &amp;

Test result:
10240 bytes (10 kB, 10 KiB) copied, 10.0134 s, 1.0 kB/s
10240 bytes (10 kB, 10 KiB) copied, 10.0135 s, 1.0 kB/s

The problem is that the second bio is finished after 10s instead of 20s.

Root cause:
1) second bio will be flagged:

__blk_throtl_bio
 while (true) {
  ...
  if (sq-&gt;nr_queued[rw]) -&gt; some bio is throttled already
   break
 };
 bio_set_flag(bio, BIO_THROTTLED); -&gt; flag the bio

2) flagged bio will be dispatched without waiting:

throtl_dispatch_tg
 tg_may_dispatch
  tg_with_in_bps_limit
   if (bps_limit == U64_MAX || bio_flagged(bio, BIO_THROTTLED))
    *wait = 0; -&gt; wait time is zero
    return true;

commit 9f5ede3c01f9 ("block: throttle split bio in case of iops limit")
support to count split bios for iops limit, thus it adds flagged bio
checking in tg_with_in_bps_limit() so that split bios will only count
once for bps limit, however, it introduce a new problem that io throttle
won't work if multiple bios are throttled.

In order to fix the problem, handle iops/bps limit in different ways:

1) for iops limit, there is no flag to record if the bio is throttled,
   and iops is always applied.
2) for bps limit, original bio will be flagged with BIO_BPS_THROTTLED,
   and io throttle will ignore bio with the flag.

Noted this patch also remove the code to set flag in __bio_clone(), it's
introduced in commit 111be8839817 ("block-throttle: avoid double
charge"), and author thinks split bio can be resubmited and throttled
again, which is wrong because split bio will continue to dispatch from
caller.

Fixes: 9f5ede3c01f9 ("block: throttle split bio in case of iops limit")
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Yu Kuai &lt;yukuai3@huawei.com&gt;
Acked-by: Tejun Heo &lt;tj@kernel.org&gt;
Link: https://lore.kernel.org/r/20220829022240.3348319-2-yukuai1@huaweicloud.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>Revert "block: freeze the queue earlier in del_gendisk"</title>
<updated>2022-09-28T09:32:28Z</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2022-09-19T14:40:49Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=48a12961e800aa404fefa845aa95098d18a30f19'/>
<id>urn:sha1:48a12961e800aa404fefa845aa95098d18a30f19</id>
<content type='text'>
commit 4c66a326b5ab784cddd72de07ac5b6210e9e1b06 upstream.

This reverts commit a09b314005f3a0956ebf56e01b3b80339df577cc.

Dusty Mabe reported consistent hang during CoreOS shutdown with a MD
RAID1 setup.  Although apparently similar hangs happened before,
and this patch most likely is not the root cause it made it much
more severe.  Revert it until we can figure out what is going on
with the md driver.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Link: https://lore.kernel.org/r/20220919144049.978907-1-hch@lst.de
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>block: Do not call blk_put_queue() if gendisk allocation fails</title>
<updated>2022-09-28T09:32:23Z</updated>
<author>
<name>Rafael Mendonca</name>
<email>rafaelmendsr@gmail.com</email>
</author>
<published>2022-08-11T23:23:37Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=98756ca2584e64dc53d3d9170af6257f860adec2'/>
<id>urn:sha1:98756ca2584e64dc53d3d9170af6257f860adec2</id>
<content type='text'>
commit aa0c680c3aa96a5f9f160d90dd95402ad578e2b0 upstream.

Commit 6f8191fdf41d ("block: simplify disk shutdown") removed the call
to blk_get_queue() during gendisk allocation but missed to remove the
corresponding cleanup code blk_put_queue() for it. Thus, if the gendisk
allocation fails, the request_queue refcount gets decremented and
reaches 0, causing blk_mq_release() to be called with a hctx still
alive. That triggers a WARNING report, as found by syzkaller:

------------[ cut here ]------------
WARNING: CPU: 0 PID: 23016 at block/blk-mq.c:3881
blk_mq_release+0xf8/0x3e0 block/blk-mq.c:3881
[...] stripped
RIP: 0010:blk_mq_release+0xf8/0x3e0 block/blk-mq.c:3881
[...] stripped
Call Trace:
 &lt;TASK&gt;
 blk_release_queue+0x153/0x270 block/blk-sysfs.c:780
 kobject_cleanup lib/kobject.c:673 [inline]
 kobject_release lib/kobject.c:704 [inline]
 kref_put include/linux/kref.h:65 [inline]
 kobject_put+0x1c8/0x540 lib/kobject.c:721
 __alloc_disk_node+0x4f7/0x610 block/genhd.c:1388
 __blk_mq_alloc_disk+0x13b/0x1f0 block/blk-mq.c:3961
 loop_add+0x3e2/0xaf0 drivers/block/loop.c:1978
 loop_control_ioctl+0x133/0x620 drivers/block/loop.c:2150
 vfs_ioctl fs/ioctl.c:51 [inline]
 __do_sys_ioctl fs/ioctl.c:870 [inline]
 __se_sys_ioctl fs/ioctl.c:856 [inline]
 __x64_sys_ioctl+0x193/0x200 fs/ioctl.c:856
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x63/0xcd
[...] stripped

Fixes: 6f8191fdf41d ("block: simplify disk shutdown")
Reported-by: syzbot+31c9594f6e43b9289b25@syzkaller.appspotmail.com
Suggested-by: Hillf Danton &lt;hdanton@sina.com&gt;
Signed-off-by: Rafael Mendonca &lt;rafaelmendsr@gmail.com&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Link: https://lore.kernel.org/r/20220811232338.254673-1-rafaelmendsr@gmail.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>block: call blk_mq_exit_queue from disk_release for never added disks</title>
<updated>2022-09-28T09:32:23Z</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2022-07-20T13:05:41Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=2f092fd2ce24eebb6f56f4b558654ba2fb4e5fe8'/>
<id>urn:sha1:2f092fd2ce24eebb6f56f4b558654ba2fb4e5fe8</id>
<content type='text'>
commit c5db2cfc6274692d821d33b59acb6ff615e350c1 upstream.

To undo the all initialization from blk_mq_init_allocated_queue in case
of a probe failure where add_disk is never called we have to call
blk_mq_exit_queue from put_disk.

This relies on the fact that drivers always call blk_mq_free_tag_set
after calling put_disk in the probe error path if they have a gendisk
at all.

We should be doing this in general, but can't do it for the normal
teardown case (yet) as the tagset can be gone by the time the disk is
released once it was added.  I hope to sort this out properly eventually
but for now this isolated hack will do it.

Fixes: 6f8191fdf41d ("block: simplify disk shutdown")
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Link: https://lore.kernel.org/r/20220720130541.1323531-2-hch@lst.de
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>blk-mq: fix error handling in __blk_mq_alloc_disk</title>
<updated>2022-09-28T09:32:23Z</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2022-07-20T13:05:40Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=47f57236ba4027840eb146117b5080fece4ec358'/>
<id>urn:sha1:47f57236ba4027840eb146117b5080fece4ec358</id>
<content type='text'>
commit 0a3e5cc7bbfcd571a2e53779ef7d7aa3c57d5432 upstream.

To fully clean up the queue if the disk allocation fails we need to
call blk_mq_destroy_queue and not just blk_put_queue.

Fixes: 6f8191fdf41d ("block: simplify disk shutdown")
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Link: https://lore.kernel.org/r/20220720130541.1323531-1-hch@lst.de
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>block: simplify disk shutdown</title>
<updated>2022-09-28T09:32:01Z</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2022-06-19T06:05:51Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=d27b66257db183fe11c10f31246ae965adb005d3'/>
<id>urn:sha1:d27b66257db183fe11c10f31246ae965adb005d3</id>
<content type='text'>
[ Upstream commit 6f8191fdf41d3a53cc1d63fe2234e812c55a0092 ]

Set the queue dying flag and call blk_mq_exit_queue from del_gendisk for
all disks that do not have separately allocated queues, and thus remove
the need to call blk_cleanup_queue for them.

Rename blk_cleanup_disk to blk_mq_destroy_queue to make it clear that
this function is intended only for separately allocated blk-mq queues.

This saves an extra queue freeze for devices without a separately
allocated queue.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: Hannes Reinecke &lt;hare@suse.de&gt;
Link: https://lore.kernel.org/r/20220619060552.1850436-6-hch@lst.de
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Stable-dep-of: 8fe4ce5836e9 ("scsi: core: Fix a use-after-free")
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
</feed>
