<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/net/ceph/osd_client.c, branch linux-rolling-lts</title>
<subtitle>Hosts the 0x221E linux distro kernel.</subtitle>
<id>https://universe.0xinfinity.dev/distro/kernel/atom?h=linux-rolling-lts</id>
<link rel='self' href='https://universe.0xinfinity.dev/distro/kernel/atom?h=linux-rolling-lts'/>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/'/>
<updated>2026-01-17T15:35:15Z</updated>
<entry>
<title>libceph: make calc_target() set t-&gt;paused, not just clear it</title>
<updated>2026-01-17T15:35:15Z</updated>
<author>
<name>Ilya Dryomov</name>
<email>idryomov@gmail.com</email>
</author>
<published>2026-01-05T18:23:19Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=5647d42c47b535573b63e073e91164d6a5bb058c'/>
<id>urn:sha1:5647d42c47b535573b63e073e91164d6a5bb058c</id>
<content type='text'>
commit c0fe2994f9a9d0a2ec9e42441ea5ba74b6a16176 upstream.

Currently calc_target() clears t-&gt;paused if the request shouldn't be
paused anymore, but doesn't ever set t-&gt;paused even though it's able to
determine when the request should be paused.  Setting t-&gt;paused is left
to __submit_request() which is fine for regular requests but doesn't
work for linger requests -- since __submit_request() doesn't operate
on linger requests, there is nowhere for lreq-&gt;t.paused to be set.
One consequence of this is that watches don't get reestablished on
paused -&gt; unpaused transitions in cases where requests have been paused
long enough for the (paused) unwatch request to time out and for the
subsequent (re)watch request to enter the paused state.  On top of the
watch not getting reestablished, rbd_reregister_watch() gets stuck with
rbd_dev-&gt;watch_mutex held:

  rbd_register_watch
    __rbd_register_watch
      ceph_osdc_watch
        linger_reg_commit_wait

It's waiting for lreq-&gt;reg_commit_wait to be completed, but for that to
happen the respective request needs to end up on need_resend_linger list
and be kicked when requests are unpaused.  There is no chance for that
if the request in question is never marked paused in the first place.

The fact that rbd_dev-&gt;watch_mutex remains taken out forever then
prevents the image from getting unmapped -- "rbd unmap" would inevitably
hang in D state on an attempt to grab the mutex.

Cc: stable@vger.kernel.org
Reported-by: Raphael Zimmer &lt;raphael.zimmer@tu-ilmenau.de&gt;
Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
Reviewed-by: Viacheslav Dubeyko &lt;Slava.Dubeyko@ibm.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>libceph: reset sparse-read state in osd_fault()</title>
<updated>2026-01-17T15:35:15Z</updated>
<author>
<name>Sam Edwards</name>
<email>cfsworks@gmail.com</email>
</author>
<published>2025-12-31T04:05:06Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=10b7c72810364226f7b27916ea3e2a4f870bc04b'/>
<id>urn:sha1:10b7c72810364226f7b27916ea3e2a4f870bc04b</id>
<content type='text'>
commit 11194b416ef95012c2cfe5f546d71af07b639e93 upstream.

When a fault occurs, the connection is abandoned, reestablished, and any
pending operations are retried. The OSD client tracks the progress of a
sparse-read reply using a separate state machine, largely independent of
the messenger's state.

If a connection is lost mid-payload or the sparse-read state machine
returns an error, the sparse-read state is not reset. The OSD client
will then interpret the beginning of a new reply as the continuation of
the old one. If this makes the sparse-read machinery enter a failure
state, it may never recover, producing loops like:

  libceph:  [0] got 0 extents
  libceph: data len 142248331 != extent len 0
  libceph: osd0 (1)...:6801 socket error on read
  libceph: data len 142248331 != extent len 0
  libceph: osd0 (1)...:6801 socket error on read

Therefore, reset the sparse-read state in osd_fault(), ensuring retries
start from a clean state.

Cc: stable@vger.kernel.org
Fixes: f628d7999727 ("libceph: add sparse read support to OSD client")
Signed-off-by: Sam Edwards &lt;CFSworks@gmail.com&gt;
Reviewed-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>ceph: Remove osd_client deadcode</title>
<updated>2025-04-03T19:35:32Z</updated>
<author>
<name>Dr. David Alan Gilbert</name>
<email>linux@treblig.org</email>
</author>
<published>2025-02-22T01:35:30Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=aed06d36ba4e7fe90f2d4a85835cee7c80ea72a7'/>
<id>urn:sha1:aed06d36ba4e7fe90f2d4a85835cee7c80ea72a7</id>
<content type='text'>
osd_req_op_extent_osd_data_pagelist() was added in 2013 as part of
commit a4ce40a9a7c1 ("libceph: combine initializing and setting osd data")
but never used.

The last use of osd_req_op_cls_request_data_pagelist() was removed in
2017's commit ecd4a68a26a2 ("rbd: switch rbd_obj_method_sync() to
ceph_osdc_call()")

Remove them.

Signed-off-by: Dr. David Alan Gilbert &lt;linux@treblig.org&gt;
Reviewed-by: Viacheslav Dubeyko &lt;Slava.Dubeyko@ibm.com&gt;
Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
</content>
</entry>
<entry>
<title>ceph: allocate sparse_ext map only for sparse reads</title>
<updated>2024-12-16T22:25:44Z</updated>
<author>
<name>Ilya Dryomov</name>
<email>idryomov@gmail.com</email>
</author>
<published>2024-12-07T16:33:25Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=18d44c5d062b97b97bb0162d9742440518958dc1'/>
<id>urn:sha1:18d44c5d062b97b97bb0162d9742440518958dc1</id>
<content type='text'>
If mounted with sparseread option, ceph_direct_read_write() ends up
making an unnecessarily allocation for O_DIRECT writes.

Fixes: 03bc06c7b0bd ("ceph: add new mount option to enable sparse reads")
Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
Reviewed-by: Alex Markuze &lt;amarkuze@redhat.com&gt;
</content>
</entry>
<entry>
<title>libceph: Remove unused ceph_osdc_watch_check</title>
<updated>2024-11-18T16:34:35Z</updated>
<author>
<name>Dr. David Alan Gilbert</name>
<email>linux@treblig.org</email>
</author>
<published>2024-10-06T01:19:54Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=32844fd72b879d02f1f6b4394025349d31a09fd3'/>
<id>urn:sha1:32844fd72b879d02f1f6b4394025349d31a09fd3</id>
<content type='text'>
ceph_osdc_watch_check() has been unused since it was added in commit
b07d3c4bd727 ("libceph: support for checking on status of watch")

Remove it.

Signed-off-by: Dr. David Alan Gilbert &lt;linux@treblig.org&gt;
Reviewed-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
</content>
</entry>
<entry>
<title>libceph: just wait for more data to be available on the socket</title>
<updated>2024-02-07T13:43:29Z</updated>
<author>
<name>Xiubo Li</name>
<email>xiubli@redhat.com</email>
</author>
<published>2023-12-14T08:01:03Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=8e46a2d068c92a905d01cbb018b00d66991585ab'/>
<id>urn:sha1:8e46a2d068c92a905d01cbb018b00d66991585ab</id>
<content type='text'>
A short read may occur while reading the message footer from the
socket.  Later, when the socket is ready for another read, the
messenger invokes all read_partial_*() handlers, including
read_partial_sparse_msg_data().  The expectation is that
read_partial_sparse_msg_data() would bail, allowing the messenger to
invoke read_partial() for the footer and pick up where it left off.

However read_partial_sparse_msg_data() violates that and ends up
calling into the state machine in the OSD client.  The sparse-read
state machine assumes that it's a new op and interprets some piece of
the footer as the sparse-read header and returns bogus extents/data
length, etc.

To determine whether read_partial_sparse_msg_data() should bail, let's
reuse cursor-&gt;total_resid.  Because once it reaches to zero that means
all the extents and data have been successfully received in last read,
else it could break out when partially reading any of the extents and
data.  And then osd_sparse_read() could continue where it left off.

[ idryomov: changelog ]

Link: https://tracker.ceph.com/issues/63586
Fixes: d396f89db39a ("libceph: add sparse read support to msgr1")
Signed-off-by: Xiubo Li &lt;xiubli@redhat.com&gt;
Reviewed-by: Jeff Layton &lt;jlayton@kernel.org&gt;
Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
</content>
</entry>
<entry>
<title>libceph: fail sparse-read if the data length doesn't match</title>
<updated>2024-02-07T13:43:29Z</updated>
<author>
<name>Xiubo Li</name>
<email>xiubli@redhat.com</email>
</author>
<published>2023-10-13T05:55:44Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=cd7d469c25704d414d71bf3644f163fb74e7996b'/>
<id>urn:sha1:cd7d469c25704d414d71bf3644f163fb74e7996b</id>
<content type='text'>
Once this happens that means there have bugs.

Signed-off-by: Xiubo Li &lt;xiubli@redhat.com&gt;
Reviewed-by: Jeff Layton &lt;jlayton@kernel.org&gt;
Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
</content>
</entry>
<entry>
<title>libceph: remove MAX_EXTENTS check for sparse reads</title>
<updated>2024-01-15T14:40:50Z</updated>
<author>
<name>Xiubo Li</name>
<email>xiubli@redhat.com</email>
</author>
<published>2023-11-07T01:37:47Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=b79e4a0aa902322756ced7361a2c637d462c3c1c'/>
<id>urn:sha1:b79e4a0aa902322756ced7361a2c637d462c3c1c</id>
<content type='text'>
There is no any limit for the extent array size and it's possible
that when reading with a large size contents the total number of
extents will exceed 4096. Then the messager will fail by reseting
the connection and keeps resending the inflight IOs infinitely.

[ idryomov: adjust error message ]

Link: https://tracker.ceph.com/issues/62081
Signed-off-by: Xiubo Li &lt;xiubli@redhat.com&gt;
Reviewed-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
</content>
</entry>
<entry>
<title>libceph: allow ceph_osdc_new_request to accept a multi-op read</title>
<updated>2023-08-24T09:24:35Z</updated>
<author>
<name>Jeff Layton</name>
<email>jlayton@kernel.org</email>
</author>
<published>2022-08-25T13:31:16Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=4e8c4c235578b4d44bd6676df3a01dce98d0f7dd'/>
<id>urn:sha1:4e8c4c235578b4d44bd6676df3a01dce98d0f7dd</id>
<content type='text'>
Currently we have some special-casing for multi-op writes, but in the
case of a read, we can't really handle it. All of the current multi-op
callers call it with CEPH_OSD_FLAG_WRITE set.

Have ceph_osdc_new_request check for CEPH_OSD_FLAG_READ and if it's set,
allocate multiple reply ops instead of multiple request ops. If neither
flag is set, return -EINVAL.

Signed-off-by: Jeff Layton &lt;jlayton@kernel.org&gt;
Reviewed-by: Xiubo Li &lt;xiubli@redhat.com&gt;
Reviewed-and-tested-by: Luís Henriques &lt;lhenriques@suse.de&gt;
Reviewed-by: Milind Changire &lt;mchangir@redhat.com&gt;
Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
</content>
</entry>
<entry>
<title>libceph: add CEPH_OSD_OP_ASSERT_VER support</title>
<updated>2023-08-24T09:24:35Z</updated>
<author>
<name>Jeff Layton</name>
<email>jlayton@kernel.org</email>
</author>
<published>2022-08-25T13:31:05Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=69dd3b3930f96b624228000921f417fb0919a6ab'/>
<id>urn:sha1:69dd3b3930f96b624228000921f417fb0919a6ab</id>
<content type='text'>
...and record the user_version in the reply in a new field in
ceph_osd_request, so we can populate the assert_ver appropriately.
Shuffle the fields a bit too so that the new field fits in an
existing hole on x86_64.

Signed-off-by: Jeff Layton &lt;jlayton@kernel.org&gt;
Reviewed-by: Xiubo Li &lt;xiubli@redhat.com&gt;
Reviewed-and-tested-by: Luís Henriques &lt;lhenriques@suse.de&gt;
Reviewed-by: Milind Changire &lt;mchangir@redhat.com&gt;
Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
</content>
</entry>
</feed>
