<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/include/linux/sunrpc/svc_rdma.h, branch linux-6.17.y</title>
<subtitle>Hosts the 0x221E linux distro kernel.</subtitle>
<id>https://universe.0xinfinity.dev/distro/kernel/atom?h=linux-6.17.y</id>
<link rel='self' href='https://universe.0xinfinity.dev/distro/kernel/atom?h=linux-6.17.y'/>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/'/>
<updated>2025-05-15T20:16:26Z</updated>
<entry>
<title>svcrdma: Adjust the number of entries in svc_rdma_send_ctxt::sc_pages</title>
<updated>2025-05-15T20:16:26Z</updated>
<author>
<name>Chuck Lever</name>
<email>chuck.lever@oracle.com</email>
</author>
<published>2025-04-28T19:36:57Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=56ab43f50d708e155f013ef60cc1b4fe39bed42e'/>
<id>urn:sha1:56ab43f50d708e155f013ef60cc1b4fe39bed42e</id>
<content type='text'>
Allow allocation of more entries in the sc_pages[] array when the
maximum size of an RPC message is increased.

Reviewed-by: Jeff Layton &lt;jlayton@kernel.org&gt;
Reviewed-by: NeilBrown &lt;neil@brown.name&gt;
Signed-off-by: Chuck Lever &lt;chuck.lever@oracle.com&gt;
</content>
</entry>
<entry>
<title>svcrdma: Adjust the number of entries in svc_rdma_recv_ctxt::rc_pages</title>
<updated>2025-05-15T20:16:26Z</updated>
<author>
<name>Chuck Lever</name>
<email>chuck.lever@oracle.com</email>
</author>
<published>2025-04-28T19:36:56Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=81381d1a90dea186ce4585d9bd3cdc40b50e9c48'/>
<id>urn:sha1:81381d1a90dea186ce4585d9bd3cdc40b50e9c48</id>
<content type='text'>
Allow allocation of more entries in the rc_pages[] array when the
maximum size of an RPC message is increased.

Reviewed-by: Jeff Layton &lt;jlayton@kernel.org&gt;
Reviewed-by: NeilBrown &lt;neil@brown.name&gt;
Signed-off-by: Chuck Lever &lt;chuck.lever@oracle.com&gt;
</content>
</entry>
<entry>
<title>svcrdma: Handle device removal outside of the CM event handler</title>
<updated>2024-09-20T23:31:03Z</updated>
<author>
<name>Chuck Lever</name>
<email>chuck.lever@oracle.com</email>
</author>
<published>2024-07-29T20:52:32Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=c4de97f7c45434985e5dbf2d6ccc9eca676e37fe'/>
<id>urn:sha1:c4de97f7c45434985e5dbf2d6ccc9eca676e37fe</id>
<content type='text'>
Synchronously wait for all disconnects to complete to ensure the
transports have divested all hardware resources before the
underlying RDMA device can safely be removed.

Reviewed-by: Sagi Grimberg &lt;sagi@grimberg.me&gt;
Signed-off-by: Chuck Lever &lt;chuck.lever@oracle.com&gt;
</content>
</entry>
<entry>
<title>Revert "svcrdma: Add Write chunk WRs to the RPC's Send WR chain"</title>
<updated>2024-04-20T15:20:41Z</updated>
<author>
<name>Chuck Lever</name>
<email>chuck.lever@oracle.com</email>
</author>
<published>2024-04-19T15:51:12Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=32cf5a4eda464d76d553ee3f1b06c4d33d796c52'/>
<id>urn:sha1:32cf5a4eda464d76d553ee3f1b06c4d33d796c52</id>
<content type='text'>
Performance regression reported with NFS/RDMA using Omnipath,
bisected to commit e084ee673c77 ("svcrdma: Add Write chunk WRs to
the RPC's Send WR chain").

Tracing on the server reports:

  nfsd-7771  [060]  1758.891809: svcrdma_sq_post_err:
	cq.id=205 cid=226 sc_sq_avail=13643/851 status=-12

sq_post_err reports ENOMEM, and the rdma-&gt;sc_sq_avail (13643) is
larger than rdma-&gt;sc_sq_depth (851). The number of available Send
Queue entries is always supposed to be smaller than the Send Queue
depth. That seems like a Send Queue accounting bug in svcrdma.

As it's getting to be late in the 6.9-rc cycle, revert this commit.
It can be revisited in a subsequent kernel release.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=218743
Fixes: e084ee673c77 ("svcrdma: Add Write chunk WRs to the RPC's Send WR chain")
Signed-off-by: Chuck Lever &lt;chuck.lever@oracle.com&gt;
</content>
</entry>
<entry>
<title>svcrdma: Add Write chunk WRs to the RPC's Send WR chain</title>
<updated>2024-03-01T14:12:30Z</updated>
<author>
<name>Chuck Lever</name>
<email>chuck.lever@oracle.com</email>
</author>
<published>2024-02-04T23:17:47Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=e084ee673c77cade06ab4c2e36b5624c82608b8c'/>
<id>urn:sha1:e084ee673c77cade06ab4c2e36b5624c82608b8c</id>
<content type='text'>
Chain RDMA Writes that convey Write chunks onto the local Send
chain. This means all WRs for an RPC Reply are now posted with a
single ib_post_send() call, and there is a single Send completion
when all of these are done. That reduces both the per-transport
doorbell rate and completion rate.

Signed-off-by: Chuck Lever &lt;chuck.lever@oracle.com&gt;
</content>
</entry>
<entry>
<title>svcrdma: Post WRs for Write chunks in svc_rdma_sendto()</title>
<updated>2024-03-01T14:12:29Z</updated>
<author>
<name>Chuck Lever</name>
<email>chuck.lever@oracle.com</email>
</author>
<published>2024-02-04T23:17:41Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=d2727cefff0204c3074a10e3ad2e1b5c9dfb986c'/>
<id>urn:sha1:d2727cefff0204c3074a10e3ad2e1b5c9dfb986c</id>
<content type='text'>
Refactor to eventually enable svcrdma to post the Write WRs for each
RPC response using the same ib_post_send() as the Send WR (ie, as a
single WR chain).

svc_rdma_result_payload (originally svc_rdma_read_payload) was added
so that the upper layer XDR encoder could identify a range of bytes
to be possibly conveyed by RDMA (if a Write chunk was provided by
the client).

The purpose of commit f6ad77590a5d ("svcrdma: Post RDMA Writes while
XDR encoding replies") was to post as much of the result payload
outside of svc_rdma_sendto() as possible because svc_rdma_sendto()
used to be called with the xpt_mutex held.

However, since commit ca4faf543a33 ("SUNRPC: Move xpt_mutex into
socket xpo_sendto methods"), the xpt_mutex is no longer held when
calling svc_rdma_sendto(). Thus, that benefit is no longer an issue.

Signed-off-by: Chuck Lever &lt;chuck.lever@oracle.com&gt;
</content>
</entry>
<entry>
<title>svcrdma: Post the Reply chunk and Send WR together</title>
<updated>2024-03-01T14:12:29Z</updated>
<author>
<name>Chuck Lever</name>
<email>chuck.lever@oracle.com</email>
</author>
<published>2024-02-04T23:17:34Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=10e6fc1054d900a205e5233f186ebce9c50e1d1d'/>
<id>urn:sha1:10e6fc1054d900a205e5233f186ebce9c50e1d1d</id>
<content type='text'>
Reduce the doorbell and Send completion rates when sending RPC/RDMA
replies that have Reply chunks. NFS READDIR procedures typically
return their result in a Reply chunk, for example.

Instead of calling ib_post_send() to post the Write WRs for the
Reply chunk, and then calling it again to post the Send WR that
conveys the transport header, chain the Write WRs to the Send WR
and call ib_post_send() only once.

Thanks to the Send Queue completion ordering rules, when the Send
WR completes, that guarantees that Write WRs posted before it have
also completed successfully. Thus all Write WRs for the Reply chunk
can remain unsignaled. Instead of handling a Write completion and
then a Send completion, only the Send completion is seen, and it
handles clean up for both the Writes and the Send.

Signed-off-by: Chuck Lever &lt;chuck.lever@oracle.com&gt;
</content>
</entry>
<entry>
<title>svcrdma: Move write_info for Reply chunks into struct svc_rdma_send_ctxt</title>
<updated>2024-03-01T14:12:28Z</updated>
<author>
<name>Chuck Lever</name>
<email>chuck.lever@oracle.com</email>
</author>
<published>2024-02-04T23:17:28Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=a1f5788a0c250c87d3007d59d11a00ab98e66f01'/>
<id>urn:sha1:a1f5788a0c250c87d3007d59d11a00ab98e66f01</id>
<content type='text'>
Since the RPC transaction's svc_rdma_send_ctxt will stay around for
the duration of the RDMA Write operation, the write_info structure
for the Reply chunk can reside in the request's svc_rdma_send_ctxt
instead of being allocated separately.

Signed-off-by: Chuck Lever &lt;chuck.lever@oracle.com&gt;
</content>
</entry>
<entry>
<title>svcrdma: Post Send WR chain</title>
<updated>2024-03-01T14:12:28Z</updated>
<author>
<name>Chuck Lever</name>
<email>chuck.lever@oracle.com</email>
</author>
<published>2024-02-04T23:17:22Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=71b43531ee0be6dcaa406132ebd540022dcb12ea'/>
<id>urn:sha1:71b43531ee0be6dcaa406132ebd540022dcb12ea</id>
<content type='text'>
Eventually I'd like the server to post the reply's Send WR along
with any Write WRs using only a single call to ib_post_send(), in
order to reduce the NIC's doorbell rate.

To do this, add an anchor for a WR chain to svc_rdma_send_ctxt, and
refactor svc_rdma_send() to post this WR chain to the Send Queue. For
the moment, the posted chain will continue to contain a single Send
WR.

Signed-off-by: Chuck Lever &lt;chuck.lever@oracle.com&gt;
</content>
</entry>
<entry>
<title>svcrdma: Implement multi-stage Read completion again</title>
<updated>2024-01-07T22:54:33Z</updated>
<author>
<name>Chuck Lever</name>
<email>chuck.lever@oracle.com</email>
</author>
<published>2023-12-18T22:32:07Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=d3dba534100d4e9eb7a5204be97cd6f9ada2066e'/>
<id>urn:sha1:d3dba534100d4e9eb7a5204be97cd6f9ada2066e</id>
<content type='text'>
Having an nfsd thread waiting for an RDMA Read completion is
problematic if the Read responder (ie, the client) stops responding.
We need to go back to handling RDMA Reads by getting the svc scheduler
to call svc_rdma_recvfrom() a second time to finish building an RPC
message after a Read completion.

This is the final patch, and makes several changes that have to
happen concurrently:

1. svc_rdma_process_read_list no longer waits for a completion, but
   simply builds and posts the Read WRs.

2. svc_rdma_read_done() now queues a completed Read on
   sc_read_complete_q for later processing rather than calling
   complete().

3. The completed RPC message is no longer built in the
   svc_rdma_process_read_list() path. Finishing the message is now
   done in svc_rdma_recvfrom() when it notices work on the
   sc_read_complete_q. The "finish building this RPC message" code
   is removed from the svc_rdma_process_read_list() path.

This arrangement avoids the need for an nfsd thread to wait for an
RDMA Read non-interruptibly without a timeout. It's basically the
same code structure that Tom Tucker used for Read chunks along with
some clean-up and modernization.

Signed-off-by: Chuck Lever &lt;chuck.lever@oracle.com&gt;
</content>
</entry>
</feed>
