<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/include/linux/sunrpc/svc_rdma.h, branch linux-6.1.y</title>
<subtitle>Hosts the 0x221E linux distro kernel.</subtitle>
<id>https://universe.0xinfinity.dev/distro/kernel/atom?h=linux-6.1.y</id>
<link rel='self' href='https://universe.0xinfinity.dev/distro/kernel/atom?h=linux-6.1.y'/>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/'/>
<updated>2023-05-24T16:32:45Z</updated>
<entry>
<title>SUNRPC: always free ctxt when freeing deferred request</title>
<updated>2023-05-24T16:32:45Z</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.de</email>
</author>
<published>2023-05-08T23:42:47Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=47adb84916ee7a66235d179b98e7702cbd73b54f'/>
<id>urn:sha1:47adb84916ee7a66235d179b98e7702cbd73b54f</id>
<content type='text'>
[ Upstream commit 948f072ada23e0a504c5e4d7d71d4c83bd0785ec ]

Since the -&gt;xprt_ctxt pointer was added to svc_deferred_req, it has not
been sufficient to use kfree() to free a deferred request.  We may need
to free the ctxt as well.

As freeing the ctxt is all that -&gt;xpo_release_rqst() does, we repurpose
it to explicit do that even when the ctxt is not stored in an rqst.
So we now have -&gt;xpo_release_ctxt() which is given an xprt and a ctxt,
which may have been taken either from an rqst or from a dreq.  The
caller is now responsible for clearing that pointer after the call to
-&gt;xpo_release_ctxt.

We also clear dr-&gt;xprt_ctxt when the ctxt is moved into a new rqst when
revisiting a deferred request.  This ensures there is only one pointer
to the ctxt, so the risk of double freeing in future is reduced.  The
new code in svc_xprt_release which releases both the ctxt and any
rq_deferred depends on this.

Fixes: 773f91b2cf3f ("SUNRPC: Fix NFSD's request deferral on RDMA transports")
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
Reviewed-by: Jeff Layton &lt;jlayton@kernel.org&gt;
Signed-off-by: Chuck Lever &lt;chuck.lever@oracle.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>svcrdma: Convert rdma-&gt;sc_rw_ctxts to llist</title>
<updated>2021-08-17T15:47:53Z</updated>
<author>
<name>Chuck Lever</name>
<email>chuck.lever@oracle.com</email>
</author>
<published>2021-02-08T20:33:16Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=07a92d009f0b1557d3d58905ce18821a483be2e1'/>
<id>urn:sha1:07a92d009f0b1557d3d58905ce18821a483be2e1</id>
<content type='text'>
Relieve contention on sc_rw_ctxt_lock by converting rdma-&gt;sc_rw_ctxts
to an llist.

The goal is to reduce the average overhead of Send completions,
because a transport's completion handlers are single-threaded on
one CPU core. This change reduces CPU utilization of each Send
completion by 2-3% on my server.

Signed-off-by: Chuck Lever &lt;chuck.lever@oracle.com&gt;
Reviewed-By: Tom Talpey &lt;tom@talpey.com&gt;
</content>
</entry>
<entry>
<title>svcrdma: Relieve contention on sc_send_lock.</title>
<updated>2021-08-17T15:47:53Z</updated>
<author>
<name>Chuck Lever</name>
<email>chuck.lever@oracle.com</email>
</author>
<published>2021-02-09T15:32:20Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=b6c2bfea096ba22583f1071c10ce0745804b9b95'/>
<id>urn:sha1:b6c2bfea096ba22583f1071c10ce0745804b9b95</id>
<content type='text'>
/proc/lock_stat indicates the the sc_send_lock is heavily
contended when the server is under load from a single client.

To address this, convert the send_ctxt free list to an llist.
Returning an item to the send_ctxt cache is now waitless, which
reduces the instruction path length in the single-threaded Send
handler (svc_rdma_wc_send).

The goal is to enable the ib_comp_wq worker to handle a higher
RPC/RDMA Send completion rate given the same CPU resources. This
change reduces CPU utilization of Send completion by 2-3% on my
server.

Signed-off-by: Chuck Lever &lt;chuck.lever@oracle.com&gt;
Reviewed-By: Tom Talpey &lt;tom@talpey.com&gt;
</content>
</entry>
<entry>
<title>svcrdma: Fewer calls to wake_up() in Send completion handler</title>
<updated>2021-08-17T15:47:53Z</updated>
<author>
<name>Chuck Lever</name>
<email>chuck.lever@oracle.com</email>
</author>
<published>2021-07-07T18:57:28Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=6c8c84f525100a1cade5698320b4abe43062e159'/>
<id>urn:sha1:6c8c84f525100a1cade5698320b4abe43062e159</id>
<content type='text'>
Because wake_up() takes an IRQ-safe lock, it can be expensive,
especially to call inside of a single-threaded completion handler.
What's more, the Send wait queue almost never has waiters, so
most of the time, this is an expensive no-op.

As always, the goal is to reduce the average overhead of each
completion, because a transport's completion handlers are single-
threaded on one CPU core. This change reduces CPU utilization of
the Send completion thread by 2-3% on my server.

Signed-off-by: Chuck Lever &lt;chuck.lever@oracle.com&gt;
Reviewed-By: Tom Talpey &lt;tom@talpey.com&gt;
</content>
</entry>
<entry>
<title>svcrdma: Remove svc_rdma_recv_ctxt::rc_pages and ::rc_arg</title>
<updated>2021-03-31T19:57:48Z</updated>
<author>
<name>Chuck Lever</name>
<email>chuck.lever@oracle.com</email>
</author>
<published>2021-01-13T14:31:50Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=5533c4f4b996b7fc36d16b5e0807ebbc08c93af4'/>
<id>urn:sha1:5533c4f4b996b7fc36d16b5e0807ebbc08c93af4</id>
<content type='text'>
These fields are no longer used.

The size of struct svc_rdma_recv_ctxt is now less than 300 bytes on
x86_64, down from 2440 bytes.

Signed-off-by: Chuck Lever &lt;chuck.lever@oracle.com&gt;
</content>
</entry>
<entry>
<title>svcrdma: Remove sc_read_complete_q</title>
<updated>2021-03-31T19:57:48Z</updated>
<author>
<name>Chuck Lever</name>
<email>chuck.lever@oracle.com</email>
</author>
<published>2020-12-30T17:43:34Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=9af723be863904c746a6a6bf4f3686087b16b9ff'/>
<id>urn:sha1:9af723be863904c746a6a6bf4f3686087b16b9ff</id>
<content type='text'>
Now that svc_rdma_recvfrom() waits for Read completion,
sc_read_complete_q is no longer used.

Signed-off-by: Chuck Lever &lt;chuck.lever@oracle.com&gt;
</content>
</entry>
<entry>
<title>svcrdma: Remove unused sc_pages field</title>
<updated>2021-03-22T17:22:13Z</updated>
<author>
<name>Chuck Lever</name>
<email>chuck.lever@oracle.com</email>
</author>
<published>2021-01-28T21:47:56Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=579900670ac770a547ff607a60c02c56a7d27bd7'/>
<id>urn:sha1:579900670ac770a547ff607a60c02c56a7d27bd7</id>
<content type='text'>
Clean up. This significantly reduces the size of struct
svc_rdma_send_ctxt.

Signed-off-by: Chuck Lever &lt;chuck.lever@oracle.com&gt;

</content>
</entry>
<entry>
<title>svcrdma: Normalize Send page handling</title>
<updated>2021-03-22T17:22:13Z</updated>
<author>
<name>Chuck Lever</name>
<email>chuck.lever@oracle.com</email>
</author>
<published>2021-01-13T18:57:18Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=2a1e4f21d84184f7ff5768ee3d3d0c30b1135867'/>
<id>urn:sha1:2a1e4f21d84184f7ff5768ee3d3d0c30b1135867</id>
<content type='text'>
Currently svc_rdma_sendto() migrates xdr_buf pages into a separate
page list and NULLs out a bunch of entries in rq_pages while the
pages are under I/O. The Send completion handler then frees those
pages later.

Instead, let's wait for the Send completion, then handle page
releasing in the nfsd thread. I'd like to avoid the cost of 250+
put_page() calls in the Send completion handler, which is single-
threaded.

Signed-off-by: Chuck Lever &lt;chuck.lever@oracle.com&gt;
</content>
</entry>
<entry>
<title>svcrdma: Maintain a Receive water mark</title>
<updated>2021-03-22T17:22:13Z</updated>
<author>
<name>Chuck Lever</name>
<email>chuck.lever@oracle.com</email>
</author>
<published>2021-03-11T23:32:30Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=c558d47596867ff1082fd7475b63670f63f7f5cf'/>
<id>urn:sha1:c558d47596867ff1082fd7475b63670f63f7f5cf</id>
<content type='text'>
Post more Receives when the number of pending Receives drops below
a water mark. The batch mechanism is disabled if the underlying
device cannot support a reasonably-sized Receive Queue.

Signed-off-by: Chuck Lever &lt;chuck.lever@oracle.com&gt;
</content>
</entry>
<entry>
<title>svcrdma: Revert "svcrdma: Reduce Receive doorbell rate"</title>
<updated>2021-03-11T20:26:07Z</updated>
<author>
<name>Chuck Lever</name>
<email>chuck.lever@oracle.com</email>
</author>
<published>2021-03-11T18:25:01Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=bade4be69a6ea6f38c5894468ede10ee60b6f7a0'/>
<id>urn:sha1:bade4be69a6ea6f38c5894468ede10ee60b6f7a0</id>
<content type='text'>
I tested commit 43042b90cae1 ("svcrdma: Reduce Receive doorbell
rate") with mlx4 (IB) and software iWARP and didn't find any
issues. However, I recently got my hardware iWARP setup back on
line (FastLinQ) and it's crashing hard on this commit (confirmed
via bisect).

The failure mode is complex.
 - After a connection is established, the first Receive completes
   normally.
 - But the second and third Receives have garbage in their Receive
   buffers. The server responds with ERR_VERS as a result.
 - When the client tears down the connection to retry, a couple
   of posted Receives flush twice, and that corrupts the recv_ctxt
   free list.
 - __svc_rdma_free then faults or loops infinitely while destroying
   the xprt's recv_ctxts.

Since 43042b90cae1 ("svcrdma: Reduce Receive doorbell rate") does
not fix a bug but is a scalability enhancement, it's safe and
appropriate to revert it while working on a replacement.

Fixes: 43042b90cae1 ("svcrdma: Reduce Receive doorbell rate")
Signed-off-by: Chuck Lever &lt;chuck.lever@oracle.com&gt;
</content>
</entry>
</feed>
