<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/drivers/infiniband/sw, branch linux-6.5.y</title>
<subtitle>Hosts the 0x221E linux distro kernel.</subtitle>
<id>https://universe.0xinfinity.dev/distro/kernel/atom?h=linux-6.5.y</id>
<link rel='self' href='https://universe.0xinfinity.dev/distro/kernel/atom?h=linux-6.5.y'/>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/'/>
<updated>2023-10-10T20:03:05Z</updated>
<entry>
<title>RDMA/siw: Fix connection failure handling</title>
<updated>2023-10-10T20:03:05Z</updated>
<author>
<name>Bernard Metzler</name>
<email>bmt@zurich.ibm.com</email>
</author>
<published>2023-09-05T14:58:22Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=eeafc50a77f6a783c2c44e7ec3674a7b693e06f8'/>
<id>urn:sha1:eeafc50a77f6a783c2c44e7ec3674a7b693e06f8</id>
<content type='text'>
commit 53a3f777049771496f791504e7dc8ef017cba590 upstream.

In case immediate MPA request processing fails, the newly
created endpoint unlinks the listening endpoint and is
ready to be dropped. This special case was not handled
correctly by the code handling the later TCP socket close,
causing a NULL dereference crash in siw_cm_work_handler()
when dereferencing a NULL listener. We now also cancel
the useless MPA timeout, if immediate MPA request
processing fails.

This patch furthermore simplifies MPA processing in general:
Scheduling a useless TCP socket read in sk_data_ready() upcall
is now surpressed, if the socket is already moved out of
TCP_ESTABLISHED state.

Fixes: 6c52fdc244b5 ("rdma/siw: connection management")
Signed-off-by: Bernard Metzler &lt;bmt@zurich.ibm.com&gt;
Link: https://lore.kernel.org/r/20230905145822.446263-1-bmt@zurich.ibm.com
Signed-off-by: Leon Romanovsky &lt;leon@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>RDMA/siw: Correct wrong debug message</title>
<updated>2023-09-13T07:53:44Z</updated>
<author>
<name>Guoqing Jiang</name>
<email>guoqing.jiang@linux.dev</email>
</author>
<published>2023-08-21T13:32:54Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=782f3b65f3ade065f12fe778d3bbda304f6686cb'/>
<id>urn:sha1:782f3b65f3ade065f12fe778d3bbda304f6686cb</id>
<content type='text'>
[ Upstream commit bee024d20451e4ce04ea30099cad09f7f75d288b ]

We need to print num_sle first then pbl-&gt;max_buf per the condition.
Also replace mem-&gt;pbl with pbl while at it.

Fixes: 303ae1cdfdf7 ("rdma/siw: application interface")
Signed-off-by: Guoqing Jiang &lt;guoqing.jiang@linux.dev&gt;
Link: https://lore.kernel.org/r/20230821133255.31111-3-guoqing.jiang@linux.dev
Acked-by: Bernard Metzler &lt;bmt@zurich.ibm.com&gt;
Signed-off-by: Leon Romanovsky &lt;leon@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>RDMA/siw: Balance the reference of cep-&gt;kref in the error path</title>
<updated>2023-09-13T07:53:44Z</updated>
<author>
<name>Guoqing Jiang</name>
<email>guoqing.jiang@linux.dev</email>
</author>
<published>2023-08-21T13:32:53Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=f7517711ae39cccf994ffc76085f429b4bebdfd5'/>
<id>urn:sha1:f7517711ae39cccf994ffc76085f429b4bebdfd5</id>
<content type='text'>
[ Upstream commit b056327bee09e6b86683d3f709a438ccd6031d72 ]

The siw_connect can go to err in below after cep is allocated successfully:

1. If siw_cm_alloc_work returns failure. In this case socket is not
assoicated with cep so siw_cep_put can't be called by siw_socket_disassoc.
We need to call siw_cep_put twice since cep-&gt;kref is increased once after
it was initialized.

2. If siw_cm_queue_work can't find a work, which means siw_cep_get is not
called in siw_cm_queue_work, so cep-&gt;kref is increased twice by siw_cep_get
and when associate socket with cep after it was initialized. So we need to
call siw_cep_put three times (one in siw_socket_disassoc).

3. siw_send_mpareqrep returns error, this scenario is similar as 2.

So we need to remove one siw_cep_put in the error path.

Fixes: 6c52fdc244b5 ("rdma/siw: connection management")
Signed-off-by: Guoqing Jiang &lt;guoqing.jiang@linux.dev&gt;
Link: https://lore.kernel.org/r/20230821133255.31111-2-guoqing.jiang@linux.dev
Acked-by: Bernard Metzler &lt;bmt@zurich.ibm.com&gt;
Signed-off-by: Leon Romanovsky &lt;leon@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>RDMA/rxe: Fix incomplete state save in rxe_requester</title>
<updated>2023-09-13T07:53:37Z</updated>
<author>
<name>Bob Pearson</name>
<email>rpearsonhpe@gmail.com</email>
</author>
<published>2023-07-21T20:07:49Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=255c0e60e1d16874fc151358d94bc8df661600dd'/>
<id>urn:sha1:255c0e60e1d16874fc151358d94bc8df661600dd</id>
<content type='text'>
[ Upstream commit 5d122db2ff80cd2aed4dcd630befb56b51ddf947 ]

If a send packet is dropped by the IP layer in rxe_requester()
the call to rxe_xmit_packet() can fail with err == -EAGAIN.
To recover, the state of the wqe is restored to the state before
the packet was sent so it can be resent. However, the routines
that save and restore the state miss a significnt part of the
variable state in the wqe, the dma struct which is used to process
through the sge table. And, the state is not saved before the packet
is built which modifies the dma struct.

Under heavy stress testing with many QPs on a fast node sending
large messages to a slow node dropped packets are observed and
the resent packets are corrupted because the dma struct was not
restored. This patch fixes this behavior and allows the test cases
to succeed.

Fixes: 3050b9985024 ("IB/rxe: Fix race condition between requester and completer")
Link: https://lore.kernel.org/r/20230721200748.4604-1-rpearsonhpe@gmail.com
Signed-off-by: Bob Pearson &lt;rpearsonhpe@gmail.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>RDMA/rxe: Fix rxe_modify_srq</title>
<updated>2023-09-13T07:53:37Z</updated>
<author>
<name>Bob Pearson</name>
<email>rpearsonhpe@gmail.com</email>
</author>
<published>2023-06-20T14:01:43Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=c40dafb951a39f987131a3093b4f67c2231ca056'/>
<id>urn:sha1:c40dafb951a39f987131a3093b4f67c2231ca056</id>
<content type='text'>
[ Upstream commit cc28f351155def8db209647f2e20a59a7080825b ]

This patch corrects an error in rxe_modify_srq where if the
caller changes the srq size the actual new value is not returned
to the caller since it may be larger than what is requested.
Additionally it open codes the subroutine rcv_wqe_size() which
adds very little value, and makes some whitespace changes.

Fixes: 8700e3e7c485 ("Soft RoCE driver")
Link: https://lore.kernel.org/r/20230620140142.9452-1-rpearsonhpe@gmail.com
Signed-off-by: Bob Pearson &lt;rpearsonhpe@gmail.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>RDMA/rxe: Fix unsafe drain work queue code</title>
<updated>2023-09-13T07:53:37Z</updated>
<author>
<name>Bob Pearson</name>
<email>rpearsonhpe@gmail.com</email>
</author>
<published>2023-06-20T13:55:21Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=d366642b3099bd322375f5b71ba84ab1d586cd6d'/>
<id>urn:sha1:d366642b3099bd322375f5b71ba84ab1d586cd6d</id>
<content type='text'>
[ Upstream commit 5993b75d0bc71cd2b441d174b028fc36180f032c ]

If create_qp does not fully succeed it is possible for qp cleanup
code to attempt to drain the send or recv work queues before the
queues have been created causing a seg fault. This patch checks
to see if the queues exist before attempting to drain them.

Link: https://lore.kernel.org/r/20230620135519.9365-3-rpearsonhpe@gmail.com
Reported-by: syzbot+2da1965168e7dbcba136@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/linux-rdma/00000000000012d89205fe7cfe00@google.com/raw
Fixes: 49dc9c1f0c7e ("RDMA/rxe: Cleanup reset state handling in rxe_resp.c")
Fixes: fbdeb828a21f ("RDMA/rxe: Cleanup error state handling in rxe_comp.c")
Signed-off-by: Bob Pearson &lt;rpearsonhpe@gmail.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>RDMA/rxe: Move work queue code to subroutines</title>
<updated>2023-09-13T07:53:37Z</updated>
<author>
<name>Bob Pearson</name>
<email>rpearsonhpe@gmail.com</email>
</author>
<published>2023-06-20T13:55:19Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=cbd45bbda937ec7be589c896980c2c3949e46b55'/>
<id>urn:sha1:cbd45bbda937ec7be589c896980c2c3949e46b55</id>
<content type='text'>
[ Upstream commit e0ba8ff46704fc924e2ef0451ba196cbdc0d68f2 ]

This patch:
	- Moves code to initialize a qp send work queue to a
	  subroutine named rxe_init_sq.
	- Moves code to initialize a qp recv work queue to a
	  subroutine named rxe_init_rq.
	- Moves initialization of qp request and response packet
	  queues ahead of work queue initialization so that cleanup
	  of a qp if it is not fully completed can successfully
	  attempt to drain the packet queues without a seg fault.
	- Makes minor whitespace cleanups.

Fixes: 8700e3e7c485 ("Soft RoCE driver")
Link: https://lore.kernel.org/r/20230620135519.9365-2-rpearsonhpe@gmail.com
Signed-off-by: Bob Pearson &lt;rpearsonhpe@gmail.com&gt;
Acked-by: Zhu Yanjun &lt;zyjzyj2000@gmail.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>RDMA/siw: Fabricate a GID on tun and loopback devices</title>
<updated>2023-09-13T07:53:35Z</updated>
<author>
<name>Chuck Lever</name>
<email>chuck.lever@oracle.com</email>
</author>
<published>2023-07-17T15:12:12Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=95cf458f1652cf7c40cda243c36ed9120c6d84aa'/>
<id>urn:sha1:95cf458f1652cf7c40cda243c36ed9120c6d84aa</id>
<content type='text'>
[ Upstream commit bad5b6e34ffbaacc77ad28a0f482e33b3929e635 ]

LOOPBACK and NONE (tunnel) devices have all-zero MAC addresses.
Currently, siw_device_create() falls back to copying the IB device's
name in those cases, because an all-zero MAC address breaks the RDMA
core address resolution mechanism.

However, at the point when siw_device_create() constructs a GID, the
ib_device::name field is uninitialized, leaving the MAC address to
remain in an all-zero state.

Fabricate a random artificial GID for such devices, and ensure this
artificial GID is returned for all device query operations.

Link: https://lore.kernel.org/r/168960673260.3007.12378736853793339110.stgit@manet.1015granger.net
Reported-by: Tom Talpey &lt;tom@talpey.com&gt;
Fixes: a2d36b02c15d ("RDMA/siw: Enable siw on tunnel devices")
Reviewed-by: Bernard Metzler &lt;bmt@zurich.ibm.com&gt;
Reviewed-by: Tom Talpey &lt;tom@talpey.com&gt;
Signed-off-by: Chuck Lever &lt;chuck.lever@oracle.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>RDMA/rxe: Fix an error handling path in rxe_bind_mw()</title>
<updated>2023-07-18T12:22:43Z</updated>
<author>
<name>Christophe JAILLET</name>
<email>christophe.jaillet@wanadoo.fr</email>
</author>
<published>2023-07-17T19:55:56Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=5c719d7aef298e9b727f39b45e88528a96df3620'/>
<id>urn:sha1:5c719d7aef298e9b727f39b45e88528a96df3620</id>
<content type='text'>
All errors go to the error handling path, except this one. Be consistent
and also branch to it.

Fixes: 02ed253770fb ("RDMA/rxe: Introduce rxe access supported flags")
Signed-off-by: Christophe JAILLET &lt;christophe.jaillet@wanadoo.fr&gt;
Link: https://lore.kernel.org/r/43698d8a3ed4e720899eadac887427f73d7ec2eb.1689623735.git.christophe.jaillet@wanadoo.fr
Reviewed-by: Bob Pearson &lt;rpearsonhpe@gmail.com&gt;
Acked-by: Zhu Yanjun &lt;zyjzyj2000@gmail.com&gt;
Signed-off-by: Leon Romanovsky &lt;leon@kernel.org&gt;
</content>
</entry>
<entry>
<title>Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma</title>
<updated>2023-06-30T04:01:17Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2023-06-30T04:01:17Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=7ede5f78a0d74b574791c7eb0e2ca6e54b80c93c'/>
<id>urn:sha1:7ede5f78a0d74b574791c7eb0e2ca6e54b80c93c</id>
<content type='text'>
Pull rdma updates from Jason Gunthorpe:
 "This cycle saw a focus on rxe and bnxt_re drivers:

   - Code cleanups for irdma, rxe, rtrs, hns, vmw_pvrdma

   - rxe uses workqueues instead of tasklets

   - rxe has better compliance around access checks for MRs and rereg_mr

   - mana supportst he 'v2' FW interface for RX coalescing

   - hfi1 bug fix for stale cache entries in its MR cache

   - mlx5 buf fix to handle FW failures when destroying QPs

   - erdma HW has a new doorbell allocation mechanism for uverbs that is
     secure

   - Lots of small cleanups and rework in bnxt_re:
       - Use the common mmap functions
       - Support disassociation
       - Improve FW command flow
       - support for 'low latency push'"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (71 commits)
  RDMA/bnxt_re: Fix an IS_ERR() vs NULL check
  RDMA/bnxt_re: Fix spelling mistake "priviledged" -&gt; "privileged"
  RDMA/bnxt_re: Remove duplicated include in bnxt_re/main.c
  RDMA/bnxt_re: Refactor code around bnxt_qplib_map_rc()
  RDMA/bnxt_re: Remove incorrect return check from slow path
  RDMA/bnxt_re: Enable low latency push
  RDMA/bnxt_re: Reorg the bar mapping
  RDMA/bnxt_re: Move the interface version to chip context structure
  RDMA/bnxt_re: Query function capabilities from firmware
  RDMA/bnxt_re: Optimize the bnxt_re_init_hwrm_hdr usage
  RDMA/bnxt_re: Add disassociate ucontext support
  RDMA/bnxt_re: Use the common mmap helper functions
  RDMA/bnxt_re: Initialize opcode while sending message
  RDMA/cma: Remove NULL check before dev_{put, hold}
  RDMA/rxe: Simplify cq-&gt;notify code
  RDMA/rxe: Fixes mr access supported list
  RDMA/bnxt_re: optimize the parameters passed to helper functions
  RDMA/bnxt_re: remove redundant cmdq_bitmap
  RDMA/bnxt_re: use firmware provided max request timeout
  RDMA/bnxt_re: cancel all control path command waiters upon error
  ...
</content>
</entry>
</feed>
