<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/include/rdma/rdmavt_qp.h, branch linux-5.2.y</title>
<subtitle>Hosts the 0x221E linux distro kernel.</subtitle>
<id>https://universe.0xinfinity.dev/distro/kernel/atom?h=linux-5.2.y</id>
<link rel='self' href='https://universe.0xinfinity.dev/distro/kernel/atom?h=linux-5.2.y'/>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/'/>
<updated>2019-09-16T06:23:21Z</updated>
<entry>
<title>IB/hfi1: Unreserve a flushed OPFN request</title>
<updated>2019-09-16T06:23:21Z</updated>
<author>
<name>Kaike Wan</name>
<email>kaike.wan@intel.com</email>
</author>
<published>2019-07-15T16:45:28Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=c31f3dc4e07536c7cf0034684be6a847f26b8197'/>
<id>urn:sha1:c31f3dc4e07536c7cf0034684be6a847f26b8197</id>
<content type='text'>
When an OPFN request is flushed, the request is completed without
unreserving itself from the send queue. Subsequently, when a new
request is post sent, the following warning will be triggered:

WARNING: CPU: 4 PID: 8130 at rdmavt/qp.c:1761 rvt_post_send+0x72a/0x880 [rdmavt]
Call Trace:
[&lt;ffffffffbbb61e41&gt;] dump_stack+0x19/0x1b
[&lt;ffffffffbb497688&gt;] __warn+0xd8/0x100
[&lt;ffffffffbb4977cd&gt;] warn_slowpath_null+0x1d/0x20
[&lt;ffffffffc01c941a&gt;] rvt_post_send+0x72a/0x880 [rdmavt]
[&lt;ffffffffbb4dcabe&gt;] ? account_entity_dequeue+0xae/0xd0
[&lt;ffffffffbb61d645&gt;] ? __kmalloc+0x55/0x230
[&lt;ffffffffc04e1a4c&gt;] ib_uverbs_post_send+0x37c/0x5d0 [ib_uverbs]
[&lt;ffffffffc04e5e36&gt;] ? rdma_lookup_put_uobject+0x26/0x60 [ib_uverbs]
[&lt;ffffffffc04dbce6&gt;] ib_uverbs_write+0x286/0x460 [ib_uverbs]
[&lt;ffffffffbb6f9457&gt;] ? security_file_permission+0x27/0xa0
[&lt;ffffffffbb641650&gt;] vfs_write+0xc0/0x1f0
[&lt;ffffffffbb64246f&gt;] SyS_write+0x7f/0xf0
[&lt;ffffffffbbb74ddb&gt;] system_call_fastpath+0x22/0x27

This patch fixes the problem by moving rvt_qp_wqe_unreserve() into
rvt_qp_complete_swqe() to simplify the code and make it less
error-prone.

Fixes: ca95f802ef51 ("IB/hfi1: Unreserve a reserved request when it is completed")
Link: https://lore.kernel.org/r/20190715164528.74174.31364.stgit@awfm-01.aw.intel.com
Cc: &lt;stable@vger.kernel.org&gt;
Reviewed-by: Mike Marciniszyn &lt;mike.marciniszyn@intel.com&gt;
Reviewed-by: Dennis Dalessandro &lt;dennis.dalessandro@intel.com&gt;
Signed-off-by: Kaike Wan &lt;kaike.wan@intel.com&gt;
Signed-off-by: Mike Marciniszyn &lt;mike.marciniszyn@intel.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@mellanox.com&gt;
</content>
</entry>
<entry>
<title>IB/{rdmavt, qib, hfi1}: Convert to new completion API</title>
<updated>2019-09-16T06:23:21Z</updated>
<author>
<name>Mike Marciniszyn</name>
<email>mike.marciniszyn@intel.com</email>
</author>
<published>2019-06-13T12:30:52Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=bef755d188f5482221c5253ba3a515e8e93bc944'/>
<id>urn:sha1:bef755d188f5482221c5253ba3a515e8e93bc944</id>
<content type='text'>
Convert all completions to use the new completion routine that
fixes a race between post send and completion where fields from
a SWQE can be read after SWQE has been freed.

This patch also addresses issues reported in
https://marc.info/?l=linux-kernel&amp;m=155656897409107&amp;w=2.

The reserved operation path has no need for any barrier.

The barrier for the other path is addressed by the
smp_load_acquire() barrier.

Cc: Andrea Parri &lt;andrea.parri@amarulasolutions.com&gt;
Reviewed-by: Michael J. Ruhl &lt;michael.j.ruhl@intel.com&gt;
Signed-off-by: Mike Marciniszyn &lt;mike.marciniszyn@intel.com&gt;
Signed-off-by: Dennis Dalessandro &lt;dennis.dalessandro@intel.com&gt;
Signed-off-by: Doug Ledford &lt;dledford@redhat.com&gt;
</content>
</entry>
<entry>
<title>IB/rdmavt: Add new completion inline</title>
<updated>2019-09-16T06:23:21Z</updated>
<author>
<name>Mike Marciniszyn</name>
<email>mike.marciniszyn@intel.com</email>
</author>
<published>2019-06-13T12:30:44Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=9d3d11319bd20c7d8935f9368dc5566867b5e203'/>
<id>urn:sha1:9d3d11319bd20c7d8935f9368dc5566867b5e203</id>
<content type='text'>
There is opencoded send completion logic all over all
the drivers.

We need to convert to this routine to enforce ordering
issues for completions.  This routine fixes an ordering
issue where the read of the SWQE fields necessary for creating
the completion can race with a post send if the post send catches
a send queue at the edge of being full.  Is is possible in that situation
to read SWQE fields that are being written.

This new routine insures that SWQE fields are read prior to advancing
the index that post send uses to determine queue fullness.

Reviewed-by: Michael J. Ruhl &lt;michael.j.ruhl@intel.com&gt;
Signed-off-by: Mike Marciniszyn &lt;mike.marciniszyn@intel.com&gt;
Signed-off-by: Dennis Dalessandro &lt;dennis.dalessandro@intel.com&gt;
Signed-off-by: Doug Ledford &lt;dledford@redhat.com&gt;
</content>
</entry>
<entry>
<title>IB/{rdmavt, qib, hfi1}: Use new routine to release reference counts</title>
<updated>2019-04-24T14:31:49Z</updated>
<author>
<name>Mike Marciniszyn</name>
<email>mike.marciniszyn@intel.com</email>
</author>
<published>2019-04-12T13:41:42Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=d40f69c9b9dff3e47d9647943db267b5792ae215'/>
<id>urn:sha1:d40f69c9b9dff3e47d9647943db267b5792ae215</id>
<content type='text'>
The reference count adjustments on reference count completion
are open coded throughout.

Add a routine to do all reference count adjustments and use.

Reviewed-by: Michael J. Ruhl &lt;michael.j.ruhl@intel.com&gt;
Signed-off-by: Mike Marciniszyn &lt;mike.marciniszyn@intel.com&gt;
Signed-off-by: Dennis Dalessandro &lt;dennis.dalessandro@intel.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@mellanox.com&gt;
</content>
</entry>
<entry>
<title>IB/rdmavt: Fix ab/ba include issues</title>
<updated>2019-04-24T14:31:49Z</updated>
<author>
<name>Mike Marciniszyn</name>
<email>mike.marciniszyn@intel.com</email>
</author>
<published>2019-04-11T14:16:11Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=715ab1a862c85b08a9881851c7b1fba84b0dc26b'/>
<id>urn:sha1:715ab1a862c85b08a9881851c7b1fba84b0dc26b</id>
<content type='text'>
The currently include file ordering for rdmavt headers has an
ab/ba include issue the precludes using inlines from rdma_vt.h
in rdmavt_qp.h.

At the heart of the issue is that rdma_vt.h includes rdmavt_qp.h.

Fix the ordering issue by adjusting rdma_vt.h to not require rdmavt_qp.h
and move qp related inlines to rdmavt_qp.h.

Additionally, promote rvt_mmap_info to rdma_vt.h since it is shared
by rdmavt_cq.h and rdmavt_qp.h.

Reviewed-by: Michael J. Ruhl &lt;michael.j.ruhl@intel.com&gt;
Signed-off-by: Mike Marciniszyn &lt;mike.marciniszyn@intel.com&gt;
Signed-off-by: Dennis Dalessandro &lt;dennis.dalessandro@intel.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@mellanox.com&gt;
</content>
</entry>
<entry>
<title>IB/{rdmavt, hfi1): Miscellaneous comment fixes</title>
<updated>2019-04-24T14:31:48Z</updated>
<author>
<name>Kaike Wan</name>
<email>kaike.wan@intel.com</email>
</author>
<published>2019-04-11T14:15:49Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=ea752bc5e50a03e337dfa5c8940d357c62300f8a'/>
<id>urn:sha1:ea752bc5e50a03e337dfa5c8940d357c62300f8a</id>
<content type='text'>
This patch fixes miscellaneous comment errors.

Reviewed-by: Mike Marciniszyn &lt;mike.marciniszyn@intel.com&gt;
Signed-off-by: Kaike Wan &lt;kaike.wan@intel.com&gt;
Signed-off-by: Dennis Dalessandro &lt;dennis.dalessandro@intel.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@mellanox.com&gt;
</content>
</entry>
<entry>
<title>IB/hfi1: Add TID RDMA WRITE functionality into RDMA verbs</title>
<updated>2019-02-05T23:07:44Z</updated>
<author>
<name>Kaike Wan</name>
<email>kaike.wan@intel.com</email>
</author>
<published>2019-01-24T05:51:39Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=3c6cb20a0d17d7a75778fb0935d6fa427c8177af'/>
<id>urn:sha1:3c6cb20a0d17d7a75778fb0935d6fa427c8177af</id>
<content type='text'>
This patch integrates TID RDMA WRITE protocol into normal RDMA verbs
framework. The TID RDMA WRITE protocol is an end-to-end protocol
between the hfi1 drivers on two OPA nodes that converts a qualified
RDMA WRITE request into a TID RDMA WRITE request to avoid data copying
on the responder side.

Reviewed-by: Mike Marciniszyn &lt;mike.marciniszyn@intel.com&gt;
Signed-off-by: Mitko Haralanov &lt;mitko.haralanov@intel.com&gt;
Signed-off-by: Kaike Wan &lt;kaike.wan@intel.com&gt;
Signed-off-by: Dennis Dalessandro &lt;dennis.dalessandro@intel.com&gt;
Signed-off-by: Doug Ledford &lt;dledford@redhat.com&gt;
</content>
</entry>
<entry>
<title>IB/hfi1: Add an s_acked_ack_queue pointer</title>
<updated>2019-02-05T23:07:43Z</updated>
<author>
<name>Kaike Wan</name>
<email>kaike.wan@intel.com</email>
</author>
<published>2019-01-24T05:48:48Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=4f9264d156dc6c154a8a6cfae780730bad45c6f8'/>
<id>urn:sha1:4f9264d156dc6c154a8a6cfae780730bad45c6f8</id>
<content type='text'>
The s_ack_queue is managed by two pointers into the ring:
r_head_ack_queue and s_tail_ack_queue. r_head_ack_queue is the index of
where the next received request is going to be placed and s_tail_ack_queue
is the entry of the request currently being processed. This works
perfectly fine for normal Verbs as the requests are processed one at a
time and the s_tail_ack_queue is not moved until the request that it
points to is fully completed.

In this fashion, s_tail_ack_queue constantly chases r_head_ack_queue and
the two pointers can easily be used to determine "queue full" and "queue
empty" conditions.

The detection of these two conditions are imported in determining when an
old entry can safely be overwritten with a new received request and the
resources associated with the old request be safely released.

When pipelined TID RDMA WRITE is introduced into this mix, things look
very different. r_head_ack_queue is still the point at which a newly
received request will be inserted, s_tail_ack_queue is still the
currently processed request. However, with pipelined TID RDMA WRITE
requests, s_tail_ack_queue moves to the next request once all TID RDMA
WRITE responses for that request have been sent. The rest of the protocol
for a particular request is managed by other pointers specific to TID RDMA
- r_tid_tail and r_tid_ack - which point to the entries for which the next
TID RDMA DATA packets are going to arrive and the request for which
the next TID RDMA ACK packets are to be generated, respectively.

What this means is that entries in the ring, which are "behind"
s_tail_ack_queue (entries which s_tail_ack_queue has gone past) are no
longer considered complete. This is where the problem is - a newly
received request could potentially overwrite a still active TID RDMA WRITE
request.

The reason why the TID RDMA pointers trail s_tail_ack_queue is that the
normal Verbs send engine uses s_tail_ack_queue as the pointer for the next
response. Since TID RDMA WRITE responses are processed by the normal Verbs
send engine, s_tail_ack_queue had to be moved to the next entry once all
TID RDMA WRITE response packets were sent to get the desired pipelining
between requests. Doing otherwise would mean that the normal Verbs send
engine would not be able to send the TID RDMA WRITE responses for the next
TID RDMA request until the current one is fully completed.

This patch introduces the s_acked_ack_queue index to point to the next
request to complete on the responder side. For requests other than TID
RDMA WRITE, s_acked_ack_queue should always be kept in sync with
s_tail_ack_queue. For TID RDMA WRITE request, it may fall behind
s_tail_ack_queue.

Reviewed-by: Mike Marciniszyn &lt;mike.marciniszyn@intel.com&gt;
Signed-off-by: Mitko Haralanov &lt;mitko.haralanov@intel.com&gt;
Signed-off-by: Kaike Wan &lt;kaike.wan@intel.com&gt;
Signed-off-by: Dennis Dalessandro &lt;dennis.dalessandro@intel.com&gt;
Signed-off-by: Doug Ledford &lt;dledford@redhat.com&gt;
</content>
</entry>
<entry>
<title>IB/hfi1: Increment the retry timeout value for TID RDMA READ request</title>
<updated>2019-02-05T22:53:55Z</updated>
<author>
<name>Kaike Wan</name>
<email>kaike.wan@intel.com</email>
</author>
<published>2019-01-24T03:31:57Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=039cd3daf19b9acbf080054d765cbceac842b6a0'/>
<id>urn:sha1:039cd3daf19b9acbf080054d765cbceac842b6a0</id>
<content type='text'>
The RC retry timeout value is based on the estimated time for the
response packet to come back. However, for TID RDMA READ request, due
to the use of header suppression, the driver is normally not notified
for each incoming response packet until the last TID RDMA READ response
packet. Consequently, the retry timeout value should be extended to
cover the transaction time for the entire length of a segment (default
256K) instead of that for a single packet. This patch addresses the
issue by introducing new retry timer functions to account for multiple
packets and wrapper functions for backward compatibility.

Reviewed-by: Mike Marciniszyn &lt;mike.marciniszyn@intel.com&gt;
Signed-off-by: Kaike Wan &lt;kaike.wan@intel.com&gt;
Signed-off-by: Dennis Dalessandro &lt;dennis.dalessandro@intel.com&gt;
Signed-off-by: Doug Ledford &lt;dledford@redhat.com&gt;
</content>
</entry>
<entry>
<title>IB/hfi1: TID RDMA RcvArray programming and TID allocation</title>
<updated>2019-02-05T22:53:55Z</updated>
<author>
<name>Kaike Wan</name>
<email>kaike.wan@intel.com</email>
</author>
<published>2019-01-24T03:30:07Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=838b6fd2d9ca29998869e4d1ecf4566efe807666'/>
<id>urn:sha1:838b6fd2d9ca29998869e4d1ecf4566efe807666</id>
<content type='text'>
TID entries are used by hfi1 hardware to receive data payload from
incoming packets directly into a user buffer and thus avoid data copying
by software. This patch implements the functions for TID allocation,
freeing, and programming TID RcvArray entries in hardware for kernel
clients. TID entries are managed via lists of TID groups similar to PSM.
Furthermore, to track TID resource allocation for each request, software
flows are also allocated and freed as needed. Since software flows
consume large amount of memory for tracking TID allocation and freeing,
it is generally desirable to allocate them dynamically in the send queue
and only for TID RDMA requests, but pre-allocate them for receive queue
because the send queue could have thousands of entries while the receive
queue has only a limited number of entries.

Signed-off-by: Mitko Haralanov &lt;mitko.haralanov@intel.com&gt;
Signed-off-by: Ashutosh Dixit &lt;ashutosh.dixit@intel.com&gt;
Signed-off-by: Mike Marciniszyn &lt;mike.marciniszyn@intel.com&gt;
Signed-off-by: Kaike Wan &lt;kaike.wan@intel.com&gt;
Signed-off-by: Dennis Dalessandro &lt;dennis.dalessandro@intel.com&gt;
Signed-off-by: Doug Ledford &lt;dledford@redhat.com&gt;
</content>
</entry>
</feed>
