<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/drivers/scsi/lpfc/lpfc_debugfs.c, branch linux-5.1.y</title>
<subtitle>Hosts the 0x221E linux distro kernel.</subtitle>
<id>https://universe.0xinfinity.dev/distro/kernel/atom?h=linux-5.1.y</id>
<link rel='self' href='https://universe.0xinfinity.dev/distro/kernel/atom?h=linux-5.1.y'/>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/'/>
<updated>2019-05-11T05:49:55Z</updated>
<entry>
<title>scsi: lpfc: change snprintf to scnprintf for possible overflow</title>
<updated>2019-05-11T05:49:55Z</updated>
<author>
<name>Silvio Cesare</name>
<email>silvio.cesare@gmail.com</email>
</author>
<published>2019-03-21T16:44:32Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=1119e98d0eb093f2ac8c230ef31afeb3be522520'/>
<id>urn:sha1:1119e98d0eb093f2ac8c230ef31afeb3be522520</id>
<content type='text'>
commit e7f7b6f38a44697428f5a2e7c606de028df2b0e3 upstream.

Change snprintf to scnprintf. There are generally two cases where using
snprintf causes problems.

1) Uses of size += snprintf(buf, SIZE - size, fmt, ...)
In this case, if snprintf would have written more characters than what the
buffer size (SIZE) is, then size will end up larger than SIZE. In later
uses of snprintf, SIZE - size will result in a negative number, leading
to problems. Note that size might already be too large by using
size = snprintf before the code reaches a case of size += snprintf.

2) If size is ultimately used as a length parameter for a copy back to user
space, then it will potentially allow for a buffer overflow and information
disclosure when size is greater than SIZE. When the size is used to index
the buffer directly, we can have memory corruption. This also means when
size = snprintf... is used, it may also cause problems since size may become
large.  Copying to userspace is mitigated by the HARDENED_USERCOPY kernel
configuration.

The solution to these issues is to use scnprintf which returns the number of
characters actually written to the buffer, so the size variable will never
exceed SIZE.

Signed-off-by: Silvio Cesare &lt;silvio.cesare@gmail.com&gt;
Signed-off-by: Willy Tarreau &lt;w@1wt.eu&gt;
Signed-off-by: James Smart &lt;james.smart@broadcom.com&gt;
Cc: Dick Kennedy &lt;dick.kennedy@broadcom.com&gt;
Cc: Dan Carpenter &lt;dan.carpenter@oracle.com&gt;
Cc: Kees Cook &lt;keescook@chromium.org&gt;
Cc: Will Deacon &lt;will.deacon@arm.com&gt;
Cc: Greg KH &lt;greg@kroah.com&gt;
Signed-off-by: Martin K. Petersen &lt;martin.petersen@oracle.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>scsi: lpfc: fix a handful of indentation issues</title>
<updated>2019-02-14T03:15:42Z</updated>
<author>
<name>Colin Ian King</name>
<email>colin.king@canonical.com</email>
</author>
<published>2019-02-12T15:29:45Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=258f84fae3aced90613b44e69b123b87c693ccaf'/>
<id>urn:sha1:258f84fae3aced90613b44e69b123b87c693ccaf</id>
<content type='text'>
There are a handful of statements that are indented incorrectly. Fix these.

Signed-off-by: Colin Ian King &lt;colin.king@canonical.com&gt;
Acked-by: James Smart &lt;james.smart@broadcom.com&gt;
Signed-off-by: Martin K. Petersen &lt;martin.petersen@oracle.com&gt;
</content>
</entry>
<entry>
<title>scsi: lpfc: Update 12.2.0.0 file copyrights to 2019</title>
<updated>2019-02-06T03:29:50Z</updated>
<author>
<name>James Smart</name>
<email>jsmart2021@gmail.com</email>
</author>
<published>2019-01-28T19:14:41Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=0d041215f0b4420bf193f3b6e13a1887ffc8320c'/>
<id>urn:sha1:0d041215f0b4420bf193f3b6e13a1887ffc8320c</id>
<content type='text'>
For files modified as part of 12.2.0.0 patches, update copyright to 2019

Signed-off-by: Dick Kennedy &lt;dick.kennedy@broadcom.com&gt;
Signed-off-by: James Smart &lt;jsmart2021@gmail.com&gt;
Reviewed-by: Hannes Reinecke &lt;hare@suse.com&gt;
Signed-off-by: Martin K. Petersen &lt;martin.petersen@oracle.com&gt;
</content>
</entry>
<entry>
<title>scsi: lpfc: Fix default driver parameter collision for allowing NPIV support</title>
<updated>2019-02-06T03:29:50Z</updated>
<author>
<name>James Smart</name>
<email>jsmart2021@gmail.com</email>
</author>
<published>2019-01-28T19:14:38Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=f6e84790520ac7a14abd921db5a2a1e790e363f9'/>
<id>urn:sha1:f6e84790520ac7a14abd921db5a2a1e790e363f9</id>
<content type='text'>
The conversion to enable SCSI and NVME fc4 support ran into an issue with
NPIV support. With NVME, NPIV is not currently supported, but with SCSI it
was. The driver reverted to its lowest setting meaning NPIV with SCSI was
not allowed.

Convert the NPIV checks and implementation so that SCSI can continue to
allow NPIV support.

Signed-off-by: Dick Kennedy &lt;dick.kennedy@broadcom.com&gt;
Signed-off-by: James Smart &lt;jsmart2021@gmail.com&gt;
Reviewed-by: Hannes Reinecke &lt;hare@suse.com&gt;
Signed-off-by: Martin K. Petersen &lt;martin.petersen@oracle.com&gt;
</content>
</entry>
<entry>
<title>scsi: lpfc: Rework EQ/CQ processing to address interrupt coalescing</title>
<updated>2019-02-06T03:29:49Z</updated>
<author>
<name>James Smart</name>
<email>jsmart2021@gmail.com</email>
</author>
<published>2019-01-28T19:14:33Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=32517fc0975bf8dd3967e43c2a6350f038a3af28'/>
<id>urn:sha1:32517fc0975bf8dd3967e43c2a6350f038a3af28</id>
<content type='text'>
When driving high iop counts, auto_imax coalescing kicks in and drives the
performance to extremely small iops levels.

There are two issues:

 1) auto_imax is enabled by default. The auto algorithm, when iops gets
    high, divides the iops by the hdwq count and uses that value to
    calculate EQ_Delay. The EQ_Delay is set uniformly on all EQs whether
    they have load or not. The EQ_delay is only manipulated every 5s (a
    long time). Thus there were large 5s swings of no interrupt delay
    followed by large/maximum delay, before repeating.

 2) When processing a CQ, the driver got mixed up on the rate of when
    to ring the doorbell to keep the chip appraised of the eqe or cqe
    consumption as well as how how long to sit in the thread and
    process queue entries. Currently, the driver capped its work at
    64 entries (very small) and exited/rearmed the CQ.  Thus, on heavy
    loads, additional overheads were taken to exit and re-enter the
    interrupt handler. Worse, if in the large/maximum coalescing
    windows,k it could be a while before getting back to servicing.

The issues are corrected by the following:

 - A change in defaults. Auto_imax is turned OFF and fcp_imax is set
   to 0. Thus all interrupts are immediate.

 - Cleanup of field names and their meanings. Existing names were
   non-intuitive or used for duplicate things.

 - Added max_proc_limit field, to control the length of time the
   handlers would service completions.

 - Reworked EQ handling:
    Added common routine that walks eq, applying notify interval and max
      processing limits. Use queue_claimed to claim ownership of the queue
      while processing. Always rearm the queue whenever the common routine
      is called.
    Rework queue element processing, namely to eliminate hba_index vs
      host_index. Only one index is necessary. The queue entry can be
      marked invalid and the host_index updated immediately after eqe
      processing.
    After rework, xx_release routines are now DB write functions. Renamed
      the routines as such.
    Moved lpfc_sli4_eq_flush(), which does similar action, to same area.
    Replaced the 2 individual loops that walk an eq with a call to the
      common routine.
    Slightly revised lpfc_sli4_hba_handle_eqe() calling syntax.
    Added per-cpu counters to detect interrupt rates and scale
      interrupt coalescing values.

 - Reworked CQ handling:
    Added common routine that walks cq, applying notify interval and max
      processing limits. Use queue_claimed to claim ownership of the queue
      while processing. Always rearm the queue whenever the common routine
      is called.
    Rework queue element processing, namely to eliminate hba_index vs
      host_index. Only one index is necessary. The queue entry can be
      marked invalid and the host_index updated immediately after cqe
      processing.
    After rework, xx_release routines are now DB write functions.  Renamed
      the routines as such.
    Replaced the 3 individual loops that walk a cq with a call to the
      common routine.
    Redefined lpfc_sli4_sp_handle_mcqe() to commong handler definition with
      queue reference. Add increment for mbox completion to handler.

 - Added a new module/sysfs attribute: lpfc_cq_max_proc_limit To allow
   dynamic changing of the CQ max_proc_limit value being used.

Although this leaves an EQ as an immediate interrupt, that interrupt will
only occur if a CQ bound to it is in an armed state and has cqe's to
process.  By staying in the cq processing routine longer, high loads will
avoid generating more interrupts as they will only rearm as the processing
thread exits. The immediately interrupt is also beneficial to idle or
lower-processing CQ's as they get serviced immediately without being
penalized by sharing an EQ with a more loaded CQ.

Signed-off-by: Dick Kennedy &lt;dick.kennedy@broadcom.com&gt;
Signed-off-by: James Smart &lt;jsmart2021@gmail.com&gt;
Reviewed-by: Hannes Reinecke &lt;hare@suse.com&gt;
Signed-off-by: Martin K. Petersen &lt;martin.petersen@oracle.com&gt;
</content>
</entry>
<entry>
<title>scsi: lpfc: Support non-uniform allocation of MSIX vectors to hardware queues</title>
<updated>2019-02-06T03:29:49Z</updated>
<author>
<name>James Smart</name>
<email>jsmart2021@gmail.com</email>
</author>
<published>2019-01-28T19:14:31Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=6a828b0f6192b4930894925d1c1d0dc1f1d99e6e'/>
<id>urn:sha1:6a828b0f6192b4930894925d1c1d0dc1f1d99e6e</id>
<content type='text'>
So far MSIX vector allocation assumed it would be 1:1 with hardware
queues. However, there are several reasons why fewer MSIX vectors may be
allocated than hardware queues such as the platform being out of vectors or
adapter limits being less than cpu count.

This patch reworks the MSIX/EQ relationships with the per-cpu hardware
queues so they can function independently. MSIX vectors will be equitably
split been cpu sockets/cores and then the per-cpu hardware queues will be
mapped to the vectors most efficient for them.

Signed-off-by: Dick Kennedy &lt;dick.kennedy@broadcom.com&gt;
Signed-off-by: James Smart &lt;jsmart2021@gmail.com&gt;
Reviewed-by: Hannes Reinecke &lt;hare@suse.com&gt;
Signed-off-by: Martin K. Petersen &lt;martin.petersen@oracle.com&gt;
</content>
</entry>
<entry>
<title>scsi: lpfc: Adapt partitioned XRI lists to efficient sharing</title>
<updated>2019-02-06T03:29:09Z</updated>
<author>
<name>James Smart</name>
<email>jsmart2021@gmail.com</email>
</author>
<published>2019-01-28T19:14:28Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=c490850a094794e7515737a6939146966c826577'/>
<id>urn:sha1:c490850a094794e7515737a6939146966c826577</id>
<content type='text'>
The XRI get/put lists were partitioned per hardware queue. However, the
adapter rarely had sufficient resources to give a large number of resources
per queue. As such, it became common for a cpu to encounter a lack of XRI
resource and request the upper io stack to retry after returning a BUSY
condition. This occurred even though other cpus were idle and not using
their resources.

Create as efficient a scheme as possible to move resources to the cpus that
need them. Each cpu maintains a small private pool which it allocates from
for io. There is a watermark that the cpu attempts to keep in the private
pool.  The private pool, when empty, pulls from a global pool from the
cpu. When the cpu's global pool is empty it will pull from other cpu's
global pool. As there many cpu global pools (1 per cpu or hardware queue
count) and as each cpu selects what cpu to pull from at different rates and
at different times, it creates a radomizing effect that minimizes the
number of cpu's that will contend with each other when the steal XRI's from
another cpu's global pool.

On io completion, a cpu will push the XRI back on to its private pool.  A
watermark level is maintained for the private pool such that when it is
exceeded it will move XRI's to the CPU global pool so that other cpu's may
allocate them.

On NVME, as heartbeat commands are critical to get placed on the wire, a
single expedite pool is maintained. When a heartbeat is to be sent, it will
allocate an XRI from the expedite pool rather than the normal cpu
private/global pools. On any io completion, if a reduction in the expedite
pools is seen, it will be replenished before the XRI is placed on the cpu
private pool.

Statistics are added to aid understanding the XRI levels on each cpu and
their behaviors.

Signed-off-by: Dick Kennedy &lt;dick.kennedy@broadcom.com&gt;
Signed-off-by: James Smart &lt;jsmart2021@gmail.com&gt;
Reviewed-by: Hannes Reinecke &lt;hare@suse.com&gt;
Signed-off-by: Martin K. Petersen &lt;martin.petersen@oracle.com&gt;
</content>
</entry>
<entry>
<title>scsi: lpfc: Move SCSI and NVME Stats to hardware queue structures</title>
<updated>2019-02-06T03:29:08Z</updated>
<author>
<name>James Smart</name>
<email>jsmart2021@gmail.com</email>
</author>
<published>2019-01-28T19:14:25Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=4c47efc140fa926f00aa59c248458d95bd7b5eab'/>
<id>urn:sha1:4c47efc140fa926f00aa59c248458d95bd7b5eab</id>
<content type='text'>
Many io statistics were being sampled and saved using adapter-based data
structures. This was creating a lot of contention and cache thrashing in
the I/O path.

Move the statistics to the hardware queue data structures.  Given the
per-queue data structures, use of atomic types is lessened.

Add new sysfs and debugfs stat routines to collate the per hardware queue
values and report at an adapter level.

Signed-off-by: Dick Kennedy &lt;dick.kennedy@broadcom.com&gt;
Signed-off-by: James Smart &lt;jsmart2021@gmail.com&gt;
Reviewed-by: Hannes Reinecke &lt;hare@suse.com&gt;
Signed-off-by: Martin K. Petersen &lt;martin.petersen@oracle.com&gt;
</content>
</entry>
<entry>
<title>scsi: lpfc: Adapt cpucheck debugfs logic to Hardware Queues</title>
<updated>2019-02-06T03:28:11Z</updated>
<author>
<name>James Smart</name>
<email>jsmart2021@gmail.com</email>
</author>
<published>2019-01-28T19:14:24Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=63df6d637e3358e64b43e7a774939f8f963926cb'/>
<id>urn:sha1:63df6d637e3358e64b43e7a774939f8f963926cb</id>
<content type='text'>
Similar to the io execution path that reports cpu context information, the
debugfs routines for cpu information needs to be aligned with new hardware
queue implementation.

Convert debugfs cnd nvme cpucheck statistics to report information per
Hardware Queue.

Signed-off-by: Dick Kennedy &lt;dick.kennedy@broadcom.com&gt;
Signed-off-by: James Smart &lt;jsmart2021@gmail.com&gt;
Reviewed-by: Hannes Reinecke &lt;hare@suse.com&gt;
Signed-off-by: Martin K. Petersen &lt;martin.petersen@oracle.com&gt;
</content>
</entry>
<entry>
<title>scsi: lpfc: Partition XRI buffer list across Hardware Queues</title>
<updated>2019-02-06T03:24:22Z</updated>
<author>
<name>James Smart</name>
<email>jsmart2021@gmail.com</email>
</author>
<published>2019-01-28T19:14:22Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=5e5b511d8bfaf765cb92a695cda336c936cb86dc'/>
<id>urn:sha1:5e5b511d8bfaf765cb92a695cda336c936cb86dc</id>
<content type='text'>
Once the IO buff allocations were made shared, there was a single XRI
buffer list shared by all hardware queues.  A single list isn't great for
performance when shared across the per-cpu hardware queues.

Create a separate XRI IO buffer get/put list for each Hardware Queue.  As
SGLs and associated IO buffers get allocated/posted to the firmware; round
robin their assignment across all available hardware Queues so that there
is an equitable assignment.

Modify SCSI and NVME IO submit code paths to use the Hardware Queue logic
for XRI allocation.

Add a debugfs interface to display hardware queue statistics

Added new empty_io_bufs counter to track if a cpu runs out of XRIs.

Replace common_ variables/names with io_ to make meanings clearer.

Signed-off-by: Dick Kennedy &lt;dick.kennedy@broadcom.com&gt;
Signed-off-by: James Smart &lt;jsmart2021@gmail.com&gt;
Reviewed-by: Hannes Reinecke &lt;hare@suse.com&gt;
Signed-off-by: Martin K. Petersen &lt;martin.petersen@oracle.com&gt;
</content>
</entry>
</feed>
