<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/drivers/pci/hotplug/pciehp.h, branch linux-6.5.y</title>
<subtitle>Hosts the 0x221E linux distro kernel.</subtitle>
<id>https://universe.0xinfinity.dev/distro/kernel/atom?h=linux-6.5.y</id>
<link rel='self' href='https://universe.0xinfinity.dev/distro/kernel/atom?h=linux-6.5.y'/>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/'/>
<updated>2022-01-12T12:42:51Z</updated>
<entry>
<title>PCI: pciehp: Use down_read/write_nested(reset_lock) to fix lockdep errors</title>
<updated>2022-01-12T12:42:51Z</updated>
<author>
<name>Hans de Goede</name>
<email>hdegoede@redhat.com</email>
</author>
<published>2021-12-17T14:17:09Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=085a9f43433f30cbe8a1ade62d9d7827c3217f4d'/>
<id>urn:sha1:085a9f43433f30cbe8a1ade62d9d7827c3217f4d</id>
<content type='text'>
Use down_read_nested() and down_write_nested() when taking the
ctrl-&gt;reset_lock rw-sem, passing the number of PCIe hotplug controllers in
the path to the PCI root bus as lock subclass parameter.

This fixes the following false-positive lockdep report when unplugging a
Lenovo X1C8 from a Lenovo 2nd gen TB3 dock:

  pcieport 0000:06:01.0: pciehp: Slot(1): Link Down
  pcieport 0000:06:01.0: pciehp: Slot(1): Card not present
  ============================================
  WARNING: possible recursive locking detected
  5.16.0-rc2+ #621 Not tainted
  --------------------------------------------
  irq/124-pciehp/86 is trying to acquire lock:
  ffff8e5ac4299ef8 (&amp;ctrl-&gt;reset_lock){.+.+}-{3:3}, at: pciehp_check_presence+0x23/0x80

  but task is already holding lock:
  ffff8e5ac4298af8 (&amp;ctrl-&gt;reset_lock){.+.+}-{3:3}, at: pciehp_ist+0xf3/0x180

   other info that might help us debug this:
   Possible unsafe locking scenario:

	 CPU0
	 ----
    lock(&amp;ctrl-&gt;reset_lock);
    lock(&amp;ctrl-&gt;reset_lock);

   *** DEADLOCK ***

   May be due to missing lock nesting notation

  3 locks held by irq/124-pciehp/86:
   #0: ffff8e5ac4298af8 (&amp;ctrl-&gt;reset_lock){.+.+}-{3:3}, at: pciehp_ist+0xf3/0x180
   #1: ffffffffa3b024e8 (pci_rescan_remove_lock){+.+.}-{3:3}, at: pciehp_unconfigure_device+0x31/0x110
   #2: ffff8e5ac1ee2248 (&amp;dev-&gt;mutex){....}-{3:3}, at: device_release_driver+0x1c/0x40

  stack backtrace:
  CPU: 4 PID: 86 Comm: irq/124-pciehp Not tainted 5.16.0-rc2+ #621
  Hardware name: LENOVO 20U90SIT19/20U90SIT19, BIOS N2WET30W (1.20 ) 08/26/2021
  Call Trace:
   &lt;TASK&gt;
   dump_stack_lvl+0x59/0x73
   __lock_acquire.cold+0xc5/0x2c6
   lock_acquire+0xb5/0x2b0
   down_read+0x3e/0x50
   pciehp_check_presence+0x23/0x80
   pciehp_runtime_resume+0x5c/0xa0
   device_for_each_child+0x45/0x70
   pcie_port_device_runtime_resume+0x20/0x30
   pci_pm_runtime_resume+0xa7/0xc0
   __rpm_callback+0x41/0x110
   rpm_callback+0x59/0x70
   rpm_resume+0x512/0x7b0
   __pm_runtime_resume+0x4a/0x90
   __device_release_driver+0x28/0x240
   device_release_driver+0x26/0x40
   pci_stop_bus_device+0x68/0x90
   pci_stop_bus_device+0x2c/0x90
   pci_stop_and_remove_bus_device+0xe/0x20
   pciehp_unconfigure_device+0x6c/0x110
   pciehp_disable_slot+0x5b/0xe0
   pciehp_handle_presence_or_link_change+0xc3/0x2f0
   pciehp_ist+0x179/0x180

This lockdep warning is triggered because with Thunderbolt, hotplug ports
are nested. When removing multiple devices in a daisy-chain, each hotplug
port's reset_lock may be acquired recursively. It's never the same lock, so
the lockdep splat is a false positive.

Because locks at the same hierarchy level are never acquired recursively, a
per-level lockdep class is sufficient to fix the lockdep warning.

The choice to use one lockdep subclass per pcie-hotplug controller in the
path to the root-bus was made to conserve class keys because their number
is limited and the complexity grows quadratically with number of keys
according to Documentation/locking/lockdep-design.rst.

Link: https://lore.kernel.org/linux-pci/20190402021933.GA2966@mit.edu/
Link: https://lore.kernel.org/linux-pci/de684a28-9038-8fc6-27ca-3f6f2f6400d7@redhat.com/
Link: https://lore.kernel.org/r/20211217141709.379663-1-hdegoede@redhat.com
Link: https://bugzilla.kernel.org/show_bug.cgi?id=208855
Reported-by: "Theodore Ts'o" &lt;tytso@mit.edu&gt;
Signed-off-by: Hans de Goede &lt;hdegoede@redhat.com&gt;
Signed-off-by: Bjorn Helgaas &lt;bhelgaas@google.com&gt;
Reviewed-by: Lukas Wunner &lt;lukas@wunner.de&gt;
Cc: stable@vger.kernel.org
</content>
</entry>
<entry>
<title>PCI: pciehp: Ignore Link Down/Up caused by error-induced Hot Reset</title>
<updated>2021-10-15T19:23:46Z</updated>
<author>
<name>Lukas Wunner</name>
<email>lukas@wunner.de</email>
</author>
<published>2021-07-31T12:39:01Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=ea401499e943c307e6d44af6c2b4e068643e7884'/>
<id>urn:sha1:ea401499e943c307e6d44af6c2b4e068643e7884</id>
<content type='text'>
Stuart Hayes reports that an error handled by DPC at a Root Port results
in pciehp gratuitously bringing down a subordinate hotplug port:

  RP -- UP -- DP -- UP -- DP (hotplug) -- EP

pciehp brings the slot down because the Link to the Endpoint goes down.
That is caused by a Hot Reset being propagated as a result of DPC.
Per PCIe Base Spec 5.0, section 6.6.1 "Conventional Reset":

  For a Switch, the following must cause a hot reset to be sent on all
  Downstream Ports: [...]

  * The Data Link Layer of the Upstream Port reporting DL_Down status.
    In Switches that support Link speeds greater than 5.0 GT/s, the
    Upstream Port must direct the LTSSM of each Downstream Port to the
    Hot Reset state, but not hold the LTSSMs in that state. This permits
    each Downstream Port to begin Link training immediately after its
    hot reset completes. This behavior is recommended for all Switches.

  * Receiving a hot reset on the Upstream Port.

Once DPC recovers, pcie_do_recovery() walks down the hierarchy and
invokes pcie_portdrv_slot_reset() to restore each port's config space.
At that point, a hotplug interrupt is signaled per PCIe Base Spec r5.0,
section 6.7.3.4 "Software Notification of Hot-Plug Events":

  If the Port is enabled for edge-triggered interrupt signaling using
  MSI or MSI-X, an interrupt message must be sent every time the logical
  AND of the following conditions transitions from FALSE to TRUE: [...]

  * The Hot-Plug Interrupt Enable bit in the Slot Control register is
    set to 1b.

  * At least one hot-plug event status bit in the Slot Status register
    and its associated enable bit in the Slot Control register are both
    set to 1b.

Prevent pciehp from gratuitously bringing down the slot by clearing the
error-induced Data Link Layer State Changed event before restoring
config space.  Afterwards, check whether the link has unexpectedly
failed to retrain and synthesize a DLLSC event if so.

Allow each pcie_port_service_driver (one of them being pciehp) to define
a slot_reset callback and re-use the existing pm_iter() function to
iterate over the callbacks.

Thereby, the Endpoint driver remains bound throughout error recovery and
may restore the device to working state.

Surprise removal during error recovery is detected through a Presence
Detect Changed event.  The hotplug port is expected to not signal that
event as a result of a Hot Reset.

The issue isn't DPC-specific, it also occurs when an error is handled by
AER through aer_root_reset().  So while the issue was noticed only now,
it's been around since 2006 when AER support was first introduced.

[bhelgaas: drop PCI_ERROR_RECOVERY Kconfig, split pm_iter() rename to
preparatory patch]
Link: https://lore.kernel.org/linux-pci/08c046b0-c9f2-3489-eeef-7e7aca435bb9@gmail.com/
Fixes: 6c2b374d7485 ("PCI-Express AER implemetation: AER core and aerdriver")
Link: https://lore.kernel.org/r/251f4edcc04c14f873ff1c967bc686169cd07d2d.1627638184.git.lukas@wunner.de
Reported-by: Stuart Hayes &lt;stuart.w.hayes@gmail.com&gt;
Tested-by: Stuart Hayes &lt;stuart.w.hayes@gmail.com&gt;
Signed-off-by: Lukas Wunner &lt;lukas@wunner.de&gt;
Signed-off-by: Bjorn Helgaas &lt;bhelgaas@google.com&gt;
Cc: stable@vger.kernel.org # v2.6.19+: ba952824e6c1: PCI/portdrv: Report reset for frozen channel
Cc: Keith Busch &lt;kbusch@kernel.org&gt;
</content>
</entry>
<entry>
<title>PCI: Change the type of probe argument in reset functions</title>
<updated>2021-08-18T22:32:42Z</updated>
<author>
<name>Amey Narkhede</name>
<email>ameynarkhede03@gmail.com</email>
</author>
<published>2021-08-17T18:05:00Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=9bdc81ce440ec6ea899b236879aee470ec388020'/>
<id>urn:sha1:9bdc81ce440ec6ea899b236879aee470ec388020</id>
<content type='text'>
Change the type of probe argument in functions which implement reset
methods from int to bool to make the context and intent clear.

Suggested-by: Alex Williamson &lt;alex.williamson@redhat.com&gt;
Link: https://lore.kernel.org/r/20210817180500.1253-10-ameynarkhede03@gmail.com
Signed-off-by: Amey Narkhede &lt;ameynarkhede03@gmail.com&gt;
Signed-off-by: Bjorn Helgaas &lt;bhelgaas@google.com&gt;
</content>
</entry>
<entry>
<title>PCI: Fix kernel-doc formatting</title>
<updated>2021-07-06T15:37:46Z</updated>
<author>
<name>Krzysztof Wilczyński</name>
<email>kw@linux.com</email>
</author>
<published>2021-07-03T15:13:02Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=347269c113f10fbe893f11dd3ae5f44aa15d3111'/>
<id>urn:sha1:347269c113f10fbe893f11dd3ae5f44aa15d3111</id>
<content type='text'>
Fix kernel-doc formatting throughout drivers/pci and related include files.
No change to functionality intended.

Check for warnings:

  $ find include drivers/pci -type f -path "*pci*.[ch]" | xargs scripts/kernel-doc -none

[bhelgaas: squashed to one commit]
Link: https://lore.kernel.org/r/20210509030237.368540-1-kw@linux.com
Link: https://lore.kernel.org/r/20210703151306.1922450-1-kw@linux.com
Link: https://lore.kernel.org/r/20210703151306.1922450-2-kw@linux.com
Link: https://lore.kernel.org/r/20210703151306.1922450-3-kw@linux.com
Link: https://lore.kernel.org/r/20210703151306.1922450-4-kw@linux.com
Link: https://lore.kernel.org/r/20210703151306.1922450-5-kw@linux.com
Signed-off-by: Krzysztof Wilczyński &lt;kw@linux.com&gt;
Signed-off-by: Bjorn Helgaas &lt;bhelgaas@google.com&gt;
</content>
</entry>
<entry>
<title>PCI: pciehp: Remove unused EMI() and HP_SUPR_RM() macros</title>
<updated>2020-04-23T18:45:35Z</updated>
<author>
<name>Ani Sinha</name>
<email>ani@anisinha.ca</email>
</author>
<published>2020-04-21T03:27:50Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=a6cec3fdbd72e9f975f2db2bbfd1847831478195'/>
<id>urn:sha1:a6cec3fdbd72e9f975f2db2bbfd1847831478195</id>
<content type='text'>
EMI() and HP_SUPR_RM() are unused, so remove them.

Link: https://lore.kernel.org/r/1587439673-39652-1-git-send-email-ani@anisinha.ca
Signed-off-by: Ani Sinha &lt;ani@anisinha.ca&gt;
Signed-off-by: Bjorn Helgaas &lt;bhelgaas@google.com&gt;
Reviewed-by: Mika Westerberg &lt;mika.westerberg@linux.intel.com&gt;
</content>
</entry>
<entry>
<title>PCI: pciehp: Disable in-band presence detect when possible</title>
<updated>2020-02-21T04:44:30Z</updated>
<author>
<name>Alexandru Gagniuc</name>
<email>mr.nuke.me@gmail.com</email>
</author>
<published>2019-10-25T19:00:45Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=202853595e53f981c86656c49fc1cc1e3620f558'/>
<id>urn:sha1:202853595e53f981c86656c49fc1cc1e3620f558</id>
<content type='text'>
The presence detect state (PDS) is normally a logical OR of in-band and
out-of-band (OOB) presence detect.  As of PCIe 4.0, there is the option to
disable in-band presence so that the PDS bit always reflects the state of
the out-of-band presence.

The recommendation of the PCIe spec is to disable in-band presence whenever
supported (PCIe r5.0, appendix I implementation note):

  Due to architectural issues, the in-band (Physical-Layer-based) portion
  of the PD mechanism is deprecated for use with async hot-plug. One issue
  is that in-band PD as architected does not detect adapter removal during
  certain LTSSM states, notably the L1 and Disabled States.  Another issue
  is that when both in-band and OOB PD are being used together, the
  Presence Detect State bit and its associated interrupt mechanism always
  reflect the logical OR of the inband and OOB PD states, and with some
  hot-plug hardware configurations, it is important for software to detect
  and respond to in-band and OOB PD events independently.  If OOB PD is
  being used and the associated DSP supports In-Band PD Disable, it is
  recommended that the In-Band PD Disable bit be Set, and the Presence
  Detect State bit and its associated interrupt mechanism be used
  exclusively for OOB PD.  As a substitute for in-band PD with async
  hot-plug, the reference model uses either the DPC or the DLL Link Active
  mechanism.

Link: https://lore.kernel.org/r/20191025190047.38130-2-stuart.w.hayes@gmail.com
[bhelgaas: move PCI_EXP_SLTCAP2 read earlier &amp; print PCI_EXP_SLTCAP2_IBPD
value (suggested by Lukas)]
Signed-off-by: Alexandru Gagniuc &lt;mr.nuke.me@gmail.com&gt;
Signed-off-by: Bjorn Helgaas &lt;bhelgaas@google.com&gt;
Reviewed-by: Andy Shevchenko &lt;andy.shevchenko@gmail.com&gt;
Reviewed-by: Lukas Wunner &lt;lukas@wunner.de&gt;
</content>
</entry>
<entry>
<title>PCI: pciehp: Prevent deadlock on disconnect</title>
<updated>2019-11-12T23:17:42Z</updated>
<author>
<name>Mika Westerberg</name>
<email>mika.westerberg@linux.intel.com</email>
</author>
<published>2019-10-29T17:00:22Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=87d0f2a5536fdf5053a6d341880f96135549a644'/>
<id>urn:sha1:87d0f2a5536fdf5053a6d341880f96135549a644</id>
<content type='text'>
This addresses deadlocks in these common cases in hierarchies containing
two switches:

  - All involved ports are runtime suspended and they are unplugged. This
    can happen easily if the drivers involved automatically enable runtime
    PM (xHCI for example does that).

  - System is suspended (e.g., closing the lid on a laptop) with a dock +
    something else connected, and the dock is unplugged while suspended.

These cases lead to the following deadlock:

  INFO: task irq/126-pciehp:198 blocked for more than 120 seconds.
  irq/126-pciehp  D    0   198      2 0x80000000
  Call Trace:
   schedule+0x2c/0x80
   schedule_timeout+0x246/0x350
   wait_for_completion+0xb7/0x140
   kthread_stop+0x49/0x110
   free_irq+0x32/0x70
   pcie_shutdown_notification+0x2f/0x50
   pciehp_remove+0x27/0x50
   pcie_port_remove_service+0x36/0x50
   device_release_driver+0x12/0x20
   bus_remove_device+0xec/0x160
   device_del+0x13b/0x350
   device_unregister+0x1a/0x60
   remove_iter+0x1e/0x30
   device_for_each_child+0x56/0x90
   pcie_port_device_remove+0x22/0x40
   pcie_portdrv_remove+0x20/0x60
   pci_device_remove+0x3e/0xc0
   device_release_driver_internal+0x18c/0x250
   device_release_driver+0x12/0x20
   pci_stop_bus_device+0x6f/0x90
   pci_stop_bus_device+0x31/0x90
   pci_stop_and_remove_bus_device+0x12/0x20
   pciehp_unconfigure_device+0x88/0x140
   pciehp_disable_slot+0x6a/0x110
   pciehp_handle_presence_or_link_change+0x263/0x400
   pciehp_ist+0x1c9/0x1d0
   irq_thread_fn+0x24/0x60
   irq_thread+0xeb/0x190
   kthread+0x120/0x140

  INFO: task irq/190-pciehp:2288 blocked for more than 120 seconds.
  irq/190-pciehp  D    0  2288      2 0x80000000
  Call Trace:
   __schedule+0x2a2/0x880
   schedule+0x2c/0x80
   schedule_preempt_disabled+0xe/0x10
   mutex_lock+0x2c/0x30
   pci_lock_rescan_remove+0x15/0x20
   pciehp_unconfigure_device+0x4d/0x140
   pciehp_disable_slot+0x6a/0x110
   pciehp_handle_presence_or_link_change+0x263/0x400
   pciehp_ist+0x1c9/0x1d0
   irq_thread_fn+0x24/0x60
   irq_thread+0xeb/0x190
   kthread+0x120/0x140

What happens here is that the whole hierarchy is runtime resumed and the
parent PCIe downstream port, which got the hot-remove event, starts
removing devices below it, taking pci_lock_rescan_remove() lock. When the
child PCIe port is runtime resumed it calls pciehp_check_presence() which
ends up calling pciehp_card_present() and pciehp_check_link_active().  Both
of these use pcie_capability_read_word(), which notices that the underlying
device is already gone and returns PCIBIOS_DEVICE_NOT_FOUND with the
capability value set to 0. When pciehp gets this value it thinks that its
child device is also hot-removed and schedules its IRQ thread to handle the
event.

The deadlock happens when the child's IRQ thread runs and tries to acquire
pci_lock_rescan_remove() which is already taken by the parent and the
parent waits for the child's IRQ thread to finish.

Prevent this from happening by checking the return value of
pcie_capability_read_word() and if it is PCIBIOS_DEVICE_NOT_FOUND stop
performing any hot-removal activities.

[bhelgaas: add common scenarios to commit log]
Link: https://lore.kernel.org/r/20191029170022.57528-2-mika.westerberg@linux.intel.com
Tested-by: Kai-Heng Feng &lt;kai.heng.feng@canonical.com&gt;
Signed-off-by: Mika Westerberg &lt;mika.westerberg@linux.intel.com&gt;
Signed-off-by: Bjorn Helgaas &lt;bhelgaas@google.com&gt;
</content>
</entry>
<entry>
<title>PCI: pciehp: Avoid returning prematurely from sysfs requests</title>
<updated>2019-10-04T20:39:36Z</updated>
<author>
<name>Lukas Wunner</name>
<email>lukas@wunner.de</email>
</author>
<published>2019-08-09T10:28:43Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=157c1062fcd86ade3c674503705033051fd3d401'/>
<id>urn:sha1:157c1062fcd86ade3c674503705033051fd3d401</id>
<content type='text'>
A sysfs request to enable or disable a PCIe hotplug slot should not
return before it has been carried out.  That is sought to be achieved by
waiting until the controller's "pending_events" have been cleared.

However the IRQ thread pciehp_ist() clears the "pending_events" before
it acts on them.  If pciehp_sysfs_enable_slot() / _disable_slot() happen
to check the "pending_events" after they have been cleared but while
pciehp_ist() is still running, the functions may return prematurely
with an incorrect return value.

Fix by introducing an "ist_running" flag which must be false before a sysfs
request is allowed to return.

Fixes: 32a8cef274fe ("PCI: pciehp: Enable/disable exclusively from IRQ thread")
Link: https://lore.kernel.org/linux-pci/1562226638-54134-1-git-send-email-wangxiongfeng2@huawei.com
Link: https://lore.kernel.org/r/4174210466e27eb7e2243dd1d801d5f75baaffd8.1565345211.git.lukas@wunner.de
Reported-and-tested-by: Xiongfeng Wang &lt;wangxiongfeng2@huawei.com&gt;
Signed-off-by: Lukas Wunner &lt;lukas@wunner.de&gt;
Signed-off-by: Bjorn Helgaas &lt;bhelgaas@google.com&gt;
Cc: stable@vger.kernel.org # v4.19+
</content>
</entry>
<entry>
<title>PCI: pciehp: Refer to "Indicators" instead of "LEDs" in comments</title>
<updated>2019-09-05T20:52:24Z</updated>
<author>
<name>Bjorn Helgaas</name>
<email>bhelgaas@google.com</email>
</author>
<published>2019-09-05T20:52:24Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=4a06c2c38235d4a0bc436131591e2927288992fe'/>
<id>urn:sha1:4a06c2c38235d4a0bc436131591e2927288992fe</id>
<content type='text'>
The PCIe spec doesn't mention "green LEDs" or "amber LEDs".  Replace those
terms with "Power Indicator" and "Attention Indicator" so the comments
match the spec language.  Comment changes only.

Signed-off-by: Bjorn Helgaas &lt;bhelgaas@google.com&gt;
</content>
</entry>
<entry>
<title>PCI: pciehp: Remove pciehp_green_led_{on,off,blink}()</title>
<updated>2019-09-05T20:45:00Z</updated>
<author>
<name>Denis Efremov</name>
<email>efremov@linux.com</email>
</author>
<published>2019-09-03T11:10:21Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=9194094be418f4f13fbab3a6049ea18acb137178'/>
<id>urn:sha1:9194094be418f4f13fbab3a6049ea18acb137178</id>
<content type='text'>
Remove pciehp_green_led_{on,off,blink}() and use pciehp_set_indicators()
instead, since the code is mostly the same.

[bhelgaas: drop set_power_indicator() wrapper to reduce the number of
interfaces]
Link: https://lore.kernel.org/r/20190903111021.1559-5-efremov@linux.com
Signed-off-by: Denis Efremov &lt;efremov@linux.com&gt;
Signed-off-by: Bjorn Helgaas &lt;bhelgaas@google.com&gt;
Reviewed-by: Kuppuswamy Sathyanarayanan &lt;sathyanarayanan.kuppuswamy@linux.intel.com&gt;
</content>
</entry>
</feed>
