<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/drivers/pci/hotplug/pciehp_core.c, branch linux-6.5.y</title>
<subtitle>Hosts the 0x221E linux distro kernel.</subtitle>
<id>https://universe.0xinfinity.dev/distro/kernel/atom?h=linux-6.5.y</id>
<link rel='self' href='https://universe.0xinfinity.dev/distro/kernel/atom?h=linux-6.5.y'/>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/'/>
<updated>2022-01-12T12:42:51Z</updated>
<entry>
<title>PCI: pciehp: Use down_read/write_nested(reset_lock) to fix lockdep errors</title>
<updated>2022-01-12T12:42:51Z</updated>
<author>
<name>Hans de Goede</name>
<email>hdegoede@redhat.com</email>
</author>
<published>2021-12-17T14:17:09Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=085a9f43433f30cbe8a1ade62d9d7827c3217f4d'/>
<id>urn:sha1:085a9f43433f30cbe8a1ade62d9d7827c3217f4d</id>
<content type='text'>
Use down_read_nested() and down_write_nested() when taking the
ctrl-&gt;reset_lock rw-sem, passing the number of PCIe hotplug controllers in
the path to the PCI root bus as lock subclass parameter.

This fixes the following false-positive lockdep report when unplugging a
Lenovo X1C8 from a Lenovo 2nd gen TB3 dock:

  pcieport 0000:06:01.0: pciehp: Slot(1): Link Down
  pcieport 0000:06:01.0: pciehp: Slot(1): Card not present
  ============================================
  WARNING: possible recursive locking detected
  5.16.0-rc2+ #621 Not tainted
  --------------------------------------------
  irq/124-pciehp/86 is trying to acquire lock:
  ffff8e5ac4299ef8 (&amp;ctrl-&gt;reset_lock){.+.+}-{3:3}, at: pciehp_check_presence+0x23/0x80

  but task is already holding lock:
  ffff8e5ac4298af8 (&amp;ctrl-&gt;reset_lock){.+.+}-{3:3}, at: pciehp_ist+0xf3/0x180

   other info that might help us debug this:
   Possible unsafe locking scenario:

	 CPU0
	 ----
    lock(&amp;ctrl-&gt;reset_lock);
    lock(&amp;ctrl-&gt;reset_lock);

   *** DEADLOCK ***

   May be due to missing lock nesting notation

  3 locks held by irq/124-pciehp/86:
   #0: ffff8e5ac4298af8 (&amp;ctrl-&gt;reset_lock){.+.+}-{3:3}, at: pciehp_ist+0xf3/0x180
   #1: ffffffffa3b024e8 (pci_rescan_remove_lock){+.+.}-{3:3}, at: pciehp_unconfigure_device+0x31/0x110
   #2: ffff8e5ac1ee2248 (&amp;dev-&gt;mutex){....}-{3:3}, at: device_release_driver+0x1c/0x40

  stack backtrace:
  CPU: 4 PID: 86 Comm: irq/124-pciehp Not tainted 5.16.0-rc2+ #621
  Hardware name: LENOVO 20U90SIT19/20U90SIT19, BIOS N2WET30W (1.20 ) 08/26/2021
  Call Trace:
   &lt;TASK&gt;
   dump_stack_lvl+0x59/0x73
   __lock_acquire.cold+0xc5/0x2c6
   lock_acquire+0xb5/0x2b0
   down_read+0x3e/0x50
   pciehp_check_presence+0x23/0x80
   pciehp_runtime_resume+0x5c/0xa0
   device_for_each_child+0x45/0x70
   pcie_port_device_runtime_resume+0x20/0x30
   pci_pm_runtime_resume+0xa7/0xc0
   __rpm_callback+0x41/0x110
   rpm_callback+0x59/0x70
   rpm_resume+0x512/0x7b0
   __pm_runtime_resume+0x4a/0x90
   __device_release_driver+0x28/0x240
   device_release_driver+0x26/0x40
   pci_stop_bus_device+0x68/0x90
   pci_stop_bus_device+0x2c/0x90
   pci_stop_and_remove_bus_device+0xe/0x20
   pciehp_unconfigure_device+0x6c/0x110
   pciehp_disable_slot+0x5b/0xe0
   pciehp_handle_presence_or_link_change+0xc3/0x2f0
   pciehp_ist+0x179/0x180

This lockdep warning is triggered because with Thunderbolt, hotplug ports
are nested. When removing multiple devices in a daisy-chain, each hotplug
port's reset_lock may be acquired recursively. It's never the same lock, so
the lockdep splat is a false positive.

Because locks at the same hierarchy level are never acquired recursively, a
per-level lockdep class is sufficient to fix the lockdep warning.

The choice to use one lockdep subclass per pcie-hotplug controller in the
path to the root-bus was made to conserve class keys because their number
is limited and the complexity grows quadratically with number of keys
according to Documentation/locking/lockdep-design.rst.

Link: https://lore.kernel.org/linux-pci/20190402021933.GA2966@mit.edu/
Link: https://lore.kernel.org/linux-pci/de684a28-9038-8fc6-27ca-3f6f2f6400d7@redhat.com/
Link: https://lore.kernel.org/r/20211217141709.379663-1-hdegoede@redhat.com
Link: https://bugzilla.kernel.org/show_bug.cgi?id=208855
Reported-by: "Theodore Ts'o" &lt;tytso@mit.edu&gt;
Signed-off-by: Hans de Goede &lt;hdegoede@redhat.com&gt;
Signed-off-by: Bjorn Helgaas &lt;bhelgaas@google.com&gt;
Reviewed-by: Lukas Wunner &lt;lukas@wunner.de&gt;
Cc: stable@vger.kernel.org
</content>
</entry>
<entry>
<title>PCI: pciehp: Ignore Link Down/Up caused by error-induced Hot Reset</title>
<updated>2021-10-15T19:23:46Z</updated>
<author>
<name>Lukas Wunner</name>
<email>lukas@wunner.de</email>
</author>
<published>2021-07-31T12:39:01Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=ea401499e943c307e6d44af6c2b4e068643e7884'/>
<id>urn:sha1:ea401499e943c307e6d44af6c2b4e068643e7884</id>
<content type='text'>
Stuart Hayes reports that an error handled by DPC at a Root Port results
in pciehp gratuitously bringing down a subordinate hotplug port:

  RP -- UP -- DP -- UP -- DP (hotplug) -- EP

pciehp brings the slot down because the Link to the Endpoint goes down.
That is caused by a Hot Reset being propagated as a result of DPC.
Per PCIe Base Spec 5.0, section 6.6.1 "Conventional Reset":

  For a Switch, the following must cause a hot reset to be sent on all
  Downstream Ports: [...]

  * The Data Link Layer of the Upstream Port reporting DL_Down status.
    In Switches that support Link speeds greater than 5.0 GT/s, the
    Upstream Port must direct the LTSSM of each Downstream Port to the
    Hot Reset state, but not hold the LTSSMs in that state. This permits
    each Downstream Port to begin Link training immediately after its
    hot reset completes. This behavior is recommended for all Switches.

  * Receiving a hot reset on the Upstream Port.

Once DPC recovers, pcie_do_recovery() walks down the hierarchy and
invokes pcie_portdrv_slot_reset() to restore each port's config space.
At that point, a hotplug interrupt is signaled per PCIe Base Spec r5.0,
section 6.7.3.4 "Software Notification of Hot-Plug Events":

  If the Port is enabled for edge-triggered interrupt signaling using
  MSI or MSI-X, an interrupt message must be sent every time the logical
  AND of the following conditions transitions from FALSE to TRUE: [...]

  * The Hot-Plug Interrupt Enable bit in the Slot Control register is
    set to 1b.

  * At least one hot-plug event status bit in the Slot Status register
    and its associated enable bit in the Slot Control register are both
    set to 1b.

Prevent pciehp from gratuitously bringing down the slot by clearing the
error-induced Data Link Layer State Changed event before restoring
config space.  Afterwards, check whether the link has unexpectedly
failed to retrain and synthesize a DLLSC event if so.

Allow each pcie_port_service_driver (one of them being pciehp) to define
a slot_reset callback and re-use the existing pm_iter() function to
iterate over the callbacks.

Thereby, the Endpoint driver remains bound throughout error recovery and
may restore the device to working state.

Surprise removal during error recovery is detected through a Presence
Detect Changed event.  The hotplug port is expected to not signal that
event as a result of a Hot Reset.

The issue isn't DPC-specific, it also occurs when an error is handled by
AER through aer_root_reset().  So while the issue was noticed only now,
it's been around since 2006 when AER support was first introduced.

[bhelgaas: drop PCI_ERROR_RECOVERY Kconfig, split pm_iter() rename to
preparatory patch]
Link: https://lore.kernel.org/linux-pci/08c046b0-c9f2-3489-eeef-7e7aca435bb9@gmail.com/
Fixes: 6c2b374d7485 ("PCI-Express AER implemetation: AER core and aerdriver")
Link: https://lore.kernel.org/r/251f4edcc04c14f873ff1c967bc686169cd07d2d.1627638184.git.lukas@wunner.de
Reported-by: Stuart Hayes &lt;stuart.w.hayes@gmail.com&gt;
Tested-by: Stuart Hayes &lt;stuart.w.hayes@gmail.com&gt;
Signed-off-by: Lukas Wunner &lt;lukas@wunner.de&gt;
Signed-off-by: Bjorn Helgaas &lt;bhelgaas@google.com&gt;
Cc: stable@vger.kernel.org # v2.6.19+: ba952824e6c1: PCI/portdrv: Report reset for frozen channel
Cc: Keith Busch &lt;kbusch@kernel.org&gt;
</content>
</entry>
<entry>
<title>PCI: Fix kerneldoc warnings</title>
<updated>2020-08-05T23:23:14Z</updated>
<author>
<name>Krzysztof Kozlowski</name>
<email>krzk@kernel.org</email>
</author>
<published>2020-07-29T20:12:19Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=9b41d19aff40907bdd64ae230aaadbb3de3db710'/>
<id>urn:sha1:9b41d19aff40907bdd64ae230aaadbb3de3db710</id>
<content type='text'>
Fix kerneldoc warnings, e.g.,

  $ make W=1 drivers/pci/
  drivers/pci/ats.c:196: warning: Function parameter or member 'pdev' not described in 'pci_enable_pri'
  drivers/pci/ats.c:196: warning: Function parameter or member 'reqs' not described in 'pci_enable_pri'
  ...

Link: https://lore.kernel.org/r/20200729201224.26799-2-krzk@kernel.org
Link: https://lore.kernel.org/r/20200729201224.26799-3-krzk@kernel.org
Link: https://lore.kernel.org/r/20200729201224.26799-4-krzk@kernel.org
Link: https://lore.kernel.org/r/20200729201224.26799-5-krzk@kernel.org
Link: https://lore.kernel.org/r/20200729201224.26799-6-krzk@kernel.org
Link: https://lore.kernel.org/r/20200729201224.26799-7-krzk@kernel.org
Signed-off-by: Krzysztof Kozlowski &lt;krzk@kernel.org&gt;
Signed-off-by: Bjorn Helgaas &lt;bhelgaas@google.com&gt;
</content>
</entry>
<entry>
<title>PM: sleep: core: Rename dev_pm_smart_suspend_and_suspended()</title>
<updated>2020-04-24T19:32:41Z</updated>
<author>
<name>Rafael J. Wysocki</name>
<email>rafael.j.wysocki@intel.com</email>
</author>
<published>2020-04-18T16:52:48Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=fa2bfead910322e44e7e0bb74364ac198a2abd32'/>
<id>urn:sha1:fa2bfead910322e44e7e0bb74364ac198a2abd32</id>
<content type='text'>
Because all callers of dev_pm_smart_suspend_and_suspended use it only
for checking whether or not to skip driver suspend callbacks for a
device, rename it to dev_pm_skip_suspend() in analogy with
dev_pm_skip_resume().

No functional impact.

Suggested-by: Alan Stern &lt;stern@rowland.harvard.edu&gt;
Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
Acked-by: Alan Stern &lt;stern@rowland.harvard.edu&gt;
Acked-by: Bjorn Helgaas &lt;bhelgaas@google.com&gt;
</content>
</entry>
<entry>
<title>PCI: pciehp: Prevent deadlock on disconnect</title>
<updated>2019-11-12T23:17:42Z</updated>
<author>
<name>Mika Westerberg</name>
<email>mika.westerberg@linux.intel.com</email>
</author>
<published>2019-10-29T17:00:22Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=87d0f2a5536fdf5053a6d341880f96135549a644'/>
<id>urn:sha1:87d0f2a5536fdf5053a6d341880f96135549a644</id>
<content type='text'>
This addresses deadlocks in these common cases in hierarchies containing
two switches:

  - All involved ports are runtime suspended and they are unplugged. This
    can happen easily if the drivers involved automatically enable runtime
    PM (xHCI for example does that).

  - System is suspended (e.g., closing the lid on a laptop) with a dock +
    something else connected, and the dock is unplugged while suspended.

These cases lead to the following deadlock:

  INFO: task irq/126-pciehp:198 blocked for more than 120 seconds.
  irq/126-pciehp  D    0   198      2 0x80000000
  Call Trace:
   schedule+0x2c/0x80
   schedule_timeout+0x246/0x350
   wait_for_completion+0xb7/0x140
   kthread_stop+0x49/0x110
   free_irq+0x32/0x70
   pcie_shutdown_notification+0x2f/0x50
   pciehp_remove+0x27/0x50
   pcie_port_remove_service+0x36/0x50
   device_release_driver+0x12/0x20
   bus_remove_device+0xec/0x160
   device_del+0x13b/0x350
   device_unregister+0x1a/0x60
   remove_iter+0x1e/0x30
   device_for_each_child+0x56/0x90
   pcie_port_device_remove+0x22/0x40
   pcie_portdrv_remove+0x20/0x60
   pci_device_remove+0x3e/0xc0
   device_release_driver_internal+0x18c/0x250
   device_release_driver+0x12/0x20
   pci_stop_bus_device+0x6f/0x90
   pci_stop_bus_device+0x31/0x90
   pci_stop_and_remove_bus_device+0x12/0x20
   pciehp_unconfigure_device+0x88/0x140
   pciehp_disable_slot+0x6a/0x110
   pciehp_handle_presence_or_link_change+0x263/0x400
   pciehp_ist+0x1c9/0x1d0
   irq_thread_fn+0x24/0x60
   irq_thread+0xeb/0x190
   kthread+0x120/0x140

  INFO: task irq/190-pciehp:2288 blocked for more than 120 seconds.
  irq/190-pciehp  D    0  2288      2 0x80000000
  Call Trace:
   __schedule+0x2a2/0x880
   schedule+0x2c/0x80
   schedule_preempt_disabled+0xe/0x10
   mutex_lock+0x2c/0x30
   pci_lock_rescan_remove+0x15/0x20
   pciehp_unconfigure_device+0x4d/0x140
   pciehp_disable_slot+0x6a/0x110
   pciehp_handle_presence_or_link_change+0x263/0x400
   pciehp_ist+0x1c9/0x1d0
   irq_thread_fn+0x24/0x60
   irq_thread+0xeb/0x190
   kthread+0x120/0x140

What happens here is that the whole hierarchy is runtime resumed and the
parent PCIe downstream port, which got the hot-remove event, starts
removing devices below it, taking pci_lock_rescan_remove() lock. When the
child PCIe port is runtime resumed it calls pciehp_check_presence() which
ends up calling pciehp_card_present() and pciehp_check_link_active().  Both
of these use pcie_capability_read_word(), which notices that the underlying
device is already gone and returns PCIBIOS_DEVICE_NOT_FOUND with the
capability value set to 0. When pciehp gets this value it thinks that its
child device is also hot-removed and schedules its IRQ thread to handle the
event.

The deadlock happens when the child's IRQ thread runs and tries to acquire
pci_lock_rescan_remove() which is already taken by the parent and the
parent waits for the child's IRQ thread to finish.

Prevent this from happening by checking the return value of
pcie_capability_read_word() and if it is PCIBIOS_DEVICE_NOT_FOUND stop
performing any hot-removal activities.

[bhelgaas: add common scenarios to commit log]
Link: https://lore.kernel.org/r/20191029170022.57528-2-mika.westerberg@linux.intel.com
Tested-by: Kai-Heng Feng &lt;kai.heng.feng@canonical.com&gt;
Signed-off-by: Mika Westerberg &lt;mika.westerberg@linux.intel.com&gt;
Signed-off-by: Bjorn Helgaas &lt;bhelgaas@google.com&gt;
</content>
</entry>
<entry>
<title>PCI: pciehp: Do not disable interrupt twice on suspend</title>
<updated>2019-11-12T22:27:33Z</updated>
<author>
<name>Mika Westerberg</name>
<email>mika.westerberg@linux.intel.com</email>
</author>
<published>2019-10-29T17:00:21Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=75fcc0ce72e5cea2e357cdde858216c5bad40442'/>
<id>urn:sha1:75fcc0ce72e5cea2e357cdde858216c5bad40442</id>
<content type='text'>
We try to keep PCIe hotplug ports runtime suspended when entering system
suspend. Because the PCIe portdrv sets the DPM_FLAG_NEVER_SKIP flag, the PM
core always calls system suspend/resume hooks even if the device is left
runtime suspended. Since PCIe hotplug driver re-used the same function for
both runtime suspend and system suspend, it ended up disabling hotplug
interrupt twice and the second time following was printed:

  pciehp 0000:03:01.0:pcie204: pcie_do_write_cmd: no response from device

Prevent this from happening by checking whether the device is already
runtime suspended when the system suspend hook is called.

Fixes: 9c62f0bfb832 ("PCI: pciehp: Implement runtime PM callbacks")
Link: https://lore.kernel.org/r/20191029170022.57528-1-mika.westerberg@linux.intel.com
Reported-by: Kai-Heng Feng &lt;kai.heng.feng@canonical.com&gt;
Tested-by: Kai-Heng Feng &lt;kai.heng.feng@canonical.com&gt;
Signed-off-by: Mika Westerberg &lt;mika.westerberg@linux.intel.com&gt;
Signed-off-by: Bjorn Helgaas &lt;bhelgaas@google.com&gt;
Reviewed-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
</content>
</entry>
<entry>
<title>PCI: pciehp: Remove pciehp_set_attention_status()</title>
<updated>2019-09-05T20:44:08Z</updated>
<author>
<name>Denis Efremov</name>
<email>efremov@linux.com</email>
</author>
<published>2019-09-03T11:10:20Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=106feb2fdced0c240ca8ecb3d5db91b2b64e9a48'/>
<id>urn:sha1:106feb2fdced0c240ca8ecb3d5db91b2b64e9a48</id>
<content type='text'>
Remove pciehp_set_attention_status() and use pciehp_set_indicators()
instead, since the code is mostly the same.

Link: https://lore.kernel.org/r/20190903111021.1559-4-efremov@linux.com
Signed-off-by: Denis Efremov &lt;efremov@linux.com&gt;
Signed-off-by: Bjorn Helgaas &lt;bhelgaas@google.com&gt;
Reviewed-by: Kuppuswamy Sathyanarayanan &lt;sathyanarayanan.kuppuswamy@linux.intel.com&gt;
</content>
</entry>
<entry>
<title>PCI: pciehp: Remove pointless PCIE_MODULE_NAME definition</title>
<updated>2019-05-09T21:45:21Z</updated>
<author>
<name>Bjorn Helgaas</name>
<email>bhelgaas@google.com</email>
</author>
<published>2019-05-08T20:23:39Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=b498b6872da1143d67ed588d21d3c3c496a987ee'/>
<id>urn:sha1:b498b6872da1143d67ed588d21d3c3c496a987ee</id>
<content type='text'>
PCIE_MODULE_NAME is only used once and offers no benefit, so remove it.

Link: https://lore.kernel.org/lkml/20190509141456.223614-10-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas &lt;bhelgaas@google.com&gt;
Reviewed-by: Keith Busch &lt;keith.busch@intel.com&gt;
Reviewed-by: Andy Shevchenko &lt;andy.shevchenko@gmail.com&gt;
</content>
</entry>
<entry>
<title>PCI: pciehp: Remove unused dbg/err/info/warn() wrappers</title>
<updated>2019-05-09T21:45:20Z</updated>
<author>
<name>Frederick Lawler</name>
<email>fred@fredlawl.com</email>
</author>
<published>2019-05-07T23:24:53Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=742ee16bc31f42b034a9b3b01731fd6dc20d0b19'/>
<id>urn:sha1:742ee16bc31f42b034a9b3b01731fd6dc20d0b19</id>
<content type='text'>
Replace the last uses of dbg() with the equivalent pr_debug(), then remove
unused dbg(), err(), info(), and warn() wrappers.

Link: https://lore.kernel.org/lkml/20190509141456.223614-9-helgaas@kernel.org
Signed-off-by: Frederick Lawler &lt;fred@fredlawl.com&gt;
Signed-off-by: Bjorn Helgaas &lt;bhelgaas@google.com&gt;
Reviewed-by: Keith Busch &lt;keith.busch@intel.com&gt;
Reviewed-by: Andy Shevchenko &lt;andy.shevchenko@gmail.com&gt;
</content>
</entry>
<entry>
<title>PCI: pciehp: Log messages with pci_dev, not pcie_device</title>
<updated>2019-05-09T21:45:20Z</updated>
<author>
<name>Frederick Lawler</name>
<email>fred@fredlawl.com</email>
</author>
<published>2019-05-07T23:24:51Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=94dbc9562edc745d0549f1744ca1bd75e644473e'/>
<id>urn:sha1:94dbc9562edc745d0549f1744ca1bd75e644473e</id>
<content type='text'>
Log messages with pci_dev, not pcie_device.  Factor out common message
prefixes with dev_fmt().

Example output change:

  - pciehp 0000:00:06.0:pcie004: Slot(0) Powering on due to button press
  + pcieport 0000:00:06.0: pciehp: Slot(0) Powering on due to button press

Link: https://lore.kernel.org/lkml/20190509141456.223614-8-helgaas@kernel.org
Signed-off-by: Frederick Lawler &lt;fred@fredlawl.com&gt;
Signed-off-by: Bjorn Helgaas &lt;bhelgaas@google.com&gt;
Reviewed-by: Keith Busch &lt;keith.busch@intel.com&gt;
Reviewed-by: Andy Shevchenko &lt;andy.shevchenko@gmail.com&gt;
</content>
</entry>
</feed>
