summaryrefslogtreecommitdiff
path: root/drivers/pci
AgeCommit message (Collapse)Author
2026-03-13PCI: Use resource_set_range() that correctly sets ->endIlpo Järvinen
[ Upstream commit 11721c45a8266a9d0c9684153d20e37159465f96 ] __pci_read_base() sets resource start and end addresses when resource is larger than 4G but pci_bus_addr_t or resource_size_t are not capable of representing 64-bit PCI addresses. This creates a problematic resource that has non-zero flags but the start and end addresses do not yield to resource size of 0 but 1. Replace custom resource addresses setup with resource_set_range() that correctly sets end address as -1 which results in resource_size() returning 0. For consistency, also use resource_set_range() in the other branch that does size based resource setup. Fixes: 23b13bc76f35 ("PCI: Fail safely if we can't handle BARs larger than 4GB") Link: https://lore.kernel.org/all/20251207215359.28895-1-ansuelsmth@gmail.com/T/#m990492684913c5a158ff0e5fc90697d8ad95351b Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@intel.com> Cc: stable@vger.kernel.org Cc: Christian Marangi <ansuelsmth@gmail.com> Link: https://patch.msgid.link/20251208145654.5294-1-ilpo.jarvinen@linux.intel.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-13Revert "PCI: qcom: Don't wait for link if we can detect Link Up"Niklas Cassel
[ Upstream commit e9ce5b3804436301ab343bc14203a4c14b336d1b ] This reverts commit 36971d6c5a9a134c15760ae9fd13c6d5f9a36abb. While this fake hotplugging was a nice idea, it has shown that this feature does not handle PCIe switches correctly: pci_bus 0004:43: busn_res: can not insert [bus 43-41] under [bus 42-41] (conflicts with (null) [bus 42-41]) pci_bus 0004:43: busn_res: [bus 43-41] end is updated to 43 pci_bus 0004:43: busn_res: can not insert [bus 43] under [bus 42-41] (conflicts with (null) [bus 42-41]) pci 0004:42:00.0: devices behind bridge are unusable because [bus 43] cannot be assigned for them pci_bus 0004:44: busn_res: can not insert [bus 44-41] under [bus 42-41] (conflicts with (null) [bus 42-41]) pci_bus 0004:44: busn_res: [bus 44-41] end is updated to 44 pci_bus 0004:44: busn_res: can not insert [bus 44] under [bus 42-41] (conflicts with (null) [bus 42-41]) pci 0004:42:02.0: devices behind bridge are unusable because [bus 44] cannot be assigned for them pci_bus 0004:45: busn_res: can not insert [bus 45-41] under [bus 42-41] (conflicts with (null) [bus 42-41]) pci_bus 0004:45: busn_res: [bus 45-41] end is updated to 45 pci_bus 0004:45: busn_res: can not insert [bus 45] under [bus 42-41] (conflicts with (null) [bus 42-41]) pci 0004:42:06.0: devices behind bridge are unusable because [bus 45] cannot be assigned for them pci_bus 0004:46: busn_res: can not insert [bus 46-41] under [bus 42-41] (conflicts with (null) [bus 42-41]) pci_bus 0004:46: busn_res: [bus 46-41] end is updated to 46 pci_bus 0004:46: busn_res: can not insert [bus 46] under [bus 42-41] (conflicts with (null) [bus 42-41]) pci 0004:42:0e.0: devices behind bridge are unusable because [bus 46] cannot be assigned for them pci_bus 0004:42: busn_res: [bus 42-41] end is updated to 46 pci_bus 0004:42: busn_res: can not insert [bus 42-46] under [bus 41] (conflicts with (null) [bus 41]) pci 0004:41:00.0: devices behind bridge are unusable because [bus 42-46] cannot be assigned for them pcieport 0004:40:00.0: bridge has subordinate 41 but max busn 46 During the initial scan, PCI core doesn't see the switch and since the Root Port is not hot plug capable, the secondary bus number gets assigned as the subordinate bus number. This means, the PCI core assumes that only one bus will appear behind the Root Port since the Root Port is not hot plug capable. This works perfectly fine for PCIe endpoints connected to the Root Port, since they don't extend the bus. However, if a PCIe switch is connected, then there is a problem when the downstream busses starts showing up and the PCI core doesn't extend the subordinate bus number and bridge resources after initial scan during boot. The long term plan is to migrate this driver to the upcoming pwrctrl APIs that are supposed to handle this problem elegantly. Suggested-by: Manivannan Sadhasivam <mani@kernel.org> Signed-off-by: Niklas Cassel <cassel@kernel.org> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Tested-by: Shawn Lin <shawn.lin@rock-chips.com> Acked-by: Shawn Lin <shawn.lin@rock-chips.com> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20251222064207.3246632-11-cassel@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-13PCI: qcom: Don't wait for link if we can detect Link UpKrishna chaitanya chundru
[ Upstream commit 36971d6c5a9a134c15760ae9fd13c6d5f9a36abb ] If we have a 'global' IRQ for Link Up events, we need not wait for the link to be up during PCI initialization, which reduces startup time. Check for 'global' IRQ, and if present, set 'use_linkup_irq', so dw_pcie_host_init() doesn't wait for the link to come up. Link: https://lore.kernel.org/r/20241123-remove_wait2-v5-2-b5f9e6b794c2@quicinc.com Signed-off-by: Krishna chaitanya chundru <quic_krichai@quicinc.com> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> [bhelgaas: commit log] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Reviewed-by: Niklas Cassel <cassel@kernel.org> Stable-dep-of: e9ce5b380443 ("Revert "PCI: qcom: Don't wait for link if we can detect Link Up"") Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-13Revert "PCI: dw-rockchip: Don't wait for link since we can detect Link Up"Niklas Cassel
[ Upstream commit fc6298086bfacaa7003b0bd1da4e4f42b29f7d77 ] This reverts commit ec9fd499b9c60a187ac8d6414c3c343c77d32e42. While this fake hotplugging was a nice idea, it has shown that this feature does not handle PCIe switches correctly: pci_bus 0004:43: busn_res: can not insert [bus 43-41] under [bus 42-41] (conflicts with (null) [bus 42-41]) pci_bus 0004:43: busn_res: [bus 43-41] end is updated to 43 pci_bus 0004:43: busn_res: can not insert [bus 43] under [bus 42-41] (conflicts with (null) [bus 42-41]) pci 0004:42:00.0: devices behind bridge are unusable because [bus 43] cannot be assigned for them pci_bus 0004:44: busn_res: can not insert [bus 44-41] under [bus 42-41] (conflicts with (null) [bus 42-41]) pci_bus 0004:44: busn_res: [bus 44-41] end is updated to 44 pci_bus 0004:44: busn_res: can not insert [bus 44] under [bus 42-41] (conflicts with (null) [bus 42-41]) pci 0004:42:02.0: devices behind bridge are unusable because [bus 44] cannot be assigned for them pci_bus 0004:45: busn_res: can not insert [bus 45-41] under [bus 42-41] (conflicts with (null) [bus 42-41]) pci_bus 0004:45: busn_res: [bus 45-41] end is updated to 45 pci_bus 0004:45: busn_res: can not insert [bus 45] under [bus 42-41] (conflicts with (null) [bus 42-41]) pci 0004:42:06.0: devices behind bridge are unusable because [bus 45] cannot be assigned for them pci_bus 0004:46: busn_res: can not insert [bus 46-41] under [bus 42-41] (conflicts with (null) [bus 42-41]) pci_bus 0004:46: busn_res: [bus 46-41] end is updated to 46 pci_bus 0004:46: busn_res: can not insert [bus 46] under [bus 42-41] (conflicts with (null) [bus 42-41]) pci 0004:42:0e.0: devices behind bridge are unusable because [bus 46] cannot be assigned for them pci_bus 0004:42: busn_res: [bus 42-41] end is updated to 46 pci_bus 0004:42: busn_res: can not insert [bus 42-46] under [bus 41] (conflicts with (null) [bus 41]) pci 0004:41:00.0: devices behind bridge are unusable because [bus 42-46] cannot be assigned for them pcieport 0004:40:00.0: bridge has subordinate 41 but max busn 46 During the initial scan, PCI core doesn't see the switch and since the Root Port is not hot plug capable, the secondary bus number gets assigned as the subordinate bus number. This means, the PCI core assumes that only one bus will appear behind the Root Port since the Root Port is not hot plug capable. This works perfectly fine for PCIe endpoints connected to the Root Port, since they don't extend the bus. However, if a PCIe switch is connected, then there is a problem when the downstream busses starts showing up and the PCI core doesn't extend the subordinate bus number and bridge resources after initial scan during boot. The long term plan is to migrate this driver to the upcoming pwrctrl APIs that are supposed to handle this problem elegantly. Suggested-by: Manivannan Sadhasivam <mani@kernel.org> Signed-off-by: Niklas Cassel <cassel@kernel.org> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Tested-by: Shawn Lin <shawn.lin@rock-chips.com> Acked-by: Shawn Lin <shawn.lin@rock-chips.com> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20251222064207.3246632-9-cassel@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-13PCI: dw-rockchip: Don't wait for link since we can detect Link UpNiklas Cassel
[ Upstream commit ec9fd499b9c60a187ac8d6414c3c343c77d32e42 ] The Root Complex specific device tree binding for pcie-dw-rockchip has the 'sys' interrupt marked as required. The driver requests the 'sys' IRQ unconditionally, and errors out if not provided. Thus, we can unconditionally set 'use_linkup_irq', so dw_pcie_host_init() doesn't wait for the link to come up. This will skip the wait for link up (since the bus will be enumerated once the link up IRQ is triggered), which reduces the bootup time. Link: https://lore.kernel.org/r/20250113-rockchip-no-wait-v1-1-25417f37b92f@kernel.org Signed-off-by: Niklas Cassel <cassel@kernel.org> [bhelgaas: commit log] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Stable-dep-of: fc6298086bfa ("Revert "PCI: dw-rockchip: Don't wait for link since we can detect Link Up"") Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-13PCI: dwc: ep: Flush MSI-X write before unmapping its ATU entryNiklas Cassel
[ Upstream commit c22533c66ccae10511ad6a7afc34bb26c47577e3 ] Endpoint drivers use dw_pcie_ep_raise_msix_irq() to raise an MSI-X interrupt to the host using a writel(), which generates a PCI posted write transaction. There's no completion for posted writes, so the writel() may return before the PCI write completes. dw_pcie_ep_raise_msix_irq() also unmaps the outbound ATU entry used for the PCI write, so the write races with the unmap. If the PCI write loses the race with the ATU unmap, the write may corrupt host memory or cause IOMMU errors, e.g., these when running fio with a larger queue depth against nvmet-pci-epf: arm-smmu-v3 fc900000.iommu: 0x0000010000000010 arm-smmu-v3 fc900000.iommu: 0x0000020000000000 arm-smmu-v3 fc900000.iommu: 0x000000090000f040 arm-smmu-v3 fc900000.iommu: 0x0000000000000000 arm-smmu-v3 fc900000.iommu: event: F_TRANSLATION client: 0000:01:00.0 sid: 0x100 ssid: 0x0 iova: 0x90000f040 ipa: 0x0 arm-smmu-v3 fc900000.iommu: unpriv data write s1 "Input address caused fault" stag: 0x0 Flush the write by performing a readl() of the same address to ensure that the write has reached the destination before the ATU entry is unmapped. The same problem was solved for dw_pcie_ep_raise_msi_irq() in commit 8719c64e76bf ("PCI: dwc: ep: Cache MSI outbound iATU mapping"), but there it was solved by dedicating an outbound iATU only for MSI. We can't do the same for MSI-X because each vector can have a different msg_addr and the msg_addr may be changed while the vector is masked. Fixes: beb4641a787d ("PCI: dwc: Add MSI-X callbacks handler") Signed-off-by: Niklas Cassel <cassel@kernel.org> [bhelgaas: commit log] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Frank Li <Frank.Li@nxp.com> Link: https://patch.msgid.link/20260211175540.105677-2-cassel@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-13PCI: dwc: ep: Use align addr function for dw_pcie_ep_raise_{msi,msix}_irq()Niklas Cassel
[ Upstream commit 3fafc38b77bebeeea5faa2a588b92353775bb390 ] Use the dw_pcie_ep_align_addr() function to calculate the alignment in dw_pcie_ep_raise_{msi,msix}_irq() instead of open coding the same. Link: https://lore.kernel.org/r/20241017132052.4014605-6-cassel@kernel.org Link: https://lore.kernel.org/r/20241104205144.409236-2-cassel@kernel.org Tested-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Niklas Cassel <cassel@kernel.org> [kwilczynski: squashed patch that fixes memory map sizes] Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Frank Li <Frank.Li@nxp.com> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Stable-dep-of: c22533c66cca ("PCI: dwc: ep: Flush MSI-X write before unmapping its ATU entry") Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-13PCI: dwc: endpoint: Implement the pci_epc_ops::align_addr() operationDamien Le Moal
[ Upstream commit e73ea1c2d4d8f7ba5daaf7aa51171f63cf79bcd8 ] The function dw_pcie_prog_outbound_atu() used to program outbound ATU entries for mapping RC PCI addresses to local CPU addresses does not allow PCI addresses that are not aligned to the value of region_align of struct dw_pcie. This value is determined from the iATU hardware registers during probing of the iATU (done by dw_pcie_iatu_detect()). This value is thus valid for all DWC PCIe controllers, and valid regardless of the hardware configuration used when synthesizing the DWC PCIe controller. Implement the ->align_addr() endpoint controller operation to allow this mapping alignment to be transparently handled by endpoint function drivers through the function pci_epc_mem_map(). Link: https://lore.kernel.org/linux-pci/20241012113246.95634-7-dlemoal@kernel.org Link: https://lore.kernel.org/linux-pci/20241015090712.112674-1-dlemoal@kernel.org Link: https://lore.kernel.org/linux-pci/20241017132052.4014605-5-cassel@kernel.org Co-developed-by: Niklas Cassel <cassel@kernel.org> Signed-off-by: Damien Le Moal <dlemoal@kernel.org> [mani: squashed the patch that changed phy_addr_t to u64] Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> [kwilczynski: squashed patch that updated the pci_size variable] Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Stable-dep-of: c22533c66cca ("PCI: dwc: ep: Flush MSI-X write before unmapping its ATU entry") Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-13PCI: endpoint: Introduce pci_epc_mem_map()/unmap()Damien Le Moal
[ Upstream commit ce1dfe6d328966b75821c1f043a940eb2569768a ] Some endpoint controllers have requirements on the alignment of the controller physical memory address that must be used to map a RC PCI address region. For instance, the endpoint controller of the RK3399 SoC uses at most the lower 20 bits of a physical memory address region as the lower bits of a RC PCI address region. For mapping a PCI address region of size bytes starting from pci_addr, the exact number of address bits used is the number of address bits changing in the address range [pci_addr..pci_addr + size - 1]. For this example, this creates the following constraints: 1) The offset into the controller physical memory allocated for a mapping depends on the mapping size *and* the starting PCI address for the mapping. 2) A mapping size cannot exceed the controller windows size (1MB) minus the offset needed into the allocated physical memory, which can end up being a smaller size than the desired mapping size. Handling these constraints independently of the controller being used in an endpoint function driver is not possible with the current EPC API as only the ->align field in struct pci_epc_features is provided but used for BAR (inbound ATU mappings) mapping only. A new API is needed for function drivers to discover mapping constraints and handle non-static requirements based on the RC PCI address range to access. Introduce the endpoint controller operation ->align_addr() to allow the EPC core functions to obtain the size and the offset into a controller address region that must be allocated and mapped to access a RC PCI address region. The size of the mapping provided by the align_addr() operation can then be used as the size argument for the function pci_epc_mem_alloc_addr() and the offset into the allocated controller memory provided can be used to correctly handle data transfers. For endpoint controllers that have PCI address alignment constraints, the align_addr() operation may indicate upon return an effective PCI address mapping size that is smaller (but not 0) than the requested PCI address region size. The controller ->align_addr() operation is optional: controllers that do not have any alignment constraints for mapping RC PCI address regions do not need to implement this operation. For such controllers, it is always assumed that the mapping size is equal to the requested size of the PCI region and that the mapping offset is 0. The function pci_epc_mem_map() is introduced to use this new controller operation (if it is defined) to handle controller memory allocation and mapping to a RC PCI address region in endpoint function drivers. This function first uses the ->align_addr() controller operation to determine the controller memory address size (and offset into) needed for mapping an RC PCI address region. The result of this operation is used to allocate a controller physical memory region using pci_epc_mem_alloc_addr() and then to map that memory to the RC PCI address space with pci_epc_map_addr(). Since ->align_addr() () may indicate that not all of a RC PCI address region can be mapped, pci_epc_mem_map() may only partially map the RC PCI address region specified. It is the responsibility of the caller (an endpoint function driver) to handle such smaller mapping by repeatedly using pci_epc_mem_map() over the desried PCI address range. The counterpart of pci_epc_mem_map() to unmap and free a mapped controller memory address region is pci_epc_mem_unmap(). Both functions operate using the new struct pci_epc_map data structure. This new structure represents a mapping PCI address, mapping effective size, the size of the controller memory needed for the mapping as well as the physical and virtual CPU addresses of the mapping (phys_base and virt_base fields). For convenience, the physical and virtual CPU addresses within that mapping to use to access the target RC PCI address region are also provided (phys_addr and virt_addr fields). Endpoint function drivers can use struct pci_epc_map to access the mapped RC PCI address region using the ->virt_addr and ->pci_size fields. Co-developed-by: Rick Wertenbroek <rick.wertenbroek@gmail.com> Signed-off-by: Rick Wertenbroek <rick.wertenbroek@gmail.com> Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Link: https://lore.kernel.org/r/20241012113246.95634-4-dlemoal@kernel.org [mani: squashed the patch that changed phy_addr_t to u64] Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Stable-dep-of: c22533c66cca ("PCI: dwc: ep: Flush MSI-X write before unmapping its ATU entry") Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-13PCI: endpoint: Introduce pci_epc_function_is_valid()Damien Le Moal
[ Upstream commit ca3c342fb3c76eee739a1cfc4ff59841722ebee7 ] Introduce the epc core helper function pci_epc_function_is_valid() to verify that an epc pointer, a physical function number and a virtual function number are all valid. This avoids repeating the code pattern: if (IS_ERR_OR_NULL(epc) || func_no >= epc->max_functions) return err; if (vfunc_no > 0 && (!epc->max_vfs || vfunc_no > epc->max_vfs[func_no])) return err; in many functions of the endpoint controller core code. Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Reviewed-by: Niklas Cassel <cassel@kernel.org> Link: https://lore.kernel.org/r/20241012113246.95634-2-dlemoal@kernel.org Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Stable-dep-of: c22533c66cca ("PCI: dwc: ep: Flush MSI-X write before unmapping its ATU entry") Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04PCI: Fix pci_slot_trylock() error handlingJinhui Guo
[ Upstream commit 9368d1ee62829b08aa31836b3ca003803caf0b72 ] Commit a4e772898f8b ("PCI: Add missing bridge lock to pci_bus_lock()") delegates the bridge device's pci_dev_trylock() to pci_bus_trylock() in pci_slot_trylock(), but it forgets to remove the corresponding pci_dev_unlock() when pci_bus_trylock() fails. Before a4e772898f8b, the code did: if (!pci_dev_trylock(dev)) /* <- lock bridge device */ goto unlock; if (dev->subordinate) { if (!pci_bus_trylock(dev->subordinate)) { pci_dev_unlock(dev); /* <- unlock bridge device */ goto unlock; } } After a4e772898f8b the bridge-device lock is no longer taken, but the pci_dev_unlock(dev) on the failure path was left in place, leading to the bug. This yields one of two errors: 1. A warning that the lock is being unlocked when no one holds it. 2. An incorrect unlock of a lock that belongs to another thread. Fix it by removing the now-redundant pci_dev_unlock(dev) on the failure path. [Same patch later posted by Keith at https://patch.msgid.link/20260116184150.3013258-1-kbusch@meta.com] Fixes: a4e772898f8b ("PCI: Add missing bridge lock to pci_bus_lock()") Signed-off-by: Jinhui Guo <guojinhui.liam@bytedance.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20251212145528.2555-1-guojinhui.liam@bytedance.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04PCI: dwc: Fix msg_atu_index assignmentNiklas Cassel
[ Upstream commit 58fbf08935d9c4396417e5887df89a4e681fa7e3 ] When dw_pcie_iatu_setup() configures outbound address translation for both type PCIE_ATU_TYPE_MEM and PCIE_ATU_TYPE_IO, the iATU index to use is incremented before calling dw_pcie_prog_outbound_atu(). However for msg_atu_index, the index is not incremented before use, causing the iATU index to be the same as the last configured iATU index, which means that it will incorrectly use the same iATU index that is already in use, breaking outbound address translation. In total there are three problems with this code: -It assigns msg_atu_index the same index that was used for the last outbound address translation window, rather than incrementing the index before assignment. -The index should only be incremented (and msg_atu_index assigned) if the use_atu_msg feature is actually requested/in use (pp->use_atu_msg is set). -If the use_atu_msg feature is requested/in use, and there are no outbound iATUs available, the code should return an error, as otherwise when this this feature is used, it will use an iATU index that is out of bounds. Fixes: e1a4ec1a9520 ("PCI: dwc: Add generic MSG TLP support for sending PME_Turn_Off when system suspend") Signed-off-by: Niklas Cassel <cassel@kernel.org> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Tested-by: Maciej W. Rozycki <macro@orcam.me.uk> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Hans Zhang <zhanghuabing@ecosda.com> Reviewed-by: Frank Li <Frank.Li@nxp.com> Reviewed-by: Shawn Lin <shawn.lin@rock-chips.com> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20260127151038.1484881-6-cassel@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04PCI/IOV: Fix race between SR-IOV enable/disable and hotplugNiklas Schnelle
[ Upstream commit a5338e365c4559d7b4d7356116b0eb95b12e08d5 ] Commit 05703271c3cd ("PCI/IOV: Add PCI rescan-remove locking when enabling/disabling SR-IOV") tried to fix a race between the VF removal inside sriov_del_vfs() and concurrent hot unplug by taking the PCI rescan/remove lock in sriov_del_vfs(). Similarly the PCI rescan/remove lock was also taken in sriov_add_vfs() to protect addition of VFs. This approach however causes deadlock on trying to remove PFs with SR-IOV enabled because PFs disable SR-IOV during removal and this removal happens under the PCI rescan/remove lock. So the original fix had to be reverted. Instead of taking the PCI rescan/remove lock in sriov_add_vfs() and sriov_del_vfs(), fix the race that occurs with SR-IOV enable and disable vs hotplug higher up in the callchain by taking the lock in sriov_numvfs_store() before calling into the driver's sriov_configure() callback. Fixes: 05703271c3cd ("PCI/IOV: Add PCI rescan-remove locking when enabling/disabling SR-IOV") Reported-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Reviewed-by: Gerd Bayer <gbayer@linux.ibm.com> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20251216-revert_sriov_lock-v3-2-dac4925a7621@linux.ibm.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04Revert "PCI/IOV: Add PCI rescan-remove locking when enabling/disabling SR-IOV"Niklas Schnelle
[ Upstream commit 2fa119c0e5e528453ebae9e70740e8d2d8c0ed5a ] This reverts commit 05703271c3cd ("PCI/IOV: Add PCI rescan-remove locking when enabling/disabling SR-IOV"), which causes a deadlock by recursively taking pci_rescan_remove_lock when sriov_del_vfs() is called as part of pci_stop_and_remove_bus_device(). For example with the following sequence of commands: $ echo <NUM> > /sys/bus/pci/devices/<pf>/sriov_numvfs $ echo 1 > /sys/bus/pci/devices/<pf>/remove A trimmed trace of the deadlock on a mlx5 device is as below: zsh/5715 is trying to acquire lock: 000002597926ef50 (pci_rescan_remove_lock){+.+.}-{3:3}, at: sriov_disable+0x34/0x140 but task is already holding lock: 000002597926ef50 (pci_rescan_remove_lock){+.+.}-{3:3}, at: pci_stop_and_remove_bus_device_locked+0x24/0x80 ... Call Trace: [<00000259778c4f90>] dump_stack_lvl+0xc0/0x110 [<00000259779c844e>] print_deadlock_bug+0x31e/0x330 [<00000259779c1908>] __lock_acquire+0x16c8/0x32f0 [<00000259779bffac>] lock_acquire+0x14c/0x350 [<00000259789643a6>] __mutex_lock_common+0xe6/0x1520 [<000002597896413c>] mutex_lock_nested+0x3c/0x50 [<00000259784a07e4>] sriov_disable+0x34/0x140 [<00000258f7d6dd80>] mlx5_sriov_disable+0x50/0x80 [mlx5_core] [<00000258f7d5745e>] remove_one+0x5e/0xf0 [mlx5_core] [<00000259784857fc>] pci_device_remove+0x3c/0xa0 [<000002597851012e>] device_release_driver_internal+0x18e/0x280 [<000002597847ae22>] pci_stop_bus_device+0x82/0xa0 [<000002597847afce>] pci_stop_and_remove_bus_device_locked+0x5e/0x80 [<00000259784972c2>] remove_store+0x72/0x90 [<0000025977e6661a>] kernfs_fop_write_iter+0x15a/0x200 [<0000025977d7241c>] vfs_write+0x24c/0x300 [<0000025977d72696>] ksys_write+0x86/0x110 [<000002597895b61c>] __do_syscall+0x14c/0x400 [<000002597896e0ee>] system_call+0x6e/0x90 This alone is not a complete fix as it restores the issue the cited commit tried to solve. A new fix will be provided as a follow on. Fixes: 05703271c3cd ("PCI/IOV: Add PCI rescan-remove locking when enabling/disabling SR-IOV") Reported-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Acked-by: Gerd Bayer <gbayer@linux.ibm.com> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20251216-revert_sriov_lock-v3-1-dac4925a7621@linux.ibm.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04PCI: endpoint: Fix swapped parameters in ↵Manikanta Maddireddy
pci_{primary/secondary}_epc_epf_unlink() functions [ Upstream commit 8754dd7639ab0fd68c3ab9d91c7bdecc3e5740a8 ] struct configfs_item_operations callbacks are defined like the following: int (*allow_link)(struct config_item *src, struct config_item *target); void (*drop_link)(struct config_item *src, struct config_item *target); While pci_primary_epc_epf_link() and pci_secondary_epc_epf_link() specify the parameters in the correct order, pci_primary_epc_epf_unlink() and pci_secondary_epc_epf_unlink() specify the parameters in the wrong order, leading to the below kernel crash when using the unlink command in configfs: Unable to handle kernel paging request at virtual address 0000000300000857 Mem abort info: ... pc : string+0x54/0x14c lr : vsnprintf+0x280/0x6e8 ... string+0x54/0x14c vsnprintf+0x280/0x6e8 vprintk_default+0x38/0x4c vprintk+0xc4/0xe0 pci_epf_unbind+0xdc/0x108 configfs_unlink+0xe0/0x208+0x44/0x74 vfs_unlink+0x120/0x29c __arm64_sys_unlinkat+0x3c/0x90 invoke_syscall+0x48/0x134 do_el0_svc+0x1c/0x30prop.0+0xd0/0xf0 Fixes: e85a2d783762 ("PCI: endpoint: Add support in configfs to associate two EPCs with EPF") Signed-off-by: Manikanta Maddireddy <mmaddireddy@nvidia.com> [mani: cced stable, changed commit message as per https://lore.kernel.org/linux-pci/aV9joi3jF1R6ca02@ryzen] Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Niklas Cassel <cassel@kernel.org> Reviewed-by: Frank Li <Frank.Li@nxp.com> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20260108062747.1870669-1-mmaddireddy@nvidia.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04Revert "PCI: qcom: Enable MSI interrupts together with Link up if 'Global ↵Niklas Cassel
IRQ' is supported" [ Upstream commit 7ebdefb87942073679e56cfbc5a72e8fc5441bfc ] This reverts commit ba4a2e2317b9faeca9193ed6d3193ddc3cf2aba3. Since the Link up IRQ support is going away, revert the MSI logic that got added for it too. Suggested-by: Manivannan Sadhasivam <mani@kernel.org> Signed-off-by: Niklas Cassel <cassel@kernel.org> [mani: reworded the description] Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Tested-by: Shawn Lin <shawn.lin@rock-chips.com> Acked-by: Shawn Lin <shawn.lin@rock-chips.com> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20251222064207.3246632-12-cassel@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04PCI: Mark Nvidia GB10 to avoid bus resetJohnny-CC Chang
[ Upstream commit c81a2ce6b6a844d1a57d2a69833a9d0f00403f00 ] After asserting Secondary Bus Reset to downstream devices via a GB10 Root Port, the link may not retrain correctly, e.g., the link may retrain with a lower lane count or config accesses to downstream devices may fail. Prevent use of Secondary Bus Reset for devices below GB10. Signed-off-by: Johnny-CC Chang <Johnny-CC.Chang@mediatek.com> [bhelgaas: drop pci_ids.h update (only used once), update commit log] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Manivannan Sadhasivam <mani@kernel.org> Link: https://patch.msgid.link/20251113084441.2124737-1-Johnny-CC.Chang@mediatek.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04PCI: Add ACS quirk for Qualcomm Hamoa & GlymurKrishna Chaitanya Chundru
[ Upstream commit 44d2f70b1fd72c339c72983fcffa181beae3e113 ] The Qualcomm Hamoa & Glymur Root Ports don't advertise an ACS capability, but they do provide ACS-like features to disable peer transactions and validate bus numbers in requests. Add an ACS quirk for Hamoa & Glymur. Signed-off-by: Krishna Chaitanya Chundru <krishna.chundru@oss.qualcomm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20260109-acs_quirk-v1-1-82adf95a89ae@oss.qualcomm.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04PCI: Enable ACS after configuring IOMMU for OF platformsManivannan Sadhasivam
[ Upstream commit c41e2fb67e26b04d919257875fa954aa5f6e392e ] Platform, ACPI, or IOMMU drivers call pci_request_acs(), which sets 'pci_acs_enable' to request that ACS be enabled for any devices enumerated in the future. OF platforms called pci_enable_acs() for the first device before of_iommu_configure() called pci_request_acs(), so ACS was never enabled for that device (typically a Root Port). Call pci_enable_acs() later, from pci_dma_configure(), after of_dma_configure() has had a chance to call pci_request_acs(). Here's the call path, showing the move of pci_enable_acs() from pci_acs_init() to pci_dma_configure(), where it always happens after pci_request_acs(): pci_device_add pci_init_capabilities pci_acs_init - pci_enable_acs - if (pci_acs_enable) <-- previous test - ... device_add bus_notify(BUS_NOTIFY_ADD_DEVICE) iommu_bus_notifier iommu_probe_device iommu_init_device dev->bus->dma_configure pci_dma_configure # pci_bus_type.dma_configure of_dma_configure of_iommu_configure pci_request_acs pci_acs_enable = 1 <-- set + pci_enable_acs + if (pci_acs_enable) <-- new test + ... bus_probe_device device_initial_probe ... really_probe dev->bus->dma_configure pci_dma_configure # pci_bus_type.dma_configure ... pci_enable_acs Note that we will now call pci_enable_acs() twice for every device, first from the iommu_probe_device() path and again from the really_probe() path. Presumably that's not an issue since we also call dev->bus->dma_configure() twice. For the ACPI platforms, pci_request_acs() is called during ACPI initialization time itself, independent of the IOMMU framework. Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com> [bhelgaas: commit log] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com> Tested-by: Naresh Kamboju <naresh.kamboju@linaro.org> Link: https://patch.msgid.link/20260102-pci_acs-v3-1-72280b94d288@oss.qualcomm.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04PCI: Fix pci_slot_lock () device lockingKeith Busch
[ Upstream commit 1f5e57c622b4dc9b8e7d291d560138d92cfbe5bf ] Like pci_bus_lock(), pci_slot_lock() needs to lock the bridge device to prevent warnings like: pcieport 0000:e2:05.0: unlocked secondary bus reset via: pciehp_reset_slot+0x55/0xa0 Take and release the lock for the bridge providing the slot for the lock/trylock and unlock routines. Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Link: https://patch.msgid.link/20260130165953.751063-3-kbusch@meta.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04PCI/AER: Clear stale errors on reporting agents upon probeLukas Wunner
[ Upstream commit e242d09b58e869f86071b7889acace4cff215935 ] Correctable and Uncorrectable Error Status Registers on reporting agents are cleared upon PCI device enumeration in pci_aer_init() to flush past events. They're cleared again when an error is handled by the AER driver. If an agent reports a new error after pci_aer_init() and before the AER driver has probed on the corresponding Root Port or Root Complex Event Collector, that error is not handled by the AER driver: It clears the Root Error Status Register on probe, but neglects to re-clear the Correctable and Uncorrectable Error Status Registers on reporting agents. The error will eventually be reported when another error occurs. Which is irritating because to an end user it appears as if the earlier error has just happened. Amend the AER driver to clear stale errors on reporting agents upon probe. Skip reporting agents which have not invoked pci_aer_init() yet to avoid using an uninitialized pdev->aer_cap. They're recognizable by the error bits in the Device Control register still being clear. Reporting agents may execute pci_aer_init() after the AER driver has probed, particularly when devices are hotplugged or removed/rescanned via sysfs. For this reason, it continues to be necessary that pci_aer_init() clears Correctable and Uncorrectable Error Status Registers. Reported-by: Lucas Van <lucas.van@intel.com> # off-list Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Lucas Van <lucas.van@intel.com> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Link: https://patch.msgid.link/3011c2ed30c11f858e35e29939add754adea7478.1769332702.git.lukas@wunner.de Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04PCI: Mark ASM1164 SATA controller to avoid bus resetAlex Williamson
[ Upstream commit beb2f81792a8a619e5122b6b24a374861309c54b ] User forums report issues when assigning ASM1164 SATA controllers to VMs, especially in configurations with multiple controllers. Logs show the device fails to retrain after bus reset. Reports suggest this is an issue across multiple platforms. The device indicates support for PM reset, therefore the device still has a viable function level reset mechanism. The reporting user confirms the device is well behaved in this use case with bus reset disabled. Reported-by: Patrick Bianchi <patrick.w.bianchi@gmail.com> Link: https://forum.proxmox.com/threads/problems-with-pcie-passthrough-with-two-identical-devices.149003/ Signed-off-by: Alex Williamson <alex.williamson@nvidia.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20260109000211.398300-1-alex.williamson@nvidia.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04PCI/MSI: Unmap MSI-X region on errorHaoxiang Li
[ Upstream commit 1a8d4c6ecb4c81261bcdf13556abd4a958eca202 ] msix_capability_init() fails to unmap the MSI-X region if msix_setup_interrupts() fails. Add the missing iounmap() for that error path. [ tglx: Massaged change log ] Signed-off-by: Haoxiang Li <lihaoxiang@isrc.iscas.ac.cn> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Link: https://patch.msgid.link/20260125144452.2103812-1-lihaoxiang@isrc.iscas.ac.cn Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04PCI: Add ACS quirk for Pericom PI7C9X2G404 switches [12d8:b404]Nicolas Cavallari
[ Upstream commit 5907a90551e9f7968781f3a6ab8684458959beb3 ] 12d8:b404 is apparently another PCI ID for Pericom PI7C9X2G404 (as identified by the chip silkscreen and lspci). It is also affected by the PI7C9X2G errata (e.g. a network card attached to it fails under load when P2P Redirect Request is enabled), so apply the same quirk to this PCI ID too. PCI bridge [0604]: Pericom Semiconductor PI7C9X2G404 EV/SV PCIe2 4-Port/4-Lane Packet Switch [12d8:b404] (rev 01) Fixes: acd61ffb2f16 ("PCI: Add ACS quirk for Pericom PI7C9X2G switches") Closes: https://lore.kernel.org/all/a1d926f0-4cb5-4877-a4df-617902648d80@green-communications.fr/ Signed-off-by: Nicolas Cavallari <nicolas.cavallari@green-communications.fr> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20260119160915.26456-1-nicolas.cavallari@green-communications.fr Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04PCI/ACPI: Restrict program_hpx_type2() to AER bitsHåkon Bugge
[ Upstream commit 9abf79c8d7b40db0e5a34aa8c744ea60ff9a3fcf ] Previously program_hpx_type2() applied PCIe settings unconditionally, which could incorrectly change bits like Extended Tag Field Enable and Enable Relaxed Ordering. When _HPX was added to ACPI r3.0, the intent of the PCIe Setting Record (Type 2) in sec 6.2.7.3 was to configure AER registers when the OS does not own the AER Capability: The PCI Express setting record contains ... [the AER] Uncorrectable Error Mask, Uncorrectable Error Severity, Correctable Error Mask ... to be used when configuring registers in the Advanced Error Reporting Extended Capability Structure ... OSPM [1] will only evaluate _HPX with Setting Record – Type 2 if OSPM is not controlling the PCI Express Advanced Error Reporting capability. ACPI r3.0b, sec 6.2.7.3, added more AER registers, including registers in the PCIe Capability with AER-related bits, and the restriction that the OS use this only when it owns PCIe native hotplug: ... when configuring PCI Express registers in the Advanced Error Reporting Extended Capability Structure *or PCI Express Capability Structure* ... An OS that has assumed ownership of native hot plug but does not ... have ownership of the AER register set must use ... the Type 2 record to program the AER registers ... However, since the Type 2 record also includes register bits that have functions other than AER, the OS must ignore values ... that are not applicable. Restrict program_hpx_type2() to only the intended purpose: - Apply settings only when OS owns PCIe native hotplug but not AER, - Only touch the AER-related bits (Error Reporting Enables) in Device Control - Don't touch Link Control at all, since nothing there seems AER-related, but log _HPX settings for debugging purposes Note that Read Completion Boundary is now configured elsewhere, since it is unrelated to _HPX. [1] Operating System-directed configuration and Power Management Fixes: 40abb96c51bb ("[PATCH] pciehp: Fix programming hotplug parameters") Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20260129175237.727059-3-haakon.bugge@oracle.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04PCI: Add defines for bridge window indexingIlpo Järvinen
[ Upstream commit e4934832c588f72bcc139d3ca0acc490c63a821c ] include/linux/pci.h provides PCI_BRIDGE_{IO,MEM,PREF_MEM}_WINDOW defines, however, they're based on the resource array indexing in the pci_dev struct. The struct pci_bus also has pointers to those same resources but they start from zeroth index. Add PCI_BUS_BRIDGE_{IO,MEM,PREF_MEM}_WINDOW defines to get rid of literal indexing. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20250829131113.36754-12-ilpo.jarvinen@linux.intel.com Stable-dep-of: 9abf79c8d7b4 ("PCI/ACPI: Restrict program_hpx_type2() to AER bits") Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04PCI: Initialize RCB from pci_configure_device()Håkon Bugge
[ Upstream commit 1a6845aaa6de81f95959b380b45de8f10d6a8502 ] Commit e42010d8207f ("PCI: Set Read Completion Boundary to 128 iff Root Port supports it (_HPX)") worked around a bogus _HPX type 2 record, which caused program_hpx_type2() to set the RCB in an endpoint even though the Root Port did not have the RCB bit set. e42010d8207f fixed that by setting the RCB in the endpoint only when it was set in the Root Port. In retrospect, program_hpx_type2() is intended for AER-related settings, and the RCB should be configured elsewhere so it doesn't depend on the presence or contents of an _HPX record. Explicitly program the RCB from pci_configure_device() so it matches the Root Port's RCB. The Root Port may not be visible to virtualized guests; in that case, leave RCB alone. Fixes: e42010d8207f ("PCI: Set Read Completion Boundary to 128 iff Root Port supports it (_HPX)") Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20260129175237.727059-2-haakon.bugge@oracle.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04PCI: Check parent for NULL in of_pci_bus_release_domain_nr()Sergey Shtylyov
[ Upstream commit f7245901de8978d829f80b3d8e36ed9a8fd18049 ] of_pci_bus_find_domain_nr() allows its parent parameter to be NULL but of_pci_bus_release_domain_nr() (that undoes its effect) doesn't -- that means it's going to blow up while calling of_get_pci_domain_nr() if the parent parameter indeed happens to be NULL. Add the missing NULL check. Found by Linux Verification Center (linuxtesting.org) with the Svace static analysis tool. Fixes: c14f7ccc9f5d ("PCI: Assign PCI domain IDs by ida_alloc()") Signed-off-by: Sergey Shtylyov <s.shtylyov@auroraos.dev> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20260127203944.28588-1-s.shtylyov@auroraos.dev Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04PCI: Mark 3ware-9650SA Root Port Extended Tags as brokenJörg Wedekind
[ Upstream commit 959ac08a2c2811305be8c2779779e8b0932e5a99 ] Per PCIe r7.0, sec 2.2.6.2.1 and 7.5.3.4, a Requester may not use 8-bit Tags unless its Extended Tag Field Enable is set, but all Receivers/Completers must handle 8-bit Tags correctly regardless of their Extended Tag Field Enable. Some devices do not handle 8-bit Tags as Completers, so add a quirk for them. If we find such a device, we disable Extended Tags for the entire hierarchy to make peer-to-peer DMA possible. The 3ware 9650SA seems to have issues with handling 8-bit tags. Mark it as broken. This fixes PCI Parity Errors like : 3w-9xxx: scsi0: ERROR: (0x06:0x000C): PCI Parity Error: clearing. 3w-9xxx: scsi0: ERROR: (0x06:0x000D): PCI Abort: clearing. 3w-9xxx: scsi0: ERROR: (0x06:0x000E): Controller Queue Error: clearing. 3w-9xxx: scsi0: ERROR: (0x06:0x0010): Microcontroller Error: clearing. Fixes: 60db3a4d8cc9 ("PCI: Enable PCIe Extended Tags if supported") Closes: https://bugzilla.kernel.org/show_bug.cgi?id=202425 Signed-off-by: Jörg Wedekind <joerg@wedekind.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20260119143114.21948-1-joerg@wedekind.de Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04PCI/portdrv: Fix potential resource leakUwe Kleine-König
[ Upstream commit 01464a3fdf91c041a381d93a1b6fefbdb819a46f ] pcie_port_probe_service() unconditionally calls get_device() (unless it fails). So drop that reference also unconditionally as it's fine for a PCIe driver to not have a remove callback. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Link: https://patch.msgid.link/e1c68c3b3f1af8427e98ca5e2c79f8bf0ebe2ce4.1764688034.git.u.kleine-koenig@baylibre.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04PCI: Do not attempt to set ExtTag for VFsHåkon Bugge
[ Upstream commit 73711730a1128d91ebca1a6994ceeb18f36cb0cd ] The bit for enabling extended tags is Reserved and Preserved (RsvdP) for VFs, according to PCIe r7.0 section 7.5.3.4 table 7.21. Hence, bail out early from pci_configure_extended_tags() if the device is a VF. Otherwise, we may see incorrect log messages such as: kernel: pci 0000:af:00.2: enabling Extended Tags (af:00.2 is a VF) Fixes: 60db3a4d8cc9 ("PCI: Enable PCIe Extended Tags if supported") Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Zhu Yanjun <yanjun.zhu@linux.dev> Link: https://patch.msgid.link/20251112095442.1913258-1-haakon.bugge@oracle.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04PCI/P2PDMA: Release per-CPU pgmap ref when vm_insert_page() failsHou Tao
[ Upstream commit 6220694c52a5a04102b48109e4f24e958b559bd3 ] When vm_insert_page() fails in p2pmem_alloc_mmap(), p2pmem_alloc_mmap() doesn't invoke percpu_ref_put() to free the per-CPU ref of pgmap acquired after gen_pool_alloc_owner(), and memunmap_pages() will hang forever when trying to remove the PCI device. Fix it by adding the missed percpu_ref_put(). Fixes: 7e9c7ef83d78 ("PCI/P2PDMA: Allow userspace VMA allocations through sysfs") Signed-off-by: Hou Tao <houtao1@huawei.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Logan Gunthorpe <logang@deltatee.com> Reviewed-by: Alistair Popple <apopple@nvidia.com> Link: https://patch.msgid.link/20251220040446.274991-2-houtao@huaweicloud.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04PCI/PM: Avoid redundant delays on D3hot->D3coldBrian Norris
[ Upstream commit 4d982084507d663df160546c4c48066a8887ed89 ] When transitioning to D3cold, __pci_set_power_state() first transitions to D3hot. If the device was already in D3hot, this adds excess work: (a) read/modify/write PMCSR; and (b) excess delay (pci_dev_d3_sleep()). For (b), we already performed the necessary delay on the previous D3hot entry; this was extra noticeable when evaluating runtime PM transition latency. Check whether we're already in the target state before continuing. Note that __pci_set_power_state() already does this same check for other state transitions, but D3cold is special because __pci_set_power_state() converts it to D3hot for the purposes of PMCSR. This seems to be an oversight in commit 0aacdc957401 ("PCI/PM: Clean up pci_set_low_power_state()"). Fixes: 0aacdc957401 ("PCI/PM: Clean up pci_set_low_power_state()") Signed-off-by: Brian Norris <briannorris@google.com> Signed-off-by: Brian Norris <briannorris@chromium.org> [bhelgaas: reverse test to match other "dev->current_state == state" cases] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20251003154008.1.I7a21c240b30062c66471329567a96dceb6274358@changeid Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04PCI: mediatek: Fix IRQ domain leak when MSI allocation failsHaotian Zhang
[ Upstream commit 7f0cdcddf8bef1c8c18f9be6708073fd3790a20f ] In mtk_pcie_init_irq_domain(), if mtk_pcie_allocate_msi_domains() fails after port->irq_domain has been successfully created via irq_domain_create_linear(), the function returns directly without cleaning up the allocated IRQ domain, resulting in a resource leak. Add irq_domain_remove() call in the error path to properly release the INTx IRQ domain before returning the error. Fixes: 43e6409db64d ("PCI: mediatek: Add MSI support for MT2712 and MT7622") Signed-off-by: Haotian Zhang <vulab@iscas.ac.cn> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Link: https://patch.msgid.link/20251119023308.476-1-vulab@iscas.ac.cn Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-02-16PCI: endpoint: Avoid creating sub-groups asynchronouslyLiu Song
commit 7c5c7d06bd1f86d2c3ebe62be903a4ba42db4d2c upstream. The asynchronous creation of sub-groups by a delayed work could lead to a NULL pointer dereference when the driver directory is removed before the work completes. The crash can be easily reproduced with the following commands: # cd /sys/kernel/config/pci_ep/functions/pci_epf_test # for i in {1..20}; do mkdir test && rmdir test; done BUG: kernel NULL pointer dereference, address: 0000000000000088 ... Call Trace: configfs_register_group+0x3d/0x190 pci_epf_cfs_work+0x41/0x110 process_one_work+0x18f/0x350 worker_thread+0x25a/0x3a0 Fix this issue by using configfs_add_default_group() API which does not have the deadlock problem as configfs_register_group() and does not require the delayed work handler. Fixes: e85a2d783762 ("PCI: endpoint: Add support in configfs to associate two EPCs with EPF") Signed-off-by: Liu Song <liu.song13@zte.com.cn> [mani: slightly reworded the description and added stable list] Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@kernel.org Link: https://patch.msgid.link/20250710143845409gLM6JdlwPhlHG9iX3F6jK@zte.com.cn Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-02-11PCI: qcom: Remove ASPM L0s support for MSM8996 SoCManivannan Sadhasivam
[ Upstream commit 0cc13256b60510936c34098ee7b929098eed823b ] Though I couldn't confirm ASPM L0s support with the Qcom hardware team, a bug report from Dmitry suggests that L0s is broken on this legacy SoC. Hence, remove L0s support from the Root Port Link Capabilities in this SoC. Since qcom_pcie_clear_aspm_l0s() is now used by more than one SoC config, call it from qcom_pcie_host_init() instead. Reported-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Closes: https://lore.kernel.org/linux-pci/4cp5pzmlkkht2ni7us6p3edidnk25l45xrp6w3fxguqcvhq2id@wjqqrdpkypkf Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Link: https://patch.msgid.link/20251126081718.8239-1-mani@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-02-11PCI/ERR: Ensure error recoverability at all timesLukas Wunner
commit a2f1e22390ac2ca7ac8d77aa0f78c068b6dd2208 upstream. When the PCI core gained power management support in 2002, it introduced pci_save_state() and pci_restore_state() helpers to restore Config Space after a D3hot or D3cold transition, which implies a Soft or Fundamental Reset (PCIe r7.0 sec 5.8): https://git.kernel.org/tglx/history/c/a5287abe398b In 2006, EEH and AER were introduced to recover from errors by performing a reset. Because errors can occur at any time, drivers began calling pci_save_state() on probe to ensure recoverability. In 2009, recoverability was foiled by commit c82f63e411f1 ("PCI: check saved state before restore"): It amended pci_restore_state() to bail out if the "state_saved" flag has been cleared. The flag is cleared by pci_restore_state() itself, hence a saved state is now allowed to be restored only once and is then invalidated. That doesn't seem to make sense because the saved state should be good enough to be reused. Soon after, drivers began to work around this behavior by calling pci_save_state() immediately after pci_restore_state(), see e.g. commit b94f2d775a71 ("igb: call pci_save_state after pci_restore_state"). Hilariously, two drivers even set the "saved_state" flag to true before invoking pci_restore_state(), see ipr_reset_restore_cfg_space() and e1000_io_slot_reset(). Despite these workarounds, recoverability at all times is not guaranteed: E.g. when a PCIe port goes through a runtime suspend and resume cycle, the "saved_state" flag is cleared by: pci_pm_runtime_resume() pci_pm_default_resume_early() pci_restore_state() ... and hence on a subsequent AER event, the port's Config Space cannot be restored. Riana reports a recovery failure of a GPU-integrated PCIe switch and has root-caused it to the behavior of pci_restore_state(). Another workaround would be necessary, namely calling pci_save_state() in pcie_port_device_runtime_resume(). The motivation of commit c82f63e411f1 was to prevent restoring state if pci_save_state() hasn't been called before. But that can be achieved by saving state already on device addition, after Config Space has been initialized. A desirable side effect is that devices become recoverable even if no driver gets bound. This renders the commit unnecessary, so revert it. Reported-by: Riana Tauro <riana.tauro@intel.com> # off-list Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Riana Tauro <riana.tauro@intel.com> Reviewed-by: Rafael J. Wysocki (Intel) <rafael@kernel.org> Link: https://patch.msgid.link/9e34ce61c5404e99ffdd29205122c6fb334b38aa.1763483367.git.lukas@wunner.de Cc: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-01-23x86/kaslr: Recognize all ZONE_DEVICE users as physaddr consumersDan Williams
commit 269031b15c1433ff39e30fa7ea3ab8f0be9d6ae2 upstream. Commit 7ffb791423c7 ("x86/kaslr: Reduce KASLR entropy on most x86 systems") is too narrow. The effect being mitigated in that commit is caused by ZONE_DEVICE which PCI_P2PDMA has a dependency. ZONE_DEVICE, in general, lets any physical address be added to the direct-map. I.e. not only ACPI hotplug ranges, CXL Memory Windows, or EFI Specific Purpose Memory, but also any PCI MMIO range for the DEVICE_PRIVATE and PCI_P2PDMA cases. Update the mitigation, limit KASLR entropy, to apply in all ZONE_DEVICE=y cases. Distro kernels typically have PCI_P2PDMA=y, so the practical exposure of this problem is limited to the PCI_P2PDMA=n case. A potential path to recover entropy would be to walk ACPI and determine the limits for hotplug and PCI MMIO before kernel_randomize_memory(). On smaller systems that could yield some KASLR address bits. This needs additional investigation to determine if some limited ACPI table scanning can happen this early without an open coded solution like arch/x86/boot/compressed/acpi.c needs to deploy. Cc: Ingo Molnar <mingo@kernel.org> Cc: Kees Cook <kees@kernel.org> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Logan Gunthorpe <logang@deltatee.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: David Hildenbrand <david@redhat.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Mike Rapoport <rppt@kernel.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Michal Hocko <mhocko@suse.com> Fixes: 7ffb791423c7 ("x86/kaslr: Reduce KASLR entropy on most x86 systems") Cc: <stable@vger.kernel.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Balbir Singh <balbirs@nvidia.com> Tested-by: Yasunori Goto <y-goto@fujitsu.com> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Link: http://patch.msgid.link/692e08b2516d4_261c1100a3@dwillia2-mobl4.notmuch Signed-off-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-01-08PCI: brcmstb: Fix disabling L0s capabilityJim Quinlan
[ Upstream commit 9583f9d22991d2cfb5cc59a2552040c4ae98d998 ] caab002d5069 ("PCI: brcmstb: Disable L0s component of ASPM if requested") set PCI_EXP_LNKCAP_ASPM_L1 and (optionally) PCI_EXP_LNKCAP_ASPM_L0S in PCI_EXP_LNKCAP (aka PCIE_RC_CFG_PRIV1_LINK_CAPABILITY in brcmstb). But instead of using PCI_EXP_LNKCAP_ASPM_L1 and PCI_EXP_LNKCAP_ASPM_L0S directly, it used PCIE_LINK_STATE_L1 and PCIE_LINK_STATE_L0S, which are Linux-created values that only coincidentally matched the PCIe spec. b478e162f227 ("PCI/ASPM: Consolidate link state defines") later changed them so they no longer matched the PCIe spec, so the bits ended up in the wrong place in PCI_EXP_LNKCAP. Use PCI_EXP_LNKCAP_ASPM_L0S to clear L0s support when there's an 'aspm-no-l0s' property. Rely on brcmstb hardware to advertise L0s and/or L1 support otherwise. Fixes: caab002d5069 ("PCI: brcmstb: Disable L0s component of ASPM if requested") Reported-by: Bjorn Helgaas <bhelgaas@google.com> Closes: https://lore.kernel.org/linux-pci/20250925194424.GA2197200@bhelgaas Signed-off-by: Jim Quinlan <james.quinlan@broadcom.com> [mani: reworded subject and description, added closes tag and CCed stable] Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> [bhelgaas: commit log] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20251003170436.1446030-1-james.quinlan@broadcom.com Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-01-08PCI: brcmstb: Set MLW based on "num-lanes" DT property if presentJim Quinlan
[ Upstream commit a364d10ffe361fb34c3838d33604da493045de1e ] By default, the driver relies on the default hardware defined value for the Max Link Width (MLW) capability. But if the "num-lanes" DT property is present, assume that the chip's default capability information is incorrect or undesired, and use the specified value instead. Signed-off-by: Jim Quinlan <james.quinlan@broadcom.com> [mani: reworded the description and comments] Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://patch.msgid.link/20250530224035.41886-3-james.quinlan@broadcom.com Stable-dep-of: 9583f9d22991 ("PCI: brcmstb: Fix disabling L0s capability") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-01-08PCI: brcmstb: Reuse pcie_cfg_data structureStanimir Varbanov
[ Upstream commit 10dbedad3c8188ce8b68559d43b7aaee7dafba25 ] Instead of copying fields from the pcie_cfg_data structure to brcm_pcie, reference it directly. Signed-off-by: Stanimir Varbanov <svarbanov@suse.de> Reviewed-by: Florian Fainelil <florian.fainelli@broadcom.com> Reviewed-by: Jim Quinlan <james.quinlan@broadcom.com> Tested-by: Ivan T. Ivanov <iivanov@suse.de> Link: https://lore.kernel.org/r/20250224083559.47645-6-svarbanov@suse.de [kwilczynski: commit log] Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Stable-dep-of: 9583f9d22991 ("PCI: brcmstb: Fix disabling L0s capability") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-01-08PCI/PM: Reinstate clearing state_saved in legacy and !PM codepathsLukas Wunner
commit 894f475f88e06c0f352c829849560790dbdedbe5 upstream. When a PCI device is suspended, it is normally the PCI core's job to save Config Space and put the device into a low power state. However drivers are allowed to assume these responsibilities. When they do, the PCI core can tell by looking at the state_saved flag in struct pci_dev: The flag is cleared before commencing the suspend sequence and it is set when pci_save_state() is called. If the PCI core finds the flag set late in the suspend sequence, it refrains from calling pci_save_state() itself. But there are two corner cases where the PCI core neglects to clear the flag before commencing the suspend sequence: * If a driver has legacy PCI PM callbacks, pci_legacy_suspend() neglects to clear the flag. The (stale) flag is subsequently queried by pci_legacy_suspend() itself and pci_legacy_suspend_late(). * If a device has no driver or its driver has no PCI PM callbacks, pci_pm_freeze() neglects to clear the flag. The (stale) flag is subsequently queried by pci_pm_freeze_noirq(). The flag may be set prior to suspend if the device went through error recovery: Drivers commonly invoke pci_restore_state() + pci_save_state() to restore Config Space after reset. The flag may also be set if drivers call pci_save_state() on probe to allow for recovery from subsequent errors. The result is that pci_legacy_suspend_late() and pci_pm_freeze_noirq() don't call pci_save_state() and so the state that will be restored on resume is the one recorded on last error recovery or on probe, not the one that the device had on suspend. If the two states happen to be identical, there's no problem. Reinstate clearing the flag in pci_legacy_suspend() and pci_pm_freeze(). The two functions used to do that until commit 4b77b0a2ba27 ("PCI: Clear saved_state after the state has been restored") deemed it unnecessary because it assumed that it's sufficient to clear the flag on resume in pci_restore_state(). The commit seemingly did not take into account that pci_save_state() and pci_restore_state() are not only used by power management code, but also for error recovery. Devices without driver or whose driver has no PCI PM callbacks may be in runtime suspend when pci_pm_freeze() is called. Their state has already been saved, so don't clear the flag to skip a pointless pci_save_state() in pci_pm_freeze_noirq(). None of the drivers with legacy PCI PM callbacks seem to use runtime PM, so clear the flag unconditionally in their case. Fixes: 4b77b0a2ba27 ("PCI: Clear saved_state after the state has been restored") Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Rafael J. Wysocki (Intel) <rafael@kernel.org> Cc: stable@vger.kernel.org # v2.6.32+ Link: https://patch.msgid.link/094f2aad64418710daf0940112abe5a0afdc6bce.1763483367.git.lukas@wunner.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-12-18PCI: dwc: Fix wrong PORT_LOGIC_LTSSM_STATE_MASK definitionShawn Lin
[ Upstream commit bcc9a4a0bca3aee4303fa4a20302e57b24ac8f68 ] As per DesignWare Cores PCI Express Controller Databook, section 5.50, SII: Debug Signals, cxpl_debug_info[63:0]: [5:0] smlh_ltssm_state: LTSSM current state. Encoding is same as the dedicated smlh_ltssm_state output. The mask should be 6 bits, from 0 to 5. Hence, fix the mask definition. Fixes: 23fe5bd4be90 ("PCI: keystone: Cleanup ks_pcie_link_up()") Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com> [mani: reworded description] Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/1763122140-203068-1-git-send-email-shawn.lin@rock-chips.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-12-18PCI: keystone: Exit ks_pcie_probe() for invalid modeSiddharth Vadapalli
[ Upstream commit 95d9c3f0e4546eaec0977f3b387549a8463cd49f ] Commit under Fixes introduced support for PCIe EP mode on AM654x platforms. When the mode happens to be either "DW_PCIE_RC_TYPE" or "DW_PCIE_EP_TYPE", the PCIe Controller is configured accordingly. However, when the mode is neither of them, an error message is displayed, but the driver probe succeeds. Since this "invalid" mode is not associated with a functional PCIe Controller, the probe should fail. Fix the behavior by exiting "ks_pcie_probe()" with the return value of "-EINVAL" in addition to displaying the existing error message when the mode is invalid. Fixes: 23284ad677a9 ("PCI: keystone: Add support for PCIe EP in AM654x Platforms") Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20251029080547.1253757-4-s-vadapalli@ti.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-12-18PCI: rcar-gen2: Drop ARM dependency from PCI_RCAR_GEN2Geert Uytterhoeven
[ Upstream commit d312742f686582e6457070bcfd24bee8acfdf213 ] Since the reliance on ARM-specific struct pci_sys_data was removed, this driver can be compile-tested on other architectures. While at it, make the help text a bit more generic, as some members of the R-Car Gen2 family have a different number of internal PCI controllers. Fixes: 4a957563fe0231e0 ("PCI: rcar-gen2: Convert to use modern host bridge probe functions") Suggested-by: Ilpo Jarvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> [bhelgaas: add rcar-gen2 to subject] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Link: https://patch.msgid.link/00f75d6732eacce93f04ffaeedc415d2db714cd6.1759480426.git.geert+renesas@glider.be Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-11-13PCI/PM: Skip resuming to D0 if device is disconnectedMario Limonciello
[ Upstream commit 299fad4133677b845ce962f78c9cf75bded63f61 ] When a device is surprise-removed (e.g., due to a dock unplug), the PCI core unconfigures all downstream devices and sets their error state to pci_channel_io_perm_failure. This marks them as disconnected via pci_dev_is_disconnected(). During device removal, the runtime PM framework may attempt to resume the device to D0 via pm_runtime_get_sync(), which calls into pci_power_up(). Since the device is already disconnected, this resume attempt is unnecessary and results in a predictable errors like this, typically when undocking from a TBT3 or USB4 dock with PCIe tunneling: pci 0000:01:00.0: Unable to change power state from D3cold to D0, device inaccessible Avoid powering up disconnected devices by checking their status early in pci_power_up() and returning -EIO. Suggested-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> [bhelgaas: add typical message] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Lukas Wunner <lukas@wunner.de> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Acked-by: Rafael J. Wysocki <rafael@kernel.org> Link: https://patch.msgid.link/20250909031916.4143121-1-superm1@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-11-13PCI: cadence: Check for the existence of cdns_pcie::ops before using itChen Wang
[ Upstream commit 49a6c160ad4812476f8ae1a8f4ed6d15adfa6c09 ] cdns_pcie::ops might not be populated by all the Cadence glue drivers. This is going to be true for the upcoming Sophgo platform which doesn't set the ops. Hence, add a check to prevent NULL pointer dereference. Signed-off-by: Chen Wang <unicorn_wang@outlook.com> [mani: reworded subject and description] Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Link: https://patch.msgid.link/35182ee1d972dfcd093a964e11205efcebbdc044.1757643388.git.unicorn_wang@outlook.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-11-13PCI: dwc: Verify the single eDMA IRQ in dw_pcie_edma_irq_verify()Niklas Cassel
[ Upstream commit 09fefb24ed5e15f3b112f6c04b21a90ea23eaf8b ] dw_pcie_edma_irq_verify() is supposed to verify the eDMA IRQs in devicetree by fetching them using either 'dma' or 'dmaX' IRQ names. Former is used when the platform uses a single IRQ for all eDMA channels and latter is used when the platform uses separate IRQ per channel. But currently, dw_pcie_edma_irq_verify() bails out early if edma::nr_irqs is 1, i.e., when a single IRQ is used. This gives an impression that the driver could work with any single IRQ in devicetree, not necessarily with name 'dma'. But dw_pcie_edma_irq_vector(), which actually requests the IRQ, does require the single IRQ to be named as 'dma'. So this creates inconsistency between dw_pcie_edma_irq_verify() and dw_pcie_edma_irq_vector(). Thus, to fix this inconsistency, make sure dw_pcie_edma_irq_verify() also verifies the single IRQ name by removing the bail out code. Signed-off-by: Niklas Cassel <cassel@kernel.org> [mani: reworded subject and description] Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> [bhelgaas: fix typos] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20250908165914.547002-3-cassel@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-11-13PCI: endpoint: pci-epf-test: Limit PCIe BAR size for fixed BARsMarek Vasut
[ Upstream commit d5f6bd3ee3f5048f272182dc91675c082773999e ] Currently, the test allocates BAR sizes according to fixed table bar_size. This does not work with controllers which have fixed size BARs that are smaller than the requested BAR size. One such controller is Renesas R-Car V4H PCIe controller, which has BAR4 size limited to 256 bytes, which is much less than one of the BAR size, 131072 currently requested by this test. A lot of controllers drivers in-tree have fixed size BARs, and they do work perfectly fine, but it is only because their fixed size is larger than the size requested by pci-epf-test.c Adjust the test such that in case a fixed size BAR is detected, the fixed BAR size is used, as that is the only possible option. This helps with test failures reported as follows: pci_epf_test pci_epf_test.0: requested BAR size is larger than fixed size pci_epf_test pci_epf_test.0: Failed to allocate space for BAR4 Signed-off-by: Marek Vasut <marek.vasut+renesas@mailbox.org> [mani: reworded description] Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Niklas Cassel <cassel@kernel.org> Link: https://patch.msgid.link/20250905184240.144431-1-marek.vasut+renesas@mailbox.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-11-13PCI: imx6: Enable the Vaux supply if availableRichard Zhu
[ Upstream commit c221cbf8dc547eb8489152ac62ef103fede99545 ] When the 3.3Vaux supply is present, fetch it at the probe time and keep it enabled for the entire PCIe controller lifecycle so that the link can enter L2 state and the devices can signal wakeup using either Beacon or WAKE# mechanisms. Signed-off-by: Richard Zhu <hongxing.zhu@nxp.com> [mani: reworded the subject, description and error message] Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Reviewed-by: Frank Li <Frank.Li@nxp.com> Link: https://patch.msgid.link/20250820022328.2143374-1-hongxing.zhu@nxp.com Signed-off-by: Sasha Levin <sashal@kernel.org>