summaryrefslogtreecommitdiff
path: root/Documentation/networking/devlink/devlink-health.rst
AgeCommit message (Collapse)Author
2025-08-26devlink: Make health reporter burst period configurableShahar Shitrit
Enable configuration of the burst period — a time window starting from the first error recovery, during which the reporter allows recovery attempts for each reported error. This feature is helpful when a single underlying issue causes multiple errors, as it delays the start of the grace period to allow sufficient time for recovering all related errors. For example, if multiple TX queues time out simultaneously, a sufficient burst period could allow all affected TX queues to be recovered within that window. Without this period, only the first TX queue that reports a timeout will undergo recovery, while the remaining TX queues will be blocked once the grace period begins. Configuration example: $ devlink health set pci/0000:00:09.0 reporter tx burst_period 500 Configuration example with ynl: ./tools/net/ynl/pyynl/cli.py \ --spec Documentation/netlink/specs/devlink.yaml \ --do health-reporter-set --json '{ "bus-name": "auxiliary", "dev-name": "mlx5_core.eth.0", "port-index": 65535, "health-reporter-name": "tx", "health-reporter-burst-period": 500 }' Signed-off-by: Shahar Shitrit <shshitrit@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Carolina Jubran <cjubran@nvidia.com> Signed-off-by: Mark Bloch <mbloch@nvidia.com> Link: https://patch.msgid.link/20250824084354.533182-5-mbloch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-15devlink: Update devlink health documentationMoshe Shemesh
Update devlink-health.rst file: - Add devlink formatted message (fmsg) API documentation. - Add auto-dump as a condition to do dump once error reported. - Expand OOB to clarify this acronym. Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-03-13docs: net: add missing devlink health cmd - triggerJakub Kicinski
Documentation is missing and it's not very clear what this callback is for - presumably testing the recovery? Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-13docs: net: tweak devlink health documentationJakub Kicinski
Minor tweaks and improvement of wording about the diagnose callback. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-10devlink: convert devlink-health.txt to rst formatJacob Keller
Update the devlink-health documentation to use the newer ReStructuredText format. Note that it's unclear what OOB stood for, and it has been left as-is without a proper first-use expansion of the acronym. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>