diff options
| author | Heng Qi <hengqi@linux.alibaba.com> | 2024-06-21 18:13:51 +0800 |
|---|---|---|
| committer | Jakub Kicinski <kuba@kernel.org> | 2024-06-25 17:15:06 -0700 |
| commit | f750dfe825b904164688adeb147950e0e0c4d262 (patch) | |
| tree | 3ca12a3a1b3b7cc0dc6370a4b2eac33bd1b00859 /include/linux/dim.h | |
| parent | b65e697a7c9e0ae28a6255257d8b9b3271960426 (diff) | |
ethtool: provide customized dim profile management
The NetDIM library, currently leveraged by an array of NICs, delivers
excellent acceleration benefits. Nevertheless, NICs vary significantly
in their dim profile list prerequisites.
Specifically, virtio-net backends may present diverse sw or hw device
implementation, making a one-size-fits-all parameter list impractical.
On Alibaba Cloud, the virtio DPU's performance under the default DIM
profile falls short of expectations, partly due to a mismatch in
parameter configuration.
I also noticed that ice/idpf/ena and other NICs have customized
profilelist or placed some restrictions on dim capabilities.
Motivated by this, I tried adding new params for "ethtool -C" that provides
a per-device control to modify and access a device's interrupt parameters.
Usage
========
The target NIC is named ethx.
Assume that ethx only declares support for rx profile setting
(with DIM_PROFILE_RX flag set in profile_flags) and supports modification
of usec and pkt fields.
1. Query the currently customized list of the device
$ ethtool -c ethx
...
rx-profile:
{.usec = 1, .pkts = 256, .comps = n/a,},
{.usec = 8, .pkts = 256, .comps = n/a,},
{.usec = 64, .pkts = 256, .comps = n/a,},
{.usec = 128, .pkts = 256, .comps = n/a,},
{.usec = 256, .pkts = 256, .comps = n/a,}
tx-profile: n/a
2. Tune
$ ethtool -C ethx rx-profile 1,1,n_2,n,n_3,3,n_4,4,n_n,5,n
"n" means do not modify this field.
$ ethtool -c ethx
...
rx-profile:
{.usec = 1, .pkts = 1, .comps = n/a,},
{.usec = 2, .pkts = 256, .comps = n/a,},
{.usec = 3, .pkts = 3, .comps = n/a,},
{.usec = 4, .pkts = 4, .comps = n/a,},
{.usec = 256, .pkts = 5, .comps = n/a,}
tx-profile: n/a
3. Hint
If the device does not support some type of customized dim profiles,
the corresponding "n/a" will display.
If the "n/a" field is being modified, -EOPNOTSUPP will be reported.
Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240621101353.107425-4-hengqi@linux.alibaba.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Diffstat (limited to 'include/linux/dim.h')
| -rw-r--r-- | include/linux/dim.h | 58 |
1 files changed, 58 insertions, 0 deletions
diff --git a/include/linux/dim.h b/include/linux/dim.h index 43398f5eade2..e0f39bd85432 100644 --- a/include/linux/dim.h +++ b/include/linux/dim.h @@ -10,6 +10,8 @@ #include <linux/types.h> #include <linux/workqueue.h> +struct net_device; + /* Number of DIM profiles and period mode. */ #define NET_DIM_PARAMS_NUM_PROFILES 5 #define NET_DIM_DEFAULT_RX_CQ_PKTS_FROM_EQE 256 @@ -45,12 +47,45 @@ * @pkts: CQ packet counter suggestion (by DIM) * @comps: Completion counter * @cq_period_mode: CQ period count mode (from CQE/EQE) + * @rcu: for asynchronous kfree_rcu */ struct dim_cq_moder { u16 usec; u16 pkts; u16 comps; u8 cq_period_mode; + struct rcu_head rcu; +}; + +#define DIM_PROFILE_RX BIT(0) /* support rx profile modification */ +#define DIM_PROFILE_TX BIT(1) /* support tx profile modification */ + +#define DIM_COALESCE_USEC BIT(0) /* support usec field modification */ +#define DIM_COALESCE_PKTS BIT(1) /* support pkts field modification */ +#define DIM_COALESCE_COMPS BIT(2) /* support comps field modification */ + +/** + * struct dim_irq_moder - Structure for irq moderation information. + * Used to collect irq moderation related information. + * + * @profile_flags: DIM_PROFILE_* + * @coal_flags: DIM_COALESCE_* for Rx and Tx + * @dim_rx_mode: Rx DIM period count mode: CQE or EQE + * @dim_tx_mode: Tx DIM period count mode: CQE or EQE + * @rx_profile: DIM profile list for Rx + * @tx_profile: DIM profile list for Tx + * @rx_dim_work: Rx DIM worker scheduled by net_dim() + * @tx_dim_work: Tx DIM worker scheduled by net_dim() + */ +struct dim_irq_moder { + u8 profile_flags; + u8 coal_flags; + u8 dim_rx_mode; + u8 dim_tx_mode; + struct dim_cq_moder __rcu *rx_profile; + struct dim_cq_moder __rcu *tx_profile; + void (*rx_dim_work)(struct work_struct *work); + void (*tx_dim_work)(struct work_struct *work); }; /** @@ -199,6 +234,29 @@ enum dim_step_result { }; /** + * net_dim_init_irq_moder - collect information to initialize irq moderation + * @dev: target network device + * @profile_flags: Rx or Tx profile modification capability + * @coal_flags: irq moderation params flags + * @rx_mode: CQ period mode for Rx + * @tx_mode: CQ period mode for Tx + * @rx_dim_work: Rx worker called after dim decision + * @tx_dim_work: Tx worker called after dim decision + * + * Return: 0 on success or a negative error code. + */ +int net_dim_init_irq_moder(struct net_device *dev, u8 profile_flags, + u8 coal_flags, u8 rx_mode, u8 tx_mode, + void (*rx_dim_work)(struct work_struct *work), + void (*tx_dim_work)(struct work_struct *work)); + +/** + * net_dim_free_irq_moder - free fields for irq moderation + * @dev: target network device + */ +void net_dim_free_irq_moder(struct net_device *dev); + +/** * dim_on_top - check if current state is a good place to stop (top location) * @dim: DIM context * |
