summaryrefslogtreecommitdiff
path: root/include/uapi
AgeCommit message (Collapse)Author
2024-06-26xfrm: support sending NAT keepalives in ESP in UDP statesEyal Birger
Add the ability to send out RFC-3948 NAT keepalives from the xfrm stack. To use, Userspace sets an XFRM_NAT_KEEPALIVE_INTERVAL integer property when creating XFRM outbound states which denotes the number of seconds between keepalive messages. Keepalive messages are sent from a per net delayed work which iterates over the xfrm states. The logic is guarded by the xfrm state spinlock due to the xfrm state walk iterator. Possible future enhancements: - Adding counters to keep track of sent keepalives. - deduplicate NAT keepalives between states sharing the same nat keepalive parameters. - provisioning hardware offloads for devices capable of implementing this. - revise xfrm state list to use an rcu list in order to avoid running this under spinlock. Suggested-by: Paul Wouters <paul.wouters@aiven.io> Tested-by: Paul Wouters <paul.wouters@aiven.io> Tested-by: Antony Antony <antony.antony@secunet.com> Signed-off-by: Eyal Birger <eyal.birger@gmail.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2024-06-25ethtool: provide customized dim profile managementHeng Qi
The NetDIM library, currently leveraged by an array of NICs, delivers excellent acceleration benefits. Nevertheless, NICs vary significantly in their dim profile list prerequisites. Specifically, virtio-net backends may present diverse sw or hw device implementation, making a one-size-fits-all parameter list impractical. On Alibaba Cloud, the virtio DPU's performance under the default DIM profile falls short of expectations, partly due to a mismatch in parameter configuration. I also noticed that ice/idpf/ena and other NICs have customized profilelist or placed some restrictions on dim capabilities. Motivated by this, I tried adding new params for "ethtool -C" that provides a per-device control to modify and access a device's interrupt parameters. Usage ======== The target NIC is named ethx. Assume that ethx only declares support for rx profile setting (with DIM_PROFILE_RX flag set in profile_flags) and supports modification of usec and pkt fields. 1. Query the currently customized list of the device $ ethtool -c ethx ... rx-profile: {.usec = 1, .pkts = 256, .comps = n/a,}, {.usec = 8, .pkts = 256, .comps = n/a,}, {.usec = 64, .pkts = 256, .comps = n/a,}, {.usec = 128, .pkts = 256, .comps = n/a,}, {.usec = 256, .pkts = 256, .comps = n/a,} tx-profile: n/a 2. Tune $ ethtool -C ethx rx-profile 1,1,n_2,n,n_3,3,n_4,4,n_n,5,n "n" means do not modify this field. $ ethtool -c ethx ... rx-profile: {.usec = 1, .pkts = 1, .comps = n/a,}, {.usec = 2, .pkts = 256, .comps = n/a,}, {.usec = 3, .pkts = 3, .comps = n/a,}, {.usec = 4, .pkts = 4, .comps = n/a,}, {.usec = 256, .pkts = 5, .comps = n/a,} tx-profile: n/a 3. Hint If the device does not support some type of customized dim profiles, the corresponding "n/a" will display. If the "n/a" field is being modified, -EOPNOTSUPP will be reported. Signed-off-by: Heng Qi <hengqi@linux.alibaba.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20240621101353.107425-4-hengqi@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-26netfilter: nf_tables: rise cap on SELinux secmark contextPablo Neira Ayuso
secmark context is artificially limited 256 bytes, rise it to 4Kbytes. Fixes: fb961945457f ("netfilter: nf_tables: add SECMARK support") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-06-25nsfs: add pid translation ioctlsChristian Brauner
Add ioctl()s to translate pids between pid namespaces. LXCFS is a tiny fuse filesystem used to virtualize various aspects of procfs. LXCFS is run on the host. The files and directories it creates can be bind-mounted by e.g. a container at startup and mounted over the various procfs files the container wishes to have virtualized. When e.g. a read request for uptime is received, LXCFS will receive the pid of the reader. In order to virtualize the corresponding read, LXCFS needs to know the pid of the init process of the reader's pid namespace. In order to do this, LXCFS first needs to fork() two helper processes. The first helper process setns() to the readers pid namespace. The second helper process is needed to create a process that is a proper member of the pid namespace. The second helper process then creates a ucred message with ucred.pid set to 1 and sends it back to LXCFS. The kernel will translate the ucred.pid field to the corresponding pid number in LXCFS's pid namespace. This way LXCFS can learn the init pid number of the reader's pid namespace and can go on to virtualize. Since these two forks() are costly LXCFS maintains an init pid cache that caches a given pid for a fixed amount of time. The cache is pruned during new read requests. However, even with the cache the hit of the two forks() is singificant when a very large number of containers are running. With this simple patch we add an ns ioctl that let's a caller retrieve the init pid nr of a pid namespace through its pid namespace fd. This significantly improves performance with a very simple change. Support translation of pids and tgids. Other concepts can be added but there are no obvious users for this right now. To protect against races pidfds can be used to check whether the process is still valid. If needed, this can also be extended to work on pidfds directly. Link: https://lore.kernel.org/r/20240619-work-ns_ioctl-v1-1-7c0097e6bb6b@kernel.org Reviewed-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-06-25syscalls: fix compat_sys_io_pgetevents_time64 usageArnd Bergmann
Using sys_io_pgetevents() as the entry point for compat mode tasks works almost correctly, but misses the sign extension for the min_nr and nr arguments. This was addressed on parisc by switching to compat_sys_io_pgetevents_time64() in commit 6431e92fc827 ("parisc: io_pgetevents_time64() needs compat syscall in 32-bit compat mode"), as well as by using more sophisticated system call wrappers on x86 and s390. However, arm64, mips, powerpc, sparc and riscv still have the same bug. Change all of them over to use compat_sys_io_pgetevents_time64() like parisc already does. This was clearly the intention when the function was originally added, but it got hooked up incorrectly in the tables. Cc: stable@vger.kernel.org Fixes: 48166e6ea47d ("y2038: add 64-bit time_t syscalls to all 32-bit architectures") Acked-by: Heiko Carstens <hca@linux.ibm.com> # s390 Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-06-21drm/msm: Add MSM_PARAM_RAYTRACING uapiConnor Abbott
Expose the value of the software fuse to userspace. Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Signed-off-by: Connor Abbott <cwabbott0@gmail.com> Patchwork: https://patchwork.freedesktop.org/patch/592044/ Signed-off-by: Rob Clark <robdclark@chromium.org>
2024-06-21Merge tag 'drm-misc-next-2024-06-06' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/misc/kernel into drm-next drm-misc-next for 6.10: UAPI Changes: Cross-subsystem Changes: - dma-buf: Warn when reserving 0 fence slots, internal API enhancements for heaps Core Changes: Driver Changes: - atmel-hlcdc: Support XLCDC in sam9x7 - msm: Validate registers XML description against schema in CI - v3d: Fix build warning - bridges: - analogix_dp: Various improvements - panels: - New panel: WL-355608-A8 Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maxime Ripard <mripard@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240606-vivid-amphibian-jackrabbit-40b1d1@houat
2024-06-21Merge tag 'drm-misc-next-2024-05-30' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/misc/kernel into drm-next drm-misc-next for 6.11: UAPI Changes: - Deprecate DRM date and return a 0 date in DRM_IOCTL_VERSION Core Changes: - connector: Create a set of helpers to help with HDMI support - fbdev: Create memory manager optimized fbdev emulation - panic: Allow to select fonts, improve drm_fb_dma_get_scanout_buffer Driver Changes: - Remove driver owner assignments - Allow more drivers to compile with COMPILE_TEST - Conversions to drm_edid - ivpu: hardware scheduler support, profiling support, improvements to the platform support layer - mgag200: general reworks and improvements - nouveau: Add NVreg_RegistryDwords command line option - rockchip: Conversion to the hdmi helpers - sun4i: Conversion to the hdmi helpers - vc4: Conversion to the hdmi helpers - v3d: Perf counters improvements - zynqmp: IRQ and debugfs improvements - bridge: - Remove redundant checks on bridge->encoder - panels: - Switch panels from register table initialization to proper code - Now that the panel code tracks the panel state, remove every ad-hoc implementation in the panel drivers - New panels: Lincoln Tech Sol LCD185-101CT, Microtips Technology 13-101HIEBCAF0-C, Microtips Technology MF-103HIEB0GA0, BOE nv110wum-l60, IVO t109nw41 Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maxime Ripard <mripard@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240530-hilarious-flat-magpie-5fa186@houat
2024-06-20fs: Add initial atomic write support info to statxPrasad Singamsetty
Extend statx system call to return additional info for atomic write support support for a file. Helper function generic_fill_statx_atomic_writes() can be used by FSes to fill in the relevant statx fields. For now atomic_write_segments_max will always be 1, otherwise some rules would need to be imposed on iovec length and alignment, which we don't want now. Signed-off-by: Prasad Singamsetty <prasad.singamsetty@oracle.com> jpg: relocate bdev support to another patch Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: John Garry <john.g.garry@oracle.com> Acked-by: Darrick J. Wong <djwong@kernel.org> Link: https://lore.kernel.org/r/20240620125359.2684798-5-john.g.garry@oracle.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-06-20fs: Initial atomic write supportPrasad Singamsetty
An atomic write is a write issued with torn-write protection, meaning that for a power failure or any other hardware failure, all or none of the data from the write will be stored, but never a mix of old and new data. Userspace may add flag RWF_ATOMIC to pwritev2() to indicate that the write is to be issued with torn-write prevention, according to special alignment and length rules. For any syscall interface utilizing struct iocb, add IOCB_ATOMIC for iocb->ki_flags field to indicate the same. A call to statx will give the relevant atomic write info for a file: - atomic_write_unit_min - atomic_write_unit_max - atomic_write_segments_max Both min and max values must be a power-of-2. Applications can avail of atomic write feature by ensuring that the total length of a write is a power-of-2 in size and also sized between atomic_write_unit_min and atomic_write_unit_max, inclusive. Applications must ensure that the write is at a naturally-aligned offset in the file wrt the total write length. The value in atomic_write_segments_max indicates the upper limit for IOV_ITER iovcnt. Add file mode flag FMODE_CAN_ATOMIC_WRITE, so files which do not have the flag set will have RWF_ATOMIC rejected and not just ignored. Add a type argument to kiocb_set_rw_flags() to allows reads which have RWF_ATOMIC set to be rejected. Helper function generic_atomic_write_valid() can be used by FSes to verify compliant writes. There we check for iov_iter type is for ubuf, which implies iovcnt==1 for pwritev2(), which is an initial restriction for atomic_write_segments_max. Initially the only user will be bdev file operations write handler. We will rely on the block BIO submission path to ensure write sizes are compliant for the bdev, so we don't need to check atomic writes sizes yet. Signed-off-by: Prasad Singamsetty <prasad.singamsetty@oracle.com> jpg: merge into single patch and much rewrite Acked-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: John Garry <john.g.garry@oracle.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Link: https://lore.kernel.org/r/20240620125359.2684798-4-john.g.garry@oracle.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-06-20can: isotp: remove ISO 15675-2 specification version where possibleOliver Hartkopp
With the new ISO 15765-2:2024 release the former documentation and comments have to be reworked. This patch removes the ISO specification version/date where possible. Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Acked-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr> Acked-by: Francesco Valla <valla.francesco@gmail.com> Link: https://lore.kernel.org/all/20240420194746.4885-1-socketcan@hartkopp.net Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2024-06-19io_uring: Introduce IORING_OP_LISTENGabriel Krisman Bertazi
IORING_OP_LISTEN provides the semantic of listen(2) via io_uring. While this is an essentially synchronous system call, the main point is to enable a network path to execute fully with io_uring registered and descriptorless files. Signed-off-by: Gabriel Krisman Bertazi <krisman@suse.de> Link: https://lore.kernel.org/r/20240614163047.31581-4-krisman@suse.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-06-19io_uring: Introduce IORING_OP_BINDGabriel Krisman Bertazi
IORING_OP_BIND provides the semantic of bind(2) via io_uring. While this is an essentially synchronous system call, the main point is to enable a network path to execute fully with io_uring registered and descriptorless files. Signed-off-by: Gabriel Krisman Bertazi <krisman@suse.de> Link: https://lore.kernel.org/r/20240614163047.31581-3-krisman@suse.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-06-18drm/xe/oa/uapi: Query OA unit propertiesAshutosh Dixit
Implement query for properties of OA units present on a device. v2: Clean up reserved/pad fields (Umesh) Follow the same scheme as other query structs v3: Skip reporting reserved engines attached to OA units v4: Expose oa_buf_size via DRM_XE_PERF_IOCTL_INFO (Umesh) v5: Don't expose capabilities as OR of properties (Umesh) v6: Add extensions to query output structs: drm_xe_oa_unit, drm_xe_query_oa_units and drm_xe_oa_stream_info v7: Change oa_units[] array to __u64 type Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240618014609.3233427-13-ashutosh.dixit@intel.com
2024-06-18drm/xe/oa/uapi: Read file_operationAshutosh Dixit
Implement the OA stream read file_operation. Both blocking and non-blocking reads are supported. As part of read system call, the read copies OA perf data from the OA buffer to the user buffer, after appending packet headers for status and data packets. v2: Drop OA report headers, implement DRM_XE_PERF_IOCTL_STATUS (Umesh) v3: Introduce 'struct drm_xe_oa_stream_status' v4: Define oa_status register bitfields (Umesh) v5: Add extensions to 'struct drm_xe_oa_stream_status' v6: Minor cleanup, eliminate report32 variable v7: Use -EIO to signal to userspace to read OASTATUS using DRM_XE_PERF_IOCTL_STATUS, change previous sites returning -EIO to return -EINVAL Make drm_xe_oa_stream_status bits contiguous (Jose, Umesh) rmw oa_status bits (Umesh) Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Acked-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240618014609.3233427-10-ashutosh.dixit@intel.com
2024-06-18drm/xe/oa/uapi: Expose OA stream fdAshutosh Dixit
The OA stream open perf op returns an fd with its own file_operations for the newly initialized OA stream. These file_operations allow userspace to enable or disable the stream, as well as apply a different metric configuration for the OA stream. Userspace can also poll for data availability. OA stream initialization is completed in this commit by enabling the OA stream. When sampling is enabled this starts a hrtimer which periodically checks for data availablility. v2: Use stream properties for stream reconfiguration with DRM_XE_PERF_IOCTL_CONFIG v3: Hold runtime_pm reference across oa buffer alloc/free v4: Fix 32 bit build Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Acked-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240618014609.3233427-9-ashutosh.dixit@intel.com
2024-06-18drm/xe/oa/uapi: Define and parse OA stream propertiesAshutosh Dixit
Properties for OA streams are specified by user space, when the stream is opened, as a chain of drm_xe_ext_set_property struct's. Parse and validate these stream properties. v2: Remove struct drm_xe_oa_open_param (Harish Chegondi) Drop DRM_XE_OA_PROPERTY_POLL_OA_PERIOD_US (Umesh) Eliminate comparison with xe_oa_max_sample_rate (Umesh) Drop 'struct drm_xe_oa_record_header' (Umesh) v3: s/DRM_XE_OA_PROPERTY_OA_EXPONENT/ \ DRM_XE_OA_PROPERTY_OA_PERIOD_EXPONENT/ (Jose) v4: Fix 32 bit build v5: Add non-static function kernel doc (Michal) Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Acked-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240618014609.3233427-7-ashutosh.dixit@intel.com
2024-06-18drm/xe/oa/uapi: Add/remove OA config perf opsAshutosh Dixit
Introduce add/remove config perf ops for OA. OA configurations consist of a set of event/counter select register address/value pairs. The add_config perf op validates and stores such configurations and also exposes them in the metrics sysfs. These configurations will be programmed to OA unit HW when an OA stream using a configuration is opened. The OA stream can also switch to other stored configurations. v2: Start config id's from 1 and other minor review comments (Umesh) v3: Add 32 bit build v4: Add kernel doc for non-static functions (Michal) Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Acked-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240618014609.3233427-6-ashutosh.dixit@intel.com
2024-06-18drm/xe/oa/uapi: Initialize OA unitsAshutosh Dixit
Initialize OA unit data struct's for each gt during device probe. Also assign OA units for hardware engines. v2: Remove XE_OA_UNIT_OAG/XE_OA_UNIT_OAM_SAMEDIA_0 enum (Umesh) Change mtl_oa_base to 0x13000 (Umesh) v3: Switch to drmm_ functions and other cleanups (Michal) Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Acked-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240618014609.3233427-5-ashutosh.dixit@intel.com
2024-06-18drm/xe/oa/uapi: Add OA data formatsAshutosh Dixit
Add and initialize supported OA data formats for various platforms (including Xe2). User can request OA data in any supported format. Bspec: 52198, 60942, 61101 v2: Start 'xe_oa_format_name' enum from 0 (Umesh) Fix error rewind with OA (Umesh) v3: Use graphics versions rather than absolute platform names v4: Add missing kernel doc for struct memebers and enum and other minor changes (Michal) Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Acked-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240618014609.3233427-4-ashutosh.dixit@intel.com
2024-06-18drm/xe/perf/uapi: "Perf" layer to support multiple perf counter stream typesAshutosh Dixit
In Xe, the plan is to support multiple types of perf counter streams (OA is only one type of these streams). Rather than introduce NxM ioctls for these (N perf streams with M ioctl's per perf stream), we decide to multiplex these (N different stream types and the M ops for each of these stream types) through a single PERF ioctl. This multiplexing is the purpose of the PERF layer. In addition to PERF DRM ioctl's, another set of ioctl's on the PERF fd are defined. These are expected to be common to different PERF stream types and therefore defined at the PERF layer itself. v2: Add param_size to 'struct drm_xe_perf_param' (Umesh) v3: Rename 'enum drm_xe_perf_ops' to 'enum drm_xe_perf_ioctls' (Guy Zadicario) Add DRM_ prefix to ioctl names to indicate uapi names v4: Add 'enum drm_xe_perf_op' previously missed out (Guy Zadicario) v5: Squash the ops and PERF layer patches into a single patch (Umesh) Remove param_size from struct 'drm_xe_perf_param' (Umesh) v6: Add DRM_XE_PERF_IOCTL_STATUS v7: Add DRM_XE_PERF_IOCTL_INFO v8: Fix Copyright years, fix DRM_XE_PERF_TYPE_MAX, move '#include "xe_perf.h"' to xe_perf.c, add kernel doc (Michal) Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Acked-by: Guy Zadicario <gzadicario@habana.ai> Acked-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240618014609.3233427-2-ashutosh.dixit@intel.com
2024-06-18KVM: Ensure new code that references immediate_exit gets extra scrutinyDavid Matlack
Ensure that any new KVM code that references immediate_exit gets extra scrutiny by renaming it to immediate_exit__unsafe in kernel code. All fields in struct kvm_run are subject to TOCTOU races since they are mapped into userspace, which may be malicious or buggy. To protect KVM, introduces a new macro that appends __unsafe to select field names in struct kvm_run, hinting to developers and reviewers that accessing such fields must be done carefully. Apply the new macro to immediate_exit, since userspace can make immediate_exit inconsistent with vcpu->wants_to_run, i.e. accessing immediate_exit directly could lead to unexpected bugs in the future. Signed-off-by: David Matlack <dmatlack@google.com> Link: https://lore.kernel.org/r/20240503181734.1467938-3-dmatlack@google.com [sean: massage changelog] Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-06-17net/smc: Introduce IPPROTO_SMCD. Wythe
This patch allows to create smc socket via AF_INET, similar to the following code, /* create v4 smc sock */ v4 = socket(AF_INET, SOCK_STREAM, IPPROTO_SMC); /* create v6 smc sock */ v6 = socket(AF_INET6, SOCK_STREAM, IPPROTO_SMC); There are several reasons why we believe it is appropriate here: 1. For smc sockets, it actually use IPv4 (AF-INET) or IPv6 (AF-INET6) address. There is no AF_SMC address at all. 2. Create smc socket in the AF_INET(6) path, which allows us to reuse the infrastructure of AF_INET(6) path, such as common ebpf hooks. Otherwise, smc have to implement it again in AF_SMC path. Signed-off-by: D. Wythe <alibuda@linux.alibaba.com> Reviewed-by: Wenjia Zhang <wenjia@linux.ibm.com> Reviewed-by: Dust Li <dust.li@linux.alibaba.com> Tested-by: Niklas Schnelle <schnelle@linux.ibm.com> Tested-by: Wenjia Zhang <wenjia@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-16Merge branch 'mana-shared' of ↵Leon Romanovsky
git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma Leon Romanovsky says: ==================== net: mana: Allow variable size indirection table Like we talked, I created new shared branch for this patch: https://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git/log/?h=mana-shared * 'mana-shared' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: net: mana: Allow variable size indirection table ==================== Link: https://lore.kernel.org/all/20240612183051.GE4966@unreal Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-06-13Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR. No conflicts, no adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-13bpf: Add CHECKSUM_COMPLETE to bpf test progsVadim Fedorenko
Add special flag to validate that TC BPF program properly updates checksum information in skb. Signed-off-by: Vadim Fedorenko <vadfed@meta.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20240606145851.229116-1-vadfed@meta.com
2024-06-12wifi: cfg80211: add regulatory flag to allow VLP AP operationJohannes Berg
Add a regulatory flag to allow VLP AP operation even on channels otherwise marked NO_IR, which may be possible in some regulatory domains/countries. Note that this requires checking also when the beacon is changed, since that may change the regulatory power type. Reviewed-by: Miriam Rachel Korenblit <miriam.rachel.korenblit@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Link: https://msgid.link/20240523120945.63792ce19790.Ie2a02750d283b78fbf3c686b10565fb0388889e2@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2024-06-12uprobe: Wire up uretprobe system callJiri Olsa
Wiring up uretprobe system call, which comes in following changes. We need to do the wiring before, because the uretprobe implementation needs the syscall number. Note at the moment uretprobe syscall is supported only for native 64-bit process. Link: https://lore.kernel.org/all/20240611112158.40795-3-jolsa@kernel.org/ Reviewed-by: Oleg Nesterov <oleg@redhat.com> Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
2024-06-11Merge tag 'vfs-6.10-rc4.fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs fixes from Christian Brauner: "Misc: - Restore debugfs behavior of ignoring unknown mount options - Fix kernel doc for netfs_wait_for_oustanding_io() - Fix struct statx comment after new addition for this cycle - Fix a check in find_next_fd() iomap: - Fix data zeroing behavior when an extent spans the block that contains i_size - Restore i_size increasing in iomap_write_end() for now to avoid stale data exposure on xfs with a realtime device Cachefiles: - Remove unneeded fdtable.h include - Improve trace output for cachefiles_obj_{get,put}_ondemand_fd() - Remove requests from the request list to prevent accessing already freed requests - Fix UAF when issuing restore command while the daemon is still alive by adding an additional reference count to requests - Fix UAF by grabbing a reference during xarray lookup with xa_lock() held - Simplify error handling in cachefiles_ondemand_daemon_read() - Add consistency checks read and open requests to avoid crashes - Add a spinlock to protect ondemand_id variable which is used to determine whether an anonymous cachefiles fd has already been closed - Make on-demand reads killable allowing to handle broken cachefiles daemon better - Flush all requests after the kernel has been marked dead via CACHEFILES_DEAD to avoid hung-tasks - Ensure that closed requests are marked as such to avoid reusing them with a reopen request - Defer fd_install() until after copy_to_user() succeeded and thereby get rid of having to use close_fd() - Ensure that anonymous cachefiles on-demand fds are reused while they are valid to avoid pinning already freed cookies" * tag 'vfs-6.10-rc4.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: iomap: Fix iomap_adjust_read_range for plen calculation iomap: keep on increasing i_size in iomap_write_end() cachefiles: remove unneeded include of <linux/fdtable.h> fs/file: fix the check in find_next_fd() cachefiles: make on-demand read killable cachefiles: flush all requests after setting CACHEFILES_DEAD cachefiles: Set object to close if ondemand_id < 0 in copen cachefiles: defer exposing anon_fd until after copy_to_user() succeeds cachefiles: never get a new anonymous fd if ondemand_id is valid cachefiles: add spin_lock for cachefiles_ondemand_info cachefiles: add consistency check for copen/cread cachefiles: remove err_put_fd label in cachefiles_ondemand_daemon_read() cachefiles: fix slab-use-after-free in cachefiles_ondemand_daemon_read() cachefiles: fix slab-use-after-free in cachefiles_ondemand_get_fd() cachefiles: remove requests from xarray during flushing requests cachefiles: add output string to cachefiles_obj_[get|put]_ondemand_fd statx: Update offset commentary for struct statx netfs: fix kernel doc for nets_wait_for_outstanding_io() debugfs: continue to ignore unknown mount options
2024-06-11dlm: introduce DLM_LSFL_SOFTIRQ_SAFEAlexander Aring
Introduce a new external lockspace flag DLM_LSFL_SOFTIRQ_SAFE. A lockspace user will set this flag if it can handle dlm running the callback functions from softirq context. When not set, dlm will continue to run callback functions from the dlm_callback workqueue. The new lockspace flag cannot be used for user space lockspaces, so a uapi placeholder definition is used for the new flag value. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
2024-06-11KVM: x86: Add KVM_RUN_X86_GUEST_MODE kvm_run flagThomas Prescher
When a vCPU is interrupted by a signal while running a nested guest, KVM will exit to userspace with L2 state. However, userspace has no way to know whether it sees L1 or L2 state (besides calling KVM_GET_STATS_FD, which does not have a stable ABI). This causes multiple problems: The simplest one is L2 state corruption when userspace marks the sregs as dirty. See this mailing list thread [1] for a complete discussion. Another problem is that if userspace decides to continue by emulating instructions, it will unknowingly emulate with L2 state as if L1 doesn't exist, which can be considered a weird guest escape. Introduce a new flag, KVM_RUN_X86_GUEST_MODE, in the kvm_run data structure, which is set when the vCPU exited while running a nested guest. Also introduce a new capability, KVM_CAP_X86_GUEST_MODE, to advertise the functionality to userspace. [1] https://lore.kernel.org/kvm/20240416123558.212040-1-julian.stecklina@cyberus-technology.de/T/#m280aadcb2e10ae02c191a7dc4ed4b711a74b1f55 Signed-off-by: Thomas Prescher <thomas.prescher@cyberus-technology.de> Signed-off-by: Julian Stecklina <julian.stecklina@cyberus-technology.de> Link: https://lore.kernel.org/r/20240508132502.184428-1-julian.stecklina@cyberus-technology.de Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-06-11Merge tag 'amd-drm-next-6.11-2024-06-07' of ↵Dave Airlie
https://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-next-6.11-2024-06-07: amdgpu: - DCN 4.0.x support - DCN 3.5 updates - GC 12.0 support - DP MST fixes - Cursor fixes - MES11 updates - MMHUB 4.1 support - DML2 Updates - DCN 3.1.5 fixes - IPS fixes - Various code cleanups - GMC 12.0 support - SDMA 7.0 support - SMU 13 updates - SR-IOV fixes - VCN 5.x fixes - MES12 support - SMU 14.x updates - Devcoredump improvements - Fixes for HDP flush on platforms with >4k pages - GC 9.4.3 fixes - RAS ACA updates - Silence UBSAN flex array warnings - MMHUB 3.3 updates amdkfd: - Contiguous VRAM allocations - GC 12.0 support - SDMA 7.0 support - SR-IOV fixes radeon: - Backlight workaround for iMac - Silence UBSAN flex array warnings UAPI: - GFX12 modifier and DCC support Proposed Mesa changes: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29510 - KFD GFX ALU exceptions Proposed ROCdebugger changes: https://github.com/ROCm/ROCdbgapi/commit/08c760622b6601abf906f75abbc5e21d9fd425df https://github.com/ROCm/ROCgdb/commit/944fe1c1414a68700414e86e32273b6bfa62ba6f - KFD Contiguous VRAM allocation flag Proposed ROCr/HIP changes: https://github.com/ROCm/ROCT-Thunk-Interface/commit/f7b4a269914a3ab4f1e2453c2879adb97b5cc9e5 https://github.com/ROCm/ROCR-Runtime/pull/214/commits/26e8530d05a775872cb06dde6693db72be0c454a https://github.com/ROCm/clr/commit/1d48f2a1ab38b632919c4b7274899b3faf4279ff Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240607195900.902537-1-alexander.deucher@amd.com
2024-06-11Merge tag 'drm-xe-next-2024-06-06' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/xe/kernel into drm-next UAPI Changes: - Expose the L3 bank mask (Francois) Cross-subsystem Changes: - Update Xe driver maintainers (Oded) Display (i915): - Add missing include to intel_vga.c (Michal Wajdeczko) Driver Changes: - Fix Display (xe-only) detection for ADL-N (Lucas) - Runtime PM fixes that enabled PC-10 and D3Cold (Francois, Rodrigo) - Fix unexpected silent drm backmerge issues (Thomas) - More (a lot more) preparation for SR-IOV support (Michal Wajdeczko) - Devcoredump fixes and improvements (Jose, Tejas, Matt Brost) - Introduce device 'wedged' state (Rodrigo) - Improve debug and info messages (Michal Wajdeczko, Rodrigo, Nirmoy) - Adding or fixing workarounds (Tejas, Shekhar, Lucas, Bommu) - Check result of drmm_mutex_init (Michal Wajdeczko) - Enlarge the critical dma fence area for preempt fences (Matt Auld) - Prevent UAF in VM's rebind work (Matt Auld) - GuC submit related clean-ups and fixes (Matt Brost, Himal, Jonathan, Niranjana) - Prefer local helpers to perform dma reservation locking (Himal) - Spelling and typo fixes (Colin, Francois) - Prep patches for 1 job per VM bind IOCTL (no uapi change yet) (Matt Brost) - Remove uninitialized end var from xe_gt_tlb_invalidation_range (Nirmoy) - GSC related changes targeting LNL support (Daniele) - Fix assert in L3 bank mask generation (Francois) - Perform dma_map when moving system buffer objects to TT (Thomas) - Add helpers for manipulating macro arguments (Michal Wajdeczko) - Refactor default device atomic settings (Nirmoy) - Add debugfs node to dump mocs (Janga) - Use ordered WQ for G2H handler (Matt Brost) - Clean up and fixes in header includes (Michal Wajdeczko) - Prefer flexible-array over deprecated zero-lenght ones (Lucas) - Add Indirect Ring State support (Niranjana) - Fix UBSAN shift-out-of-bounds failure (Shuicheng) - HWMon fixes and additions (Karthik) - Clean-up refactor around probe init functions (Lucas, Michal Wajdeczko) - Fix PCODE init function (Himal) - Only use reserved BCS instances for usm migrate exec queue (Matt Brost) - Only zap PTEs as needed (Matt Brost) - Per client usage info (Lucas) - Core hotunplug improvements converting stuff towards devm (Matt Auld) - Don't emit false error if running in execlist mode (Michal Wajdeczko) - Remove unused struct (Dr. David) - Support/debug for slow GuC loads (John Harrison) - Decouple job seqno and lrc seqno (Matt Brost) - Allow migrate vm gpu submissions from reclaim context (Thomas) - Rename drm-client running time to run_ticks and fix a UAF (Umesh) - Check empty pinned BO list with lock held (Nirmoy) - Drop undesired prefix from the platform name (Michal Wajdeczko) - Remove unwanted mutex locking on xe file close (Niranjana) - Replace format-less snprintf() with strscpy() (Arnd) - Other general clean-ups on registers definitions and function names (Michal Wajdeczko) - Add kernel-doc to some xe_lrc interfaces (Niranajana) - Use missing lock in relay_needs_worker (Nirmoy) - Drop redundant W=1 warnings from Makefile (Jani) - Simplify if condition in preempt fences code (Thorsten) - Flush engine buffers before signalling user fence on all engines (Andrzej) - Don't overmap identity VRAM mapping (Matt Brost) - Do not dereference NULL job->fence in trace points (Matt Brost) - Add synchronous gt reset debugfs (Jonathan) - Xe gt_idle fixes (Riana) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/ZmItmuf7vq_xvRjJ@intel.com
2024-06-10media: v4l2-ctrls: Add average QP controlMing Qian
Add a control V4L2_CID_MPEG_VIDEO_AVERAGE_QP to report the average QP value of the current encoded frame. The value applies to the last dequeued capture buffer. Signed-off-by: Ming Qian <ming.qian@nxp.com> Reviewed-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Sebastian Fricke <sebastian.fricke@collabora.com> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
2024-06-08Merge tag 'for-linus-2024060801' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid Pull HID fixes from Benjamin Tissoires: - fix potential read out of bounds in hid-asus (Andrew Ballance) - fix endian-conversion on little endian systems in intel-ish-hid (Arnd Bergmann) - A couple of new input event codes (Aseda Aboagye) - errors handling fixes in hid-nvidia-shield (Chen Ni), hid-nintendo (Christophe JAILLET), hid-logitech-dj (José Expósito) - current leakage fix while the device is in suspend on a i2c-hid laptop (Johan Hovold) - other assorted smaller fixes and device ID / quirk entry additions * tag 'for-linus-2024060801' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid: HID: Ignore battery for ELAN touchscreens 2F2C and 4116 HID: i2c-hid: elan: fix reset suspend current leakage dt-bindings: HID: i2c-hid: elan: add 'no-reset-on-power-off' property dt-bindings: HID: i2c-hid: elan: add Elan eKTH5015M dt-bindings: HID: i2c-hid: add dedicated Ilitek ILI2901 schema input: Add support for "Do Not Disturb" input: Add event code for accessibility key hid: asus: asus_report_fixup: fix potential read out of bounds HID: logitech-hidpp: add missing MODULE_DESCRIPTION() macro HID: intel-ish-hid: fix endian-conversion HID: nintendo: Fix an error handling path in nintendo_hid_probe() HID: logitech-dj: Fix memory leak in logi_dj_recv_switch_to_dj_mode() HID: core: remove unnecessary WARN_ON() in implement() HID: nvidia-shield: Add missing check for input_ff_create_memless HID: intel-ish-hid: Fix build error for COMPILE_TEST
2024-06-07input: Add support for "Do Not Disturb"Aseda Aboagye
HUTRR94 added support for a new usage titled "System Do Not Disturb" which toggles a system-wide Do Not Disturb setting. This commit simply adds a new event code for the usage. Signed-off-by: Aseda Aboagye <aaboagye@chromium.org> Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Link: https://lore.kernel.org/r/Zl-gUHE70s7wCAoB@google.com Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
2024-06-07input: Add event code for accessibility keyAseda Aboagye
HUTRR116 added support for a new usage titled "System Accessibility Binding" which toggles a system-wide bound accessibility UI or command. This commit simply adds a new event code for the usage. Signed-off-by: Aseda Aboagye <aaboagye@chromium.org> Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Link: https://lore.kernel.org/r/Zl-e97O9nvudco5z@google.com Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
2024-06-06Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR. No conflicts. Adjacent changes: drivers/net/ethernet/pensando/ionic/ionic_txrx.c d9c04209990b ("ionic: Mark error paths in the data path as unlikely") 491aee894a08 ("ionic: fix kernel panic in XDP_TX action") net/ipv6/ip6_fib.c b4cb4a1391dc ("net: use unrcu_pointer() helper") b01e1c030770 ("ipv6: fix possible race in __fib6_drop_pcpu_from()") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-05drm/amdgpu: define new gfx12 uapi flagsMarek Olšák
define new gfx12 uapi flags Signed-off-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-05KVM: x86: Add a capability to configure bus frequency for APIC timerIsaku Yamahata
Add KVM_CAP_X86_APIC_BUS_CYCLES_NS capability to configure the APIC bus clock frequency for APIC timer emulation. Allow KVM_ENABLE_CAPABILITY(KVM_CAP_X86_APIC_BUS_CYCLES_NS) to set the frequency in nanoseconds. When using this capability, the user space VMM should configure CPUID leaf 0x15 to advertise the frequency. Vishal reported that the TDX guest kernel expects a 25MHz APIC bus frequency but ends up getting interrupts at a significantly higher rate. The TDX architecture hard-codes the core crystal clock frequency to 25MHz and mandates exposing it via CPUID leaf 0x15. The TDX architecture does not allow the VMM to override the value. In addition, per Intel SDM: "The APIC timer frequency will be the processor’s bus clock or core crystal clock frequency (when TSC/core crystal clock ratio is enumerated in CPUID leaf 0x15) divided by the value specified in the divide configuration register." The resulting 25MHz APIC bus frequency conflicts with the KVM hardcoded APIC bus frequency of 1GHz. The KVM doesn't enumerate CPUID leaf 0x15 to the guest unless the user space VMM sets it using KVM_SET_CPUID. If the CPUID leaf 0x15 is enumerated, the guest kernel uses it as the APIC bus frequency. If not, the guest kernel measures the frequency based on other known timers like the ACPI timer or the legacy PIT. As reported by Vishal the TDX guest kernel expects a 25MHz timer frequency but gets timer interrupt more frequently due to the 1GHz frequency used by KVM. To ensure that the guest doesn't have a conflicting view of the APIC bus frequency, allow the userspace to tell KVM to use the same frequency that TDX mandates instead of the default 1Ghz. Reported-by: Vishal Annapurve <vannapurve@google.com> Closes: https://lore.kernel.org/lkml/20231006011255.4163884-1-vannapurve@google.com Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com> Reviewed-by: Rick Edgecombe <rick.p.edgecombe@intel.com> Co-developed-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com> Reviewed-by: Yuan Yao <yuan.yao@intel.com> Link: https://lore.kernel.org/r/6748a4c12269e756f0c48680da8ccc5367c31ce7.1714081726.git.reinette.chatre@intel.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-06-05dma-buf: align fd_flags and heap_flags with dma_heap_allocation_dataBarry Song
dma_heap_allocation_data defines the UAPI as follows: struct dma_heap_allocation_data { __u64 len; __u32 fd; __u32 fd_flags; __u64 heap_flags; }; However, dma_heap_buffer_alloc() casts both fd_flags and heap_flags into unsigned int. We're inconsistent with types in the non UAPI arguments. This patch fixes it. Signed-off-by: Barry Song <v-songbaohua@oppo.com> Acked-by: John Stultz <jstultz@google.com> Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20240605012605.5341-1-21cnbao@gmail.com
2024-06-04net/sched: cls_flower: add support for matching tunnel control flagsDavide Caratti
extend cls_flower to match TUNNEL_FLAGS_PRESENT bits in tunnel metadata. Suggested-by: Ilya Maximets <i.maximets@ovn.org> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Davide Caratti <dcaratti@redhat.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-06-04m68k: amiga: Turn off Warp1260 interrupts during bootPaolo Pisati
On an Amiga 1200 equipped with a Warp1260 accelerator, an interrupt storm coming from the accelerator board causes the machine to crash in local_irq_enable() or auto_irq_enable(). Disabling interrupts for the Warp1260 in amiga_parse_bootinfo() fixes the problem. Link: https://lore.kernel.org/r/ZkjwzVwYeQtyAPrL@amaterasu.local Cc: stable <stable@kernel.org> Signed-off-by: Paolo Pisati <p.pisati@gmail.com> Reviewed-by: Michael Schmitz <schmitzmic@gmail.com> Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org> Link: https://lore.kernel.org/r/20240601153254.186225-1-p.pisati@gmail.com Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
2024-06-01Merge tag 'tty-6.10-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty Pull tty fix from Greg KH: "Here is a single revert for a much-reported regression in 6.10-rc1 when it comes to a few older architectures. Turns out that the VT ioctls don't work the same across all cpu types because of some old compatibility requrements for stuff like alpha and powerpc. So revert the change that attempted to have them use the _IO() macros and go back to the known-working values instead. This has NOT been in linux-next but has had many reports that it fixes the issue with 6.10-rc1" * tag 'tty-6.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: Revert "VT: Use macros to define ioctls"
2024-06-01Revert "VT: Use macros to define ioctls"Greg Kroah-Hartman
This reverts commit 8c467f3300591a206fa8dcc6988d768910799872. Turns out this breaks many architectures as the vt ioctls do not all match up everywhere due to historical reasons, so the original commit is invalid for many values. Reported-by: Nick Bowler <nbowler@draconx.ca> Reported-by: Arnd Bergmann <arnd@kernel.org> Reported-by: Jiri Slaby <jirislaby@kernel.org> Reported-by: Christian Zigotzky <chzigotzky@xenosoft.de> Reported-by: Michael Ellerman <mpe@ellerman.id.au> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Alexey Gladkov <legion@kernel.org> Link: https://lore.kernel.org/r/ad4e561c-1d49-4f25-882c-7a36c6b1b5c0@draconx.ca Link: https://lore.kernel.org/r/0da9785e-ba44-4718-9d08-4e96c1ba7ab2@kernel.org Link: https://lore.kernel.org/all/34d848f4-670b-4493-bf21-130ef862521b@xenosoft.de/ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-05-31Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR. Conflicts: drivers/net/ethernet/ti/icssg/icssg_classifier.c abd5576b9c57 ("net: ti: icssg-prueth: Add support for ICSSG switch firmware") 56a5cf538c3f ("net: ti: icssg-prueth: Fix start counter for ft1 filter") https://lore.kernel.org/all/20240531123822.3bb7eadf@canb.auug.org.au/ No other adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-30Merge tag 'net-6.10-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from bpf and netfilter. Current release - regressions: - gro: initialize network_offset in network layer - tcp: reduce accepted window in NEW_SYN_RECV state Current release - new code bugs: - eth: mlx5e: do not use ptp structure for tx ts stats when not initialized - eth: ice: check for unregistering correct number of devlink params Previous releases - regressions: - bpf: Allow delete from sockmap/sockhash only if update is allowed - sched: taprio: extend minimum interval restriction to entire cycle too - netfilter: ipset: add list flush to cancel_gc - ipv4: fix address dump when IPv4 is disabled on an interface - sock_map: avoid race between sock_map_close and sk_psock_put - eth: mlx5: use mlx5_ipsec_rx_status_destroy to correctly delete status rules Previous releases - always broken: - core: fix __dst_negative_advice() race - bpf: - fix multi-uprobe PID filtering logic - fix pkt_type override upon netkit pass verdict - netfilter: tproxy: bail out if IP has been disabled on the device - af_unix: annotate data-race around unix_sk(sk)->addr - eth: mlx5e: fix UDP GSO for encapsulated packets - eth: idpf: don't enable NAPI and interrupts prior to allocating Rx buffers - eth: i40e: fully suspend and resume IO operations in EEH case - eth: octeontx2-pf: free send queue buffers incase of leaf to inner - eth: ipvlan: dont Use skb->sk in ipvlan_process_v{4,6}_outbound" * tag 'net-6.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (69 commits) netdev: add qstat for csum complete ipvlan: Dont Use skb->sk in ipvlan_process_v{4,6}_outbound net: ena: Fix redundant device NUMA node override ice: check for unregistering correct number of devlink params ice: fix 200G PHY types to link speed mapping i40e: Fully suspend and resume IO operations in EEH case i40e: factoring out i40e_suspend/i40e_resume e1000e: move force SMBUS near the end of enable_ulp function net: dsa: microchip: fix RGMII error in KSZ DSA driver ipv4: correctly iterate over the target netns in inet_dump_ifaddr() net: fix __dst_negative_advice() race nfc/nci: Add the inconsistency check between the input data length and count MAINTAINERS: dwmac: starfive: update Maintainer net/sched: taprio: extend minimum interval restriction to entire cycle too net/sched: taprio: make q->picos_per_byte available to fill_sched_entry() netfilter: nft_fib: allow from forward/input without iif selector netfilter: tproxy: bail out if IP has been disabled on the device netfilter: nft_payload: skbuff vlan metadata mangle support net: ti: icssg-prueth: Fix start counter for ft1 filter sock_map: avoid race between sock_map_close and sk_psock_put ...
2024-05-30RDMA/mana_ib: Implement uapi to create and destroy RC QPKonstantin Taranov
Implement user requests to create and destroy an RC QP. As the user does not have an FMR queue, it is skipped and NO_FMR flag is used. Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com> Link: https://lore.kernel.org/r/1716366242-558-3-git-send-email-kotaranov@linux.microsoft.com Reviewed-by: Zhu Yanjun <yanjun.zhu@linux.dev> Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-05-30RDMA/bnxt_re: Expose the MSN table capability for user librarySelvin Xavier
BNXT_RE_COMP_MASK_UCNTX_HW_RETX_ENABLED was introduced to share the HW retransmit capability between driver and lib. The main difference in implementation for HW Retransmit support is the usage of MSN table or PSN table . When HW retrans is enabled, HW expects MSN table to be allocated by driver/lib, else PSN table (for older adapters). FW expose a new field which gives MSN capability. Drivers and libs can depend on the new field instead of HW Retrasns capability. For adapters which support HW_RETX feature, MSN table capability will be set. For older adapters, this value will be 0(to maintain backward compatibility with older FW). Rename UAPI just to capture the correct name of the HW capability that driver/library is interested in. No functional impact even if older rdma-core is used. Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://lore.kernel.org/r/1716876697-25970-3-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-05-30netdev: add qstat for csum completeJakub Kicinski
Recent commit 0cfe71f45f42 ("netdev: add queue stats") added a lot of useful stats, but only those immediately needed by virtio. Presumably virtio does not support CHECKSUM_COMPLETE, so statistic for that form of checksumming wasn't included. Other drivers will definitely need it, in fact we expect it to be needed in net-next soon (mlx5). So let's add the definition of the counter for CHECKSUM_COMPLETE to uAPI in net already, so that the counters are in a more natural order (all subsequent counters have not been present in any released kernel, yet). Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Joe Damato <jdamato@fastly.com> Fixes: 0cfe71f45f42 ("netdev: add queue stats") Link: https://lore.kernel.org/r/20240529163547.3693194-1-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>