summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2026-04-27f2fs: fix use-after-free of sbi in f2fs_compress_write_end_io()George Saad
commit 39d4ee19c1e7d753dd655aebee632271b171f43a upstream. In f2fs_compress_write_end_io(), dec_page_count(sbi, type) can bring the F2FS_WB_CP_DATA counter to zero, unblocking f2fs_wait_on_all_pages() in f2fs_put_super() on a concurrent unmount CPU. The unmount path then proceeds to call f2fs_destroy_page_array_cache(sbi), which destroys sbi->page_array_slab via kmem_cache_destroy(), and eventually kfree(sbi). Meanwhile, the bio completion callback is still executing: when it reaches page_array_free(sbi, ...), it dereferences sbi->page_array_slab — a destroyed slab cache — to call kmem_cache_free(), causing a use-after-free. This is the same class of bug as CVE-2026-23234 (which fixed the equivalent race in f2fs_write_end_io() in data.c), but in the compressed writeback completion path that was not covered by that fix. Fix this by moving dec_page_count() to after page_array_free(), so that all sbi accesses complete before the counter decrement that can unblock unmount. For non-last folios (where atomic_dec_return on cic->pending_pages is nonzero), dec_page_count is called immediately before returning — page_array_free is not reached on this path, so there is no post-decrement sbi access. For the last folio, page_array_free runs while the F2FS_WB_CP_DATA counter is still nonzero (this folio has not yet decremented it), keeping sbi alive, and dec_page_count runs as the final operation. Fixes: 4c8ff7095bef ("f2fs: support data compression") Cc: stable@vger.kernel.org Signed-off-by: George Saad <geoo115@gmail.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-27writeback: Fix use after free in inode_switch_wbs_work_fn()Jan Kara
commit 6689f01d6740cf358932b3e97ee968c6099800d9 upstream. inode_switch_wbs_work_fn() has a loop like: wb_get(new_wb); while (1) { list = llist_del_all(&new_wb->switch_wbs_ctxs); /* Nothing to do? */ if (!list) break; ... process the items ... } Now adding of items to the list looks like: wb_queue_isw() if (llist_add(&isw->list, &wb->switch_wbs_ctxs)) queue_work(isw_wq, &wb->switch_work); Because inode_switch_wbs_work_fn() loops when processing isw items, it can happen that wb->switch_work is pending while wb->switch_wbs_ctxs is empty. This is a problem because in that case wb can get freed (no isw items -> no wb reference) while the work is still pending causing use-after-free issues. We cannot just fix this by cancelling work when freeing wb because that could still trigger problematic 0 -> 1 transitions on wb refcount due to wb_get() in inode_switch_wbs_work_fn(). It could be all handled with more careful code but that seems unnecessarily complex so let's avoid that until it is proven that the looping actually brings practical benefit. Just remove the loop from inode_switch_wbs_work_fn() instead. That way when wb_queue_isw() queues work, we are guaranteed we have added the first item to wb->switch_wbs_ctxs and nobody is going to remove it (and drop the wb reference it holds) until the queued work runs. Fixes: e1b849cfa6b6 ("writeback: Avoid contention on wb->list_lock when switching inodes") CC: stable@vger.kernel.org Signed-off-by: Jan Kara <jack@suse.cz> Link: https://patch.msgid.link/20260413093618.17244-2-jack@suse.cz Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-27ksmbd: reset rcount per connection in ksmbd_conn_wait_idle_sess_id()DaeMyung Kang
commit def036ef87f8641c1c525d5ae17438d7a1006491 upstream. rcount is intended to be connection-specific: 2 for curr_conn, 1 for every other connection sharing the same session. However, it is initialised only once before the hash iteration and is never reset. After the loop visits curr_conn, later sibling connections are also checked against rcount == 2, so a sibling with req_running == 1 is incorrectly treated as idle. This makes the outcome depend on the hash iteration order: whether a given sibling is checked against the loose (< 2) or the strict (< 1) threshold is decided by whether it happens to be visited before or after curr_conn. The function's contract is "wait until every connection sharing this session is idle" so that destroy_previous_session() can safely tear the session down. The latched rcount violates that contract and reopens the teardown race window the wait logic was meant to close: destroy_previous_session() may proceed before sibling channels have actually quiesced, overlapping session teardown with in-flight work on those connections. Recompute rcount inside the loop so each connection is compared against its own threshold regardless of iteration order. This is a code-inspection fix for an iteration-order-dependent logic error; a targeted reproducer would require SMB3 multichannel with in-flight work on a sibling channel landing after curr_conn in hash order, which is not something that can be triggered reliably. Fixes: 76e98a158b20 ("ksmbd: fix race condition between destroy_previous_session() and smb2 operations()") Cc: stable@vger.kernel.org Signed-off-by: DaeMyung Kang <charsyam@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-27ksmbd: use check_add_overflow() to prevent u16 DACL size overflowTristan Madani
commit 299f962c0b02d048fb45d248b4da493d03f3175d upstream. set_posix_acl_entries_dacl() and set_ntacl_dacl() accumulate ACE sizes in u16 variables. When a file has many POSIX ACL entries, the accumulated size can wrap past 65535, causing the pointer arithmetic (char *)pndace + *size to land within already-written ACEs. Subsequent writes then overwrite earlier entries, and pndacl->size gets a truncated value. Use check_add_overflow() at each accumulation point to detect the wrap before it corrupts the buffer, consistent with existing check_mul_overflow() usage elsewhere in smbacl.c. Cc: stable@vger.kernel.org Fixes: e2f34481b24d ("cifsd: add server-side procedures for SMB3") Signed-off-by: Tristan Madani <tristan@talencesecurity.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-27ksmbd: fix out-of-bounds write in smb2_get_ea() EA alignmentTristan Madani
commit 30010c952077a1c89ecdd71fc4d574c75a8f5617 upstream. smb2_get_ea() applies 4-byte alignment padding via memset() after writing each EA entry. The bounds check on buf_free_len is performed before the value memcpy, but the alignment memset fires unconditionally afterward with no check on remaining space. When the EA value exactly fills the remaining buffer (buf_free_len == 0 after value subtraction), the alignment memset writes 1-3 NUL bytes past the buf_free_len boundary. In compound requests where the response buffer is shared across commands, the first command (e.g., READ) can consume most of the buffer, leaving a tight remainder for the QUERY_INFO EA response. The alignment memset then overwrites past the physical kvmalloc allocation into adjacent kernel heap memory. Add a bounds check before the alignment memset to ensure buf_free_len can accommodate the padding bytes. This is the same bug pattern fixed by commit beef2634f81f ("ksmbd: fix potencial OOB in get_file_all_info() for compound requests") and commit fda9522ed6af ("ksmbd: fix OOB write in QUERY_INFO for compound requests"), both of which added bounds checks before unconditional writes in QUERY_INFO response handlers. Cc: stable@vger.kernel.org Fixes: e2b76ab8b5c9 ("ksmbd: add support for read compound") Signed-off-by: Tristan Madani <tristan@talencesecurity.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-27ksmbd: validate num_aces and harden ACE walk in smb_inherit_dacl()Michael Bommarito
commit 3e4e2ea2a781018ed5d75f969e3e5606beb66e48 upstream. smb_inherit_dacl() trusts the on-disk num_aces value from the parent directory's DACL xattr and uses it to size a heap allocation: aces_base = kmalloc(sizeof(struct smb_ace) * num_aces * 2, ...); num_aces is a u16 read from le16_to_cpu(parent_pdacl->num_aces) without checking that it is consistent with the declared pdacl_size. An authenticated client whose parent directory's security.NTACL is tampered (e.g. via offline xattr corruption or a concurrent path that bypasses parse_dacl()) can present num_aces = 65535 with minimal actual ACE data. This causes a ~8 MB allocation (not kzalloc, so uninitialized) that the subsequent loop only partially populates, and may also overflow the three-way size_t multiply on 32-bit kernels. Additionally, the ACE walk loop uses the weaker offsetof(struct smb_ace, access_req) minimum size check rather than the minimum valid on-wire ACE size, and does not reject ACEs whose declared size is below the minimum. Reproduced on UML + KASAN + LOCKDEP against the real ksmbd code path. A legitimate mount.cifs client creates a parent directory over SMB (ksmbd writes a valid security.NTACL xattr), then the NTACL blob on the backing filesystem is rewritten to set num_aces = 0xFFFF while keeping the posix_acl_hash bytes intact so ksmbd_vfs_get_sd_xattr()'s hash check still passes. A subsequent SMB2 CREATE of a child under that parent drives smb2_open() into smb_inherit_dacl() (share has "vfs objects = acl_xattr" set), which fails the page allocator: WARNING: mm/page_alloc.c:5226 at __alloc_frozen_pages_noprof+0x46c/0x9c0 Workqueue: ksmbd-io handle_ksmbd_work __alloc_frozen_pages_noprof+0x46c/0x9c0 ___kmalloc_large_node+0x68/0x130 __kmalloc_large_node_noprof+0x24/0x70 __kmalloc_noprof+0x4c9/0x690 smb_inherit_dacl+0x394/0x2430 smb2_open+0x595d/0xabe0 handle_ksmbd_work+0x3d3/0x1140 With the patch applied the added guard rejects the tampered value with -EINVAL before any large allocation runs, smb2_open() falls back to smb2_create_sd_buffer(), and the child is created with a default SD. No warning, no splat. Fix by: 1. Validating num_aces against pdacl_size using the same formula applied in parse_dacl(). 2. Replacing the raw kmalloc(sizeof * num_aces * 2) with kmalloc_array(num_aces * 2, sizeof(...)) for overflow-safe allocation. 3. Tightening the per-ACE loop guard to require the minimum valid ACE size (offsetof(smb_ace, sid) + CIFS_SID_BASE_SIZE) and rejecting under-sized ACEs, matching the hardening in smb_check_perm_dacl() and parse_dacl(). v1 -> v2: - Replace the synthetic test-module splat in the changelog with a real-path UML + KASAN reproduction driven through mount.cifs and SMB2 CREATE; Namjae flagged the kcifs3_test_inherit_dacl_old name in v1 since it does not exist in ksmbd. - Drop the commit-hash citation from the code comment per Namjae's review; keep the parse_dacl() pointer. Fixes: e2f34481b24d ("cifsd: add server-side procedures for SMB3") Cc: stable@vger.kernel.org Assisted-by: Claude:claude-opus-4-6 Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-27ksmbd: validate response sizes in ipc_validate_msg()Michael Bommarito
commit d6a6aa81eac2c9bff66dc6e191179cb69a14426b upstream. ipc_validate_msg() computes the expected message size for each response type by adding (or multiplying) attacker-controlled fields from the daemon response to a fixed struct size in unsigned int arithmetic. Three cases can overflow: KSMBD_EVENT_RPC_REQUEST: msg_sz = sizeof(struct ksmbd_rpc_command) + resp->payload_sz; KSMBD_EVENT_SHARE_CONFIG_REQUEST: msg_sz = sizeof(struct ksmbd_share_config_response) + resp->payload_sz; KSMBD_EVENT_LOGIN_REQUEST_EXT: msg_sz = sizeof(struct ksmbd_login_response_ext) + resp->ngroups * sizeof(gid_t); resp->payload_sz is __u32 and resp->ngroups is __s32. Each addition can wrap in unsigned int; the multiplication by sizeof(gid_t) mixes signed and size_t, so a negative ngroups is converted to SIZE_MAX before the multiply. A wrapped value of msg_sz that happens to equal entry->msg_sz bypasses the size check on the next line, and downstream consumers (smb2pdu.c:6742 memcpy using rpc_resp->payload_sz, kmemdup in ksmbd_alloc_user using resp_ext->ngroups) then trust the unverified length. Use check_add_overflow() on the RPC_REQUEST and SHARE_CONFIG_REQUEST paths to detect integer overflow without constraining functional payload size; userspace ksmbd-tools grows NDR responses in 4096-byte chunks for calls like NetShareEnumAll, so a hard transport cap is unworkable on the response side. For LOGIN_REQUEST_EXT, reject resp->ngroups outside the signed [0, NGROUPS_MAX] range up front and report the error from ipc_validate_msg() so it fires at the IPC boundary; with that bound the subsequent multiplication and addition stay well below UINT_MAX. The now-redundant ngroups check and pr_err in ksmbd_alloc_user() are removed. This is the response-side analogue of aab98e2dbd64 ("ksmbd: fix integer overflows on 32 bit systems"), which hardened the request side. Fixes: 0626e6641f6b ("cifsd: add server handler for central processing and tranport layers") Fixes: a77e0e02af1c ("ksmbd: add support for supplementary groups") Cc: stable@vger.kernel.org Assisted-by: Claude:claude-opus-4-6 Assisted-by: Codex:gpt-5-4 Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-27smb: client: fix OOB read in smb2_ioctl_query_info QUERY_INFO pathMichael Bommarito
commit a58c5af19ff0d6f44f6e9fe31e33a2c92223f77e upstream. smb2_ioctl_query_info() has two response-copy branches: PASSTHRU_FSCTL and the default QUERY_INFO path. The QUERY_INFO branch clamps qi.input_buffer_length to the server-reported OutputBufferLength and then copies qi.input_buffer_length bytes from qi_rsp->Buffer to userspace, but it never verifies that the flexible-array payload actually fits within rsp_iov[1].iov_len. A malicious server can return OutputBufferLength larger than the actual QUERY_INFO response, causing copy_to_user() to walk past the response buffer and expose adjacent kernel heap to userspace. Guard the QUERY_INFO copy with a bounds check on the actual Buffer payload. Use struct_size(qi_rsp, Buffer, qi.input_buffer_length) rather than an open-coded addition so the guard cannot overflow on 32-bit builds. Fixes: f5778c398713 ("SMB3: Allow SMB3 FSCTL queries to be sent to server from tools") Cc: stable@vger.kernel.org Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com> Assisted-by: Claude:claude-opus-4-6 Assisted-by: Codex:gpt-5-4 Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-27smb: client: validate the whole DACL before rewriting it in cifsaclMichael Bommarito
commit 0a8cf165566ba55a39fd0f4de172119dd646d39a upstream. build_sec_desc() and id_mode_to_cifs_acl() derive a DACL pointer from a server-supplied dacloffset and then use the incoming ACL to rebuild the chmod/chown security descriptor. The original fix only checked that the struct smb_acl header fits before reading dacl_ptr->size or dacl_ptr->num_aces. That avoids the immediate header-field OOB read, but the rewrite helpers still walk ACEs based on pdacl->num_aces with no structural validation of the incoming DACL body. A malicious server can return a truncated DACL that still contains a header, claims one or more ACEs, and then drive replace_sids_and_copy_aces() or set_chmod_dacl() past the validated extent while they compare or copy attacker-controlled ACEs. Factor the DACL structural checks into validate_dacl(), extend them to validate each ACE against the DACL bounds, and use the shared validator before the chmod/chown rebuild paths. parse_dacl() reuses the same validator so the read-side parser and write-side rewrite paths agree on what constitutes a well-formed incoming DACL. Fixes: bc3e9dd9d104 ("cifs: Change SIDs in ACEs while transferring file ownership.") Cc: stable@vger.kernel.org Assisted-by: Claude:claude-opus-4-6 Assisted-by: Codex:gpt-5-4 Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-27smb: client: require a full NFS mode SID before reading mode bitsMichael Bommarito
commit 2757ad3e4b6f9e0fed4c7739594e702abc5cab21 upstream. parse_dacl() treats an ACE SID matching sid_unix_NFS_mode as an NFS mode SID and reads sid.sub_auth[2] to recover the mode bits. That assumes the ACE carries three subauthorities, but compare_sids() only compares min(a, b) subauthorities. A malicious server can return an ACE with num_subauth = 2 and sub_auth[] = {88, 3}, which still matches sid_unix_NFS_mode and then drives the sub_auth[2] read four bytes past the end of the ACE. Require num_subauth >= 3 before treating the ACE as an NFS mode SID. This keeps the fix local to the special-SID mode path without changing compare_sids() semantics for the rest of cifsacl. Fixes: e2f8fbfb8d09 ("cifs: get mode bits from special sid on stat") Cc: stable@vger.kernel.org Assisted-by: Claude:claude-opus-4-6 Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-27smb: server: fix max_connections off-by-one in tcp accept pathDaeMyung Kang
commit ce23158bfe584bd90d1918f279fdf9de57802012 upstream. The global max_connections check in ksmbd's TCP accept path counts the newly accepted connection with atomic_inc_return(), but then rejects the connection when the result is greater than or equal to server_conf.max_connections. That makes the effective limit one smaller than configured. For example: - max_connections=1 rejects the first connection - max_connections=2 allows only one connection The per-IP limit in the same function uses <= correctly because it counts only pre-existing connections. The global limit instead checks the post-increment total, so it should reject only when that total exceeds the configured maximum. Fix this by changing the comparison from >= to >, so exactly max_connections simultaneous connections are allowed and the next one is rejected. This matches the documented meaning of max_connections in fs/smb/server/ksmbd_netlink.h as the "Number of maximum simultaneous connections". Fixes: 0d0d4680db22 ("ksmbd: add max connections parameter") Cc: stable@vger.kernel.org Signed-off-by: DaeMyung Kang <charsyam@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-27smb: client: fix dir separator in SMB1 UNIX mountsPaulo Alcantara
commit c4d3fc5844d685441befd0caaab648321013cdfd upstream. When calling cifs_mount_get_tcon() with SMB1 UNIX mounts, @cifs_sb->mnt_cifs_flags needs to be read or updated only after calling reset_cifs_unix_caps(), otherwise it might end up with missing CIFS_MOUNT_POSIXACL and CIFS_MOUNT_POSIX_PATHS bits. This fixes the wrong dir separator used in paths caused by the missing CIFS_MOUNT_POSIX_PATHS bit in cifs_sb_info::mnt_cifs_flags. Reported-by: "Kris Karas (Bug Reporting)" <bugs-a21@moonlit-rail.com> Closes: https://lore.kernel.org/r/f758f4ff-4d54-4244-931d-38f469c3ff14@moonlit-rail.com Fixes: 4fc3a433c139 ("smb: client: use atomic_t for mnt_cifs_flags") Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> Cc: David Howells <dhowells@redhat.com> Cc: linux-cifs@vger.kernel.org Cc: stable@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-27smb: server: fix active_num_conn leak on transport allocation failureMichael Bommarito
commit 6551300dc452ac16a855a83dbd1e74899542d3b3 upstream. Commit 77ffbcac4e56 ("smb: server: fix leak of active_num_conn in ksmbd_tcp_new_connection()") addressed the kthread_run() failure path. The earlier alloc_transport() == NULL path in the same function has the same leak, is reachable pre-authentication via any TCP connect to port 445, and was empirically reproduced on UML (ARCH=um, v7.0-rc7): a small number of forced allocation failures were sufficient to put ksmbd into a state where every subsequent connection attempt was rejected for the remainder of the boot. ksmbd_kthread_fn() increments active_num_conn before calling ksmbd_tcp_new_connection() and discards the return value, so when alloc_transport() returns NULL the socket is released and -ENOMEM returned without decrementing the counter. Each such failure permanently consumes one slot from the max_connections pool; once cumulative failures reach the cap, atomic_inc_return() hits the threshold on every subsequent accept and every new connection is rejected. The counter is only reset by module reload. An unauthenticated remote attacker can drive the server toward the memory pressure that makes alloc_transport() fail by holding open connections with large RFC1002 lengths up to MAX_STREAM_PROT_LEN (0x00FFFFFF); natural transient allocation failures on a loaded host produce the same drift more slowly. Mirror the existing rollback pattern in ksmbd_kthread_fn(): on the alloc_transport() failure path, decrement active_num_conn gated on server_conf.max_connections. Repro details: with the patch reverted, forced alloc_transport() NULL returns leaked counter slots and subsequent connection attempts -- including legitimate connects issued after the forced-fail window had closed -- were all rejected with "Limit the maximum number of connections". With this patch applied, the same connect sequence produces no rejections and the counter cycles cleanly between zero and one on every accept. Fixes: 0d0d4680db22 ("ksmbd: add max connections parameter") Cc: stable@vger.kernel.org Assisted-by: Claude:claude-opus-4-6 Assisted-by: Codex:gpt-5-4 Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-27ksmbd: require minimum ACE size in smb_check_perm_dacl()Michael Bommarito
commit d07b26f39246a82399661936dd0c853983cfade7 upstream. Both ACE-walk loops in smb_check_perm_dacl() only guard against an under-sized remaining buffer, not against an ACE whose declared `ace->size` is smaller than the struct it claims to describe: if (offsetof(struct smb_ace, access_req) > aces_size) break; ace_size = le16_to_cpu(ace->size); if (ace_size > aces_size) break; The first check only requires the 4-byte ACE header to be in bounds; it does not require access_req (4 bytes at offset 4) to be readable. An attacker who has set a crafted DACL on a file they own can declare ace->size == 4 with aces_size == 4, pass both checks, and then granted |= le32_to_cpu(ace->access_req); /* upper loop */ compare_sids(&sid, &ace->sid); /* lower loop */ reads access_req at offset 4 (OOB by up to 4 bytes) and ace->sid at offset 8 (OOB by up to CIFS_SID_BASE_SIZE + SID_MAX_SUB_AUTHORITIES * 4 bytes). Tighten both loops to require ace_size >= offsetof(struct smb_ace, sid) + CIFS_SID_BASE_SIZE which is the smallest valid on-wire ACE layout (4-byte header + 4-byte access_req + 8-byte sid base with zero sub-auths). Also reject ACEs whose sid.num_subauth exceeds SID_MAX_SUB_AUTHORITIES before letting compare_sids() dereference sub_auth[] entries. parse_sec_desc() already enforces an equivalent check (lines 441-448); smb_check_perm_dacl() simply grew weaker validation over time. Reachability: authenticated SMB client with permission to set an ACL on a file. On a subsequent CREATE against that file, the kernel walks the stored DACL via smb_check_perm_dacl() and triggers the OOB read. Not pre-auth, and the OOB read is not reflected to the attacker, but KASAN reports and kernel state corruption are possible. Fixes: e2f34481b24d ("cifsd: add server-side procedures for SMB3") Cc: stable@vger.kernel.org Assisted-by: Claude:claude-opus-4-6 Assisted-by: Codex:gpt-5-4 Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-27fuse: fuse_dev_ioctl_clone() should wait for device file to be initializedMiklos Szeredi
commit da6fcc6dbddbef80e603d2f0c1554a9f2ac03742 upstream. Use fuse_get_dev() not __fuse_get_dev() on the old fd, since in the case of synchronous INIT the caller will want to wait for the device file to be available for cloning, just like I/O wants to wait instead of returning an error. Fixes: dfb84c330794 ("fuse: allow synchronous FUSE_INIT") Cc: stable@vger.kernel.org # v6.18 Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-27fuse: quiet down complaints in fuse_conn_limit_writeDarrick J. Wong
commit 129a45f9755a89f573c6a513a6b9e3d234ce89b0 upstream. gcc 15 complains about an uninitialized variable val that is passed by reference into fuse_conn_limit_write: control.c: In function ‘fuse_conn_congestion_threshold_write’: include/asm-generic/rwonce.h:55:37: warning: ‘val’ may be used uninitialized [-Wmaybe-uninitialized] 55 | *(volatile typeof(x) *)&(x) = (val); \ | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~ include/asm-generic/rwonce.h:61:9: note: in expansion of macro ‘__WRITE_ONCE’ 61 | __WRITE_ONCE(x, val); \ | ^~~~~~~~~~~~ control.c:178:9: note: in expansion of macro ‘WRITE_ONCE’ 178 | WRITE_ONCE(fc->congestion_threshold, val); | ^~~~~~~~~~ control.c:166:18: note: ‘val’ was declared here 166 | unsigned val; | ^~~ Unfortunately there's enough macro spew involved in kstrtoul_from_user that I think gcc gives up on its analysis and sprays the above warning. AFAICT it's not actually a bug, but we could just zero-initialize the variable to enable using -Wmaybe-uninitialized to find real problems. Previously we would use some weird uninitialized_var annotation to quiet down the warnings, so clearly this code has been like this for quite some time. Cc: stable@vger.kernel.org # v5.9 Fixes: 3f649ab728cda8 ("treewide: Remove uninitialized_var() usage") Signed-off-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-27fuse: Check for large folio with SPLICE_F_MOVEBernd Schubert
commit 59ba47b6be9cd0146ef9a55c6e32e337e11e7625 upstream. xfstest generic/074 and generic/075 complain result in kernel warning messages / page dumps. This is easily reproducible (on 6.19) with CONFIG_TRANSPARENT_HUGEPAGE_SHMEM_HUGE_ALWAYS=y CONFIG_TRANSPARENT_HUGEPAGE_TMPFS_HUGE_ALWAYS=y This just adds a test for large folios fuse_try_move_folio with the same page copy fallback, but to avoid the warnings from fuse_check_folio(). Cc: stable@vger.kernel.org Signed-off-by: Bernd Schubert <bschubert@ddn.com> Signed-off-by: Horst Birthelmer <hbirthelmer@ddn.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-27fuse: abort on fatal signal during sync initMiklos Szeredi
commit 204aa22a686bfee48daca7db620c1e017615f2ff upstream. When sync init is used and the server exits for some reason (error, crash) while processing FUSE_INIT, the filesystem creation will hang. The reason is that while all other threads will exit, the mounting thread (or process) will keep the device fd open, which will prevent an abort from happening. This is a regression from the async mount case, where the mount was done first, and the FUSE_INIT processing afterwards, in which case there's no such recursive syscall keeping the fd open. Fixes: dfb84c330794 ("fuse: allow synchronous FUSE_INIT") Cc: stable@vger.kernel.org # v6.18 Reviewed-by: Joanne Koong <joannelkoong@gmail.com> Reviewed-by: Bernd Schubert <bernd@bsbernd.com> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-27fuse: reject oversized dirents in page cacheSamuel Page
commit 51a8de6c50bf947c8f534cd73da4c8f0a13e7bed upstream. fuse_add_dirent_to_cache() computes a serialized dirent size from the server-controlled namelen field and copies the dirent into a single page-cache page. The existing logic only checks whether the dirent fits in the remaining space of the current page and advances to a fresh page if not. It never checks whether the dirent itself exceeds PAGE_SIZE. As a result, a malicious FUSE server can return a dirent with namelen=4095, producing a serialized record size of 4120 bytes. On 4 KiB page systems this causes memcpy() to overflow the cache page by 24 bytes into the following kernel page. Reject dirents that cannot fit in a single page before copying them into the readdir cache. Fixes: 69e34551152a ("fuse: allow caching readdir") Cc: stable@vger.kernel.org # v6.16+ Assisted-by: Bynario AI Signed-off-by: Samuel Page <sam@bynar.io> Reported-by: Qi Tang <tpluszz77@gmail.com> Reported-by: Zijun Hu <nightu@northwestern.edu> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Link: https://patch.msgid.link/20260420090139.662772-1-mszeredi@redhat.com Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-27f2fs: fix to avoid uninit-value access in f2fs_sanity_check_node_footerChao Yu
commit 7b9161a605e91d0987e2596a245dc1f21621b23f upstream. syzbot reported a f2fs bug as below: BUG: KMSAN: uninit-value in f2fs_sanity_check_node_footer+0x374/0xa20 fs/f2fs/node.c:1520 f2fs_sanity_check_node_footer+0x374/0xa20 fs/f2fs/node.c:1520 f2fs_finish_read_bio+0xe1e/0x1d60 fs/f2fs/data.c:177 f2fs_read_end_io+0x6ab/0x2220 fs/f2fs/data.c:-1 bio_endio+0x1006/0x1160 block/bio.c:1792 submit_bio_noacct+0x533/0x2960 block/blk-core.c:891 submit_bio+0x57a/0x620 block/blk-core.c:926 blk_crypto_submit_bio include/linux/blk-crypto.h:203 [inline] f2fs_submit_read_bio+0x12c/0x360 fs/f2fs/data.c:557 f2fs_submit_page_bio+0xee2/0x1450 fs/f2fs/data.c:775 read_node_folio+0x384/0x4b0 fs/f2fs/node.c:1481 __get_node_folio+0x5db/0x15d0 fs/f2fs/node.c:1576 f2fs_get_inode_folio+0x40/0x50 fs/f2fs/node.c:1623 do_read_inode fs/f2fs/inode.c:425 [inline] f2fs_iget+0x1209/0x9380 fs/f2fs/inode.c:596 f2fs_fill_super+0x8f5a/0xb2e0 fs/f2fs/super.c:5184 get_tree_bdev_flags+0x6e6/0x920 fs/super.c:1694 get_tree_bdev+0x38/0x50 fs/super.c:1717 f2fs_get_tree+0x35/0x40 fs/f2fs/super.c:5436 vfs_get_tree+0xb3/0x5d0 fs/super.c:1754 fc_mount fs/namespace.c:1193 [inline] do_new_mount_fc fs/namespace.c:3763 [inline] do_new_mount+0x885/0x1dd0 fs/namespace.c:3839 path_mount+0x7a2/0x20b0 fs/namespace.c:4159 do_mount fs/namespace.c:4172 [inline] __do_sys_mount fs/namespace.c:4361 [inline] __se_sys_mount+0x704/0x7f0 fs/namespace.c:4338 __x64_sys_mount+0xe4/0x150 fs/namespace.c:4338 x64_sys_call+0x39f0/0x3ea0 arch/x86/include/generated/asm/syscalls_64.h:166 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0x134/0xf80 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f The root cause is: in f2fs_finish_read_bio(), we may access uninit data in folio if we failed to read the data from device into folio, let's add a check condition to avoid such issue. Cc: stable@kernel.org Fixes: 50ac3ecd8e05 ("f2fs: fix to do sanity check on node footer in {read,write}_end_io") Reported-by: syzbot+9aac813cdc456cdd49f8@syzkaller.appspotmail.com Closes: https://lore.kernel.org/linux-f2fs-devel/69a9ca26.a70a0220.305d9a.0000.GAE@google.com Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-27f2fs: fix to avoid memory leak in f2fs_rename()Chao Yu
commit 3cf11e6f36c170050c12171dd6fd3142711478fc upstream. syzbot reported a f2fs bug as below: BUG: memory leak unreferenced object 0xffff888127f70830 (size 16): comm "syz.0.23", pid 6144, jiffies 4294943712 hex dump (first 16 bytes): 3c af 57 72 5b e6 8f ad 6e 8e fd 33 42 39 03 ff <.Wr[...n..3B9.. backtrace (crc 925f8a80): kmemleak_alloc_recursive include/linux/kmemleak.h:44 [inline] slab_post_alloc_hook mm/slub.c:4520 [inline] slab_alloc_node mm/slub.c:4844 [inline] __do_kmalloc_node mm/slub.c:5237 [inline] __kmalloc_noprof+0x3bd/0x560 mm/slub.c:5250 kmalloc_noprof include/linux/slab.h:954 [inline] fscrypt_setup_filename+0x15e/0x3b0 fs/crypto/fname.c:364 f2fs_setup_filename+0x52/0xb0 fs/f2fs/dir.c:143 f2fs_rename+0x159/0xca0 fs/f2fs/namei.c:961 f2fs_rename2+0xd5/0xf20 fs/f2fs/namei.c:1308 vfs_rename+0x7ff/0x1250 fs/namei.c:6026 filename_renameat2+0x4f4/0x660 fs/namei.c:6144 __do_sys_renameat2 fs/namei.c:6173 [inline] __se_sys_renameat2 fs/namei.c:6168 [inline] __x64_sys_renameat2+0x59/0x80 fs/namei.c:6168 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xe2/0xf80 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f The root cause is in commit 40b2d55e0452 ("f2fs: fix to create selinux label during whiteout initialization"), we added a call to f2fs_setup_filename() without a matching call to f2fs_free_filename(), fix it. Fixes: 40b2d55e0452 ("f2fs: fix to create selinux label during whiteout initialization") Cc: stable@kernel.org Reported-by: syzbot+cf7946ab25b21abc4b66@syzkaller.appspotmail.com Closes: https://lore.kernel.org/linux-f2fs-devel/69a75fe1.a70a0220.b118c.0014.GAE@google.com Suggested-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-27f2fs: fix UAF caused by decrementing sbi->nr_pages[] in f2fs_write_end_io()Yongpeng Yang
commit 2d9c4a4ed4eef1f82c5b16b037aee8bad819fd53 upstream. The xfstests case "generic/107" and syzbot have both reported a NULL pointer dereference. The concurrent scenario that triggers the panic is as follows: F2FS_WB_CP_DATA write callback umount - f2fs_write_checkpoint - f2fs_wait_on_all_pages(sbi, F2FS_WB_CP_DATA) - blk_mq_end_request - bio_endio - f2fs_write_end_io : dec_page_count(sbi, F2FS_WB_CP_DATA) : wake_up(&sbi->cp_wait) - kill_f2fs_super - kill_block_super - f2fs_put_super : iput(sbi->node_inode) : sbi->node_inode = NULL : f2fs_in_warm_node_list - is_node_folio // sbi->node_inode is NULL and panic The root cause is that f2fs_put_super() calls iput(sbi->node_inode) and sets sbi->node_inode to NULL after sbi->nr_pages[F2FS_WB_CP_DATA] is decremented to zero. As a result, f2fs_in_warm_node_list() may dereference a NULL node_inode when checking whether a folio belongs to the node inode, leading to a panic. This patch fixes the issue by calling f2fs_in_warm_node_list() before decrementing sbi->nr_pages[F2FS_WB_CP_DATA], thus preventing the use-after-free condition. Cc: stable@kernel.org Fixes: 50fa53eccf9f ("f2fs: fix to avoid broken of dnode block list") Reported-by: syzbot+6e4cb1cac5efc96ea0ca@syzkaller.appspotmail.com Signed-off-by: Yongpeng Yang <yangyongpeng@xiaomi.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-27f2fs: fix to do sanity check on dcc->discard_cmd_cnt conditionallyChao Yu
commit 6af249c996f7d73a3435f9e577956fa259347d18 upstream. Syzbot reported a f2fs bug as below: ------------[ cut here ]------------ kernel BUG at fs/f2fs/segment.c:1900! Oops: invalid opcode: 0000 [#1] SMP KASAN PTI CPU: 1 UID: 0 PID: 6527 Comm: syz.5.110 Not tainted syzkaller #0 PREEMPT_{RT,(full)} Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/12/2026 RIP: 0010:f2fs_issue_discard_timeout+0x59b/0x5a0 fs/f2fs/segment.c:1900 Code: d9 80 e1 07 80 c1 03 38 c1 0f 8c d6 fe ff ff 48 89 df e8 a8 5e fa fd e9 c9 fe ff ff e8 4e 46 94 fd 90 0f 0b e8 46 46 94 fd 90 <0f> 0b 0f 1f 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 RSP: 0018:ffffc9000494f940 EFLAGS: 00010283 RAX: ffffffff843009ca RBX: 0000000000000001 RCX: 0000000000080000 RDX: ffffc9001ca78000 RSI: 00000000000029f3 RDI: 00000000000029f4 RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 R10: dffffc0000000000 R11: ffffed100893a431 R12: 1ffff1100893a430 R13: 1ffff1100c2b702c R14: dffffc0000000000 R15: ffff8880449d2160 FS: 00007ffa35fed6c0(0000) GS:ffff88812643d000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f2b68634000 CR3: 0000000039f62000 CR4: 00000000003526f0 Call Trace: <TASK> __f2fs_remount fs/f2fs/super.c:2960 [inline] f2fs_reconfigure+0x108a/0x1710 fs/f2fs/super.c:5443 reconfigure_super+0x227/0x8a0 fs/super.c:1080 do_remount fs/namespace.c:3391 [inline] path_mount+0xdc5/0x10e0 fs/namespace.c:4151 do_mount fs/namespace.c:4172 [inline] __do_sys_mount fs/namespace.c:4361 [inline] __se_sys_mount+0x31d/0x420 fs/namespace.c:4338 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0x14d/0xf80 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7ffa37dbda0a The root cause is there will be race condition in between f2fs_ioc_fitrim() and f2fs_remount(): - f2fs_remount - f2fs_ioc_fitrim - f2fs_issue_discard_timeout - __issue_discard_cmd - __drop_discard_cmd - __wait_all_discard_cmd - f2fs_trim_fs - f2fs_write_checkpoint - f2fs_clear_prefree_segments - f2fs_issue_discard - __issue_discard_async - __queue_discard_cmd - __update_discard_tree_range - __insert_discard_cmd - __create_discard_cmd : atomic_inc(&dcc->discard_cmd_cnt); - sanity check on dcc->discard_cmd_cnt (expect discard_cmd_cnt to be zero) This will only happen when fitrim races w/ remount rw, if we remount to readonly filesystem, remount will wait until mnt_pcp.mnt_writers to zero, that means fitrim is not in process at that time. Cc: stable@kernel.org Fixes: 2482c4325dfe ("f2fs: detect bug_on in f2fs_wait_discard_bios") Reported-by: syzbot+62538b67389ee582837a@syzkaller.appspotmail.com Closes: https://lore.kernel.org/linux-f2fs-devel/69b07d7c.050a0220.8df7.09a1.GAE@google.com Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-27fs/ntfs3: validate rec->used in journal-replay file record checkGreg Kroah-Hartman
commit 0ca0485e4b2e837ebb6cbd4f2451aba665a03e4b upstream. check_file_record() validates rec->total against the record size but never validates rec->used. The do_action() journal-replay handlers read rec->used from disk and use it to compute memmove lengths: DeleteAttribute: memmove(attr, ..., used - asize - roff) CreateAttribute: memmove(..., attr, used - roff) change_attr_size: memmove(..., used - PtrOffset(rec, next)) When rec->used is smaller than the offset of a validated attribute, or larger than the record size, these subtractions can underflow allowing us to copy huge amounts of memory in to a 4kb buffer, generally considered a bad idea overall. This requires a corrupted filesystem, which isn't a threat model the kernel really needs to worry about, but checking for such an obvious out-of-bounds value is good to keep things robust, especially on journal replay Fix this up by bounding rec->used correctly. This is much like commit b2bc7c44ed17 ("fs/ntfs3: Fix slab-out-of-bounds read in DeleteIndexEntryRoot") which checked different values in this same switch statement. Cc: Konstantin Komarov <almaz.alexandrovich@paragon-software.com> Fixes: b46acd6a6a62 ("fs/ntfs3: Add NTFS journal") Cc: stable <stable@kernel.org> Assisted-by: gregkh_clanker_t1000 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-27ksmbd: validate owner of durable handle on reconnectNamjae Jeon
[ Upstream commit 49110a8ce654bbe56bef7c5e44cce31f4b102b8a ] Currently, ksmbd does not verify if the user attempting to reconnect to a durable handle is the same user who originally opened the file. This allows any authenticated user to hijack an orphaned durable handle by predicting or brute-forcing the persistent ID. According to MS-SMB2, the server MUST verify that the SecurityContext of the reconnect request matches the SecurityContext associated with the existing open. Add a durable_owner structure to ksmbd_file to store the original opener's UID, GID, and account name. and catpure the owner information when a file handle becomes orphaned. and implementing ksmbd_vfs_compare_durable_owner() to validate the identity of the requester during SMB2_CREATE (DHnC). Fixes: c8efcc786146 ("ksmbd: add support for durable handles v1/v2") Reported-by: Davide Ornaghi <d.ornaghi97@gmail.com> Reported-by: Navaneeth K <knavaneeth786@gmail.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-27ksmbd: fix use-after-free in __ksmbd_close_fd() via durable scavengerNamjae Jeon
[ Upstream commit 235e32320a470fcd3998fb3774f2290a0eb302a1 ] When a durable file handle survives session disconnect (TCP close without SMB2_LOGOFF), session_fd_check() sets fp->conn = NULL to preserve the handle for later reconnection. However, it did not clean up the byte-range locks on fp->lock_list. Later, when the durable scavenger thread times out and calls __ksmbd_close_fd(NULL, fp), the lock cleanup loop did: spin_lock(&fp->conn->llist_lock); This caused a slab use-after-free because fp->conn was NULL and the original connection object had already been freed by ksmbd_tcp_disconnect(). The root cause is asymmetric cleanup: lock entries (smb_lock->clist) were left dangling on the freed conn->lock_list while fp->conn was nulled out. To fix this issue properly, we need to handle the lifetime of smb_lock->clist across three paths: - Safely skip clist deletion when list is empty and fp->conn is NULL. - Remove the lock from the old connection's lock_list in session_fd_check() - Re-add the lock to the new connection's lock_list in ksmbd_reopen_durable_fd(). Fixes: c8efcc786146 ("ksmbd: add support for durable handles v1/v2") Co-developed-by: munan Huang <munanevil@gmail.com> Signed-off-by: munan Huang <munanevil@gmail.com> Reviewed-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com> Stable-dep-of: 49110a8ce654 ("ksmbd: validate owner of durable handle on reconnect") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-22nilfs2: fix NULL i_assoc_inode dereference in nilfs_mdt_save_to_shadow_mapDeepanshu Kartikey
commit 4a4e0328edd9e9755843787d28f16dd4165f8b48 upstream. The DAT inode's btree node cache (i_assoc_inode) is initialized lazily during btree operations. However, nilfs_mdt_save_to_shadow_map() assumes i_assoc_inode is already initialized when copying dirty pages to the shadow map during GC. If NILFS_IOCTL_CLEAN_SEGMENTS is called immediately after mount before any btree operation has occurred on the DAT inode, i_assoc_inode is NULL leading to a general protection fault. Fix this by calling nilfs_attach_btree_node_cache() on the DAT inode in nilfs_dat_read() at mount time, ensuring i_assoc_inode is always initialized before any GC operation can use it. Reported-by: syzbot+4b4093b1f24ad789bf37@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=4b4093b1f24ad789bf37 Tested-by: syzbot+4b4093b1f24ad789bf37@syzkaller.appspotmail.com Fixes: e897be17a441 ("nilfs2: fix lockdep warnings in page operations for btree nodes") Signed-off-by: Deepanshu Kartikey <Kartikey406@gmail.com> Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-22ocfs2: handle invalid dinode in ocfs2_group_extendZhengYuan Huang
commit 4a1c0ddc6e7bcf2e9db0eeaab9340dcfe97f448f upstream. [BUG] kernel BUG at fs/ocfs2/resize.c:308! Oops: invalid opcode: 0000 [#1] SMP KASAN NOPTI RIP: 0010:ocfs2_group_extend+0x10aa/0x1ae0 fs/ocfs2/resize.c:308 Code: 8b8520ff ffff83f8 860f8580 030000e8 5cc3c1fe Call Trace: ... ocfs2_ioctl+0x175/0x6e0 fs/ocfs2/ioctl.c:869 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:597 [inline] __se_sys_ioctl fs/ioctl.c:583 [inline] __x64_sys_ioctl+0x197/0x1e0 fs/ioctl.c:583 x64_sys_call+0x1144/0x26a0 arch/x86/include/generated/asm/syscalls_64.h:17 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0x93/0xf80 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x76/0x7e ... [CAUSE] ocfs2_group_extend() assumes that the global bitmap inode block returned from ocfs2_inode_lock() has already been validated and BUG_ONs when the signature is not a dinode. That assumption is too strong for crafted filesystems because the JBD2-managed buffer path can bypass structural validation and return an invalid dinode to the resize ioctl. [FIX] Validate the dinode explicitly in ocfs2_group_extend(). If the global bitmap buffer does not contain a valid dinode, report filesystem corruption with ocfs2_error() and fail the resize operation instead of crashing the kernel. Link: https://lkml.kernel.org/r/20260401092303.3709187-1-gality369@gmail.com Fixes: 10995aa2451a ("ocfs2: Morph the haphazard OCFS2_IS_VALID_DINODE() checks.") Signed-off-by: ZhengYuan Huang <gality369@gmail.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Jun Piao <piaojun@huawei.com> Cc: Heming Zhao <heming.zhao@suse.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-22ocfs2: fix use-after-free in ocfs2_fault() when VM_FAULT_RETRYTejas Bharambe
commit 7de554cabf160e331e4442e2a9ad874ca9875921 upstream. filemap_fault() may drop the mmap_lock before returning VM_FAULT_RETRY, as documented in mm/filemap.c: "If our return value has VM_FAULT_RETRY set, it's because the mmap_lock may be dropped before doing I/O or by lock_folio_maybe_drop_mmap()." When this happens, a concurrent munmap() can call remove_vma() and free the vm_area_struct via RCU. The saved 'vma' pointer in ocfs2_fault() then becomes a dangling pointer, and the subsequent trace_ocfs2_fault() call dereferences it -- a use-after-free. Fix this by saving ip_blkno as a plain integer before calling filemap_fault(), and removing vma from the trace event. Since ip_blkno is copied by value before the lock can be dropped, it remains valid regardless of what happens to the vma or inode afterward. Link: https://lkml.kernel.org/r/20260410083816.34951-1-tejas.bharambe@outlook.com Fixes: 614a9e849ca6 ("ocfs2: Remove FILE_IO from masklog.") Signed-off-by: Tejas Bharambe <tejas.bharambe@outlook.com> Reported-by: syzbot+a49010a0e8fcdeea075f@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=a49010a0e8fcdeea075f Suggested-by: Joseph Qi <joseph.qi@linux.alibaba.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Jun Piao <piaojun@huawei.com> Cc: Heming Zhao <heming.zhao@suse.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-22ocfs2: fix possible deadlock between unlink and dio_end_io_writeJoseph Qi
commit b02da26a992db0c0e2559acbda0fc48d4a2fd337 upstream. ocfs2_unlink takes orphan dir inode_lock first and then ip_alloc_sem, while in ocfs2_dio_end_io_write, it acquires these locks in reverse order. This creates an ABBA lock ordering violation on lock classes ocfs2_sysfile_lock_key[ORPHAN_DIR_SYSTEM_INODE] and ocfs2_file_ip_alloc_sem_key. Lock Chain #0 (orphan dir inode_lock -> ip_alloc_sem): ocfs2_unlink ocfs2_prepare_orphan_dir ocfs2_lookup_lock_orphan_dir inode_lock(orphan_dir_inode) <- lock A __ocfs2_prepare_orphan_dir ocfs2_prepare_dir_for_insert ocfs2_extend_dir ocfs2_expand_inline_dir down_write(&oi->ip_alloc_sem) <- Lock B Lock Chain #1 (ip_alloc_sem -> orphan dir inode_lock): ocfs2_dio_end_io_write down_write(&oi->ip_alloc_sem) <- Lock B ocfs2_del_inode_from_orphan() inode_lock(orphan_dir_inode) <- Lock A Deadlock Scenario: CPU0 (unlink) CPU1 (dio_end_io_write) ------ ------ inode_lock(orphan_dir_inode) down_write(ip_alloc_sem) down_write(ip_alloc_sem) inode_lock(orphan_dir_inode) Since ip_alloc_sem is to protect allocation changes, which is unrelated with operations in ocfs2_del_inode_from_orphan. So move ocfs2_del_inode_from_orphan out of ip_alloc_sem to fix the deadlock. Link: https://lkml.kernel.org/r/20260306032211.1016452-1-joseph.qi@linux.alibaba.com Reported-by: syzbot+67b90111784a3eac8c04@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=67b90111784a3eac8c04 Fixes: a86a72a4a4e0 ("ocfs2: take ip_alloc_sem in ocfs2_dio_get_block & ocfs2_dio_end_io_write") Signed-off-by: Joseph Qi <joseph.qi@linux.alibaba.com> Reviewed-by: Heming Zhao <heming.zhao@suse.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Joseph Qi <jiangqi903@gmail.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Jun Piao <piaojun@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-22dcache: Limit the minimal number of bucket to twoZhihao Cheng
commit f08fe8891c3eeb63b73f9f1f6d97aa629c821579 upstream. There is an OOB read problem on dentry_hashtable when user sets 'dhash_entries=1': BUG: unable to handle page fault for address: ffff888b30b774b0 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page Oops: Oops: 0000 [#1] SMP PTI RIP: 0010:__d_lookup+0x56/0x120 Call Trace: d_lookup.cold+0x16/0x5d lookup_dcache+0x27/0xf0 lookup_one_qstr_excl+0x2a/0x180 start_dirop+0x55/0xa0 simple_start_creating+0x8d/0xa0 debugfs_start_creating+0x8c/0x180 debugfs_create_dir+0x1d/0x1c0 pinctrl_init+0x6d/0x140 do_one_initcall+0x6d/0x3d0 kernel_init_freeable+0x39f/0x460 kernel_init+0x2a/0x260 There will be only one bucket in dentry_hashtable when dhash_entries is set as one, and d_hash_shift is calculated as 32 by dcache_init(). Then, following process will access more than one buckets(which memory region is not allocated) in dentry_hashtable: d_lookup b = d_hash(hash) dentry_hashtable + ((u32)hashlen >> d_hash_shift) // The C standard defines the behavior of right shift amounts // exceeding the bit width of the operand as undefined. The // result of '(u32)hashlen >> d_hash_shift' becomes 'hashlen', // so 'b' will point to an unallocated memory region. hlist_bl_for_each_entry_rcu(b) hlist_bl_first_rcu(head) h->first // read OOB! Fix it by limiting the minimal number of dentry_hashtable bucket to two, so that 'd_hash_shift' won't exceeds the bit width of type u32. Cc: stable@vger.kernel.org Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com> Link: https://patch.msgid.link/20260130034853.215819-1-chengzhihao1@huawei.com Reviewed-by: Yang Erkun <yangerkun@huawei.com> Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-22smb: server: avoid double-free in smb_direct_free_sendmsg after ↵Stefan Metzmacher
smb_direct_flush_send_list() commit 84ff995ae826aa6bbcc6c7b9ea569ff67c021d72 upstream. smb_direct_flush_send_list() already calls smb_direct_free_sendmsg(), so we should not call it again after post_sendmsg() moved it to the batch list. Reported-by: Ruikai Peng <ruikai@pwno.io> Closes: https://lore.kernel.org/linux-cifs/CAFD3drNOSJ05y3A+jNXSDxW-2w09KHQ0DivhxQ_pcc7immVVOQ@mail.gmail.com/ Fixes: 34abd408c8ba ("smb: server: make use of smbdirect_socket.send_io.bcredits") Cc: stable@kernel.org Cc: Steve French <smfrench@gmail.com> Cc: Tom Talpey <tom@talpey.com> Cc: Ruikai Peng <ruikai@pwno.io> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: linux-cifs@vger.kernel.org Cc: samba-technical@lists.samba.org Cc: security@kernel.org Signed-off-by: Stefan Metzmacher <metze@samba.org> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Acked-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> Tested-by: Ruikai Peng <ruikai@pwno.io> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-22smb: client: avoid double-free in smbd_free_send_io() after ↵Stefan Metzmacher
smbd_send_batch_flush() commit 27b7c3e916218b5eb2ee350211140e961bfc49be upstream. smbd_send_batch_flush() already calls smbd_free_send_io(), so we should not call it again after smbd_post_send() moved it to the batch list. Reported-by: Ruikai Peng <ruikai@pwno.io> Closes: https://lore.kernel.org/linux-cifs/CAFD3drNOSJ05y3A+jNXSDxW-2w09KHQ0DivhxQ_pcc7immVVOQ@mail.gmail.com/ Fixes: 21538121efe6 ("smb: client: make use of smbdirect_socket.send_io.bcredits") Cc: stable@kernel.org Cc: Steve French <smfrench@gmail.com> Cc: Tom Talpey <tom@talpey.com> Cc: Long Li <longli@microsoft.com> Cc: Ruikai Peng <ruikai@pwno.io> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: linux-cifs@vger.kernel.org Cc: samba-technical@lists.samba.org Cc: security@kernel.org Acked-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Stefan Metzmacher <metze@samba.org> Tested-by: Ruikai Peng <ruikai@pwno.io> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-22ksmbd: fix mechToken leak when SPNEGO decode fails after token allocGreg Kroah-Hartman
commit ad0057fb91218914d6c98268718ceb9d59b388e1 upstream. The kernel ASN.1 BER decoder calls action callbacks incrementally as it walks the input. When ksmbd_decode_negTokenInit() reaches the mechToken [2] OCTET STRING element, ksmbd_neg_token_alloc() allocates conn->mechToken immediately via kmemdup_nul(). If a later element in the same blob is malformed, then the decoder will return nonzero after the allocation is already live. This could happen if mechListMIC [3] overrunse the enclosing SEQUENCE. decode_negotiation_token() then sets conn->use_spnego = false because both the negTokenInit and negTokenTarg grammars failed. The cleanup at the bottom of smb2_sess_setup() is gated on use_spnego: if (conn->use_spnego && conn->mechToken) { kfree(conn->mechToken); conn->mechToken = NULL; } so the kfree is skipped, causing the mechToken to never be freed. This codepath is reachable pre-authentication, so untrusted clients can cause slow memory leaks on a server without even being properly authenticated. Fix this up by not checking check for use_spnego, as it's not required, so the memory will always be properly freed. At the same time, always free the memory in ksmbd_conn_free() incase some other failure path forgot to free it. Cc: Namjae Jeon <linkinjeon@kernel.org> Cc: Steve French <smfrench@gmail.com> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Tom Talpey <tom@talpey.com> Cc: linux-cifs@vger.kernel.org Cc: <stable@kernel.org> Assisted-by: gregkh_clanker_t1000 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-22ksmbd: require 3 sub-authorities before reading sub_auth[2]Greg Kroah-Hartman
commit 53370cf9090777774e07fd9a8ebce67c6cc333ab upstream. parse_dacl() compares each ACE SID against sid_unix_NFS_mode and on match reads sid.sub_auth[2] as the file mode. If sid_unix_NFS_mode is the prefix S-1-5-88-3 with num_subauth = 2 then compare_sids() compares only min(num_subauth, 2) sub-authorities so a client SID with num_subauth = 2 and sub_auth = {88, 3} will match. If num_subauth = 2 and the ACE is placed at the very end of the security descriptor, sub_auth[2] will be 4 bytes past end_of_acl. The out-of-band bytes will then be masked to the low 9 bits and applied as the file's POSIX mode, probably not something that is good to have happen. Fix this up by forcing the SID to actually carry a third sub-authority before reading it at all. Cc: Namjae Jeon <linkinjeon@kernel.org> Cc: Steve French <smfrench@gmail.com> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Tom Talpey <tom@talpey.com> Cc: linux-cifs@vger.kernel.org Cc: <stable@kernel.org> Assisted-by: gregkh_clanker_t1000 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-22ksmbd: validate EaNameLength in smb2_get_ea()Greg Kroah-Hartman
commit 66751841212c2cc196577453c37f7774ff363f02 upstream. smb2_get_ea() reads ea_req->EaNameLength from the client request and passes it directly to strncmp() as the comparison length without verifying that the length of the name really is the size of the input buffer received. Fix this up by properly checking the size of the name based on the value received and the overall size of the request, to prevent a later strncmp() call to use the length as a "trusted" size of the buffer. Without this check, uninitialized heap values might be slowly leaked to the client. Cc: Namjae Jeon <linkinjeon@kernel.org> Cc: Steve French <smfrench@gmail.com> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Tom Talpey <tom@talpey.com> Cc: linux-cifs@vger.kernel.org Cc: <stable@kernel.org> Assisted-by: gregkh_clanker_t1000 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-22smb: client: fix OOB reads parsing symlink error responseGreg Kroah-Hartman
commit 3df690bba28edec865cf7190be10708ad0ddd67e upstream. When a CREATE returns STATUS_STOPPED_ON_SYMLINK, smb2_check_message() returns success without any length validation, leaving the symlink parsers as the only defense against an untrusted server. symlink_data() walks SMB 3.1.1 error contexts with the loop test "p < end", but reads p->ErrorId at offset 4 and p->ErrorDataLength at offset 0. When the server-controlled ErrorDataLength advances p to within 1-7 bytes of end, the next iteration will read past it. When the matching context is found, sym->SymLinkErrorTag is read at offset 4 from p->ErrorContextData with no check that the symlink header itself fits. smb2_parse_symlink_response() then bounds-checks the substitute name using SMB2_SYMLINK_STRUCT_SIZE as the offset of PathBuffer from iov_base. That value is computed as sizeof(smb2_err_rsp) + sizeof(smb2_symlink_err_rsp), which is correct only when ErrorContextCount == 0. With at least one error context the symlink data sits 8 bytes deeper, and each skipped non-matching context shifts it further by 8 + ALIGN(ErrorDataLength, 8). The check is too short, allowing the substitute name read to run past iov_len. The out-of-bound heap bytes are UTF-16-decoded into the symlink target and returned to userspace via readlink(2). Fix this all up by making the loops test require the full context header to fit, rejecting sym if its header runs past end, and bound the substitute name against the actual position of sym->PathBuffer rather than a fixed offset. Because sub_offs and sub_len are 16bits, the pointer math will not overflow here with the new greater-than. Cc: Ronnie Sahlberg <ronniesahlberg@gmail.com> Cc: Shyam Prasad N <sprasad@microsoft.com> Cc: Tom Talpey <tom@talpey.com> Cc: Bharath SM <bharathsm@microsoft.com> Cc: linux-cifs@vger.kernel.org Cc: samba-technical@lists.samba.org Cc: stable <stable@kernel.org> Reviewed-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> Assisted-by: gregkh_clanker_t1000 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-22smb: client: fix off-by-8 bounds check in check_wsl_eas()Greg Kroah-Hartman
commit 3d8b9d06bd3ac4c6846f5498800b0f5f8062e53b upstream. The bounds check uses (u8 *)ea + nlen + 1 + vlen as the end of the EA name and value, but ea_data sits at offset sizeof(struct smb2_file_full_ea_info) = 8 from ea, not at offset 0. The strncmp() later reads ea->ea_data[0..nlen-1] and the value bytes follow at ea_data[nlen+1..nlen+vlen], so the actual end is ea->ea_data + nlen + 1 + vlen. Isn't pointer math fun? The earlier check (u8 *)ea > end - sizeof(*ea) only guarantees the 8-byte header is in bounds, but since the last EA is placed within 8 bytes of the end of the response, the name and value bytes are read past the end of iov. Fix this mess all up by using ea->ea_data as the base for the bounds check. An "untrusted" server can use this to leak up to 8 bytes of kernel heap into the EA name comparison and influence which WSL xattr the data is interpreted as. Cc: Ronnie Sahlberg <ronniesahlberg@gmail.com> Cc: Shyam Prasad N <sprasad@microsoft.com> Cc: Tom Talpey <tom@talpey.com> Cc: Bharath SM <bharathsm@microsoft.com> Cc: linux-cifs@vger.kernel.org Cc: samba-technical@lists.samba.org Cc: stable <stable@kernel.org> Assisted-by: gregkh_clanker_t1000 Reviewed-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-10Merge tag 'vfs-7.0-rc8.fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs fixes from Christian Brauner: "The kernfs rbtree is keyed by (hash, ns, name) where the hash is seeded with the raw namespace pointer via init_name_hash(ns). The resulting hash values are exposed to userspace through readdir seek positions, and the pointer-based ordering in kernfs_name_compare() is observable through entry order. Switch from raw pointers to ns_common::ns_id for both hashing and comparison. A preparatory commit first replaces all const void * namespace parameters with const struct ns_common * throughout kernfs, sysfs, and kobject so the code can access ns->ns_id. Also compare the ns_id when hashes match in the rbtree to handle crafted collisions. Also fix eventpoll RCU grace period issue and a cachefiles refcount problem" * tag 'vfs-7.0-rc8.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: kernfs: make directory seek namespace-aware kernfs: use namespace id instead of pointer for hashing and comparison kernfs: pass struct ns_common instead of const void * for namespace tags eventpoll: defer struct eventpoll free to RCU grace period cachefiles: fix incorrect dentry refcount in cachefiles_cull()
2026-04-09kernfs: make directory seek namespace-awareChristian Brauner
The rbtree backing kernfs directories is ordered by (hash, ns_id, name) but kernfs_dir_pos() only searches by hash when seeking to a position during readdir. When two nodes from different namespaces share the same hash value, the binary search can land on a node in the wrong namespace. The subsequent skip-forward loop walks rb_next() and may overshoot the correct node, silently dropping an entry from the readdir results. With the recent switch from raw namespace pointers to public namespace ids as hash seeds, computing hash collisions became an offline operation. An unprivileged user could unshare into a new network namespace, create a single interface whose name-hash collides with a target entry in init_net, and cause a victim's seekdir/readdir on /sys/class/net to miss that entry. Fix this by extending the rbtree search in kernfs_dir_pos() to also compare namespace ids when hashes match. Since the rbtree is already ordered by (hash, ns_id, name), this makes the seek land directly in the correct namespace's range, eliminating the wrong-namespace overshoot. Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-04-09kernfs: use namespace id instead of pointer for hashing and comparisonChristian Brauner
kernfs uses the namespace tag as both a hash seed (via init_name_hash()) and a comparison key in the rbtree. The resulting hash values are exposed to userspace through directory seek positions (ctx->pos), and the raw pointer comparisons in kernfs_name_compare() encode kernel pointer ordering into the rbtree layout. This constitutes a KASLR information leak since the hash and ordering derived from kernel pointers can be observed from userspace. Fix this by using the 64-bit namespace id (ns_common::ns_id) instead of the raw pointer value for both hashing and comparison. The namespace id is a stable, non-secret identifier that is already exposed to userspace through other interfaces (e.g., /proc/pid/ns/, ioctl NS_GET_NSID). Introduce kernfs_ns_id() as a helper that extracts the namespace id from a potentially-NULL ns_common pointer, returning 0 for the no-namespace case. All namespace equality checks in the directory iteration and dentry revalidation paths are also switched from pointer comparison to ns_id comparison for consistency. Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-04-09kernfs: pass struct ns_common instead of const void * for namespace tagsChristian Brauner
kernfs has historically used const void * to pass around namespace tags used for directory-level namespace filtering. The only current user of this is sysfs network namespace tagging where struct net pointers are cast to void *. Replace all const void * namespace parameters with const struct ns_common * throughout the kernfs, sysfs, and kobject namespace layers. This includes the kobj_ns_type_operations callbacks, kobject_namespace(), and all sysfs/kernfs APIs that accept or return namespace tags. Passing struct ns_common is needed because various codepaths require access to the underlying namespace. A struct ns_common can always be converted back to the concrete namespace type (e.g., struct net) via container_of() or to_ns_common() in the reverse direction. This is a preparatory change for switching to ns_id-based directory iteration to prevent a KASLR pointer leak through the current use of raw namespace pointers as hash seeds and comparison keys. Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-04-07Merge tag 'mm-hotfixes-stable-2026-04-06-15-27' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "Eight hotfixes. All are cc:stable and seven are for MM. All are singletons - please see the changelogs for details" * tag 'mm-hotfixes-stable-2026-04-06-15-27' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: ocfs2: fix out-of-bounds write in ocfs2_write_end_inline mm/damon/stat: deallocate damon_call() failure leaking damon_ctx mm/vma: fix memory leak in __mmap_region() mm/memory_hotplug: maintain N_NORMAL_MEMORY during hotplug mm/damon/sysfs: dealloc repeat_call_control if damon_call() fails mm: reinstate unconditional writeback start in balance_dirty_pages() liveupdate: propagate file deserialization failures mm: filemap: fix nr_pages calculation overflow in filemap_map_pages()
2026-04-06ocfs2: fix out-of-bounds write in ocfs2_write_end_inlineJoseph Qi
KASAN reports a use-after-free write of 4086 bytes in ocfs2_write_end_inline, called from ocfs2_write_end_nolock during a copy_file_range splice fallback on a corrupted ocfs2 filesystem mounted on a loop device. The actual bug is an out-of-bounds write past the inode block buffer, not a true use-after-free. The write overflows into an adjacent freed page, which KASAN reports as UAF. The root cause is that ocfs2_try_to_write_inline_data trusts the on-disk id_count field to determine whether a write fits in inline data. On a corrupted filesystem, id_count can exceed the physical maximum inline data capacity, causing writes to overflow the inode block buffer. Call trace (crash path): vfs_copy_file_range (fs/read_write.c:1634) do_splice_direct splice_direct_to_actor iter_file_splice_write ocfs2_file_write_iter generic_perform_write ocfs2_write_end ocfs2_write_end_nolock (fs/ocfs2/aops.c:1949) ocfs2_write_end_inline (fs/ocfs2/aops.c:1915) memcpy_from_folio <-- KASAN: write OOB So add id_count upper bound check in ocfs2_validate_inode_block() to alongside the existing i_size check to fix it. Link: https://lkml.kernel.org/r/20260403063830.3662739-1-joseph.qi@linux.alibaba.com Signed-off-by: Joseph Qi <joseph.qi@linux.alibaba.com> Reported-by: syzbot+62c1793956716ea8b28a@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=62c1793956716ea8b28a Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Jun Piao <piaojun@huawei.com> Cc: Heming Zhao <heming.zhao@suse.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-02Merge tag 'v7.0-rc6-smb3-client-fix' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds
Pull smb client fix from Steve French: - Fix potential out of bounds read in mount * tag 'v7.0-rc6-smb3-client-fix' of git://git.samba.org/sfrench/cifs-2.6: fs/smb/client: fix out-of-bounds read in cifs_sanitize_prepath
2026-04-02eventpoll: defer struct eventpoll free to RCU grace periodNicholas Carlini
In certain situations, ep_free() in eventpoll.c will kfree the epi->ep eventpoll struct while it still being used by another concurrent thread. Defer the kfree() to an RCU callback to prevent UAF. Fixes: f2e467a48287 ("eventpoll: Fix semi-unbounded recursion") Signed-off-by: Nicholas Carlini <nicholas@carlini.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-04-02Merge tag 'v7.0-rc6-ksmbd-server-fix' of git://git.samba.org/ksmbdLinus Torvalds
Pull smb server fix from Steve French: - Fix out of bound write * tag 'v7.0-rc6-ksmbd-server-fix' of git://git.samba.org/ksmbd: ksmbd: fix OOB write in QUERY_INFO for compound requests
2026-04-02Merge tag 'for-7.0-rc6-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fix from David Sterba: "One more fix for a potential extent tree corruption due to an unexpected error value. When the search for an extent item failed, it under some circumstances was reported as a success to the caller" * tag 'for-7.0-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: fix incorrect return value after changing leaf in lookup_extent_data_ref()
2026-03-31fs/smb/client: fix out-of-bounds read in cifs_sanitize_prepathFredric Cover
When cifs_sanitize_prepath is called with an empty string or a string containing only delimiters (e.g., "/"), the current logic attempts to check *(cursor2 - 1) before cursor2 has advanced. This results in an out-of-bounds read. This patch adds an early exit check after stripping prepended delimiters. If no path content remains, the function returns NULL. The bug was identified via manual audit and verified using a standalone test case compiled with AddressSanitizer, which triggered a SEGV on affected inputs. Signed-off-by: Fredric Cover <FredTheDude@proton.me> Reviewed-by: Henrique Carvalho <[2]henrique.carvalho@suse.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-03-31Merge tag 'fs_for_v7.0-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull udf fix from Jan Kara: "Fix for a race in UDF that can lead to memory corruption" * tag 'fs_for_v7.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: udf: Fix race between file type conversion and writeback mpage: Provide variant of mpage_writepages() with own optional folio handler