<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/net/core/sock_map.c, branch linux-6.2.y</title>
<subtitle>Hosts the 0x221E linux distro kernel.</subtitle>
<id>https://universe.0xinfinity.dev/distro/kernel/atom?h=linux-6.2.y</id>
<link rel='self' href='https://universe.0xinfinity.dev/distro/kernel/atom?h=linux-6.2.y'/>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/'/>
<updated>2023-05-11T14:11:14Z</updated>
<entry>
<title>bpf, sockmap: Revert buggy deadlock fix in the sockhash and sockmap</title>
<updated>2023-05-11T14:11:14Z</updated>
<author>
<name>Daniel Borkmann</name>
<email>daniel@iogearbox.net</email>
</author>
<published>2023-04-13T18:28:42Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=3cd6ab3b5451d12f2d98c54a1c955fd84cc5cf49'/>
<id>urn:sha1:3cd6ab3b5451d12f2d98c54a1c955fd84cc5cf49</id>
<content type='text'>
[ Upstream commit 8c5c2a4898e3d6bad86e29d471e023c8a19ba799 ]

syzbot reported a splat and bisected it to recent commit ed17aa92dc56 ("bpf,
sockmap: fix deadlocks in the sockhash and sockmap"):

  [...]
  WARNING: CPU: 1 PID: 9280 at kernel/softirq.c:376 __local_bh_enable_ip+0xbe/0x130 kernel/softirq.c:376
  Modules linked in:
  CPU: 1 PID: 9280 Comm: syz-executor.1 Not tainted 6.2.0-syzkaller-13249-gd319f344561d #0
  Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/30/2023
  RIP: 0010:__local_bh_enable_ip+0xbe/0x130 kernel/softirq.c:376
  [...]
  Call Trace:
  &lt;TASK&gt;
  spin_unlock_bh include/linux/spinlock.h:395 [inline]
  sock_map_del_link+0x2ea/0x510 net/core/sock_map.c:165
  sock_map_unref+0xb0/0x1d0 net/core/sock_map.c:184
  sock_hash_delete_elem+0x1ec/0x2a0 net/core/sock_map.c:945
  map_delete_elem kernel/bpf/syscall.c:1536 [inline]
  __sys_bpf+0x2edc/0x53e0 kernel/bpf/syscall.c:5053
  __do_sys_bpf kernel/bpf/syscall.c:5166 [inline]
  __se_sys_bpf kernel/bpf/syscall.c:5164 [inline]
  __x64_sys_bpf+0x79/0xc0 kernel/bpf/syscall.c:5164
  do_syscall_x64 arch/x86/entry/common.c:50 [inline]
  do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80
  entry_SYSCALL_64_after_hwframe+0x63/0xcd
  RIP: 0033:0x7fe8f7c8c169
  &lt;/TASK&gt;
  [...]

Revert for now until we have a proper solution.

Fixes: ed17aa92dc56 ("bpf, sockmap: fix deadlocks in the sockhash and sockmap")
Reported-by: syzbot+49f6cef45247ff249498@syzkaller.appspotmail.com
Cc: Hsin-Wei Hung &lt;hsinweih@uci.edu&gt;
Cc: Xin Liu &lt;liuxin350@huawei.com&gt;
Cc: John Fastabend &lt;john.fastabend@gmail.com&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Link: https://lore.kernel.org/bpf/000000000000f1db9605f939720e@google.com/
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>bpf, sockmap: fix deadlocks in the sockhash and sockmap</title>
<updated>2023-05-11T14:11:13Z</updated>
<author>
<name>Xin Liu</name>
<email>liuxin350@huawei.com</email>
</author>
<published>2023-04-06T12:26:22Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=d0e34aaea13391e28069c50de5519300d408e985'/>
<id>urn:sha1:d0e34aaea13391e28069c50de5519300d408e985</id>
<content type='text'>
[ Upstream commit ed17aa92dc56b6d8883e4b7a8f1c6fbf5ed6cd29 ]

When huang uses sched_switch tracepoint, the tracepoint
does only one thing in the mounted ebpf program, which
deletes the fixed elements in sockhash ([0])

It seems that elements in sockhash are rarely actively
deleted by users or ebpf program. Therefore, we do not
pay much attention to their deletion. Compared with hash
maps, sockhash only provides spin_lock_bh protection.
This causes it to appear to have self-locking behavior
in the interrupt context.

  [0]:https://lore.kernel.org/all/CABcoxUayum5oOqFMMqAeWuS8+EzojquSOSyDA3J_2omY=2EeAg@mail.gmail.com/

Reported-by: Hsin-Wei Hung &lt;hsinweih@uci.edu&gt;
Fixes: 604326b41a6f ("bpf, sockmap: convert to generic sk_msg interface")
Signed-off-by: Xin Liu &lt;liuxin350@huawei.com&gt;
Acked-by: John Fastabend &lt;john.fastabend@gmail.com&gt;
Link: https://lore.kernel.org/r/20230406122622.109978-1-liuxin350@huawei.com
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>bpf: return long from bpf_map_ops funcs</title>
<updated>2023-05-11T14:11:09Z</updated>
<author>
<name>JP Kobryn</name>
<email>inwardvessel@gmail.com</email>
</author>
<published>2023-03-22T19:47:54Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=468ce1c96d7d431bd603eb76e911cbc74f39cf6c'/>
<id>urn:sha1:468ce1c96d7d431bd603eb76e911cbc74f39cf6c</id>
<content type='text'>
[ Upstream commit d7ba4cc900bf1eea2d8c807c6b1fc6bd61f41237 ]

This patch changes the return types of bpf_map_ops functions to long, where
previously int was returned. Using long allows for bpf programs to maintain
the sign bit in the absence of sign extension during situations where
inlined bpf helper funcs make calls to the bpf_map_ops funcs and a negative
error is returned.

The definitions of the helper funcs are generated from comments in the bpf
uapi header at `include/uapi/linux/bpf.h`. The return type of these
helpers was previously changed from int to long in commit bdb7b79b4ce8. For
any case where one of the map helpers call the bpf_map_ops funcs that are
still returning 32-bit int, a compiler might not include sign extension
instructions to properly convert the 32-bit negative value a 64-bit
negative value.

For example:
bpf assembly excerpt of an inlined helper calling a kernel function and
checking for a specific error:

; err = bpf_map_update_elem(&amp;mymap, &amp;key, &amp;val, BPF_NOEXIST);
  ...
  46:	call   0xffffffffe103291c	; htab_map_update_elem
; if (err &amp;&amp; err != -EEXIST) {
  4b:	cmp    $0xffffffffffffffef,%rax ; cmp -EEXIST,%rax

kernel function assembly excerpt of return value from
`htab_map_update_elem` returning 32-bit int:

movl $0xffffffef, %r9d
...
movl %r9d, %eax

...results in the comparison:
cmp $0xffffffffffffffef, $0x00000000ffffffef

Fixes: bdb7b79b4ce8 ("bpf: Switch most helper return values from 32-bit int to 64-bit long")
Tested-by: Eduard Zingerman &lt;eddyz87@gmail.com&gt;
Signed-off-by: JP Kobryn &lt;inwardvessel@gmail.com&gt;
Link: https://lore.kernel.org/r/20230322194754.185781-3-inwardvessel@gmail.com
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>bpf, sockmap: Don't let sock_map_{close,destroy,unhash} call itself</title>
<updated>2023-01-25T05:32:55Z</updated>
<author>
<name>Jakub Sitnicki</name>
<email>jakub@cloudflare.com</email>
</author>
<published>2023-01-21T12:41:43Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=5b4a79ba65a1ab479903fff2e604865d229b70a9'/>
<id>urn:sha1:5b4a79ba65a1ab479903fff2e604865d229b70a9</id>
<content type='text'>
sock_map proto callbacks should never call themselves by design. Protect
against bugs like [1] and break out of the recursive loop to avoid a stack
overflow in favor of a resource leak.

[1] https://lore.kernel.org/all/00000000000073b14905ef2e7401@google.com/

Suggested-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: Jakub Sitnicki &lt;jakub@cloudflare.com&gt;
Acked-by: John Fastabend &lt;john.fastabend@gmail.com&gt;
Link: https://lore.kernel.org/r/20230113-sockmap-fix-v2-1-1e0ee7ac2f90@cloudflare.com
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
</content>
</entry>
<entry>
<title>bpf, sockmap: fix race in sock_map_free()</title>
<updated>2022-12-05T02:53:51Z</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2022-12-02T11:16:40Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=0a182f8d607464911756b4dbef5d6cad8de22469'/>
<id>urn:sha1:0a182f8d607464911756b4dbef5d6cad8de22469</id>
<content type='text'>
sock_map_free() calls release_sock(sk) without owning a reference
on the socket. This can cause use-after-free as syzbot found [1]

Jakub Sitnicki already took care of a similar issue
in sock_hash_free() in commit 75e68e5bf2c7 ("bpf, sockhash:
Synchronize delete from bucket list on map free")

[1]
refcount_t: decrement hit 0; leaking memory.
WARNING: CPU: 0 PID: 3785 at lib/refcount.c:31 refcount_warn_saturate+0x17c/0x1a0 lib/refcount.c:31
Modules linked in:
CPU: 0 PID: 3785 Comm: kworker/u4:6 Not tainted 6.1.0-rc7-syzkaller-00103-gef4d3ea40565 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/26/2022
Workqueue: events_unbound bpf_map_free_deferred
RIP: 0010:refcount_warn_saturate+0x17c/0x1a0 lib/refcount.c:31
Code: 68 8b 31 c0 e8 75 71 15 fd 0f 0b e9 64 ff ff ff e8 d9 6e 4e fd c6 05 62 9c 3d 0a 01 48 c7 c7 80 bb 68 8b 31 c0 e8 54 71 15 fd &lt;0f&gt; 0b e9 43 ff ff ff 89 d9 80 e1 07 80 c1 03 38 c1 0f 8c a2 fe ff
RSP: 0018:ffffc9000456fb60 EFLAGS: 00010246
RAX: eae59bab72dcd700 RBX: 0000000000000004 RCX: ffff8880207057c0
RDX: 0000000000000000 RSI: 0000000000000201 RDI: 0000000000000000
RBP: 0000000000000004 R08: ffffffff816fdabd R09: fffff520008adee5
R10: fffff520008adee5 R11: 1ffff920008adee4 R12: 0000000000000004
R13: dffffc0000000000 R14: ffff88807b1c6c00 R15: 1ffff1100f638dcf
FS: 0000000000000000(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000001b30c30000 CR3: 000000000d08e000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
&lt;TASK&gt;
__refcount_dec include/linux/refcount.h:344 [inline]
refcount_dec include/linux/refcount.h:359 [inline]
__sock_put include/net/sock.h:779 [inline]
tcp_release_cb+0x2d0/0x360 net/ipv4/tcp_output.c:1092
release_sock+0xaf/0x1c0 net/core/sock.c:3468
sock_map_free+0x219/0x2c0 net/core/sock_map.c:356
process_one_work+0x81c/0xd10 kernel/workqueue.c:2289
worker_thread+0xb14/0x1330 kernel/workqueue.c:2436
kthread+0x266/0x300 kernel/kthread.c:376
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:306
&lt;/TASK&gt;

Fixes: 7e81a3530206 ("bpf: Sockmap, ensure sock lock held during tear down")
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Reported-by: syzbot &lt;syzkaller@googlegroups.com&gt;
Cc: Jakub Sitnicki &lt;jakub@cloudflare.com&gt;
Cc: John Fastabend &lt;john.fastabend@gmail.com&gt;
Cc: Alexei Starovoitov &lt;ast@kernel.org&gt;
Cc: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Cc: Song Liu &lt;songliubraving@fb.com&gt;
Acked-by: John Fastabend &lt;john.fastabend@gmail.com&gt;
Link: https://lore.kernel.org/r/20221202111640.2745533-1-edumazet@google.com
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
</content>
</entry>
<entry>
<title>bpf, sock_map: Move cancel_work_sync() out of sock lock</title>
<updated>2022-11-03T12:51:06Z</updated>
<author>
<name>Cong Wang</name>
<email>cong.wang@bytedance.com</email>
</author>
<published>2022-11-02T04:34:17Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=8bbabb3fddcd0f858be69ed5abc9b470a239d6f2'/>
<id>urn:sha1:8bbabb3fddcd0f858be69ed5abc9b470a239d6f2</id>
<content type='text'>
Stanislav reported a lockdep warning, which is caused by the
cancel_work_sync() called inside sock_map_close(), as analyzed
below by Jakub:

psock-&gt;work.func = sk_psock_backlog()
  ACQUIRE psock-&gt;work_mutex
    sk_psock_handle_skb()
      skb_send_sock()
        __skb_send_sock()
          sendpage_unlocked()
            kernel_sendpage()
              sock-&gt;ops-&gt;sendpage = inet_sendpage()
                sk-&gt;sk_prot-&gt;sendpage = tcp_sendpage()
                  ACQUIRE sk-&gt;sk_lock
                    tcp_sendpage_locked()
                  RELEASE sk-&gt;sk_lock
  RELEASE psock-&gt;work_mutex

sock_map_close()
  ACQUIRE sk-&gt;sk_lock
  sk_psock_stop()
    sk_psock_clear_state(psock, SK_PSOCK_TX_ENABLED)
    cancel_work_sync()
      __cancel_work_timer()
        __flush_work()
          // wait for psock-&gt;work to finish
  RELEASE sk-&gt;sk_lock

We can move the cancel_work_sync() out of the sock lock protection,
but still before saved_close() was called.

Fixes: 799aa7f98d53 ("skmsg: Avoid lock_sock() in sk_psock_backlog()")
Reported-by: Stanislav Fomichev &lt;sdf@google.com&gt;
Signed-off-by: Cong Wang &lt;cong.wang@bytedance.com&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Tested-by: Jakub Sitnicki &lt;jakub@cloudflare.com&gt;
Acked-by: John Fastabend &lt;john.fastabend@gmail.com&gt;
Acked-by: Jakub Sitnicki &lt;jakub@cloudflare.com&gt;
Link: https://lore.kernel.org/bpf/20221102043417.279409-1-xiyou.wangcong@gmail.com
</content>
</entry>
<entry>
<title>Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next</title>
<updated>2022-08-18T03:29:36Z</updated>
<author>
<name>Jakub Kicinski</name>
<email>kuba@kernel.org</email>
</author>
<published>2022-08-18T03:29:34Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=3f5f728a7296b5d5b87117d85d3020cc9640f6dd'/>
<id>urn:sha1:3f5f728a7296b5d5b87117d85d3020cc9640f6dd</id>
<content type='text'>
Andrii Nakryiko says:

====================
bpf-next 2022-08-17

We've added 45 non-merge commits during the last 14 day(s) which contain
a total of 61 files changed, 986 insertions(+), 372 deletions(-).

The main changes are:

1) New bpf_ktime_get_tai_ns() BPF helper to access CLOCK_TAI, from Kurt
   Kanzenbach and Jesper Dangaard Brouer.

2) Few clean ups and improvements for libbpf 1.0, from Andrii Nakryiko.

3) Expose crash_kexec() as kfunc for BPF programs, from Artem Savkov.

4) Add ability to define sleepable-only kfuncs, from Benjamin Tissoires.

5) Teach libbpf's bpf_prog_load() and bpf_map_create() to gracefully handle
   unsupported names on old kernels, from Hangbin Liu.

6) Allow opting out from auto-attaching BPF programs by libbpf's BPF skeleton,
   from Hao Luo.

7) Relax libbpf's requirement for shared libs to be marked executable, from
   Henqgi Chen.

8) Improve bpf_iter internals handling of error returns, from Hao Luo.

9) Few accommodations in libbpf to support GCC-BPF quirks, from James Hilliard.

10) Fix BPF verifier logic around tracking dynptr ref_obj_id, from Joanne Koong.

11) bpftool improvements to handle full BPF program names better, from Manu
    Bretelle.

12) bpftool fixes around libcap use, from Quentin Monnet.

13) BPF map internals clean ups and improvements around memory allocations,
    from Yafang Shao.

14) Allow to use cgroup_get_from_file() on cgroupv1, allowing BPF cgroup
    iterator to work on cgroupv1, from Yosry Ahmed.

15) BPF verifier internal clean ups, from Dave Marchevsky and Joanne Koong.

16) Various fixes and clean ups for selftests/bpf and vmtest.sh, from Daniel
    Xu, Artem Savkov, Joanne Koong, Andrii Nakryiko, Shibin Koikkara Reeny.

* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (45 commits)
  selftests/bpf: Few fixes for selftests/bpf built in release mode
  libbpf: Clean up deprecated and legacy aliases
  libbpf: Streamline bpf_attr and perf_event_attr initialization
  libbpf: Fix potential NULL dereference when parsing ELF
  selftests/bpf: Tests libbpf autoattach APIs
  libbpf: Allows disabling auto attach
  selftests/bpf: Fix attach point for non-x86 arches in test_progs/lsm
  libbpf: Making bpf_prog_load() ignore name if kernel doesn't support
  selftests/bpf: Update CI kconfig
  selftests/bpf: Add connmark read test
  selftests/bpf: Add existing connection bpf_*_ct_lookup() test
  bpftool: Clear errno after libcap's checks
  bpf: Clear up confusion in bpf_skb_adjust_room()'s documentation
  bpftool: Fix a typo in a comment
  libbpf: Add names for auxiliary maps
  bpf: Use bpf_map_area_alloc consistently on bpf map creation
  bpf: Make __GFP_NOWARN consistent in bpf map creation
  bpf: Use bpf_map_area_free instread of kvfree
  bpf: Remove unneeded memset in queue_stack_map creation
  libbpf: preserve errno across pr_warn/pr_info/pr_debug
  ...
====================

Link: https://lore.kernel.org/r/20220817215656.1180215-1-andrii@kernel.org
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
<entry>
<title>bpf: Use bpf_map_area_alloc consistently on bpf map creation</title>
<updated>2022-08-10T18:50:43Z</updated>
<author>
<name>Yafang Shao</name>
<email>laoar.shao@gmail.com</email>
</author>
<published>2022-08-10T15:18:29Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=73cf09a36bf7bfb3e5a3ff23755c36d49137c44d'/>
<id>urn:sha1:73cf09a36bf7bfb3e5a3ff23755c36d49137c44d</id>
<content type='text'>
Let's use the generic helper bpf_map_area_alloc() instead of the
open-coded kzalloc helpers in bpf maps creation path.

Signed-off-by: Yafang Shao &lt;laoar.shao@gmail.com&gt;
Link: https://lore.kernel.org/r/20220810151840.16394-5-laoar.shao@gmail.com
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
</content>
</entry>
<entry>
<title>bpf: Make __GFP_NOWARN consistent in bpf map creation</title>
<updated>2022-08-10T18:49:25Z</updated>
<author>
<name>Yafang Shao</name>
<email>laoar.shao@gmail.com</email>
</author>
<published>2022-08-10T15:18:28Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=992c9e13f5939437037627c67bcb51e674b64265'/>
<id>urn:sha1:992c9e13f5939437037627c67bcb51e674b64265</id>
<content type='text'>
Some of the bpf maps are created with __GFP_NOWARN, i.e. arraymap,
bloom_filter, bpf_local_storage, bpf_struct_ops, lpm_trie,
queue_stack_maps, reuseport_array, stackmap and xskmap, while others are
created without __GFP_NOWARN, i.e. cpumap, devmap, hashtab,
local_storage, offload, ringbuf and sock_map. But there are not key
differences between the creation of these maps. So let make this
allocation flag consistent in all bpf maps creation. Then we can use a
generic helper to alloc all bpf maps.

Signed-off-by: Yafang Shao &lt;laoar.shao@gmail.com&gt;
Link: https://lore.kernel.org/r/20220810151840.16394-4-laoar.shao@gmail.com
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
</content>
</entry>
<entry>
<title>bpf: Acquire map uref in .init_seq_private for sock{map,hash} iterator</title>
<updated>2022-08-10T17:12:48Z</updated>
<author>
<name>Hou Tao</name>
<email>houtao1@huawei.com</email>
</author>
<published>2022-08-10T08:05:33Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=f0d2b2716d71778d0b0c8eaa433c073287d69d93'/>
<id>urn:sha1:f0d2b2716d71778d0b0c8eaa433c073287d69d93</id>
<content type='text'>
sock_map_iter_attach_target() acquires a map uref, and the uref may be
released before or in the middle of iterating map elements. For example,
the uref could be released in sock_map_iter_detach_target() as part of
bpf_link_release(), or could be released in bpf_map_put_with_uref() as
part of bpf_map_release().

Fixing it by acquiring an extra map uref in .init_seq_private and
releasing it in .fini_seq_private.

Fixes: 0365351524d7 ("net: Allow iterating sockmap and sockhash")
Signed-off-by: Hou Tao &lt;houtao1@huawei.com&gt;
Acked-by: Yonghong Song &lt;yhs@fb.com&gt;
Link: https://lore.kernel.org/r/20220810080538.1845898-5-houtao@huaweicloud.com
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
</content>
</entry>
</feed>
