| Age | Commit message (Collapse) | Author |
|
commit 22f56b8e890d4e2835951b437bb6eeebfd1cb18b upstream.
xas_find_chunk() calls find_next_bit(), which is defined in find.h,
included from bitmap.h. Inside the kernel, this isn't a problem because
bitmap.h is included from cpumask.h which is dragged in (eventually)
by gfp.h. When building the test-suite, that doesn't happen, so we need
to include bitmap.h explicitly.
Fixes: 4ade0818cf04 ("tools: sync tools/bitmap with mother linux")
Reported-by: Liam Howlett <liam.howlett@oracle.com>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 5974ea7ce0f9a5987fc8cf5e08ad6e3e70bb542e upstream.
A NVMe subsystem with multiple controller can have private namespaces
that use the same NSID under some conditions:
"If Namespace Management, ANA Reporting, or NVM Sets are supported, the
NSIDs shall be unique within the NVM subsystem. If the Namespace
Management, ANA Reporting, and NVM Sets are not supported, then NSIDs:
a) for shared namespace shall be unique; and
b) for private namespace are not required to be unique."
Reference: Section 6.1.6 NSID and Namespace Usage; NVM Express 1.4c spec.
Make sure this specific setup is supported in Linux.
Fixes: 9ad1927a3bc2 ("nvme: always search for namespace head")
Signed-off-by: Sungup Moon <sungup.moon@samsung.com>
[hch: refactored and fixed the controller vs subsystem based naming
conflict]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit d4da1f27396fb1dde079447a3612f4f512caed07 upstream.
The pcon_dsc_dpcd array holds 13 registers (0x92 through 0x9E). Fix the
math to calculate the max size. Found from a -Warray-bounds build:
drivers/gpu/drm/drm_dp_helper.c: In function 'drm_dp_pcon_dsc_bpp_incr':
drivers/gpu/drm/drm_dp_helper.c:3130:28: error: array subscript 12 is outside array bounds of 'const u8[12]' {aka 'const unsigned char[12]'} [-Werror=array-bounds]
3130 | buf = pcon_dsc_dpcd[DP_PCON_DSC_BPP_INCR - DP_PCON_DSC_ENCODER];
| ~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/gpu/drm/drm_dp_helper.c:3126:39: note: while referencing 'pcon_dsc_dpcd'
3126 | int drm_dp_pcon_dsc_bpp_incr(const u8 pcon_dsc_dpcd[DP_PCON_DSC_ENCODER_CAP_SIZE])
| ~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: David Airlie <airlied@linux.ie>
Cc: dri-devel@lists.freedesktop.org
Fixes: e2e16da398d9 ("drm/dp_helper: Add support for Configuring DSC for HDMI2.1 Pcon")
Cc: stable@vger.kernel.org
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Link: https://lore.kernel.org/lkml/20211214001849.GA62559@embeddedor/
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20220105173310.2420598-1-keescook@chromium.org
Signed-off-by: Thierry Reding <treding@nvidia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220225035610.2552144-2-keescook@chromium.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 0c470db0399e17310ed2ba54dd1c25cfa16ce0d3 ]
The NHLT information can be used to figure out which SSPs are enabled
in a platform.
The 'SSP' link type is too broad for machine drivers, since it can
cover the Bluetooth sideband and the analog audio codec connections,
so this helper exposes a parameter to filter with the device
type (DEVICE_I2S refers to analog audio codec in NHLT parlance).
The helper returns a mask, since more than one SSP may be used for
analog audio, e.g. the NHLT spec describes the use of SSP0 for
amplifiers and SSP1 for headset codec. Note that if more than one bit
is set, it's impossible to determine which SSP is connected to what
external component. Additional platform-specific information based on
e.g. DMI quirks would still be required in the machine driver to
configure the relevant dailinks.
Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com>
Acked-by: Takashi Iwai <tiwai@suse.de>
Link: https://lore.kernel.org/r/20220308192610.392950-5-pierre-louis.bossart@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 25795ef6299f07ce3838f3253a9cb34f64efcfae ]
TASK_RTLOCK_WAIT currently isn't part of TASK_REPORT, thus a task blocking
on an rtlock will appear as having a task state == 0, IOW TASK_RUNNING.
The actual state is saved in p->saved_state, but reading it after reading
p->__state has a few issues:
o that could still be TASK_RUNNING in the case of e.g. rt_spin_lock
o ttwu_state_match() might have changed that to TASK_RUNNING
As pointed out by Eric, adding TASK_RTLOCK_WAIT to TASK_REPORT implies
exposing a new state to userspace tools which way not know what to do with
them. The only information that needs to be conveyed here is that a task is
waiting on an rt_mutex, which matches TASK_UNINTERRUPTIBLE - there's no
need for a new state.
Reported-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Link: https://lore.kernel.org/r/20220120162520.570782-3-valentin.schneider@arm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit dc1b4df09acdca7a89806b28f235cd6d8dcd3d24 ]
Arnd reports that on 32-bit architectures, the fallbacks for
atomic64_read_acquire() and atomic64_set_release() are broken as they
use smp_load_acquire() and smp_store_release() respectively, which do
not work on types larger than the native word size.
Since those contain compiletime_assert_atomic_type(), any attempt to use
those fallbacks will result in a build-time error. e.g. with the
following added to arch/arm/kernel/setup.c:
| void test_atomic64(atomic64_t *v)
| {
| atomic64_set_release(v, 5);
| atomic64_read_acquire(v);
| }
The compiler will complain as follows:
| In file included from <command-line>:
| In function 'arch_atomic64_set_release',
| inlined from 'test_atomic64' at ./include/linux/atomic/atomic-instrumented.h:669:2:
| ././include/linux/compiler_types.h:346:38: error: call to '__compiletime_assert_9' declared with attribute error: Need native word sized stores/loads for atomicity.
| 346 | _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
| | ^
| ././include/linux/compiler_types.h:327:4: note: in definition of macro '__compiletime_assert'
| 327 | prefix ## suffix(); \
| | ^~~~~~
| ././include/linux/compiler_types.h:346:2: note: in expansion of macro '_compiletime_assert'
| 346 | _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
| | ^~~~~~~~~~~~~~~~~~~
| ././include/linux/compiler_types.h:349:2: note: in expansion of macro 'compiletime_assert'
| 349 | compiletime_assert(__native_word(t), \
| | ^~~~~~~~~~~~~~~~~~
| ./include/asm-generic/barrier.h:133:2: note: in expansion of macro 'compiletime_assert_atomic_type'
| 133 | compiletime_assert_atomic_type(*p); \
| | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
| ./include/asm-generic/barrier.h:164:55: note: in expansion of macro '__smp_store_release'
| 164 | #define smp_store_release(p, v) do { kcsan_release(); __smp_store_release(p, v); } while (0)
| | ^~~~~~~~~~~~~~~~~~~
| ./include/linux/atomic/atomic-arch-fallback.h:1270:2: note: in expansion of macro 'smp_store_release'
| 1270 | smp_store_release(&(v)->counter, i);
| | ^~~~~~~~~~~~~~~~~
| make[2]: *** [scripts/Makefile.build:288: arch/arm/kernel/setup.o] Error 1
| make[1]: *** [scripts/Makefile.build:550: arch/arm/kernel] Error 2
| make: *** [Makefile:1831: arch/arm] Error 2
Fix this by only using smp_load_acquire() and smp_store_release() for
native atomic types, and otherwise falling back to the regular barriers
necessary for acquire/release semantics, as we do in the more generic
acquire and release fallbacks.
Since the fallback templates are used to generate the atomic64_*() and
atomic_*() operations, the __native_word() check is added to both. For
the atomic_*() operations, which are always 32-bit, the __native_word()
check is redundant but not harmful, as it is always true.
For the example above this works as expected on 32-bit, e.g. for arm
multi_v7_defconfig:
| <test_atomic64>:
| push {r4, r5}
| dmb ish
| pldw [r0]
| mov r2, #5
| mov r3, #0
| ldrexd r4, [r0]
| strexd r4, r2, [r0]
| teq r4, #0
| bne 484 <test_atomic64+0x14>
| ldrexd r2, [r0]
| dmb ish
| pop {r4, r5}
| bx lr
... and also on 64-bit, e.g. for arm64 defconfig:
| <test_atomic64>:
| bti c
| paciasp
| mov x1, #0x5
| stlr x1, [x0]
| ldar x0, [x0]
| autiasp
| ret
Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Boqun Feng <boqun.feng@gmail.com>
Link: https://lore.kernel.org/r/20220207101943.439825-1-mark.rutland@arm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit d645552e9bd96671079b27015294ec7f9748fa2b ]
Otherwise packets in egress chains seem like they are being received by
the interface, not sent out via it.
Fixes: 42df6e1d221dd ("netfilter: Introduce egress hook")
Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 89f42494f92f448747bd8a7ab1ae8b5d5520577d ]
Avoid socket state races due to repeated calls to ->connect() using the
same socket. If connect() returns 0 due to the connection having
completed, but we are in fact in a closing state, then we may leave the
XPRT_CONNECTING flag set on the transport.
Reported-by: Enrico Scholz <enrico.scholz@sigma-chemnitz.de>
Fixes: 3be232f11a3c ("SUNRPC: Prevent immediate close+reconnect")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit f58c252e30cf74f68b0054293adc03b5923b9f0e ]
When 8250 UART is using DMA, x_char (XON/XOFF) is never sent
to the wire. After this change, x_char is injected correctly.
Create uart_xchar_out() helper for sending the x_char out and
accounting related to it. It seems that almost every driver
does these same steps with x_char. Except for 8250, however,
almost all currently lack .serial_out so they cannot immediately
take advantage of this new helper.
The downside of this patch is that it might reintroduce
the problems some devices faced with mixed DMA/non-DMA transfer
which caused revert f967fc8f165f (Revert "serial: 8250_dma:
don't bother DMA with small transfers"). However, the impact
should be limited to cases with XON/XOFF (that didn't work
with DMA capable devices to begin with so this problem is not
very likely to cause a major issue, if any at all).
Fixes: 9ee4b83e51f74 ("serial: 8250: Add support for dmaengine")
Reported-by: Gilles Buloz <gilles.buloz@kontron.com>
Tested-by: Gilles Buloz <gilles.buloz@kontron.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Link: https://lore.kernel.org/r/20220314091432.4288-2-ilpo.jarvinen@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit a603ca60cebff8589882427a67f870ed946b3fc8 ]
Commit 54da3e381c2b ("serial: 8250_aspeed_vuart: use UPF_IOREMAP to
set up register mapping") fixed a bug that had, as a side-effect,
prevented the 8250_aspeed_vuart driver from enabling the VUART's
FIFOs. However, fixing that (and hence enabling the FIFOs) has in
turn revealed what appears to be a hardware bug in the ASPEED VUART in
which the host-side THRE bit doesn't get if the BMC-side receive FIFO
trigger level is set to anything but one byte. This causes problems
for polled-mode writes from the host -- for example, Linux kernel
console writes proceed at a glacial pace (less than 100 bytes per
second) because the write path waits for a 10ms timeout to expire
after every character instead of being able to continue on to the next
character upon seeing THRE asserted. (GRUB behaves similarly.)
As a workaround, introduce a new port type for the ASPEED VUART that's
identical to PORT_16550A as it had previously been using, but with
UART_FCR_R_TRIG_00 instead to set the receive FIFO trigger level to
one byte, which (experimentally) seems to avoid the problematic THRE
behavior.
Fixes: 54da3e381c2b ("serial: 8250_aspeed_vuart: use UPF_IOREMAP to set up register mapping")
Tested-by: Konstantin Aladyshev <aladyshev22@gmail.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Zev Weiss <zev@bewilderbeest.net>
Link: https://lore.kernel.org/r/20220211004203.14915-1-zev@bewilderbeest.net
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 31d0bb9763efad30377505f3467f958d1ebe1e3d ]
The function sets the pernet boolean to avoid the spurious warning from
nf_ct_lookup_helper() when assigning conntrack helpers via nftables.
Fixes: 1a64edf54f55 ("netfilter: nft_ct: add helper set support")
Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 0492d857636e1c52cd71594a723c4b26a7b31978 ]
nf_flow_offload_inet_hook() does not check for 802.1q and PPPoE.
Fetch inner ethertype from these encapsulation protocols.
Fixes: 72efd585f714 ("netfilter: flowtable: add pppoe support")
Fixes: 4cd91f7c290f ("netfilter: flowtable: add vlan support")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 938d3480b92fa5e454b7734294f12a7b75126f09 ]
If tcp_bpf_sendmsg is running during a tear down operation we may enqueue
data on the ingress msg queue while tear down is trying to free it.
sk1 (redirect sk2) sk2
------------------- ---------------
tcp_bpf_sendmsg()
tcp_bpf_send_verdict()
tcp_bpf_sendmsg_redir()
bpf_tcp_ingress()
sock_map_close()
lock_sock()
lock_sock() ... blocking
sk_psock_stop
sk_psock_clear_state(psock, SK_PSOCK_TX_ENABLED);
release_sock(sk);
lock_sock()
sk_mem_charge()
get_page()
sk_psock_queue_msg()
sk_psock_test_state(psock, SK_PSOCK_TX_ENABLED);
drop_sk_msg()
release_sock()
While drop_sk_msg(), the msg has charged memory form sk by sk_mem_charge
and has sg pages need to put. To fix we use sk_msg_free() and then kfee()
msg.
This issue can cause the following info:
WARNING: CPU: 0 PID: 9202 at net/core/stream.c:205 sk_stream_kill_queues+0xc8/0xe0
Call Trace:
<IRQ>
inet_csk_destroy_sock+0x55/0x110
tcp_rcv_state_process+0xe5f/0xe90
? sk_filter_trim_cap+0x10d/0x230
? tcp_v4_do_rcv+0x161/0x250
tcp_v4_do_rcv+0x161/0x250
tcp_v4_rcv+0xc3a/0xce0
ip_protocol_deliver_rcu+0x3d/0x230
ip_local_deliver_finish+0x54/0x60
ip_local_deliver+0xfd/0x110
? ip_protocol_deliver_rcu+0x230/0x230
ip_rcv+0xd6/0x100
? ip_local_deliver+0x110/0x110
__netif_receive_skb_one_core+0x85/0xa0
process_backlog+0xa4/0x160
__napi_poll+0x29/0x1b0
net_rx_action+0x287/0x300
__do_softirq+0xff/0x2fc
do_softirq+0x79/0x90
</IRQ>
WARNING: CPU: 0 PID: 531 at net/ipv4/af_inet.c:154 inet_sock_destruct+0x175/0x1b0
Call Trace:
<TASK>
__sk_destruct+0x24/0x1f0
sk_psock_destroy+0x19b/0x1c0
process_one_work+0x1b3/0x3c0
? process_one_work+0x3c0/0x3c0
worker_thread+0x30/0x350
? process_one_work+0x3c0/0x3c0
kthread+0xe6/0x110
? kthread_complete_and_exit+0x20/0x20
ret_from_fork+0x22/0x30
</TASK>
Fixes: 9635720b7c88 ("bpf, sockmap: Fix memleak on ingress msg enqueue")
Signed-off-by: Wang Yufen <wangyufen@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20220304081145.2037182-2-wangyufen@huawei.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 92c45b63ce22c8898aa41806e8d6692bcd577510 ]
For hardware that only supports 32-bit writes to PCI there is the
possibility of clearing RW1C (write-one-to-clear) bits. A rate-limited
messages was introduced by fb2659230120, but rate-limiting is not the best
choice here. Some devices may not show the warnings they should if another
device has just produced a bunch of warnings. Also, the number of messages
can be a nuisance on devices which are otherwise working fine.
Change the ratelimit to a single warning per bus. This ensures no bus is
'starved' of emitting a warning and also that there isn't a continuous
stream of warnings. It would be preferable to have a warning per device,
but the pci_dev structure is not available here, and a lookup from devfn
would be far too slow.
Suggested-by: Bjorn Helgaas <helgaas@kernel.org>
Fixes: fb2659230120 ("PCI: Warn on possible RW1C corruption for sub-32 bit config writes")
Link: https://lore.kernel.org/r/20200806041455.11070-1-mark.tomlinson@alliedtelesis.co.nz
Signed-off-by: Mark Tomlinson <mark.tomlinson@alliedtelesis.co.nz>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Acked-by: Scott Branden <scott.branden@broadcom.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit a2151490cc6c57b368d7974ffd447a8b36ade639 ]
The link_status array was not large enough to read the Adjust Request
Post Cursor2 register, so remove the common helper function to avoid
an OOB read, found with a -Warray-bounds build:
drivers/gpu/drm/drm_dp_helper.c: In function 'drm_dp_get_adjust_request_post_cursor':
drivers/gpu/drm/drm_dp_helper.c:59:27: error: array subscript 10 is outside array bounds of 'const u8[6]' {aka 'const unsigned char[6]'} [-Werror=array-bounds]
59 | return link_status[r - DP_LANE0_1_STATUS];
| ~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~
drivers/gpu/drm/drm_dp_helper.c:147:51: note: while referencing 'link_status'
147 | u8 drm_dp_get_adjust_request_post_cursor(const u8 link_status[DP_LINK_STATUS_SIZE],
| ~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Replace the only user of the helper with an open-coded fetch and decode,
similar to drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c.
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: David Airlie <airlied@linux.ie>
Cc: dri-devel@lists.freedesktop.org
Fixes: 79465e0ffeb9 ("drm/dp: Add helper to get post-cursor adjustments")
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Link: https://lore.kernel.org/r/20220105173507.2420910-1-keescook@chromium.org
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 4adc33f36d80489339f1b43dfeee96bb9ea8e459 ]
The current code assumes that the RGB444 and YUV444 formats are the
same, but the HDMI 2.0 specification states that:
The three DC_XXbit bits above only indicate support for RGB 4:4:4 at
that pixel size. Support for YCBCR 4:4:4 in Deep Color modes is
indicated with the DC_Y444 bit. If DC_Y444 is set, then YCBCR 4:4:4
is supported for all modes indicated by the DC_XXbit flags.
So if we have YUV444 support and any DC_XXbit flag set but the DC_Y444
flag isn't, we'll assume that we support that deep colour mode for
YUV444 which breaks the specification.
In order to fix this, let's split the edid_hdmi_dc_modes field in struct
drm_display_info into two fields, one for RGB444 and one for YUV444.
Suggested-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Fixes: d0c94692e0a3 ("drm/edid: Parse and handle HDMI deep color modes.")
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220120151625.594595-4-maxime@cerno.tech
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 6f043b5969a4d6d385ca429388ded37e30e0d179 ]
The stack_depot member was added without kernel-doc, leading to below
warning. Fix it.
./include/drm/drm_modeset_lock.h:74: warning: Function parameter or
member 'stack_depot' not described in 'drm_modeset_acquire_ctx'
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Fixes: cd06ab2fd48f ("drm/locking: add backtrace for locking contended locks without backoff")
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Tested-by: Stephen Rothwell <sfr@canb.auug.org.au>
Link: https://patchwork.freedesktop.org/patch/msgid/20220120094856.3004147-1-jani.nikula@intel.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
defined
[ Upstream commit 043cfff99a18933fda2fb2e163daee73cc07910b ]
Remove an extra ";" which breaks compilation.
Fixes: 53bf2b0e4e4c ("firmware: ti_sci: Add support for getting resource with subtype")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Nishanth Menon <nm@ti.com>
Link: https://lore.kernel.org/r/e6c3cb793e1a6a2a0ae2528d5a5650dfe6a4b6ff.1640276505.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit bfdf4e6208051ed7165b2e92035b4bf11f43eb63 ]
The rseq rseq_cs.ptr.{ptr32,padding} uapi endianness handling is
entirely wrong on 32-bit little endian: a preprocessor logic mistake
wrongly uses the big endian field layout on 32-bit little endian
architectures.
Fortunately, those ptr32 accessors were never used within the kernel,
and only meant as a convenience for user-space.
Remove those and replace the whole rseq_cs union by a __u64 type, as
this is the only thing really needed to express the ABI. Document how
32-bit architectures are meant to interact with this field.
Fixes: ec9c82e03a74 ("rseq: uapi: Declare rseq_cs field as union, update includes")
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20220127152720.25898-1-mathieu.desnoyers@efficios.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 5e50f5d4ff31e95599d695df1f0a4e7d2d6fef99 ]
security_sctp_assoc_established() is added to replace
security_inet_conn_established() called in
sctp_sf_do_5_1E_ca(), so that asoc can be accessed in security
subsystem and save the peer secid to asoc->peer_secid.
Fixes: 72e89f50084c ("security: Add support for SCTP security hooks")
Reported-by: Prashanth Prahlad <pprahlad@redhat.com>
Based-on-patch-by: Xin Long <lucien.xin@gmail.com>
Reviewed-by: Xin Long <lucien.xin@gmail.com>
Tested-by: Richard Haines <richard_c_haines@btinternet.com>
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit efa90c11f62e6b7252fb75efe2787056872a627c ]
All supported versions of Clang perform auto-init of __builtin_alloca()
when stack auto-init is on (CONFIG_INIT_STACK_ALL_{ZERO,PATTERN}).
add_random_kstack_offset() uses __builtin_alloca() to add a stack
offset. This means, when CONFIG_INIT_STACK_ALL_{ZERO,PATTERN} is
enabled, add_random_kstack_offset() will auto-init that unused portion
of the stack used to add an offset.
There are several problems with this:
1. These offsets can be as large as 1023 bytes. Performing
memset() on them isn't exactly cheap, and this is done on
every syscall entry.
2. Architectures adding add_random_kstack_offset() to syscall
entry implemented in C require them to be 'noinstr' (e.g. see
x86 and s390). The potential problem here is that a call to
memset may occur, which is not noinstr.
A x86_64 defconfig kernel with Clang 11 and CONFIG_VMLINUX_VALIDATION shows:
| vmlinux.o: warning: objtool: do_syscall_64()+0x9d: call to memset() leaves .noinstr.text section
| vmlinux.o: warning: objtool: do_int80_syscall_32()+0xab: call to memset() leaves .noinstr.text section
| vmlinux.o: warning: objtool: __do_fast_syscall_32()+0xe2: call to memset() leaves .noinstr.text section
| vmlinux.o: warning: objtool: fixup_bad_iret()+0x2f: call to memset() leaves .noinstr.text section
Clang 14 (unreleased) will introduce a way to skip alloca initialization
via __builtin_alloca_uninitialized() (https://reviews.llvm.org/D115440).
Constrain RANDOMIZE_KSTACK_OFFSET to only be enabled if no stack
auto-init is enabled, the compiler is GCC, or Clang is version 14+. Use
__builtin_alloca_uninitialized() if the compiler provides it, as is done
by Clang 14.
Link: https://lkml.kernel.org/r/YbHTKUjEejZCLyhX@elver.google.com
Fixes: 39218ff4c625 ("stack: Optionally randomize kernel stack offset each syscall")
Signed-off-by: Marco Elver <elver@google.com>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20220131090521.1947110-2-elver@google.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
commit d4568fc8525897e683983806f813be1ae9eedaed upstream.
In preparation for FORTIFY_SOURCE performing compile-time and run-time
field bounds checking for memcpy(), memmove(), and memset(), avoid
intentionally writing across neighboring fields. Wrap the target region
in struct_group(). This additionally fixes a theoretical misalignment
of the copy (since the size of "buf" changes between 64-bit and 32-bit,
but this is likely never built for 64-bit).
FWIW, I think this code is totally broken on 64-bit (which appears to
not be a "real" build configuration): it would either always fail (with
an uninitialized data->buf_size) or would cause corruption in userspace
due to the copy_to_user() in the call path against an uninitialized
data->buf value:
omap3isp_stat_request_statistics_time32(...)
struct omap3isp_stat_data data64;
...
omap3isp_stat_request_statistics(stat, &data64);
int omap3isp_stat_request_statistics(struct ispstat *stat,
struct omap3isp_stat_data *data)
...
buf = isp_stat_buf_get(stat, data);
static struct ispstat_buffer *isp_stat_buf_get(struct ispstat *stat,
struct omap3isp_stat_data *data)
...
if (buf->buf_size > data->buf_size) {
...
return ERR_PTR(-EINVAL);
}
...
rval = copy_to_user(data->buf,
buf->virt_addr,
buf->buf_size);
Regardless, additionally initialize data64 to be zero-filled to avoid
undefined behavior.
Link: https://lore.kernel.org/lkml/20211215220505.GB21862@embeddedor
Cc: Arnd Bergmann <arnd@arndb.de>
Fixes: 378e3f81cb56 ("media: omap3isp: support 64-bit version of omap3isp_stat_data")
Cc: stable@vger.kernel.org
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 8126b1c73108bc691f5643df19071a59a69d0bc6 upstream.
pstore_dump() is *always* invoked in atomic context (nowadays in an RCU
read-side critical section, before that under a spinlock).
It doesn't make sense to try to use semaphores here.
This is mostly a revert of commit ea84b580b955 ("pstore: Convert buf_lock
to semaphore"), except that two parts aren't restored back exactly as they
were:
- keep the lock initialization in pstore_register
- in efi_pstore_write(), always set the "block" flag to false
- omit "is_locked", that was unnecessary since
commit 959217c84c27 ("pstore: Actually give up during locking failure")
- fix the bailout message
The actual problem that the buggy commit was trying to address may have
been that the use of preemptible() in efi_pstore_write() was wrong - it
only looks at preempt_count() and the state of IRQs, but __rcu_read_lock()
doesn't touch either of those under CONFIG_PREEMPT_RCU.
(Sidenote: CONFIG_PREEMPT_RCU means that the scheduler can preempt tasks in
RCU read-side critical sections, but you're not allowed to actively
block/reschedule.)
Lockdep probably never caught the problem because it's very rare that you
actually hit the contended case, so lockdep always just sees the
down_trylock(), not the down_interruptible(), and so it can't tell that
there's a problem.
Fixes: ea84b580b955 ("pstore: Convert buf_lock to semaphore")
Cc: stable@vger.kernel.org
Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20220314185953.2068993-1-jannh@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 54f586a9153201c6cff55e1f561990c78bd99aa7 upstream.
Again new complaints surfaced that we had broken the ABI here,
although previously all the userspace tools had agreed that it
was their mistake and fixed it. Yet now there are cases (e.g.
RHEL) that want to run old userspace with newer kernels, and
thus are broken.
Since this is a bit of a whack-a-mole thing, change the whole
extensibility scheme of rfkill to no longer just rely on the
message lengths, but instead require userspace to opt in via a
new ioctl to a given maximum event size that it is willing to
understand.
By default, set that to RFKILL_EVENT_SIZE_V1 (8), so that the
behaviour for userspace not calling the ioctl will look as if
it's just running on an older kernel.
Fixes: 14486c82612a ("rfkill: add a reason to the HW rfkill state")
Cc: stable@vger.kernel.org # 5.11+
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20220316212749.16491491b270.Ifcb1950998330a596f29a2a162e00b7546a1d6d0@changeid
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 27599aacbaefcbf2af7b06b0029459bbf682000d upstream.
Hot-unplug all firmware-framebuffer devices as part of removing
them via remove_conflicting_framebuffers() et al. Releases all
memory regions to be acquired by native drivers.
Firmware, such as EFI, install a framebuffer while posting the
computer. After removing the firmware-framebuffer device from fbdev,
a native driver takes over the hardware and the firmware framebuffer
becomes invalid.
Firmware-framebuffer drivers, specifically simplefb, don't release
their device from Linux' device hierarchy. It still owns the firmware
framebuffer and blocks the native drivers from loading. This has been
observed in the vmwgfx driver. [1]
Initiating a device removal (i.e., hot unplug) as part of
remove_conflicting_framebuffers() removes the underlying device and
returns the memory range to the system.
[1] https://lore.kernel.org/dri-devel/20220117180359.18114-1-zack@kde.org/
v2:
* rename variable 'dev' to 'device' (Javier)
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reported-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Reviewed-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
CC: stable@vger.kernel.org # v5.11+
Link: https://patchwork.freedesktop.org/patch/msgid/20220125091222.21457-2-tzimmermann@suse.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 6b2b04590b51aa4cf395fcd185ce439cab5961dc upstream.
blk-iocost and iolatency are cgroup aware rq-qos policies but they didn't
disable merges across different cgroups. This obviously can lead to
accounting and control errors but more importantly to priority inversions -
e.g. an IO which belongs to a higher priority cgroup or IO class may end up
getting throttled incorrectly because it gets merged to an IO issued from a
low priority cgroup.
Fix it by adding blk_cgroup_mergeable() which is called from merge paths and
rejects cross-cgroup and cross-issue_as_root merges.
Signed-off-by: Tejun Heo <tj@kernel.org>
Fixes: d70675121546 ("block: introduce blk-iolatency io controller")
Cc: stable@vger.kernel.org # v4.19+
Cc: Josef Bacik <jbacik@fb.com>
Link: https://lore.kernel.org/r/Yi/eE/6zFNyWJ+qd@slm.duckdns.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit aa1b46dcdc7baaf5fec0be25782ef24b26aa209e upstream.
a647a524a467 ("block: don't call rq_qos_ops->done_bio if the bio isn't
tracked") made bio_endio() skip rq_qos_done_bio() if BIO_TRACKED is not set.
While this fixed a potential oops, it also broke blk-iocost by skipping the
done_bio callback for merged bios.
Before, whether a bio goes through rq_qos_throttle() or rq_qos_merge(),
rq_qos_done_bio() would be called on the bio on completion with BIO_TRACKED
distinguishing the former from the latter. rq_qos_done_bio() is not called
for bios which wenth through rq_qos_merge(). This royally confuses
blk-iocost as the merged bios never finish and are considered perpetually
in-flight.
One reliably reproducible failure mode is an intermediate cgroup geting
stuck active preventing its children from being activated due to the
leaf-only rule, leading to loss of control. The following is from
resctl-bench protection scenario which emulates isolating a web server like
workload from a memory bomb run on an iocost configuration which should
yield a reasonable level of protection.
# cat /sys/block/nvme2n1/device/model
Samsung SSD 970 PRO 512GB
# cat /sys/fs/cgroup/io.cost.model
259:0 ctrl=user model=linear rbps=834913556 rseqiops=93622 rrandiops=102913 wbps=618985353 wseqiops=72325 wrandiops=71025
# cat /sys/fs/cgroup/io.cost.qos
259:0 enable=1 ctrl=user rpct=95.00 rlat=18776 wpct=95.00 wlat=8897 min=60.00 max=100.00
# resctl-bench -m 29.6G -r out.json run protection::scenario=mem-hog,loops=1
...
Memory Hog Summary
==================
IO Latency: R p50=242u:336u/2.5m p90=794u:1.4m/7.5m p99=2.7m:8.0m/62.5m max=8.0m:36.4m/350m
W p50=221u:323u/1.5m p90=709u:1.2m/5.5m p99=1.5m:2.5m/9.5m max=6.9m:35.9m/350m
Isolation and Request Latency Impact Distributions:
min p01 p05 p10 p25 p50 p75 p90 p95 p99 max mean stdev
isol% 15.90 15.90 15.90 40.05 57.24 59.07 60.01 74.63 74.63 90.35 90.35 58.12 15.82
lat-imp% 0 0 0 0 0 4.55 14.68 15.54 233.5 548.1 548.1 53.88 143.6
Result: isol=58.12:15.82% lat_imp=53.88%:143.6 work_csv=100.0% missing=3.96%
The isolation result of 58.12% is close to what this device would show
without any IO control.
Fix it by introducing a new flag BIO_QOS_MERGED to mark merged bios and
calling rq_qos_done_bio() on them too. For consistency and clarity, rename
BIO_TRACKED to BIO_QOS_THROTTLED. The flag checks are moved into
rq_qos_done_bio() so that it's next to the code paths that set the flags.
With the patch applied, the above same benchmark shows:
# resctl-bench -m 29.6G -r out.json run protection::scenario=mem-hog,loops=1
...
Memory Hog Summary
==================
IO Latency: R p50=123u:84.4u/985u p90=322u:256u/2.5m p99=1.6m:1.4m/9.5m max=11.1m:36.0m/350m
W p50=429u:274u/995u p90=1.7m:1.3m/4.5m p99=3.4m:2.7m/11.5m max=7.9m:5.9m/26.5m
Isolation and Request Latency Impact Distributions:
min p01 p05 p10 p25 p50 p75 p90 p95 p99 max mean stdev
isol% 84.91 84.91 89.51 90.73 92.31 94.49 96.36 98.04 98.71 100.0 100.0 94.42 2.81
lat-imp% 0 0 0 0 0 2.81 5.73 11.11 13.92 17.53 22.61 4.10 4.68
Result: isol=94.42:2.81% lat_imp=4.10%:4.68 work_csv=58.34% missing=0%
Signed-off-by: Tejun Heo <tj@kernel.org>
Fixes: a647a524a467 ("block: don't call rq_qos_ops->done_bio if the bio isn't tracked")
Cc: stable@vger.kernel.org # v5.15+
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/Yi7rdrzQEHjJLGKB@slm.duckdns.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 7af1974af0a9ba8a8ed2e3e947d87dd4d9a78d27 upstream.
ftrace's __print_symbolic() requires that any enum values used in the
symbol to string translation table be wrapped in a TRACE_DEFINE_ENUM
so that the enum value can be decoded from the ftrace ring buffer by
user space tooling.
This patch also fixes few other problems found in this trace point.
e.g. dereferencing structures in TP_printk which should not be done
at any cost.
Also to avoid checkpatch warnings, this patch removes those
whitespaces/tab stops issues.
Cc: stable@kernel.org
Fixes: aa75f4d3daae ("ext4: main fast-commit commit path")
Reported-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Reviewed-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com>
Link: https://lore.kernel.org/r/b4b9691414c35c62e570b723e661c80674169f9a.1647057583.git.riteshh@linux.ibm.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 734c15700cdf9062ae98d8b131c6fe873dfad26d upstream.
Abhishek reported that after patch [1], hotplug operations are taking
roughly double the expected time. [2]
The reason behind is that the CPU callbacks that
migrate_on_reclaim_init() sets always call set_migration_target_nodes()
whenever a CPU is brought up/down.
But we only care about numa nodes going from having cpus to become
cpuless, and vice versa, as that influences the demotion_target order.
We do already have two CPU callbacks (vmstat_cpu_online() and
vmstat_cpu_dead()) that check exactly that, so get rid of the CPU
callbacks in migrate_on_reclaim_init() and only call
set_migration_target_nodes() from vmstat_cpu_{dead,online}() whenever a
numa node change its N_CPU state.
[1] https://lore.kernel.org/linux-mm/20210721063926.3024591-2-ying.huang@intel.com/
[2] https://lore.kernel.org/linux-mm/eb438ddd-2919-73d4-bd9f-b7eecdd9577a@linux.vnet.ibm.com/
[osalvador@suse.de: add feedback from Huang Ying]
Link: https://lkml.kernel.org/r/20220314150945.12694-1-osalvador@suse.de
Link: https://lkml.kernel.org/r/20220310120749.23077-1-osalvador@suse.de
Fixes: 884a6e5d1f93b ("mm/migrate: update node demotion order on hotplug events")
Signed-off-by: Oscar Salvador <osalvador@suse.de>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Tested-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Reported-by: Abhishek Goel <huntbag@linux.vnet.ibm.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "Huang, Ying" <ying.huang@intel.com>
Cc: Abhishek Goel <huntbag@linux.vnet.ibm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit bc55cfd5718c7c23e5524582e9fa70b4d10f2433 upstream.
syzbot caught a potential deadlock between the PCM
runtime->buffer_mutex and the mm->mmap_lock. It was brought by the
recent fix to cover the racy read/write and other ioctls, and in that
commit, I overlooked a (hopefully only) corner case that may take the
revert lock, namely, the OSS mmap. The OSS mmap operation
exceptionally allows to re-configure the parameters inside the OSS
mmap syscall, where mm->mmap_mutex is already held. Meanwhile, the
copy_from/to_user calls at read/write operations also take the
mm->mmap_lock internally, hence it may lead to a AB/BA deadlock.
A similar problem was already seen in the past and we fixed it with a
refcount (in commit b248371628aa). The former fix covered only the
call paths with OSS read/write and OSS ioctls, while we need to cover
the concurrent access via both ALSA and OSS APIs now.
This patch addresses the problem above by replacing the buffer_mutex
lock in the read/write operations with a refcount similar as we've
used for OSS. The new field, runtime->buffer_accessing, keeps the
number of concurrent read/write operations. Unlike the former
buffer_mutex protection, this protects only around the
copy_from/to_user() calls; the other codes are basically protected by
the PCM stream lock. The refcount can be a negative, meaning blocked
by the ioctls. If a negative value is seen, the read/write aborts
with -EBUSY. In the ioctl side, OTOH, they check this refcount, too,
and set to a negative value for blocking unless it's already being
accessed.
Reported-by: syzbot+6e5c88838328e99c7e1c@syzkaller.appspotmail.com
Fixes: dca947d4d26d ("ALSA: pcm: Fix races among concurrent read/write and buffer changes")
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/000000000000381a0d05db622a81@google.com
Link: https://lore.kernel.org/r/20220330120903.4738-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit af4edb1d50c6d1044cb34bc43621411b7ba2cffe upstream.
Kernel messages produced during runtime PM can cause a never-ending cycle
because user space utilities (e.g. journald or rsyslog) write the messages
back to storage, causing runtime resume, more messages, and so on.
Messages that tell of things that are expected to happen are arguably
unnecessary, so add a flag to suppress them. This flag is used by the UFS
driver.
Link: https://lore.kernel.org/r/20220228113652.970857-2-adrian.hunter@intel.com
Cc: stable@vger.kernel.org
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 8cba323437a49a45756d661f500b324fc2d486fe upstream.
Prevent rawnand access while in a suspended state.
Commit 013e6292aaf5 ("mtd: rawnand: Simplify the locking") allows the
rawnand layer to return errors rather than waiting in a blocking wait.
Tested on a iMX6ULL.
Fixes: 013e6292aaf5 ("mtd: rawnand: Simplify the locking")
Signed-off-by: Sean Nyekjaer <sean@geanix.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Cc: stable@vger.kernel.org
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lore.kernel.org/linux-mtd/20220208085213.1838273-1-sean@geanix.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 23a9dbbe0faf124fc4c139615633b9d12a3a89ef upstream.
On a 32 bit system, the "len * sizeof(*p)" operation can have an
integer overflow.
Cc: stable@vger.kernel.org
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 421ab1be43bd015ffe744f4ea25df4f19d1ce6fe upstream.
Do not cast the struct xprt to a sock_xprt unless we know it is a UDP or
TCP transport. Otherwise the call to lock the mutex will scribble over
whatever structure is actually there. This has been seen to cause hard
system lockups when the underlying transport was RDMA.
Fixes: b49ea673e119 ("SUNRPC: lock against ->sock changing during sysfs read")
Cc: stable@vger.kernel.org
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit bddac7c1e02ba47f0570e494c9289acea3062cc1 upstream.
This reverts commit aa6f8dcbab473f3a3c7454b74caa46d36cdc5d13.
It turns out this breaks at least the ath9k wireless driver, and
possibly others.
What the ath9k driver does on packet receive is to set up the DMA
transfer with:
int ath_rx_init(..)
..
bf->bf_buf_addr = dma_map_single(sc->dev, skb->data,
common->rx_bufsize,
DMA_FROM_DEVICE);
and then the receive logic (through ath_rx_tasklet()) will fetch
incoming packets
static bool ath_edma_get_buffers(..)
..
dma_sync_single_for_cpu(sc->dev, bf->bf_buf_addr,
common->rx_bufsize, DMA_FROM_DEVICE);
ret = ath9k_hw_process_rxdesc_edma(ah, rs, skb->data);
if (ret == -EINPROGRESS) {
/*let device gain the buffer again*/
dma_sync_single_for_device(sc->dev, bf->bf_buf_addr,
common->rx_bufsize, DMA_FROM_DEVICE);
return false;
}
and it's worth noting how that first DMA sync:
dma_sync_single_for_cpu(..DMA_FROM_DEVICE);
is there to make sure the CPU can read the DMA buffer (possibly by
copying it from the bounce buffer area, or by doing some cache flush).
The iommu correctly turns that into a "copy from bounce bufer" so that
the driver can look at the state of the packets.
In the meantime, the device may continue to write to the DMA buffer, but
we at least have a snapshot of the state due to that first DMA sync.
But that _second_ DMA sync:
dma_sync_single_for_device(..DMA_FROM_DEVICE);
is telling the DMA mapping that the CPU wasn't interested in the area
because the packet wasn't there. In the case of a DMA bounce buffer,
that is a no-op.
Note how it's not a sync for the CPU (the "for_device()" part), and it's
not a sync for data written by the CPU (the "DMA_FROM_DEVICE" part).
Or rather, it _should_ be a no-op. That's what commit aa6f8dcbab47
broke: it made the code bounce the buffer unconditionally, and changed
the DMA_FROM_DEVICE to just unconditionally and illogically be
DMA_TO_DEVICE.
[ Side note: purely within the confines of the swiotlb driver it wasn't
entirely illogical: The reason it did that odd DMA_FROM_DEVICE ->
DMA_TO_DEVICE conversion thing is because inside the swiotlb driver,
it uses just a swiotlb_bounce() helper that doesn't care about the
whole distinction of who the sync is for - only which direction to
bounce.
So it took the "sync for device" to mean that the CPU must have been
the one writing, and thought it meant DMA_TO_DEVICE. ]
Also note how the commentary in that commit was wrong, probably due to
that whole confusion, claiming that the commit makes the swiotlb code
"bounce unconditionally (that is, also
when dir == DMA_TO_DEVICE) in order do avoid synchronising back stale
data from the swiotlb buffer"
which is nonsensical for two reasons:
- that "also when dir == DMA_TO_DEVICE" is nonsensical, as that was
exactly when it always did - and should do - the bounce.
- since this is a sync for the device (not for the CPU), we're clearly
fundamentally not coping back stale data from the bounce buffers at
all, because we'd be copying *to* the bounce buffers.
So that commit was just very confused. It confused the direction of the
synchronization (to the device, not the cpu) with the direction of the
DMA (from the device).
Reported-and-bisected-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Reported-by: Olha Cherevyk <olha.cherevyk@gmail.com>
Cc: Halil Pasic <pasic@linux.ibm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Kalle Valo <kvalo@kernel.org>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Toke Høiland-Jørgensen <toke@toke.dk>
Cc: Maxime Bizon <mbizon@freebox.fr>
Cc: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 0eaecfb2e4814d51ab172df3823e35d7c488b6d2 upstream.
Some controllers have problems with being sent a command to clear
all filtering. While the HCI code does not unconditionally
send a clear-all anymore at BR/EDR setup (after the state machine
refactor), there might be more ways of hitting these codepaths
in the future as the kernel develops.
Cc: stable@vger.kernel.org
Cc: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Ismael Ferreras Morezuelas <swyterzone@gmail.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 92ee3c60ec9fe64404dc035e7c41277d74aa26cb upstream.
Currently we have neither proper check nor protection against the
concurrent calls of PCM hw_params and hw_free ioctls, which may result
in a UAF. Since the existing PCM stream lock can't be used for
protecting the whole ioctl operations, we need a new mutex to protect
those racy calls.
This patch introduced a new mutex, runtime->buffer_mutex, and applies
it to both hw_params and hw_free ioctl code paths. Along with it, the
both functions are slightly modified (the mmap_count check is moved
into the state-check block) for code simplicity.
Reported-by: Hu Jiahui <kirin.say@gmail.com>
Cc: <stable@vger.kernel.org>
Reviewed-by: Jaroslav Kysela <perex@perex.cz>
Link: https://lore.kernel.org/r/20220322170720.3529-2-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from netfilter, ipsec, and wireless.
A few last minute revert / disable and fix patches came down from our
sub-trees. We're not waiting for any fixes at this point.
Current release - regressions:
- Revert "netfilter: nat: force port remap to prevent shadowing
well-known ports", restore working conntrack on asymmetric paths
- Revert "ath10k: drop beacon and probe response which leak from
other channel", restore working AP and mesh mode on QCA9984
- eth: intel: fix hang during reboot/shutdown
Current release - new code bugs:
- netfilter: nf_tables: disable register tracking, it needs more work
to cover all corner cases
Previous releases - regressions:
- ipv6: fix skb_over_panic in __ip6_append_data when (admin-only)
extension headers get specified
- esp6: fix ESP over TCP/UDP, interpret ipv6_skip_exthdr's return
value more selectively
- bnx2x: fix driver load failure when FW not present in initrd
Previous releases - always broken:
- vsock: stop destroying unrelated sockets in nested virtualization
- packet: fix slab-out-of-bounds access in packet_recvmsg()
Misc:
- add Paolo Abeni to networking maintainers!"
* tag 'net-5.17-final' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (26 commits)
iavf: Fix hang during reboot/shutdown
net: mscc: ocelot: fix backwards compatibility with single-chain tc-flower offload
net: bcmgenet: skip invalid partial checksums
bnx2x: fix built-in kernel driver load failure
net: phy: mscc: Add MODULE_FIRMWARE macros
net: dsa: Add missing of_node_put() in dsa_port_parse_of
net: handle ARPHRD_PIMREG in dev_is_mac_header_xmit()
Revert "ath10k: drop beacon and probe response which leak from other channel"
hv_netvsc: Add check for kvmalloc_array
iavf: Fix double free in iavf_reset_task
ice: destroy flow director filter mutex after releasing VSIs
ice: fix NULL pointer dereference in ice_update_vsi_tx_ring_stats()
Add Paolo Abeni to networking maintainers
atm: eni: Add check for dma_map_single
net/packet: fix slab-out-of-bounds access in packet_recvmsg()
net: mdio: mscc-miim: fix duplicate debugfs entry
net: phy: marvell: Fix invalid comparison in the resume and suspend functions
esp6: fix check on ipv6_skip_exthdr's return value
net: dsa: microchip: add spi_device_id tables
netfilter: nf_tables: disable register tracking
...
|
|
This kind of interface doesn't have a mac header. This patch fixes
bpf_redirect() to a PIM interface.
Fixes: 27b29f63058d ("bpf: add bpf_redirect() helper")
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Link: https://lore.kernel.org/r/20220315092008.31423-1-nicolas.dichtel@6wind.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following patchset contains Netfilter fixes for net coming late
in the 5.17-rc process:
1) Revert port remap to mitigate shadowing service ports, this is causing
problems in existing setups and this mitigation can be achieved with
explicit ruleset, eg.
... tcp sport < 16386 tcp dport >= 32768 masquerade random
This patches provided a built-in policy similar to the one described above.
2) Disable register tracking infrastructure in nf_tables. Florian reported
two issues:
- Existing expressions with no implemented .reduce interface
that causes data-store on register should cancel the tracking.
- Register clobbering might be possible storing data on registers that
are larger than 32-bits.
This might lead to generating incorrect ruleset bytecode. These two
issues are scheduled to be addressed in the next release cycle.
* git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
netfilter: nf_tables: disable register tracking
Revert "netfilter: conntrack: tag conntracks picked up in local out hook"
Revert "netfilter: nat: force port remap to prevent shadowing well-known ports"
====================
Link: https://lore.kernel.org/r/20220312220315.64531-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When iterating over sockets using vsock_for_each_connected_socket, make
sure that a transport filters out sockets that don't belong to the
transport.
There actually was an issue caused by this; in a nested VM
configuration, destroying the nested VM (which often involves the
closing of /dev/vhost-vsock if there was h2g connections to the nested
VM) kills not only the h2g connections, but also all existing g2h
connections to the (outmost) host which are totally unrelated.
Tested: Executed the following steps on Cuttlefish (Android running on a
VM) [1]: (1) Enter into an `adb shell` session - to have a g2h
connection inside the VM, (2) open and then close /dev/vhost-vsock by
`exec 3< /dev/vhost-vsock && exec 3<&-`, (3) observe that the adb
session is not reset.
[1] https://android.googlesource.com/device/google/cuttlefish/
Fixes: c0cfa2d8a788 ("vsock: add multi-transports support")
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jiyong Park <jiyong@google.com>
Link: https://lore.kernel.org/r/20220311020017.1509316-1-jiyong@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Merge misc fixes from David Howells:
"A set of patches for watch_queue filter issues noted by Jann. I've
added in a cleanup patch from Christophe Jaillet to convert to using
formal bitmap specifiers for the note allocation bitmap.
Also two filesystem fixes (afs and cachefiles)"
* emailed patches from David Howells <dhowells@redhat.com>:
cachefiles: Fix volume coherency attribute
afs: Fix potential thrashing in afs writeback
watch_queue: Make comment about setting ->defunct more accurate
watch_queue: Fix lack of barrier/sync/lock between post and read
watch_queue: Free the alloc bitmap when the watch_queue is torn down
watch_queue: Fix the alloc bitmap size to reflect notes allocated
watch_queue: Use the bitmap API when applicable
watch_queue: Fix to always request a pow-of-2 pipe ring size
watch_queue: Fix to release page in ->release()
watch_queue, pipe: Free watchqueue state after clearing pipe ring
watch_queue: Fix filter limit check
|
|
A network filesystem may set coherency data on a volume cookie, and if
given, cachefiles will store this in an xattr on the directory in the
cache corresponding to the volume.
The function that sets the xattr just stores the contents of the volume
coherency buffer directly into the xattr, with nothing added; the
checking function, on the other hand, has a cut'n'paste error whereby it
tries to interpret the xattr contents as would be the xattr on an
ordinary file (using the cachefiles_xattr struct). This results in a
failure to match the coherency data because the buffer ends up being
shifted by 18 bytes.
Fix this by defining a structure specifically for the volume xattr and
making both the setting and checking functions use it.
Since the volume coherency doesn't work if used, take the opportunity to
insert a reserved field for future use, set it to 0 and check that it is
0. Log mismatch through the appropriate tracepoint.
Note that this only affects cifs; 9p, afs, ceph and nfs don't use the
volume coherency data at the moment.
Fixes: 32e150037dce ("fscache, cachefiles: Store the volume coherency data")
Reported-by: Rohith Surabattula <rohiths.msft@gmail.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
cc: Steve French <smfrench@gmail.com>
cc: linux-cifs@vger.kernel.org
cc: linux-cachefs@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
In watch_queue_set_filter(), there are a couple of places where we check
that the filter type value does not exceed what the type_filter bitmap
can hold. One place calculates the number of bits by:
if (tf[i].type >= sizeof(wfilter->type_filter) * 8)
which is fine, but the second does:
if (tf[i].type >= sizeof(wfilter->type_filter) * BITS_PER_LONG)
which is not. This can lead to a couple of out-of-bounds writes due to
a too-large type:
(1) __set_bit() on wfilter->type_filter
(2) Writing more elements in wfilter->filters[] than we allocated.
Fix this by just using the proper WATCH_TYPE__NR instead, which is the
number of types we actually know about.
The bug may cause an oops looking something like:
BUG: KASAN: slab-out-of-bounds in watch_queue_set_filter+0x659/0x740
Write of size 4 at addr ffff88800d2c66bc by task watch_queue_oob/611
...
Call Trace:
<TASK>
dump_stack_lvl+0x45/0x59
print_address_description.constprop.0+0x1f/0x150
...
kasan_report.cold+0x7f/0x11b
...
watch_queue_set_filter+0x659/0x740
...
__x64_sys_ioctl+0x127/0x190
do_syscall_64+0x43/0x90
entry_SYSCALL_64_after_hwframe+0x44/0xae
Allocated by task 611:
kasan_save_stack+0x1e/0x40
__kasan_kmalloc+0x81/0xa0
watch_queue_set_filter+0x23a/0x740
__x64_sys_ioctl+0x127/0x190
do_syscall_64+0x43/0x90
entry_SYSCALL_64_after_hwframe+0x44/0xae
The buggy address belongs to the object at ffff88800d2c66a0
which belongs to the cache kmalloc-32 of size 32
The buggy address is located 28 bytes inside of
32-byte region [ffff88800d2c66a0, ffff88800d2c66c0)
Fixes: c73be61cede5 ("pipe: Add general notification queue support")
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from bluetooth, and ipsec.
Current release - regressions:
- Bluetooth: fix unbalanced unlock in set_device_flags()
- Bluetooth: fix not processing all entries on cmd_sync_work, make
connect with qualcomm and intel adapters reliable
- Revert "xfrm: state and policy should fail if XFRMA_IF_ID 0"
- xdp: xdp_mem_allocator can be NULL in trace_mem_connect()
- eth: ice: fix race condition and deadlock during interface enslave
Current release - new code bugs:
- tipc: fix incorrect order of state message data sanity check
Previous releases - regressions:
- esp: fix possible buffer overflow in ESP transformation
- dsa: unlock the rtnl_mutex when dsa_master_setup() fails
- phy: meson-gxl: fix interrupt handling in forced mode
- smsc95xx: ignore -ENODEV errors when device is unplugged
Previous releases - always broken:
- xfrm: fix tunnel mode fragmentation behavior
- esp: fix inter address family tunneling on GSO
- tipc: fix null-deref due to race when enabling bearer
- sctp: fix kernel-infoleak for SCTP sockets
- eth: macb: fix lost RX packet wakeup race in NAPI receive
- eth: intel stop disabling VFs due to PF error responses
- eth: bcmgenet: don't claim WOL when its not available"
* tag 'net-5.17-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (50 commits)
xdp: xdp_mem_allocator can be NULL in trace_mem_connect().
ice: Fix race condition during interface enslave
net: phy: meson-gxl: improve link-up behavior
net: bcmgenet: Don't claim WOL when its not available
net: arc_emac: Fix use after free in arc_mdio_probe()
sctp: fix kernel-infoleak for SCTP sockets
net: phy: correct spelling error of media in documentation
net: phy: DP83822: clear MISR2 register to disable interrupts
gianfar: ethtool: Fix refcount leak in gfar_get_ts_info
selftests: pmtu.sh: Kill nettest processes launched in subshell.
selftests: pmtu.sh: Kill tcpdump processes launched by subshell.
NFC: port100: fix use-after-free in port100_send_complete
net/mlx5e: SHAMPO, reduce TIR indication
net/mlx5e: Lag, Only handle events from highest priority multipath entry
net/mlx5: Fix offloading with ESWITCH_IPV4_TTL_MODIFY_ENABLE
net/mlx5: Fix a race on command flush flow
net/mlx5: Fix size field in bufferx_reg struct
ax25: Fix NULL pointer dereference in ax25_kill_by_device
net: marvell: prestera: Add missing of_node_put() in prestera_switch_set_base_mac_addr
net: ethernet: lpc_eth: Handle error for clk_enable
...
|
|
The header file incorrectly referenced "median-independant interface"
instead of media. Correct this typo.
Signed-off-by: Colin Foster <colin.foster@in-advantage.com>
Fixes: 4069a572d423 ("net: phy: Document core PHY structures")
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/20220309062544.3073-1-colin.foster@in-advantage.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen fixes from Juergen Gross:
"Several Linux PV device frontends are using the grant table interfaces
for removing access rights of the backends in ways being subject to
race conditions, resulting in potential data leaks, data corruption by
malicious backends, and denial of service triggered by malicious
backends:
- blkfront, netfront, scsifront and the gntalloc driver are testing
whether a grant reference is still in use. If this is not the case,
they assume that a following removal of the granted access will
always succeed, which is not true in case the backend has mapped
the granted page between those two operations.
As a result the backend can keep access to the memory page of the
guest no matter how the page will be used after the frontend I/O
has finished. The xenbus driver has a similar problem, as it
doesn't check the success of removing the granted access of a
shared ring buffer.
- blkfront, netfront, scsifront, usbfront, dmabuf, xenbus, 9p,
kbdfront, and pvcalls are using a functionality to delay freeing a
grant reference until it is no longer in use, but the freeing of
the related data page is not synchronized with dropping the granted
access.
As a result the backend can keep access to the memory page even
after it has been freed and then re-used for a different purpose.
- netfront will fail a BUG_ON() assertion if it fails to revoke
access in the rx path.
This will result in a Denial of Service (DoS) situation of the
guest which can be triggered by the backend"
* tag 'xsa396-5.17-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen/netfront: react properly to failing gnttab_end_foreign_access_ref()
xen/gnttab: fix gnttab_end_foreign_access() without page specified
xen/pvcalls: use alloc/free_pages_exact()
xen/9p: use alloc/free_pages_exact()
xen/usb: don't use gnttab_end_foreign_access() in xenhcd_gnttab_done()
xen: remove gnttab_query_foreign_access()
xen/gntalloc: don't use gnttab_query_foreign_access()
xen/scsifront: don't use gnttab_query_foreign_access() for mapped status
xen/netfront: don't use gnttab_query_foreign_access() for mapped status
xen/blkfront: don't use gnttab_query_foreign_access() for mapped status
xen/grant-table: add gnttab_try_end_foreign_access()
xen/xenbus: don't let xenbus_grant_ring() remove grants in error case
|
|
SHAMPO is an RQ / WQ feature, an indication was added to the TIR in the
first place to enforce suitability between connected TIR and RQ, this
enforcement does not exist in current the Firmware implementation and was
redundant in the first place.
Fixes: 83439f3c37aa ("net/mlx5e: Add HW-GRO offload")
Signed-off-by: Ben Ben-Ishay <benishay@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
According to HW spec the field "size" should be 16 bits
in bufferx register.
Fixes: e281682bf294 ("net/mlx5_core: HW data structs/types definitions cleanup")
Signed-off-by: Mohammad Kabat <mohammadkab@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec
Steffen Klassert says:
====================
pull request (net): ipsec 2022-03-09
1) Fix IPv6 PMTU discovery for xfrm interfaces.
From Lina Wang.
2) Revert failing for policies and states that are
configured with XFRMA_IF_ID 0. It broke a
user configuration. From Kai Lueke.
3) Fix a possible buffer overflow in the ESP output path.
4) Fix ESP GSO for tunnel and BEET mode on inter address
family tunnels.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|