summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2021-03-04KEYS: trusted: Reserve TPM for seal and unseal operationsJarkko Sakkinen
commit 8c657a0590de585b1115847c17b34a58025f2f4b upstream. When TPM 2.0 trusted keys code was moved to the trusted keys subsystem, the operations were unwrapped from tpm_try_get_ops() and tpm_put_ops(), which are used to take temporarily the ownership of the TPM chip. The ownership is only taken inside tpm_send(), but this is not sufficient, as in the key load TPM2_CC_LOAD, TPM2_CC_UNSEAL and TPM2_FLUSH_CONTEXT need to be done as a one single atom. Take the TPM chip ownership before sending anything with tpm_try_get_ops() and tpm_put_ops(), and use tpm_transmit_cmd() to send TPM commands instead of tpm_send(), reverting back to the old behaviour. Fixes: 2e19e10131a0 ("KEYS: trusted: Move TPM2 trusted keys code") Reported-by: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> Cc: stable@vger.kernel.org Cc: David Howells <dhowells@redhat.com> Cc: Mimi Zohar <zohar@linux.ibm.com> Cc: Sumit Garg <sumit.garg@linaro.org> Acked-by Sumit Garg <sumit.garg@linaro.org> Tested-by: Mimi Zohar <zohar@linux.ibm.com> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-04mm/rmap: fix potential pte_unmap on an not mapped pteMiaohe Lin
[ Upstream commit 5d5d19eda6b0ee790af89c45e3f678345be6f50f ] For PMD-mapped page (usually THP), pvmw->pte is NULL. For PTE-mapped THP, pvmw->pte is mapped. But for HugeTLB pages, pvmw->pte is not mapped and set to the relevant page table entry. So in page_vma_mapped_walk_done(), we may do pte_unmap() for HugeTLB pte which is not mapped. Fix this by checking pvmw->page against PageHuge before trying to do pte_unmap(). Link: https://lkml.kernel.org/r/20210127093349.39081-1-linmiaohe@huawei.com Fixes: ace71a19cec5 ("mm: introduce page_vma_mapped_walk()") Signed-off-by: Hongxiang Lou <louhongxiang@huawei.com> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Tested-by: Sedat Dilek <sedat.dilek@gmail.com> Cc: Kees Cook <keescook@chromium.org> Cc: Nathan Chancellor <natechancellor@gmail.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Michel Lespinasse <walken@google.com> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Wei Yang <richard.weiyang@linux.alibaba.com> Cc: Dmitry Safonov <0x7f454c46@gmail.com> Cc: Brian Geffon <bgeffon@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-04mm: fix memory_failure() handling of dax-namespace metadataDan Williams
[ Upstream commit 34dc45be4563f344d59ba0428416d0d265aa4f4d ] Given 'struct dev_pagemap' spans both data pages and metadata pages be careful to consult the altmap if present to delineate metadata. In fact the pfn_first() helper already identifies the first valid data pfn, so export that helper for other code paths via pgmap_pfn_valid(). Other usage of get_dev_pagemap() are not a concern because those are operating on known data pfns having been looked up by get_user_pages(). I.e. metadata pfns are never user mapped. Link: https://lkml.kernel.org/r/161058501758.1840162.4239831989762604527.stgit@dwillia2-desk3.amr.corp.intel.com Fixes: 6100e34b2526 ("mm, memory_failure: Teach memory_failure() about dev_pagemap pages") Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reported-by: David Hildenbrand <david@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Naoya Horiguchi <naoya.horiguchi@nec.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Qian Cai <cai@lca.pw> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-04mm,thp,shmem: make khugepaged obey tmpfs mount flagsRik van Riel
[ Upstream commit cd89fb06509903f942a0ffe97ffa63034671ed0c ] Currently if thp enabled=[madvise], mounting a tmpfs filesystem with huge=always and mmapping files from that tmpfs does not result in khugepaged collapsing those mappings, despite the mount flag indicating that it should. Fix that by breaking up the blocks of tests in hugepage_vma_check a little bit, and testing things in the correct order. Link: https://lkml.kernel.org/r/20201124194925.623931-4-riel@surriel.com Fixes: c2231020ea7b ("mm: thp: register mm for khugepaged when merging vma for shmem") Signed-off-by: Rik van Riel <riel@surriel.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Michal Hocko <mhocko@suse.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Xu Yu <xuyu@linux.alibaba.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-04soundwire: export sdw_write/read_no_pm functionsBard Liao
[ Upstream commit 167790abb90fa073d8341ee0e408ccad3d2109cd ] sdw_write_no_pm and sdw_read_no_pm are useful when we want to do IO without touching PM. Fixes: 0231453bc08f ('soundwire: bus: add clock stop helpers') Fixes: 60ee9be25571 ('soundwire: bus: add PM/no-PM versions of read/write functions') Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com> Link: https://lore.kernel.org/r/20210122070634.12825-5-yung-chuan.liao@linux.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-04regulator: bd718x7, bd71828, Fix dvs voltage levelsMatti Vaittinen
[ Upstream commit c294554111a835598b557db789d9ad2379b512a2 ] The ROHM BD718x7 and BD71828 drivers support setting HW state specific voltages from device-tree. This is used also by various in-tree DTS files. These drivers do incorrectly try to compose bit-map using enum values. By a chance this works for first two valid levels having values 1 and 2 - but setting values for the rest of the levels do indicate capability of setting values for first levels as well. Luckily the regulators which support setting values for SUSPEND/LPSR do usually also support setting values for RUN and IDLE too - thus this has not been such a fatal issue. Fix this by defining the old enum values as bits and fixing the parsing code. This allows keeping existing IC specific drivers intact and only slightly changing the rohm-regulator.c Fixes: 21b72156ede8b ("regulator: bd718x7: Split driver to common and bd718x7 specific parts") Signed-off-by: Matti Vaittinen <matti.vaittinen@fi.rohmeurope.com> Acked-by: Lee Jones <lee.jones@linaro.org> Link: https://lore.kernel.org/r/20210212080023.GA880728@localhost.localdomain Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-04iommu: Switch gather->end to the inclusive endYong Wu
[ Upstream commit 862c3715de8f3e5350489240c951d697f04bd8c9 ] Currently gather->end is "unsigned long" which may be overflow in arch32 in the corner case: 0xfff00000 + 0x100000(iova + size). Although it doesn't affect the size(end - start), it affects the checking "gather->end < end" This patch changes this "end" to the real end address (end = start + size - 1). Correspondingly, update the length to "end - start + 1". Fixes: a7d20dc19d9e ("iommu: Introduce struct iommu_iotlb_gather for batching TLB flushes") Signed-off-by: Yong Wu <yong.wu@mediatek.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20210107122909.16317-5-yong.wu@mediatek.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-04scsi: libsas: Introduce a _gfp() variant of event notifiersAhmed S. Darwish
[ Upstream commit c2d0f1a65ab9fbabebb463bf36f50ea8f4633386 ] sas_alloc_event() uses in_interrupt() to decide which allocation should be used. The usage of in_interrupt() in drivers is phased out and Linus clearly requested that code which changes behaviour depending on context should either be separated or the context be conveyed in an argument passed by the caller, which usually knows the context. The in_interrupt() check is also only partially correct, because it fails to choose the correct code path when just preemption or interrupts are disabled. For example, as in the following call chain: mvsas/mv_sas.c: mvs_work_queue() [process context] spin_lock_irqsave(mvs_info::lock, ) -> libsas/sas_event.c: sas_notify_phy_event() -> sas_alloc_event() -> in_interrupt() = false -> invalid GFP_KERNEL allocation -> libsas/sas_event.c: sas_notify_port_event() -> sas_alloc_event() -> in_interrupt() = false -> invalid GFP_KERNEL allocation Introduce sas_alloc_event_gfp(), sas_notify_port_event_gfp(), and sas_notify_phy_event_gfp(), which all behave like the non _gfp() variants but use a caller-passed GFP mask for allocations. For bisectability, all callers will be modified first to pass GFP context, then the non _gfp() libsas API variants will be modified to take a gfp_t by default. Link: https://lore.kernel.org/r/20210118100955.1761652-4-a.darwish@linutronix.de Fixes: 1c393b970e0f ("scsi: libsas: Use dynamic alloced work to avoid sas event lost") Cc: Jason Yan <yanaijie@huawei.com> Reviewed-by: John Garry <john.garry@huawei.com> Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-04scsi: libsas: Remove notifier indirectionJohn Garry
[ Upstream commit 121181f3f839c29d8dd9fdc3cc9babbdc74227f8 ] LLDDs report events to libsas with .notify_port_event and .notify_phy_event callbacks. These callbacks are fixed and so there is no reason why the functions cannot be called directly, so do that. This neatens the code slightly, makes it more obvious, and reduces function pointer usage, which is generally a good thing. Downside is that there are 2x more symbol exports. [a.darwish@linutronix.de: Remove the now unused "sas_ha" local variables] Link: https://lore.kernel.org/r/20210118100955.1761652-3-a.darwish@linutronix.de Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jack Wang <jinpu.wang@cloud.ionos.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-04certs: Fix blacklist flag type confusionDavid Howells
[ Upstream commit 4993e1f9479a4161fd7d93e2b8b30b438f00cb0f ] KEY_FLAG_KEEP is not meant to be passed to keyring_alloc() or key_alloc(), as these only take KEY_ALLOC_* flags. KEY_FLAG_KEEP has the same value as KEY_ALLOC_BYPASS_RESTRICTION, but fortunately only key_create_or_update() uses it. LSMs using the key_alloc hook don't check that flag. KEY_FLAG_KEEP is then ignored but fortunately (again) the root user cannot write to the blacklist keyring, so it is not possible to remove a key/hash from it. Fix this by adding a KEY_ALLOC_SET_KEEP flag that tells key_alloc() to set KEY_FLAG_KEEP on the new key. blacklist_init() can then, correctly, pass this to keyring_alloc(). We can also use this in ima_mok_init() rather than setting the flag manually. Note that this doesn't fix an observable bug with the current implementation but it is required to allow addition of new hashes to the blacklist in the future without making it possible for them to be removed. Fixes: 734114f8782f ("KEYS: Add a system blacklist keyring") Reported-by: Mickaël Salaün <mic@linux.microsoft.com> Signed-off-by: David Howells <dhowells@redhat.com> cc: Mickaël Salaün <mic@linux.microsoft.com> cc: Mimi Zohar <zohar@linux.vnet.ibm.com> Cc: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-04ima: Free IMA measurement buffer after kexec syscallLakshmi Ramasubramanian
[ Upstream commit f31e3386a4e92ba6eda7328cb508462956c94c64 ] IMA allocates kernel virtual memory to carry forward the measurement list, from the current kernel to the next kernel on kexec system call, in ima_add_kexec_buffer() function. This buffer is not freed before completing the kexec system call resulting in memory leak. Add ima_buffer field in "struct kimage" to store the virtual address of the buffer allocated for the IMA measurement list. Free the memory allocated for the IMA measurement list in kimage_file_post_load_cleanup() function. Signed-off-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com> Suggested-by: Tyler Hicks <tyhicks@linux.microsoft.com> Reviewed-by: Thiago Jung Bauermann <bauerman@linux.ibm.com> Reviewed-by: Tyler Hicks <tyhicks@linux.microsoft.com> Fixes: 7b8589cc29e7 ("ima: on soft reboot, save the measurement list") Signed-off-by: Mimi Zohar <zohar@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-04drm/fourcc: fix Amlogic format modifier masksSimon Ser
[ Upstream commit cc3283f8f41f741fbaef63d0503d8fb4a7919100 ] The comment says the layout and options use 8 bits, and the shift uses 8 bits. However the mask is 0xf, ie. 0b00001111 (4 bits). This could be surprising when introducing new layouts or options that take more than 4 bits, as this would silently drop the high bits. Make the masks consistent with the comment and the shift. Found when writing a drm_info patch [1]. [1]: https://github.com/ascent12/drm_info/pull/67 Signed-off-by: Simon Ser <contact@emersion.fr> Fixes: d6528ec88309 ("drm/fourcc: Add modifier definitions for describing Amlogic Video Framebuffer Compression") Cc: Neil Armstrong <narmstrong@baylibre.com> Cc: Sam Ravnborg <sam@ravnborg.org> Cc: Kevin Hilman <khilman@baylibre.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Acked-by: Neil Armstrong <narmstrong@baylibre.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210110125103.15447-1-contact@emersion.fr Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-04drm: document that user-space should force-probe connectorsSimon Ser
[ Upstream commit a7e2e1c50450c6a0f020b35960edecbe25dde520 ] It seems like we can't have nice things, so let's just document the disappointing behaviour instead. The previous version assumed the kernel would perform the probing work when appropriate, however this is not the case today. Update the documentation to reflect reality. v2: - Improve commit message to explain why this change is made (Pekka) - Keep the bit about flickering (Daniel) - Explain when user-space should force-probe, and when it shouldn't (Daniel) Signed-off-by: Simon Ser <contact@emersion.fr> Fixes: 2ac5ef3b2362 ("drm: document drm_mode_get_connector") Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Pekka Paalanen <pekka.paalanen@collabora.com> Cc: Ville Syrjala <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/AxqLnTAsFCRishOVB5CLsqIesmrMrm7oytnOVB7oPA@cp7-web-043.plabs.ch Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-04tty: convert tty_ldisc_ops 'read()' function to take a kernel pointerLinus Torvalds
[ Upstream commit 3b830a9c34d5897be07176ce4e6f2d75e2c8cfd7 ] The tty line discipline .read() function was passed the final user pointer destination as an argument, which doesn't match the 'write()' function, and makes it very inconvenient to do a splice method for ttys. This is a conversion to use a kernel buffer instead. NOTE! It does this by passing the tty line discipline ->read() function an additional "cookie" to fill in, and an offset into the cookie data. The line discipline can fill in the cookie data with its own private information, and then the reader will repeat the read until either the cookie is cleared or it runs out of data. The only real user of this is N_HDLC, which can use this to handle big packets, even if the kernel buffer is smaller than the whole packet. Cc: Christoph Hellwig <hch@lst.de> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-04bpf: Declare __bpf_free_used_maps() unconditionallyAndrii Nakryiko
[ Upstream commit 936f8946bdb48239f4292812d4d2e26c6d328c95 ] __bpf_free_used_maps() is always defined in kernel/bpf/core.c, while include/linux/bpf.h is guarding it behind CONFIG_BPF_SYSCALL. Move it out of that guard region and fix compiler warning. Fixes: a2ea07465c8d ("bpf: Fix missing prog untrack in release_maps") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20210112075520.4103414-4-andrii@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-04tcp: fix SO_RCVLOWAT related hangs under mem pressureEric Dumazet
[ Upstream commit f969dc5a885736842c3511ecdea240fbb02d25d9 ] While commit 24adbc1676af ("tcp: fix SO_RCVLOWAT hangs with fat skbs") fixed an issue vs too small sk_rcvbuf for given sk_rcvlowat constraint, it missed to address issue caused by memory pressure. 1) If we are under memory pressure and socket receive queue is empty. First incoming packet is allowed to be queued, after commit 76dfa6082032 ("tcp: allow one skb to be received per socket under memory pressure") But we do not send EPOLLIN yet, in case tcp_data_ready() sees sk_rcvlowat is bigger than skb length. 2) Then, when next packet comes, it is dropped, and we directly call sk->sk_data_ready(). 3) If application is using poll(), tcp_poll() will then use tcp_stream_is_readable() and decide the socket receive queue is not yet filled, so nothing will happen. Even when sender retransmits packets, phases 2) & 3) repeat and flow is effectively frozen, until memory pressure is off. Fix is to consider tcp_under_memory_pressure() to take care of global memory pressure or memcg pressure. Fixes: 24adbc1676af ("tcp: fix SO_RCVLOWAT hangs with fat skbs") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Arjun Roy <arjunroy@google.com> Suggested-by: Wei Wang <weiwan@google.com> Reviewed-by: Wei Wang <weiwan@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-04ACPICA: Fix exception code class checksMaximilian Luz
[ Upstream commit 3dfaea3811f8b6a89a347e8da9ab862cdf3e30fe ] ACPICA commit 1a3a549286ea9db07d7ec700e7a70dd8bcc4354e The macros to classify different AML exception codes are broken. For instance, ACPI_ENV_EXCEPTION(Status) will always evaluate to zero due to #define AE_CODE_ENVIRONMENTAL 0x0000 #define ACPI_ENV_EXCEPTION(Status) (Status & AE_CODE_ENVIRONMENTAL) Similarly, ACPI_AML_EXCEPTION(Status) will evaluate to a non-zero value for error codes of type AE_CODE_PROGRAMMER, AE_CODE_ACPI_TABLES, as well as AE_CODE_AML, and not just AE_CODE_AML as the name suggests. This commit fixes those checks. Fixes: d46b6537f0ce ("ACPICA: AML Parser: ignore all exceptions resulting from incorrect AML during table load") Link: https://github.com/acpica/acpica/commit/1a3a5492 Signed-off-by: Maximilian Luz <luzmaximilian@gmail.com> Signed-off-by: Bob Moore <robert.moore@intel.com> Signed-off-by: Erik Kaneda <erik.kaneda@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-04bpf: Avoid warning when re-casting __bpf_call_base into __bpf_call_base_argsAndrii Nakryiko
[ Upstream commit 6943c2b05bf09fd5c5729f7d7d803bf3f126cb9a ] BPF interpreter uses extra input argument, so re-casts __bpf_call_base into __bpf_call_base_args. Avoid compiler warning about incompatible function prototypes by casting to void * first. Fixes: 1ea47e01ad6e ("bpf: add support for bpf_call to interpreter") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20210112075520.4103414-3-andrii@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-04bpf: Add bpf_patch_call_args prototype to include/linux/bpf.hAndrii Nakryiko
[ Upstream commit a643bff752dcf72a07e1b2ab2f8587e4f51118be ] Add bpf_patch_call_args() prototype. This function is called from BPF verifier and only if CONFIG_BPF_JIT_ALWAYS_ON is not defined. This fixes compiler warning about missing prototype in some kernel configurations. Fixes: 1ea47e01ad6e ("bpf: add support for bpf_call to interpreter") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20210112075520.4103414-2-andrii@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-04vmlinux.lds.h: Define SANTIZER_DISCARDS with CONFIG_GCOV_KERNEL=yNathan Chancellor
commit f5b6a74d9c08b19740ca056876bf6584acdba582 upstream. clang produces .eh_frame sections when CONFIG_GCOV_KERNEL is enabled, even when -fno-asynchronous-unwind-tables is in KBUILD_CFLAGS: $ make CC=clang vmlinux ... ld: warning: orphan section `.eh_frame' from `init/main.o' being placed in section `.eh_frame' ld: warning: orphan section `.eh_frame' from `init/version.o' being placed in section `.eh_frame' ld: warning: orphan section `.eh_frame' from `init/do_mounts.o' being placed in section `.eh_frame' ld: warning: orphan section `.eh_frame' from `init/do_mounts_initrd.o' being placed in section `.eh_frame' ld: warning: orphan section `.eh_frame' from `init/initramfs.o' being placed in section `.eh_frame' ld: warning: orphan section `.eh_frame' from `init/calibrate.o' being placed in section `.eh_frame' ld: warning: orphan section `.eh_frame' from `init/init_task.o' being placed in section `.eh_frame' ... $ rg "GCOV_KERNEL|GCOV_PROFILE_ALL" .config CONFIG_GCOV_KERNEL=y CONFIG_ARCH_HAS_GCOV_PROFILE_ALL=y CONFIG_GCOV_PROFILE_ALL=y This was already handled for a couple of other options in commit d812db78288d ("vmlinux.lds.h: Avoid KASAN and KCSAN's unwanted sections") and there is an open LLVM bug for this issue. Take advantage of that section for this config as well so that there are no more orphan warnings. Link: https://bugs.llvm.org/show_bug.cgi?id=46478 Link: https://github.com/ClangBuiltLinux/linux/issues/1069 Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Fangrui Song <maskray@google.com> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Nathan Chancellor <nathan@kernel.org> Fixes: d812db78288d ("vmlinux.lds.h: Avoid KASAN and KCSAN's unwanted sections") Cc: stable@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20210130004650.2682422-1-nathan@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-04zsmalloc: account the number of compacted pages correctlyRokudo Yan
commit 2395928158059b8f9858365fce7713ce7fef62e4 upstream. There exists multiple path may do zram compaction concurrently. 1. auto-compaction triggered during memory reclaim 2. userspace utils write zram<id>/compaction node So, multiple threads may call zs_shrinker_scan/zs_compact concurrently. But pages_compacted is a per zsmalloc pool variable and modification of the variable is not serialized(through under class->lock). There are two issues here: 1. the pages_compacted may not equal to total number of pages freed(due to concurrently add). 2. zs_shrinker_scan may not return the correct number of pages freed(issued by current shrinker). The fix is simple: 1. account the number of pages freed in zs_compact locally. 2. use actomic variable pages_compacted to accumulate total number. Link: https://lkml.kernel.org/r/20210202122235.26885-1-wu-yan@tcl.com Fixes: 860c707dca155a56 ("zsmalloc: account the number of compacted pages") Signed-off-by: Rokudo Yan <wu-yan@tcl.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-04vmlinux.lds.h: add DWARF v5 sectionsNick Desaulniers
commit 3c4fa46b30c551b1df2fb1574a684f68bc22067c upstream. We expect toolchains to produce these new debug info sections as part of DWARF v5. Add explicit placements to prevent the linker warnings from --orphan-section=warn. Compilers may produce such sections with explicit -gdwarf-5, or based on the implicit default version of DWARF when -g is used via DEBUG_INFO. This implicit default changes over time, and has changed to DWARF v5 with GCC 11. .debug_sup was mentioned in review, but without compilers producing it today, let's wait to add it until it becomes necessary. Cc: stable@vger.kernel.org Link: https://bugzilla.redhat.com/show_bug.cgi?id=1922707 Reported-by: Chris Murphy <lists@colorremedies.com> Suggested-by: Fangrui Song <maskray@google.com> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Mark Wielaard <mark@klomp.org> Tested-by: Sedat Dilek <sedat.dilek@gmail.com> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-02-26mm: provide a saner PTE walking API for modulesPaolo Bonzini
commit 9fd6dad1261a541b3f5fa7dc5b152222306e6702 upstream. Currently, the follow_pfn function is exported for modules but follow_pte is not. However, follow_pfn is very easy to misuse, because it does not provide protections (so most of its callers assume the page is writable!) and because it returns after having already unlocked the page table lock. Provide instead a simplified version of follow_pte that does not have the pmdpp and range arguments. The older version survives as follow_invalidate_pte() for use by fs/dax.c. Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-02-23Xen/gntdev: correct error checking in gntdev_map_grant_pages()Jan Beulich
commit ebee0eab08594b2bd5db716288a4f1ae5936e9bc upstream. Failure of the kernel part of the mapping operation should also be indicated as an error to the caller, or else it may assume the respective kernel VA is okay to access. Furthermore gnttab_map_refs() failing still requires recording successfully mapped handles, so they can be unmapped subsequently. This in turn requires there to be a way to tell full hypercall failure from partial success - preset map_op status fields such that they won't "happen" to look as if the operation succeeded. Also again use GNTST_okay instead of implying its value (zero). This is part of XSA-361. Signed-off-by: Jan Beulich <jbeulich@suse.com> Cc: stable@vger.kernel.org Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-02-12Merge tag 'for-linus-5.11-rc8-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fix from Juergen Gross: "A single fix for an issue introduced this development cycle: when running as a Xen guest on Arm systems the kernel will hang during boot" * tag 'for-linus-5.11-rc8-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: arm/xen: Don't probe xenbus as part of an early initcall
2021-02-11arm/xen: Don't probe xenbus as part of an early initcallJulien Grall
After Commit 3499ba8198cad ("xen: Fix event channel callback via INTX/GSI"), xenbus_probe() will be called too early on Arm. This will recent to a guest hang during boot. If the hang wasn't there, we would have ended up to call xenbus_probe() twice (the second time is in xenbus_probe_initcall()). We don't need to initialize xenbus_probe() early for Arm guest. Therefore, the call in xen_guest_init() is now removed. After this change, there is no more external caller for xenbus_probe(). So the function is turned to a static one. Interestingly there were two prototypes for it. Cc: stable@vger.kernel.org Fixes: 3499ba8198cad ("xen: Fix event channel callback via INTX/GSI") Reported-by: Ian Jackson <iwj@xenproject.org> Signed-off-by: Julien Grall <jgrall@amazon.com> Reviewed-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> Link: https://lore.kernel.org/r/20210210170654.5377-1-julien@xen.org Signed-off-by: Juergen Gross <jgross@suse.com>
2021-02-10Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netLinus Torvalds
Pull networking fixes from David Miller: "Another pile of networing fixes: 1) ath9k build error fix from Arnd Bergmann 2) dma memory leak fix in mediatec driver from Lorenzo Bianconi. 3) bpf int3 kprobe fix from Alexei Starovoitov. 4) bpf stackmap integer overflow fix from Bui Quang Minh. 5) Add usb device ids for Cinterion MV31 to qmi_qwwan driver, from Christoph Schemmel. 6) Don't update deleted entry in xt_recent netfilter module, from Jazsef Kadlecsik. 7) Use after free in nftables, fix from Pablo Neira Ayuso. 8) Header checksum fix in flowtable from Sven Auhagen. 9) Validate user controlled length in qrtr code, from Sabyrzhan Tasbolatov. 10) Fix race in xen/netback, from Juergen Gross, 11) New device ID in cxgb4, from Raju Rangoju. 12) Fix ring locking in rxrpc release call, from David Howells. 13) Don't return LAPB error codes from x25_open(), from Xie He. 14) Missing error returns in gsi_channel_setup() from Alex Elder. 15) Get skb_copy_and_csum_datagram working properly with odd segment sizes, from Willem de Bruijn. 16) Missing RFS/RSS table init in enetc driver, from Vladimir Oltean. 17) Do teardown on probe failure in DSA, from Vladimir Oltean. 18) Fix compilation failures of txtimestamp selftest, from Vadim Fedorenko. 19) Limit rx per-napi gro queue size to fix latency regression, from Eric Dumazet. 20) dpaa_eth xdp fixes from Camelia Groza. 21) Missing txq mode update when switching CBS off, in stmmac driver, from Mohammad Athari Bin Ismail. 22) Failover pending logic fix in ibmvnic driver, from Sukadev Bhattiprolu. 23) Null deref fix in vmw_vsock, from Norbert Slusarek. 24) Missing verdict update in xdp paths of ena driver, from Shay Agroskin. 25) seq_file iteration fix in sctp from Neil Brown. 26) bpf 32-bit src register truncation fix on div/mod, from Daniel Borkmann. 27) Fix jmp32 pruning in bpf verifier, from Daniel Borkmann. 28) Fix locking in vsock_shutdown(), from Stefano Garzarella. 29) Various missing index bound checks in hns3 driver, from Yufeng Mo. 30) Flush ports on .phylink_mac_link_down() in dsa felix driver, from Vladimir Oltean. 31) Don't mix up stp and mrp port states in bridge layer, from Horatiu Vultur. 32) Fix locking during netif_tx_disable(), from Edwin Peer" * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (45 commits) bpf: Fix 32 bit src register truncation on div/mod bpf: Fix verifier jmp32 pruning decision logic bpf: Fix verifier jsgt branch analysis on max bound vsock: fix locking in vsock_shutdown() net: hns3: add a check for index in hclge_get_rss_key() net: hns3: add a check for tqp_index in hclge_get_ring_chain_from_mbx() net: hns3: add a check for queue_id in hclge_reset_vf_queue() net: dsa: felix: implement port flushing on .phylink_mac_link_down switchdev: mrp: Remove SWITCHDEV_ATTR_ID_MRP_PORT_STAT bridge: mrp: Fix the usage of br_mrp_port_switchdev_set_state net: watchdog: hold device global xmit lock during tx disable netfilter: nftables: relax check for stateful expressions in set definition netfilter: conntrack: skip identical origin tuple in same zone only vsock/virtio: update credit only if socket is not closed net: fix iteration for sctp transport seq_files net: ena: Update XDP verdict upon failure net/vmw_vsock: improve locking in vsock_connect_timeout() net/vmw_vsock: fix NULL pointer dereference ibmvnic: Clear failover_pending if unable to schedule net: stmmac: set TxQ mode back to DCB after disabling CBS ...
2021-02-09firmware_loader: align .builtin_fw to 8Fangrui Song
arm64 references the start address of .builtin_fw (__start_builtin_fw) with a pair of R_AARCH64_ADR_PREL_PG_HI21/R_AARCH64_LDST64_ABS_LO12_NC relocations. The compiler is allowed to emit the R_AARCH64_LDST64_ABS_LO12_NC relocation because struct builtin_fw in include/linux/firmware.h is 8-byte aligned. The R_AARCH64_LDST64_ABS_LO12_NC relocation requires the address to be a multiple of 8, which may not be the case if .builtin_fw is empty. Unconditionally align .builtin_fw to fix the linker error. 32-bit architectures could use ALIGN(4) but that would add unnecessary complexity, so just use ALIGN(8). Link: https://lkml.kernel.org/r/20201208054646.2913063-1-maskray@google.com Link: https://github.com/ClangBuiltLinux/linux/issues/1204 Fixes: 5658c76 ("firmware: allow firmware files to be built into kernel image") Signed-off-by: Fangrui Song <maskray@google.com> Reported-by: kernel test robot <lkp@intel.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Tested-by: Nick Desaulniers <ndesaulniers@google.com> Tested-by: Douglas Anderson <dianders@chromium.org> Acked-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-02-09net: dsa: felix: implement port flushing on .phylink_mac_link_downVladimir Oltean
There are several issues which may be seen when the link goes down while forwarding traffic, all of which can be attributed to the fact that the port flushing procedure from the reference manual was not closely followed. With flow control enabled on both the ingress port and the egress port, it may happen when a link goes down that Ethernet packets are in flight. In flow control mode, frames are held back and not dropped. When there is enough traffic in flight (example: iperf3 TCP), then the ingress port might enter congestion and never exit that state. This is a problem, because it is the egress port's link that went down, and that has caused the inability of the ingress port to send packets to any other port. This is solved by flushing the egress port's queues when it goes down. There is also a problem when performing stream splitting for IEEE 802.1CB traffic (not yet upstream, but a sort of multicast, basically). There, if one port from the destination ports mask goes down, splitting the stream towards the other destinations will no longer be performed. This can be traced down to this line: ocelot_port_writel(ocelot_port, 0, DEV_MAC_ENA_CFG); which should have been instead, as per the reference manual: ocelot_port_rmwl(ocelot_port, 0, DEV_MAC_ENA_CFG_RX_ENA, DEV_MAC_ENA_CFG); Basically only DEV_MAC_ENA_CFG_RX_ENA should be disabled, but not DEV_MAC_ENA_CFG_TX_ENA - I don't have further insight into why that is the case, but apparently multicasting to several ports will cause issues if at least one of them doesn't have DEV_MAC_ENA_CFG_TX_ENA set. I am not sure what the state of the Ocelot VSC7514 driver is, but probably not as bad as Felix/Seville, since VSC7514 uses phylib and has the following in ocelot_adjust_link: if (!phydev->link) return; therefore the port is not really put down when the link is lost, unlike the DSA drivers which use .phylink_mac_link_down for that. Nonetheless, I put ocelot_port_flush() in the common ocelot.c because it needs to access some registers from drivers/net/ethernet/mscc/ocelot_rew.h which are not exported in include/soc/mscc/ and a bugfix patch should probably not move headers around. Fixes: bdeced75b13f ("net: dsa: felix: Add PCS operations for PHYLINK") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-08switchdev: mrp: Remove SWITCHDEV_ATTR_ID_MRP_PORT_STATHoratiu Vultur
Now that MRP started to use also SWITCHDEV_ATTR_ID_PORT_STP_STATE to notify HW, then SWITCHDEV_ATTR_ID_MRP_PORT_STAT is not used anywhere else, therefore we can remove it. Fixes: c284b545900830 ("switchdev: mrp: Extend switchdev API to offload MRP") Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-08net: watchdog: hold device global xmit lock during tx disableEdwin Peer
Prevent netif_tx_disable() running concurrently with dev_watchdog() by taking the device global xmit lock. Otherwise, the recommended: netif_carrier_off(dev); netif_tx_disable(dev); driver shutdown sequence can happen after the watchdog has already checked carrier, resulting in possible false alarms. This is because netif_tx_lock() only sets the frozen bit without maintaining the locks on the individual queues. Fixes: c3f26a269c24 ("netdev: Fix lockdep warnings in multiqueue configurations.") Signed-off-by: Edwin Peer <edwin.peer@broadcom.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-07Merge tag 'irq_urgent_for_v5.11_rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fixes from Borislav Petkov: - Prevent device managed IRQ allocation helpers from returning IRQ 0 - A fix for MSI activation of PCI endpoints with multiple MSIs * tag 'irq_urgent_for_v5.11_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: genirq: Prevent [devm_]irq_alloc_desc from returning irq 0 genirq/msi: Activate Multi-MSI early when MSI_FLAG_ACTIVATE_EARLY is set
2021-02-07Merge tag 'core_urgent_for_v5.11_rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull syscall entry fixes from Borislav Petkov: - For syscall user dispatch, separate prctl operation from syscall redirection range specification before the API has been made official in 5.11. - Ensure tasks using the generic syscall code do trap after returning from a syscall when single-stepping is requested. * tag 'core_urgent_for_v5.11_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: entry: Use different define for selector variable in SUD entry: Ensure trap after single-step on system call return
2021-02-06entry: Use different define for selector variable in SUDGabriel Krisman Bertazi
Michael Kerrisk suggested that, from an API perspective, it is a bad idea to share the PR_SYS_DISPATCH_ defines between the prctl operation and the selector variable. Therefore, define two new constants to be used by SUD's selector variable and update the corresponding documentation and test cases. While this changes the API syscall user dispatch has never been part of a Linux release, it will show up for the first time in 5.11. Suggested-by: Michael Kerrisk (man-pages) <mtk.manpages@gmail.com> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20210205184321.2062251-1-krisman@collabora.com
2021-02-06entry: Ensure trap after single-step on system call returnGabriel Krisman Bertazi
Commit 299155244770 ("entry: Drop usage of TIF flags in the generic syscall code") introduced a bug on architectures using the generic syscall entry code, in which processes stopped by PTRACE_SYSCALL do not trap on syscall return after receiving a TIF_SINGLESTEP. The reason is that the meaning of TIF_SINGLESTEP flag is overloaded to cause the trap after a system call is executed, but since the above commit, the syscall call handler only checks for the SYSCALL_WORK flags on the exit work. Split the meaning of TIF_SINGLESTEP such that it only means single-step mode, and create a new type of SYSCALL_WORK to request a trap immediately after a syscall in single-step mode. In the current implementation, the SYSCALL_WORK flag shadows the TIF_SINGLESTEP flag for simplicity. Update x86 to flip this bit when a tracer enables single stepping. Fixes: 299155244770 ("entry: Drop usage of TIF flags in the generic syscall code") Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Kyle Huey <me@kylehuey.com> Link: https://lore.kernel.org/r/87h7mtc9pr.fsf_-_@collabora.com
2021-02-05Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge misc fixes from Andrew Morton: "18 patches. Subsystems affected by this patch series: mm (hugetlb, compaction, vmalloc, shmem, memblock, pagecache, kasan, and hugetlb), mailmap, gcov, ubsan, and MAINTAINERS" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: MAINTAINERS/.mailmap: use my @kernel.org address mm: hugetlb: fix missing put_page in gather_surplus_pages() ubsan: implement __ubsan_handle_alignment_assumption kasan: make addr_has_metadata() return true for valid addresses kasan: add explicit preconditions to kasan_report() mm/filemap: add missing mem_cgroup_uncharge() to __add_to_page_cache_locked() mailmap: add entries for Manivannan Sadhasivam mailmap: fix name/email for Viresh Kumar memblock: do not start bottom-up allocations with kernel_end mm: thp: fix MADV_REMOVE deadlock on shmem THP init/gcov: allow CONFIG_CONSTRUCTORS on UML to fix module gcov mm/vmalloc: separate put pages and flush VM flags mm, compaction: move high_pfn to the for loop scope mm: migrate: do not migrate HugeTLB page whose refcount is one mm: hugetlb: remove VM_BUG_ON_PAGE from page_huge_active mm: hugetlb: fix a race between isolating and freeing page mm: hugetlb: fix a race between freeing and dissolving the page mm: hugetlbfs: fix cannot migrate the fallocated HugeTLB page
2021-02-05genirq: Prevent [devm_]irq_alloc_desc from returning irq 0Hans de Goede
Since commit a85a6c86c25b ("driver core: platform: Clarify that IRQ 0 is invalid"), having a linux-irq with number 0 will trigger a WARN() when calling platform_get_irq*() to retrieve that linux-irq. Since [devm_]irq_alloc_desc allocs a single irq and since irq 0 is not used on some systems, it can return 0, triggering that WARN(). This happens e.g. on Intel Bay Trail and Cherry Trail devices using the LPE audio engine for HDMI audio: 0 is an invalid IRQ number WARNING: CPU: 3 PID: 472 at drivers/base/platform.c:238 platform_get_irq_optional+0x108/0x180 Modules linked in: snd_hdmi_lpe_audio(+) ... Call Trace: platform_get_irq+0x17/0x30 hdmi_lpe_audio_probe+0x4a/0x6c0 [snd_hdmi_lpe_audio] ---[ end trace ceece38854223a0b ]--- Change the 'from' parameter passed to __[devm_]irq_alloc_descs() by the [devm_]irq_alloc_desc macros from 0 to 1, so that these macros will no longer return 0. Fixes: a85a6c86c25b ("driver core: platform: Clarify that IRQ 0 is invalid") Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20201221185647.226146-1-hdegoede@redhat.com
2021-02-05kasan: add explicit preconditions to kasan_report()Vincenzo Frascino
Patch series "kasan: Fix metadata detection for KASAN_HW_TAGS", v5. With the introduction of KASAN_HW_TAGS, kasan_report() currently assumes that every location in memory has valid metadata associated. This is due to the fact that addr_has_metadata() returns always true. As a consequence of this, an invalid address (e.g. NULL pointer address) passed to kasan_report() when KASAN_HW_TAGS is enabled, leads to a kernel panic. Example below, based on arm64: BUG: KASAN: invalid-access in 0x0 Read at addr 0000000000000000 by task swapper/0/1 Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 Mem abort info: ESR = 0x96000004 EC = 0x25: DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 Data abort info: ISV = 0, ISS = 0x00000004 CM = 0, WnR = 0 ... Call trace: mte_get_mem_tag+0x24/0x40 kasan_report+0x1a4/0x410 alsa_sound_last_init+0x8c/0xa4 do_one_initcall+0x50/0x1b0 kernel_init_freeable+0x1d4/0x23c kernel_init+0x14/0x118 ret_from_fork+0x10/0x34 Code: d65f03c0 9000f021 f9428021 b6cfff61 (d9600000) ---[ end trace 377c8bb45bdd3a1a ]--- hrtimer: interrupt took 48694256 ns note: swapper/0[1] exited with preempt_count 1 Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b SMP: stopping secondary CPUs Kernel Offset: 0x35abaf140000 from 0xffff800010000000 PHYS_OFFSET: 0x40000000 CPU features: 0x0a7e0152,61c0a030 Memory Limit: none ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b ]--- This series fixes the behavior of addr_has_metadata() that now returns true only when the address is valid. This patch (of 2): With the introduction of KASAN_HW_TAGS, kasan_report() accesses the metadata only when addr_has_metadata() succeeds. Add a comment to make sure that the preconditions to the function are explicitly clarified. Link: https://lkml.kernel.org/r/20210126134409.47894-1-vincenzo.frascino@arm.com Link: https://lkml.kernel.org/r/20210126134409.47894-2-vincenzo.frascino@arm.com Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Reviewed-by: Andrey Konovalov <andreyknvl@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Alexander Potapenko <glider@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Leon Romanovsky <leonro@mellanox.com> Cc: Andrey Konovalov <andreyknvl@google.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: "Paul E . McKenney" <paulmck@kernel.org> Cc: Naresh Kamboju <naresh.kamboju@linaro.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-02-05mm/vmalloc: separate put pages and flush VM flagsRick Edgecombe
When VM_MAP_PUT_PAGES was added, it was defined with the same value as VM_FLUSH_RESET_PERMS. This doesn't seem like it will cause any big functional problems other than some excess flushing for VM_MAP_PUT_PAGES allocations. Redefine VM_MAP_PUT_PAGES to have its own value. Also, rearrange things so flags are less likely to be missed in the future. Link: https://lkml.kernel.org/r/20210122233706.9304-1-rick.p.edgecombe@intel.com Fixes: b944afc9d64d ("mm: add a VM_MAP_PUT_PAGES flag for vmap") Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com> Suggested-by: Matthew Wilcox <willy@infradead.org> Cc: Miaohe Lin <linmiaohe@huawei.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Daniel Axtens <dja@axtens.net> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-02-05mm: hugetlbfs: fix cannot migrate the fallocated HugeTLB pageMuchun Song
If a new hugetlb page is allocated during fallocate it will not be marked as active (set_page_huge_active) which will result in a later isolate_huge_page failure when the page migration code would like to move that page. Such a failure would be unexpected and wrong. Only export set_page_huge_active, just leave clear_page_huge_active as static. Because there are no external users. Link: https://lkml.kernel.org/r/20210115124942.46403-3-songmuchun@bytedance.com Fixes: 70c3547e36f5 (hugetlbfs: add hugetlbfs_fallocate()) Signed-off-by: Muchun Song <songmuchun@bytedance.com> Acked-by: Michal Hocko <mhocko@suse.com> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Cc: David Hildenbrand <david@redhat.com> Cc: Yang Shi <shy828301@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-02-05Merge tag 'iommu-fixes-v5.11-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull IOMMU fix from Joerg Roedel: "Fix a possible NULL-ptr dereference in dev_iommu_priv_get() which is too easy to accidentially trigger from IOMMU drivers. In the current case the AMD IOMMU driver triggered it on some machines in the IO-page-fault path, so fix it once and for all" * tag 'iommu-fixes-v5.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: iommu: Check dev->iommu in dev_iommu_priv_get() before dereferencing it
2021-02-05Merge tag 'drm-fixes-2021-02-05-1' of git://anongit.freedesktop.org/drm/drmLinus Torvalds
Pull drm fixes from Dave Airlie: "Fixes for rc7, bit bigger than I'd like at this stage, but most of the i915 stuff and some amdgpu is destined for staging and I'd rather not hold it up, the i915 changes also pulled in a few precusor code movement patches to make things cleaner, but nothing seems that horrible, and I've checked over all of it. Otherwise there is a nouveau dma-api warning regression, and a ttm page allocation warning fix, and some fixes for a bridge chip, ttm: - fix huge page warning regression i915: - Skip vswing programming for TBT - Power up combo PHY lanes for HDMI - Fix double YUV range correction on HDR planes - Fix the MST PBN divider calculation - Fix LTTPR vswing/pre-emp setting in non-transparent mode - Move the breadcrumb to the signaler if completed upon cancel - Close race between enable_breadcrumbs and cancel_breadcrumbs - Drop lru bumping on display unpinning amdgpu: - Fix retry in gem create - Vangogh fixes - Fix for display from shared buffers - Various display fixes amdkfd: - Fix regression in buffer free nouveau: - fix DMA API warning regression drm/bridge/lontium-lt9611uxc: - EDID fixes - Don't handle hotplug events in IRQ handler" * tag 'drm-fixes-2021-02-05-1' of git://anongit.freedesktop.org/drm/drm: (29 commits) drm/nouveau: fix dma syncing warning with debugging on. drm/amd/display: Decrement refcount of dc_sink before reassignment drm/amd/display: Free atomic state after drm_atomic_commit drm/amd/display: Fix dc_sink kref count in emulated_link_detect drm/amd/display: Release DSC before acquiring drm/amd/display: Revert "Fix EDID parsing after resume from suspend" drm/amd/display: Add more Clock Sources to DCN2.1 drm/amd/display: reuse current context instead of recreating one drm/amd/display: Fix DPCD translation for LTTPR AUX_RD_INTERVAL drm/amdgpu: enable freesync for A+A configs drm/amd/pm: fill in the data member of v2 gpu metrics table for vangogh drm/amdgpu/gfx10: update CGTS_TCC_DISABLE and CGTS_USER_TCC_DISABLE register offsets for VGH drm/amdkfd: fix null pointer panic while free buffer in kfd drm/amdgpu: fix the issue that retry constantly once the buffer is oversize drm/i915/dp: Fix LTTPR vswing/pre-emp setting in non-transparent mode drm/i915/dp: Move intel_dp_set_signal_levels() to intel_dp_link_training.c drm/i915: Fix the MST PBN divider calculation drm/dp/mst: Export drm_dp_get_vc_payload_bw() drm/i915/gem: Drop lru bumping on display unpinning drm/i915/gt: Close race between enable_breadcrumbs and cancel_breadcrumbs ...
2021-02-04udp: fix skb_copy_and_csum_datagram with odd segment sizesWillem de Bruijn
When iteratively computing a checksum with csum_block_add, track the offset "pos" to correctly rotate in csum_block_add when offset is odd. The open coded implementation of skb_copy_and_csum_datagram did this. With the switch to __skb_datagram_iter calling csum_and_copy_to_iter, pos was reinitialized to 0 on each call. Bring back the pos by passing it along with the csum to the callback. Changes v1->v2 - pass csum value, instead of csump pointer (Alexander Duyck) Link: https://lore.kernel.org/netdev/20210128152353.GB27281@optiplex/ Fixes: 950fcaecd5cc ("datagram: consolidate datagram copy to iter helpers") Reported-by: Oliver Graute <oliver.graute@gmail.com> Signed-off-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20210203192952.1849843-1-willemdebruijn.kernel@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-03Merge tag 'trace-v5.11-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fixes from Steven Rostedt: - Initialize tracing-graph-pause at task creation, not start of function tracing, to avoid corrupting the pause counter. - Set "pause-on-trace" for latency tracers as that option breaks their output (regression). - Fix the wrong error return for setting kretprobes on future modules (before they are loaded). - Fix re-registering the same kretprobe. - Add missing value check for added RCU variable reload. * tag 'trace-v5.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracepoint: Fix race between tracing and removing tracepoint kretprobe: Avoid re-registration of the same kretprobe earlier tracing/kprobe: Fix to support kretprobe events on unloaded modules tracing: Use pause-on-trace with the latency tracers fgraph: Initialize tracing_graph_pause at task creation
2021-02-02Merge tag 'net-5.11-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Networking fixes for 5.11-rc7, including fixes from bpf and mac80211 trees. Current release - regressions: - ip_tunnel: fix mtu calculation - mlx5: fix function calculation for page trees Previous releases - regressions: - vsock: fix the race conditions in multi-transport support - neighbour: prevent a dead entry from updating gc_list - dsa: mv88e6xxx: override existent unicast portvec in port_fdb_add Previous releases - always broken: - bpf, cgroup: two copy_{from,to}_user() warn_on_once splats for BPF cgroup getsockopt infra when user space is trying to race against optlen, from Loris Reiff. - bpf: add missing fput() in BPF inode storage map update helper - udp: ipv4: manipulate network header of NATed UDP GRO fraglist - mac80211: fix station rate table updates on assoc - r8169: work around RTL8125 UDP HW bug - igc: report speed and duplex as unknown when device is runtime suspended - rxrpc: fix deadlock around release of dst cached on udp tunnel" * tag 'net-5.11-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (36 commits) net: hsr: align sup_multicast_addr in struct hsr_priv to u16 boundary net: ipa: fix two format specifier errors net: ipa: use the right accessor in ipa_endpoint_status_skip() net: ipa: be explicit about endianness net: ipa: add a missing __iomem attribute net: ipa: pass correct dma_handle to dma_free_coherent() r8169: fix WoL on shutdown if CONFIG_DEBUG_SHIRQ is set net/rds: restrict iovecs length for RDS_CMSG_RDMA_ARGS net: mvpp2: TCAM entry enable should be written after SRAM data net: lapb: Copy the skb before sending a packet net/mlx5e: Release skb in case of failure in tc update skb net/mlx5e: Update max_opened_tc also when channels are closed net/mlx5: Fix leak upon failure of rule creation net/mlx5: Fix function calculation for page trees docs: networking: swap words in icmp_errors_use_inbound_ifaddr doc udp: ipv4: manipulate network header of NATed UDP GRO fraglist net: ip_tunnel: fix mtu calculation vsock: fix the race conditions in multi-transport support net: sched: replaced invalid qdisc tree flush helper in qdisc_replace ibmvnic: device remove has higher precedence over reset ...
2021-02-02drm/dp/mst: Export drm_dp_get_vc_payload_bw()Imre Deak
This function will be needed by the next patch where the driver calculates the BW based on driver specific parameters, so export it. At the same time sanitize the function params, passing the more natural link rate instead of the encoding of the same rate. v2: - Fix function documentation. (Lyude) Cc: Lyude Paul <lyude@redhat.com> Cc: Ville Syrjala <ville.syrjala@intel.com> Cc: <stable@vger.kernel.org> Cc: dri-devel@lists.freedesktop.org Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: Lyude Paul <lyude@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210125173636.1733812-1-imre.deak@intel.com (cherry picked from commit a321fc2b4e60fc1b39517d26c8104351636a6062) Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2021-02-02iommu: Check dev->iommu in dev_iommu_priv_get() before dereferencing itJoerg Roedel
The dev_iommu_priv_get() needs a similar check to dev_iommu_fwspec_get() to make sure no NULL-ptr is dereferenced. Fixes: 05a0542b456e1 ("iommu/amd: Store dev_data as device iommu private data") Cc: stable@vger.kernel.org # v5.8+ Link: https://lore.kernel.org/r/20210202145419.29143-1-joro@8bytes.org Reference: https://bugzilla.kernel.org/show_bug.cgi?id=211241 Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-02-02tracepoint: Fix race between tracing and removing tracepointAlexey Kardashevskiy
When executing a tracepoint, the tracepoint's func is dereferenced twice - in __DO_TRACE() (where the returned pointer is checked) and later on in __traceiter_##_name where the returned pointer is dereferenced without checking which leads to races against tracepoint_removal_sync() and crashes. This adds a check before referencing the pointer in tracepoint_ptr_deref. Link: https://lkml.kernel.org/r/20210202072326.120557-1-aik@ozlabs.ru Cc: stable@vger.kernel.org Fixes: d25e37d89dd2f ("tracepoint: Optimize using static_call()") Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-02-01udp: ipv4: manipulate network header of NATed UDP GRO fraglistDongseok Yi
UDP/IP header of UDP GROed frag_skbs are not updated even after NAT forwarding. Only the header of head_skb from ip_finish_output_gso -> skb_gso_segment is updated but following frag_skbs are not updated. A call path skb_mac_gso_segment -> inet_gso_segment -> udp4_ufo_fragment -> __udp_gso_segment -> __udp_gso_segment_list does not try to update UDP/IP header of the segment list but copy only the MAC header. Update port, addr and check of each skb of the segment list in __udp_gso_segment_list. It covers both SNAT and DNAT. Fixes: 9fd1ff5d2ac7 (udp: Support UDP fraglist GRO/GSO.) Signed-off-by: Dongseok Yi <dseok.yi@samsung.com> Acked-by: Steffen Klassert <steffen.klassert@secunet.com> Link: https://lore.kernel.org/r/1611962007-80092-1-git-send-email-dseok.yi@samsung.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-01net: sched: replaced invalid qdisc tree flush helper in qdisc_replaceAlexander Ovechkin
Commit e5f0e8f8e456 ("net: sched: introduce and use qdisc tree flush/purge helpers") introduced qdisc tree flush/purge helpers, but erroneously used flush helper instead of purge helper in qdisc_replace function. This issue was found in our CI, that tests various qdisc setups by configuring qdisc and sending data through it. Call of invalid helper sporadically leads to corruption of vt_tree/cf_tree of hfsc_class that causes kernel oops: Oops: 0000 [#1] SMP PTI CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.11.0-8f6859df #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014 RIP: 0010:rb_insert_color+0x18/0x190 Code: c3 31 c0 c3 0f 1f 40 00 66 2e 0f 1f 84 00 00 00 00 00 48 8b 07 48 85 c0 0f 84 05 01 00 00 48 8b 10 f6 c2 01 0f 85 34 01 00 00 <48> 8b 4a 08 49 89 d0 48 39 c1 74 7d 48 85 c9 74 32 f6 01 01 75 2d RSP: 0018:ffffc900000b8bb0 EFLAGS: 00010246 RAX: ffff8881ef4c38b0 RBX: ffff8881d956e400 RCX: ffff8881ef4c38b0 RDX: 0000000000000000 RSI: ffff8881d956f0a8 RDI: ffff8881d956e4b0 RBP: 0000000000000000 R08: 000000d5c4e249da R09: 1600000000000000 R10: ffffc900000b8be0 R11: ffffc900000b8b28 R12: 0000000000000001 R13: 000000000000005a R14: ffff8881f0905000 R15: ffff8881f0387d00 FS: 0000000000000000(0000) GS:ffff8881f8b00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000008 CR3: 00000001f4796004 CR4: 0000000000060ee0 Call Trace: <IRQ> init_vf.isra.19+0xec/0x250 [sch_hfsc] hfsc_enqueue+0x245/0x300 [sch_hfsc] ? fib_rules_lookup+0x12a/0x1d0 ? __dev_queue_xmit+0x4b6/0x930 ? hfsc_delete_class+0x250/0x250 [sch_hfsc] __dev_queue_xmit+0x4b6/0x930 ? ip6_finish_output2+0x24d/0x590 ip6_finish_output2+0x24d/0x590 ? ip6_output+0x6c/0x130 ip6_output+0x6c/0x130 ? __ip6_finish_output+0x110/0x110 mld_sendpack+0x224/0x230 mld_ifc_timer_expire+0x186/0x2c0 ? igmp6_group_dropped+0x200/0x200 call_timer_fn+0x2d/0x150 run_timer_softirq+0x20c/0x480 ? tick_sched_do_timer+0x60/0x60 ? tick_sched_timer+0x37/0x70 __do_softirq+0xf7/0x2cb irq_exit+0xa0/0xb0 smp_apic_timer_interrupt+0x74/0x150 apic_timer_interrupt+0xf/0x20 </IRQ> Fixes: e5f0e8f8e456 ("net: sched: introduce and use qdisc tree flush/purge helpers") Signed-off-by: Alexander Ovechkin <ovov@yandex-team.ru> Reported-by: Alexander Kuznetsov <wwfq@yandex-team.ru> Acked-by: Dmitry Monakhov <dmtrmonakhov@yandex-team.ru> Acked-by: Dmitry Yakunin <zeil@yandex-team.ru> Acked-by: Cong Wang <xiyou.wangcong@gmail.com> Link: https://lore.kernel.org/r/20210201200049.299153-1-ovov@yandex-team.ru Signed-off-by: Jakub Kicinski <kuba@kernel.org>