commit 10ad6cfd57360760116cde00a8ef756e121367a9 Author: Greg Kroah-Hartman Date: Sat Sep 26 18:01:33 2020 +0200 Linux 4.19.148 Tested-by: Jon Hunter Tested-by: Shuah Khan Tested-by: Linux Kernel Functional Testing Tested-by: Guenter Roeck Signed-off-by: Greg Kroah-Hartman Link: https://lore.kernel.org/r/20200925124720.972208530@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman commit 8b4846ac1af4b0c99817aee7304e9f5dd6ffcb56 Author: Lukas Wunner Date: Tue May 12 14:40:01 2020 +0200 serial: 8250: Avoid error message on reprobe commit e0a851fe6b9b619527bd928aa93caaddd003f70c upstream. If the call to uart_add_one_port() in serial8250_register_8250_port() fails, a half-initialized entry in the serial_8250ports[] array is left behind. A subsequent reprobe of the same serial port causes that entry to be reused. Because uart->port.dev is set, uart_remove_one_port() is called for the half-initialized entry and bails out with an error message: bcm2835-aux-uart 3f215040.serial: Removing wrong port: (null) != (ptrval) The same happens on failure of mctrl_gpio_init() since commit 4a96895f74c9 ("tty/serial/8250: use mctrl_gpio helpers"). Fix by zeroing the uart->port.dev pointer in the probe error path. The bug was introduced in v2.6.10 by historical commit befff6f5bf5f ("[SERIAL] Add new port registration/unregistration functions."): https://git.kernel.org/tglx/history/c/befff6f5bf5f The commit added an unconditional call to uart_remove_one_port() in serial8250_register_port(). In v3.7, commit 835d844d1a28 ("8250_pnp: do pnp probe before legacy probe") made that call conditional on uart->port.dev which allows me to fix the issue by zeroing that pointer in the error path. Thus, the present commit will fix the problem as far back as v3.7 whereas still older versions need to also cherry-pick 835d844d1a28. Fixes: 835d844d1a28 ("8250_pnp: do pnp probe before legacy probe") Signed-off-by: Lukas Wunner Cc: stable@vger.kernel.org # v2.6.10 Cc: stable@vger.kernel.org # v2.6.10: 835d844d1a28: 8250_pnp: do pnp probe before legacy Reviewed-by: Andy Shevchenko Link: https://lore.kernel.org/r/b4a072013ee1a1d13ee06b4325afb19bda57ca1b.1589285873.git.lukas@wunner.de [iwamatsu: Backported to 4.14, 4.19: adjust context] Signed-off-by: Nobuhiro Iwamatsu (CIP) Signed-off-by: Greg Kroah-Hartman commit 610058f519b579e38f9be0715ec9f73697e5d40d Author: Priyaranjan Jha Date: Wed Jan 23 12:04:54 2019 -0800 tcp_bbr: adapt cwnd based on ack aggregation estimation commit 78dc70ebaa38aa303274e333be6c98eef87619e2 upstream. Aggregation effects are extremely common with wifi, cellular, and cable modem link technologies, ACK decimation in middleboxes, and LRO and GRO in receiving hosts. The aggregation can happen in either direction, data or ACKs, but in either case the aggregation effect is visible to the sender in the ACK stream. Previously BBR's sending was often limited by cwnd under severe ACK aggregation/decimation because BBR sized the cwnd at 2*BDP. If packets were acked in bursts after long delays (e.g. one ACK acking 5*BDP after 5*RTT), BBR's sending was halted after sending 2*BDP over 2*RTT, leaving the bottleneck idle for potentially long periods. Note that loss-based congestion control does not have this issue because when facing aggregation it continues increasing cwnd after bursts of ACKs, growing cwnd until the buffer is full. To achieve good throughput in the presence of aggregation effects, this algorithm allows the BBR sender to put extra data in flight to keep the bottleneck utilized during silences in the ACK stream that it has evidence to suggest were caused by aggregation. A summary of the algorithm: when a burst of packets are acked by a stretched ACK or a burst of ACKs or both, BBR first estimates the expected amount of data that should have been acked, based on its estimated bandwidth. Then the surplus ("extra_acked") is recorded in a windowed-max filter to estimate the recent level of observed ACK aggregation. Then cwnd is increased by the ACK aggregation estimate. The larger cwnd avoids BBR being cwnd-limited in the face of ACK silences that recent history suggests were caused by aggregation. As a sanity check, the ACK aggregation degree is upper-bounded by the cwnd (at the time of measurement) and a global max of BW * 100ms. The algorithm is further described by the following presentation: https://datatracker.ietf.org/meeting/101/materials/slides-101-iccrg-an-update-on-bbr-work-at-google-00 In our internal testing, we observed a significant increase in BBR throughput (measured using netperf), in a basic wifi setup. - Host1 (sender on ethernet) -> AP -> Host2 (receiver on wifi) - 2.4 GHz -> BBR before: ~73 Mbps; BBR after: ~102 Mbps; CUBIC: ~100 Mbps - 5.0 GHz -> BBR before: ~362 Mbps; BBR after: ~593 Mbps; CUBIC: ~601 Mbps Also, this code is running globally on YouTube TCP connections and produced significant bandwidth increases for YouTube traffic. This is based on Ian Swett's max_ack_height_ algorithm from the QUIC BBR implementation. Signed-off-by: Priyaranjan Jha Signed-off-by: Neal Cardwell Signed-off-by: Yuchung Cheng Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit a56eb38acc700684f365740993ce3ddedcbb9152 Author: Priyaranjan Jha Date: Wed Jan 23 12:04:53 2019 -0800 tcp_bbr: refactor bbr_target_cwnd() for general inflight provisioning commit 232aa8ec3ed979d4716891540c03a806ecab0c37 upstream. Because bbr_target_cwnd() is really a general-purpose BBR helper for computing some volume of inflight data as a function of the estimated BDP, refactor it into following helper functions: - bbr_bdp() - bbr_quantization_budget() - bbr_inflight() Signed-off-by: Priyaranjan Jha Signed-off-by: Neal Cardwell Signed-off-by: Yuchung Cheng Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 1aa7a9e5eebc5c40f0a5ea4e4cb8e8bd0267aea1 Author: Xunlei Pang Date: Fri Sep 4 16:35:27 2020 -0700 mm: memcg: fix memcg reclaim soft lockup commit e3336cab2579012b1e72b5265adf98e2d6e244ad upstream. We've met softlockup with "CONFIG_PREEMPT_NONE=y", when the target memcg doesn't have any reclaimable memory. It can be easily reproduced as below: watchdog: BUG: soft lockup - CPU#0 stuck for 111s![memcg_test:2204] CPU: 0 PID: 2204 Comm: memcg_test Not tainted 5.9.0-rc2+ #12 Call Trace: shrink_lruvec+0x49f/0x640 shrink_node+0x2a6/0x6f0 do_try_to_free_pages+0xe9/0x3e0 try_to_free_mem_cgroup_pages+0xef/0x1f0 try_charge+0x2c1/0x750 mem_cgroup_charge+0xd7/0x240 __add_to_page_cache_locked+0x2fd/0x370 add_to_page_cache_lru+0x4a/0xc0 pagecache_get_page+0x10b/0x2f0 filemap_fault+0x661/0xad0 ext4_filemap_fault+0x2c/0x40 __do_fault+0x4d/0xf9 handle_mm_fault+0x1080/0x1790 It only happens on our 1-vcpu instances, because there's no chance for oom reaper to run to reclaim the to-be-killed process. Add a cond_resched() at the upper shrink_node_memcgs() to solve this issue, this will mean that we will get a scheduling point for each memcg in the reclaimed hierarchy without any dependency on the reclaimable memory in that memcg thus making it more predictable. Suggested-by: Michal Hocko Signed-off-by: Xunlei Pang Signed-off-by: Andrew Morton Acked-by: Chris Down Acked-by: Michal Hocko Acked-by: Johannes Weiner Link: http://lkml.kernel.org/r/1598495549-67324-1-git-send-email-xlpang@linux.alibaba.com Signed-off-by: Linus Torvalds Signed-off-by: Julius Hemanth Pitti Signed-off-by: Greg Kroah-Hartman commit 7aaf09fd5c63ee9dc86325896abdfa47c54d39a9 Author: Masahiro Yamada Date: Wed Apr 8 10:36:23 2020 +0900 kbuild: support LLVM=1 to switch the default tools to Clang/LLVM commit a0d1c951ef08ed24f35129267e3595d86f57f5d3 upstream. As Documentation/kbuild/llvm.rst implies, building the kernel with a full set of LLVM tools gets very verbose and unwieldy. Provide a single switch LLVM=1 to use Clang and LLVM tools instead of GCC and Binutils. You can pass it from the command line or as an environment variable. Please note LLVM=1 does not turn on the integrated assembler. You need to pass LLVM_IAS=1 to use it. When the upstream kernel is ready for the integrated assembler, I think we can make it default. We discussed what we need, and we agreed to go with a simple boolean flag that switches both target and host tools: https://lkml.org/lkml/2020/3/28/494 https://lkml.org/lkml/2020/4/3/43 Some items discussed, but not adopted: - LLVM_DIR When multiple versions of LLVM are installed, I just thought supporting LLVM_DIR=/path/to/my/llvm/bin/ might be useful. CC = $(LLVM_DIR)clang LD = $(LLVM_DIR)ld.lld ... However, we can handle this by modifying PATH. So, we decided to not do this. - LLVM_SUFFIX Some distributions (e.g. Debian) package specific versions of LLVM with naming conventions that use the version as a suffix. CC = clang$(LLVM_SUFFIX) LD = ld.lld(LLVM_SUFFIX) ... will allow a user to pass LLVM_SUFFIX=-11 to use clang-11 etc., but the suffixed versions in /usr/bin/ are symlinks to binaries in /usr/lib/llvm-#/bin/, so this can also be handled by PATH. Signed-off-by: Masahiro Yamada Reviewed-by: Nathan Chancellor Tested-by: Nathan Chancellor # build Tested-by: Nick Desaulniers Reviewed-by: Nick Desaulniers Signed-off-by: Nick Desaulniers [nd: conflict in exported vars list from not backporting commit e83b9f55448a ("kbuild: add ability to generate BTF type info for vmlinux")] [nd: hunk against Documentation/kbuild/kbuild.rst dropped due to not backporting commit cd238effefa2 ("docs: kbuild: convert docs to ReST and rename to *.rst")] Signed-off-by: Greg Kroah-Hartman commit 459c7a844fcba41ab70f6247a1c2b4304939c221 Author: Masahiro Yamada Date: Wed Apr 8 10:36:22 2020 +0900 kbuild: replace AS=clang with LLVM_IAS=1 commit 7e20e47c70f810d678d02941fa3c671209c4ca97 upstream. The 'AS' variable is unused for building the kernel. Only the remaining usage is to turn on the integrated assembler. A boolean flag is a better fit for this purpose. AS=clang was added for experts. So, I replaced it with LLVM_IAS=1, breaking the backward compatibility. Suggested-by: Nick Desaulniers Signed-off-by: Masahiro Yamada Reviewed-by: Nathan Chancellor Reviewed-by: Nick Desaulniers Signed-off-by: Nick Desaulniers Signed-off-by: Greg Kroah-Hartman commit 0fbcb1294d3b3c80110575a82b3fa2ab812719d6 Author: Masahiro Yamada Date: Thu Mar 26 14:57:18 2020 +0900 kbuild: remove AS variable commit aa824e0c962b532d5073cbb41b2efcd6f5e72bae upstream. As commit 5ef872636ca7 ("kbuild: get rid of misleading $(AS) from documents") noted, we rarely use $(AS) directly in the kernel build. Now that the only/last user of $(AS) in drivers/net/wan/Makefile was converted to $(CC), $(AS) is no longer used in the build process. You can still pass in AS=clang, which is just a switch to turn on the LLVM integrated assembler. Signed-off-by: Masahiro Yamada Reviewed-by: Nick Desaulniers Tested-by: Nick Desaulniers Reviewed-by: Nathan Chancellor Signed-off-by: Nick Desaulniers [nd: conflict in exported vars list from not backporting commit e83b9f55448a ("kbuild: add ability to generate BTF type info for vmlinux")] Signed-off-by: Greg Kroah-Hartman commit 621150689b0992bc02389b2351c2cbf0bc5bd700 Author: Dmitry Golovin Date: Thu Dec 5 00:54:41 2019 +0200 x86/boot: kbuild: allow readelf executable to be specified commit eefb8c124fd969e9a174ff2bedff86aa305a7438 upstream. Introduce a new READELF variable to top-level Makefile, so the name of readelf binary can be specified. Before this change the name of the binary was hardcoded to "$(CROSS_COMPILE)readelf" which might not be present for every toolchain. This allows to build with LLVM Object Reader by using make parameter READELF=llvm-readelf. Link: https://github.com/ClangBuiltLinux/linux/issues/771 Signed-off-by: Dmitry Golovin Reviewed-by: Nick Desaulniers Signed-off-by: Masahiro Yamada Signed-off-by: Nick Desaulniers [nd: conflict in exported vars list from not backporting commit e83b9f55448a ("kbuild: add ability to generate BTF type info for vmlinux")] Signed-off-by: Greg Kroah-Hartman commit a1c015990071258650b3ec45dfa182bd20378de5 Author: Masahiro Yamada Date: Thu Mar 26 14:57:16 2020 +0900 net: wan: wanxl: use $(M68KCC) instead of $(M68KAS) for rebuilding firmware commit 734f3719d3438f9cc181d674c33ca9762e9148a1 upstream. The firmware source, wanxlfw.S, is currently compiled by the combo of $(CPP) and $(M68KAS). This is not what we usually do for compiling *.S files. In fact, this Makefile is the only user of $(AS) in the kernel build. Instead of combining $(CPP) and (AS) from different tool sets, using $(M68KCC) as an assembler driver is simpler, and saner. Signed-off-by: Masahiro Yamada Signed-off-by: Nick Desaulniers Signed-off-by: Greg Kroah-Hartman commit fb181ac6fe194b4eb20007f65504c61ea22826fd Author: Masahiro Yamada Date: Thu Mar 26 14:57:15 2020 +0900 net: wan: wanxl: use allow to pass CROSS_COMPILE_M68k for rebuilding firmware commit 63b903dfebdea92aa92ad337d8451a6fbfeabf9d upstream. As far as I understood from the Kconfig help text, this build rule is used to rebuild the driver firmware, which runs on an old m68k-based chip. So, you need m68k tools for the firmware rebuild. wanxl.c is a PCI driver, but CONFIG_M68K does not select CONFIG_HAVE_PCI. So, you cannot enable CONFIG_WANXL_BUILD_FIRMWARE for ARCH=m68k. In other words, ifeq ($(ARCH),m68k) is false here. I am keeping the dead code for now, but rebuilding the firmware requires 'as68k' and 'ld68k', which I do not have in hand. Instead, the kernel.org m68k GCC [1] successfully built it. Allowing a user to pass in CROSS_COMPILE_M68K= is handier. [1] https://mirrors.edge.kernel.org/pub/tools/crosstool/files/bin/x86_64/9.2.0/x86_64-gcc-9.2.0-nolibc-m68k-linux.tar.xz Suggested-by: Geert Uytterhoeven Signed-off-by: Masahiro Yamada Signed-off-by: Nick Desaulniers Signed-off-by: Greg Kroah-Hartman commit 98aeb8d9716fba2fcc64c9c18d915f749e4d69f0 Author: Fangrui Song Date: Thu Apr 2 10:38:42 2020 -0700 Documentation/llvm: fix the name of llvm-size commit 0f44fbc162b737ff6251ae248184390ae2279fee upstream. The tool is called llvm-size, not llvm-objsize. Fixes: fcf1b6a35c16 ("Documentation/llvm: add documentation on building w/ Clang/LLVM") Signed-off-by: Fangrui Song Reviewed-by: Nick Desaulniers Reviewed-by: Nathan Chancellor Signed-off-by: Masahiro Yamada Signed-off-by: Nick Desaulniers Signed-off-by: Greg Kroah-Hartman commit 948f0c02039b0ed9a2da283893e96e41c18b07fc Author: Nick Desaulniers Date: Wed Feb 26 15:23:36 2020 -0800 Documentation/llvm: add documentation on building w/ Clang/LLVM commit fcf1b6a35c16ac500fa908a4022238e5d666eabf upstream. added to kbuild documentation. Provides more official info on building kernels with Clang and LLVM than our wiki. Suggested-by: Kees Cook Reviewed-by: Kees Cook Reviewed-by: Nathan Chancellor Reviewed-by: Sedat Dilek Signed-off-by: Nick Desaulniers Signed-off-by: Masahiro Yamada [nd: hunk against Documentation/kbuild/index.rst dropped due to not backporting commit cd238effefa2 ("docs: kbuild: convert docs to ReST and rename to *.rst")] Signed-off-by: Greg Kroah-Hartman commit 31030d63d5b6253aa15b31b7240e035fcd2704e0 Author: Vasily Gorbik Date: Mon Jan 21 13:54:39 2019 +0100 kbuild: add OBJSIZE variable for the size tool commit 7bac98707f65b93bf994ef4e99b1eb9e7dbb9c32 upstream. Define and export OBJSIZE variable for "size" tool from binutils to be used in architecture specific Makefiles (naming the variable just "SIZE" would be too risky). In particular this tool is useful to perform checks that early boot code is not using bss section (which might have not been zeroed yet or intersects with initrd or other files boot loader might have put right after the linux kernel). Link: http://lkml.kernel.org/r/patch-1.thread-2257a1.git-188f5a3d81d5.your-ad-here.call-01565088755-ext-5120@work.hours Acked-by: Masahiro Yamada Signed-off-by: Vasily Gorbik [nd: conflict in exported vars list from not backporting commit e83b9f55448a ("kbuild: add ability to generate BTF type info for vmlinux")] Signed-off-by: Greg Kroah-Hartman commit e711de542260e8b2a5158534ded59ab9b1072f41 Author: Nick Desaulniers Date: Fri Jun 28 12:07:12 2019 -0700 MAINTAINERS: add CLANG/LLVM BUILD SUPPORT info commit 8708e13c6a0600625eea3aebd027c0715a5d2bb2 upstream. Add keyword support so that our mailing list gets cc'ed for clang/llvm patches. We're pretty active on our mailing list so far as code review. There are numerous Googlers like myself that are paid to support building the Linux kernel with Clang and LLVM. Link: http://lkml.kernel.org/r/20190620001907.255803-1-ndesaulniers@google.com Signed-off-by: Nick Desaulniers Reviewed-by: Nathan Chancellor Cc: Joe Perches Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 98776a365da509ad923083ae54b38ee521c52742 Author: David Ahern Date: Mon Sep 14 21:03:54 2020 -0600 ipv4: Update exception handling for multipath routes via same device [ Upstream commit 2fbc6e89b2f1403189e624cabaf73e189c5e50c6 ] Kfir reported that pmtu exceptions are not created properly for deployments where multipath routes use the same device. After some digging I see 2 compounding problems: 1. ip_route_output_key_hash_rcu is updating the flowi4_oif *after* the route lookup. This is the second use case where this has been a problem (the first is related to use of vti devices with VRF). I can not find any reason for the oif to be changed after the lookup; the code goes back to the start of git. It does not seem logical so remove it. 2. fib_lookups for exceptions do not call fib_select_path to handle multipath route selection based on the hash. The end result is that the fib_lookup used to add the exception always creates it based using the first leg of the route. An example topology showing the problem: | host1 +------+ | eth0 | .209 +------+ | +------+ switch | br0 | +------+ | +---------+---------+ | host2 | host3 +------+ +------+ | eth0 | .250 | eth0 | 192.168.252.252 +------+ +------+ +-----+ +-----+ | vti | .2 | vti | 192.168.247.3 +-----+ +-----+ \ / ================================= tunnels 192.168.247.1/24 for h in host1 host2 host3; do ip netns add ${h} ip -netns ${h} link set lo up ip netns exec ${h} sysctl -wq net.ipv4.ip_forward=1 done ip netns add switch ip -netns switch li set lo up ip -netns switch link add br0 type bridge stp 0 ip -netns switch link set br0 up for n in 1 2 3; do ip -netns switch link add eth-sw type veth peer name eth-h${n} ip -netns switch li set eth-h${n} master br0 up ip -netns switch li set eth-sw netns host${n} name eth0 done ip -netns host1 addr add 192.168.252.209/24 dev eth0 ip -netns host1 link set dev eth0 up ip -netns host1 route add 192.168.247.0/24 \ nexthop via 192.168.252.250 dev eth0 nexthop via 192.168.252.252 dev eth0 ip -netns host2 addr add 192.168.252.250/24 dev eth0 ip -netns host2 link set dev eth0 up ip -netns host2 addr add 192.168.252.252/24 dev eth0 ip -netns host3 link set dev eth0 up ip netns add tunnel ip -netns tunnel li set lo up ip -netns tunnel li add br0 type bridge ip -netns tunnel li set br0 up for n in $(seq 11 20); do ip -netns tunnel addr add dev br0 192.168.247.${n}/24 done for n in 2 3 do ip -netns tunnel link add vti${n} type veth peer name eth${n} ip -netns tunnel link set eth${n} mtu 1360 master br0 up ip -netns tunnel link set vti${n} netns host${n} mtu 1360 up ip -netns host${n} addr add dev vti${n} 192.168.247.${n}/24 done ip -netns tunnel ro add default nexthop via 192.168.247.2 nexthop via 192.168.247.3 ip netns exec host1 ping -M do -s 1400 -c3 -I 192.168.252.209 192.168.247.11 ip netns exec host1 ping -M do -s 1400 -c3 -I 192.168.252.209 192.168.247.15 ip -netns host1 ro ls cache Before this patch the cache always shows exceptions against the first leg in the multipath route; 192.168.252.250 per this example. Since the hash has an initial random seed, you may need to vary the final octet more than what is listed. In my tests, using addresses between 11 and 19 usually found 1 that used both legs. With this patch, the cache will have exceptions for both legs. Fixes: 4895c771c7f0 ("ipv4: Add FIB nexthop exceptions") Reported-by: Kfir Itzhak Signed-off-by: David Ahern Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit f424617e01dce6a220892fce26afc0abef952e1b Author: Eric Dumazet Date: Wed Sep 9 01:27:40 2020 -0700 net: add __must_check to skb_put_padto() [ Upstream commit 4a009cb04aeca0de60b73f37b102573354214b52 ] skb_put_padto() and __skb_put_padto() callers must check return values or risk use-after-free. Signed-off-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 771443a2ffe189f5f51653d21e949b436c9da09a Author: Eric Dumazet Date: Wed Sep 9 01:27:39 2020 -0700 net: qrtr: check skb_put_padto() return value [ Upstream commit 3ca1a42a52ca4b4f02061683851692ad65fefac8 ] If skb_put_padto() returns an error, skb has been freed. Better not touch it anymore, as reported by syzbot [1] Note to qrtr maintainers : this suggests qrtr_sendmsg() should adjust sock_alloc_send_skb() second parameter to account for the potential added alignment to avoid reallocation. [1] BUG: KASAN: use-after-free in __skb_insert include/linux/skbuff.h:1907 [inline] BUG: KASAN: use-after-free in __skb_queue_before include/linux/skbuff.h:2016 [inline] BUG: KASAN: use-after-free in __skb_queue_tail include/linux/skbuff.h:2049 [inline] BUG: KASAN: use-after-free in skb_queue_tail+0x6b/0x120 net/core/skbuff.c:3146 Write of size 8 at addr ffff88804d8ab3c0 by task syz-executor.4/4316 CPU: 1 PID: 4316 Comm: syz-executor.4 Not tainted 5.9.0-rc4-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1d6/0x29e lib/dump_stack.c:118 print_address_description+0x66/0x620 mm/kasan/report.c:383 __kasan_report mm/kasan/report.c:513 [inline] kasan_report+0x132/0x1d0 mm/kasan/report.c:530 __skb_insert include/linux/skbuff.h:1907 [inline] __skb_queue_before include/linux/skbuff.h:2016 [inline] __skb_queue_tail include/linux/skbuff.h:2049 [inline] skb_queue_tail+0x6b/0x120 net/core/skbuff.c:3146 qrtr_tun_send+0x1a/0x40 net/qrtr/tun.c:23 qrtr_node_enqueue+0x44f/0xc00 net/qrtr/qrtr.c:364 qrtr_bcast_enqueue+0xbe/0x140 net/qrtr/qrtr.c:861 qrtr_sendmsg+0x680/0x9c0 net/qrtr/qrtr.c:960 sock_sendmsg_nosec net/socket.c:651 [inline] sock_sendmsg net/socket.c:671 [inline] sock_write_iter+0x317/0x470 net/socket.c:998 call_write_iter include/linux/fs.h:1882 [inline] new_sync_write fs/read_write.c:503 [inline] vfs_write+0xa96/0xd10 fs/read_write.c:578 ksys_write+0x11b/0x220 fs/read_write.c:631 do_syscall_64+0x31/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x45d5b9 Code: 5d b4 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 2b b4 fb ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007f84b5b81c78 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 0000000000038b40 RCX: 000000000045d5b9 RDX: 0000000000000055 RSI: 0000000020001240 RDI: 0000000000000003 RBP: 00007f84b5b81ca0 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 000000000000000f R13: 00007ffcbbf86daf R14: 00007f84b5b829c0 R15: 000000000118cf4c Allocated by task 4316: kasan_save_stack mm/kasan/common.c:48 [inline] kasan_set_track mm/kasan/common.c:56 [inline] __kasan_kmalloc+0x100/0x130 mm/kasan/common.c:461 slab_post_alloc_hook+0x3e/0x290 mm/slab.h:518 slab_alloc mm/slab.c:3312 [inline] kmem_cache_alloc+0x1c1/0x2d0 mm/slab.c:3482 skb_clone+0x1b2/0x370 net/core/skbuff.c:1449 qrtr_bcast_enqueue+0x6d/0x140 net/qrtr/qrtr.c:857 qrtr_sendmsg+0x680/0x9c0 net/qrtr/qrtr.c:960 sock_sendmsg_nosec net/socket.c:651 [inline] sock_sendmsg net/socket.c:671 [inline] sock_write_iter+0x317/0x470 net/socket.c:998 call_write_iter include/linux/fs.h:1882 [inline] new_sync_write fs/read_write.c:503 [inline] vfs_write+0xa96/0xd10 fs/read_write.c:578 ksys_write+0x11b/0x220 fs/read_write.c:631 do_syscall_64+0x31/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Freed by task 4316: kasan_save_stack mm/kasan/common.c:48 [inline] kasan_set_track+0x3d/0x70 mm/kasan/common.c:56 kasan_set_free_info+0x17/0x30 mm/kasan/generic.c:355 __kasan_slab_free+0xdd/0x110 mm/kasan/common.c:422 __cache_free mm/slab.c:3418 [inline] kmem_cache_free+0x82/0xf0 mm/slab.c:3693 __skb_pad+0x3f5/0x5a0 net/core/skbuff.c:1823 __skb_put_padto include/linux/skbuff.h:3233 [inline] skb_put_padto include/linux/skbuff.h:3252 [inline] qrtr_node_enqueue+0x62f/0xc00 net/qrtr/qrtr.c:360 qrtr_bcast_enqueue+0xbe/0x140 net/qrtr/qrtr.c:861 qrtr_sendmsg+0x680/0x9c0 net/qrtr/qrtr.c:960 sock_sendmsg_nosec net/socket.c:651 [inline] sock_sendmsg net/socket.c:671 [inline] sock_write_iter+0x317/0x470 net/socket.c:998 call_write_iter include/linux/fs.h:1882 [inline] new_sync_write fs/read_write.c:503 [inline] vfs_write+0xa96/0xd10 fs/read_write.c:578 ksys_write+0x11b/0x220 fs/read_write.c:631 do_syscall_64+0x31/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xa9 The buggy address belongs to the object at ffff88804d8ab3c0 which belongs to the cache skbuff_head_cache of size 224 The buggy address is located 0 bytes inside of 224-byte region [ffff88804d8ab3c0, ffff88804d8ab4a0) The buggy address belongs to the page: page:00000000ea8cccfb refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff88804d8abb40 pfn:0x4d8ab flags: 0xfffe0000000200(slab) raw: 00fffe0000000200 ffffea0002237ec8 ffffea00029b3388 ffff88821bb66800 raw: ffff88804d8abb40 ffff88804d8ab000 000000010000000b 0000000000000000 page dumped because: kasan: bad access detected Fixes: ce57785bf91b ("net: qrtr: fix len of skb_put_padto in qrtr_node_enqueue") Signed-off-by: Eric Dumazet Reported-by: syzbot Cc: Carl Huang Cc: Wen Gong Cc: Bjorn Andersson Cc: Manivannan Sadhasivam Acked-by: Manivannan Sadhasivam Reviewed-by: Bjorn Andersson Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit e9ee8b696d116e9e5a375de811fb0929ef2a5139 Author: Florian Fainelli Date: Wed Sep 16 20:43:09 2020 -0700 net: phy: Avoid NPD upon phy_detach() when driver is unbound [ Upstream commit c2b727df7caa33876e7066bde090f40001b6d643 ] If we have unbound the PHY driver prior to calling phy_detach() (often via phy_disconnect()) then we can cause a NULL pointer de-reference accessing the driver owner member. The steps to reproduce are: echo unimac-mdio-0:01 > /sys/class/net/eth0/phydev/driver/unbind ip link set eth0 down Fixes: cafe8df8b9bc ("net: phy: Fix lack of reference count on PHY driver") Signed-off-by: Florian Fainelli Reviewed-by: Andrew Lunn Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit ee0491c2906a352d1575bd6073ad7b3115568861 Author: Michael Chan Date: Sun Sep 20 21:08:56 2020 -0400 bnxt_en: Protect bnxt_set_eee() and bnxt_set_pauseparam() with mutex. [ Upstream commit a53906908148d64423398a62c4435efb0d09652c ] All changes related to bp->link_info require the protection of the link_lock mutex. It's not sufficient to rely just on RTNL. Fixes: 163e9ef63641 ("bnxt_en: Fix race when modifying pause settings.") Reviewed-by: Edwin Peer Signed-off-by: Michael Chan Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 1627f9325dbea4778d150f0b83b01f5883129a16 Author: Edwin Peer Date: Sun Sep 20 21:08:55 2020 -0400 bnxt_en: return proper error codes in bnxt_show_temp [ Upstream commit d69753fa1ecb3218b56b022722f7a5822735b876 ] Returning "unknown" as a temperature value violates the hwmon interface rules. Appropriate error codes should be returned via device_attribute show instead. These will ultimately be propagated to the user via the file system interface. In addition to the corrected error handling, it is an even better idea to not present the sensor in sysfs at all if it is known that the read will definitely fail. Given that temp1_input is currently the only sensor reported, ensure no hwmon registration if TEMP_MONITOR_QUERY is not supported or if it will fail due to access permissions. Something smarter may be needed if and when other sensors are added. Fixes: 12cce90b934b ("bnxt_en: fix HWRM error when querying VF temperature") Signed-off-by: Edwin Peer Signed-off-by: Michael Chan Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit b15fcca8eff903c4a9a50336f5bd8a208ca45df7 Author: Xin Long Date: Sun Sep 13 19:37:31 2020 +0800 tipc: use skb_unshare() instead in tipc_buf_append() [ Upstream commit ff48b6222e65ebdba5a403ef1deba6214e749193 ] In tipc_buf_append() it may change skb's frag_list, and it causes problems when this skb is cloned. skb_unclone() doesn't really make this skb's flag_list available to change. Shuang Li has reported an use-after-free issue because of this when creating quite a few macvlan dev over the same dev, where the broadcast packets will be cloned and go up to the stack: [ ] BUG: KASAN: use-after-free in pskb_expand_head+0x86d/0xea0 [ ] Call Trace: [ ] dump_stack+0x7c/0xb0 [ ] print_address_description.constprop.7+0x1a/0x220 [ ] kasan_report.cold.10+0x37/0x7c [ ] check_memory_region+0x183/0x1e0 [ ] pskb_expand_head+0x86d/0xea0 [ ] process_backlog+0x1df/0x660 [ ] net_rx_action+0x3b4/0xc90 [ ] [ ] Allocated by task 1786: [ ] kmem_cache_alloc+0xbf/0x220 [ ] skb_clone+0x10a/0x300 [ ] macvlan_broadcast+0x2f6/0x590 [macvlan] [ ] macvlan_process_broadcast+0x37c/0x516 [macvlan] [ ] process_one_work+0x66a/0x1060 [ ] worker_thread+0x87/0xb10 [ ] [ ] Freed by task 3253: [ ] kmem_cache_free+0x82/0x2a0 [ ] skb_release_data+0x2c3/0x6e0 [ ] kfree_skb+0x78/0x1d0 [ ] tipc_recvmsg+0x3be/0xa40 [tipc] So fix it by using skb_unshare() instead, which would create a new skb for the cloned frag and it'll be safe to change its frag_list. The similar things were also done in sctp_make_reassembled_event(), which is using skb_copy(). Reported-by: Shuang Li Fixes: 37e22164a8a3 ("tipc: rename and move message reassembly function") Signed-off-by: Xin Long Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 0183a74c915882509f70c2ddc05bc9e6726cfb7c Author: Tetsuo Handa Date: Sat Sep 5 15:14:47 2020 +0900 tipc: fix shutdown() of connection oriented socket [ Upstream commit a4b5cc9e10803ecba64a7d54c0f47e4564b4a980 ] I confirmed that the problem fixed by commit 2a63866c8b51a3f7 ("tipc: fix shutdown() of connectionless socket") also applies to stream socket. ---------- #include #include #include int main(int argc, char *argv[]) { int fds[2] = { -1, -1 }; socketpair(PF_TIPC, SOCK_STREAM /* or SOCK_DGRAM */, 0, fds); if (fork() == 0) _exit(read(fds[0], NULL, 1)); shutdown(fds[0], SHUT_RDWR); /* This must make read() return. */ wait(NULL); /* To be woken up by _exit(). */ return 0; } ---------- Since shutdown(SHUT_RDWR) should affect all processes sharing that socket, unconditionally setting sk->sk_shutdown to SHUTDOWN_MASK will be the right behavior. Signed-off-by: Tetsuo Handa Acked-by: Ying Xue Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit d82e08de23e36c37667f67a502b0cf4a3e3f61bd Author: Peilin Ye Date: Sun Sep 13 04:06:05 2020 -0400 tipc: Fix memory leak in tipc_group_create_member() [ Upstream commit bb3a420d47ab00d7e1e5083286cab15235a96680 ] tipc_group_add_to_tree() returns silently if `key` matches `nkey` of an existing node, causing tipc_group_create_member() to leak memory. Let tipc_group_add_to_tree() return an error in such a case, so that tipc_group_create_member() can handle it properly. Fixes: 75da2163dbb6 ("tipc: introduce communication groups") Reported-and-tested-by: syzbot+f95d90c454864b3b5bc9@syzkaller.appspotmail.com Cc: Hillf Danton Link: https://syzkaller.appspot.com/bug?id=048390604fe1b60df34150265479202f10e13aff Signed-off-by: Peilin Ye Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit d4c5a31a63365fc86579d7a6ebe98ecc4cba9bd2 Author: Jakub Kicinski Date: Thu Sep 17 10:52:57 2020 -0700 nfp: use correct define to return NONE fec [ Upstream commit 5f6857e808a8bd078296575b417c4b9d160b9779 ] struct ethtool_fecparam carries bitmasks not bit numbers. We want to return 1 (NONE), not 0. Fixes: 0d0870938337 ("nfp: implement ethtool FEC mode settings") Signed-off-by: Jakub Kicinski Reviewed-by: Simon Horman Reviewed-by: Jesse Brandeburg Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 749cc0b0c7f3dcdfe5842f998c0274e54987384f Author: Yunsheng Lin Date: Tue Sep 8 19:02:34 2020 +0800 net: sch_generic: aviod concurrent reset and enqueue op for lockless qdisc [ Upstream commit 2fb541c862c987d02dfdf28f1545016deecfa0d5 ] Currently there is concurrent reset and enqueue operation for the same lockless qdisc when there is no lock to synchronize the q->enqueue() in __dev_xmit_skb() with the qdisc reset operation in qdisc_deactivate() called by dev_deactivate_queue(), which may cause out-of-bounds access for priv->ring[] in hns3 driver if user has requested a smaller queue num when __dev_xmit_skb() still enqueue a skb with a larger queue_mapping after the corresponding qdisc is reset, and call hns3_nic_net_xmit() with that skb later. Reused the existing synchronize_net() in dev_deactivate_many() to make sure skb with larger queue_mapping enqueued to old qdisc(which is saved in dev_queue->qdisc_sleeping) will always be reset when dev_reset_queue() is called. Fixes: 6b3ba9146fe6 ("net: sched: allow qdiscs to handle locking") Signed-off-by: Yunsheng Lin Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit fe916542565b7bbb529c1fa5151812c1fbb07631 Author: Necip Fazil Yildiran Date: Thu Sep 17 19:46:43 2020 +0300 net: ipv6: fix kconfig dependency warning for IPV6_SEG6_HMAC [ Upstream commit db7cd91a4be15e1485d6b58c6afc8761c59c4efb ] When IPV6_SEG6_HMAC is enabled and CRYPTO is disabled, it results in the following Kbuild warning: WARNING: unmet direct dependencies detected for CRYPTO_HMAC Depends on [n]: CRYPTO [=n] Selected by [y]: - IPV6_SEG6_HMAC [=y] && NET [=y] && INET [=y] && IPV6 [=y] WARNING: unmet direct dependencies detected for CRYPTO_SHA1 Depends on [n]: CRYPTO [=n] Selected by [y]: - IPV6_SEG6_HMAC [=y] && NET [=y] && INET [=y] && IPV6 [=y] WARNING: unmet direct dependencies detected for CRYPTO_SHA256 Depends on [n]: CRYPTO [=n] Selected by [y]: - IPV6_SEG6_HMAC [=y] && NET [=y] && INET [=y] && IPV6 [=y] The reason is that IPV6_SEG6_HMAC selects CRYPTO_HMAC, CRYPTO_SHA1, and CRYPTO_SHA256 without depending on or selecting CRYPTO while those configs are subordinate to CRYPTO. Honor the kconfig menu hierarchy to remove kconfig dependency warnings. Fixes: bf355b8d2c30 ("ipv6: sr: add core files for SR HMAC support") Signed-off-by: Necip Fazil Yildiran Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 76fde30cf12ccf3f6d0e731972d15da174159b71 Author: Linus Walleij Date: Sat Sep 5 12:32:33 2020 +0200 net: dsa: rtl8366: Properly clear member config [ Upstream commit 4ddcaf1ebb5e4e99240f29d531ee69d4244fe416 ] When removing a port from a VLAN we are just erasing the member config for the VLAN, which is wrong: other ports can be using it. Just mask off the port and only zero out the rest of the member config once ports using of the VLAN are removed from it. Reported-by: Florian Fainelli Fixes: d8652956cf37 ("net: dsa: realtek-smi: Add Realtek SMI driver") Signed-off-by: Linus Walleij Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit d0c2f72526c6cf7ad090ee3a85226d3da8e62458 Author: Petr Machata Date: Thu Sep 10 14:09:05 2020 +0200 net: DCB: Validate DCB_ATTR_DCB_BUFFER argument [ Upstream commit 297e77e53eadb332d5062913447b104a772dc33b ] The parameter passed via DCB_ATTR_DCB_BUFFER is a struct dcbnl_buffer. The field prio2buffer is an array of IEEE_8021Q_MAX_PRIORITIES bytes, where each value is a number of a buffer to direct that priority's traffic to. That value is however never validated to lie within the bounds set by DCBX_MAX_BUFFERS. The only driver that currently implements the callback is mlx5 (maintainers CCd), and that does not do any validation either, in particual allowing incorrect configuration if the prio2buffer value does not fit into 4 bits. Instead of offloading the need to validate the buffer index to drivers, do it right there in core, and bounce the request if the value is too large. CC: Parav Pandit CC: Saeed Mahameed Fixes: e549f6f9c098 ("net/dcb: Add dcbnl buffer attribute") Signed-off-by: Petr Machata Reviewed-by: Ido Schimmel Reviewed-by: Jiri Pirko Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit f2e5359dd3bffa434cba0f62179b1e72065183af Author: Eric Dumazet Date: Tue Sep 8 01:20:23 2020 -0700 ipv6: avoid lockdep issue in fib6_del() [ Upstream commit 843d926b003ea692468c8cc5bea1f9f58dfa8c75 ] syzbot reported twice a lockdep issue in fib6_del() [1] which I think is caused by net->ipv6.fib6_null_entry having a NULL fib6_table pointer. fib6_del() already checks for fib6_null_entry special case, we only need to return earlier. Bug seems to occur very rarely, I have thus chosen a 'bug origin' that makes backports not too complex. [1] WARNING: suspicious RCU usage 5.9.0-rc4-syzkaller #0 Not tainted ----------------------------- net/ipv6/ip6_fib.c:1996 suspicious rcu_dereference_protected() usage! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 4 locks held by syz-executor.5/8095: #0: ffffffff8a7ea708 (rtnl_mutex){+.+.}-{3:3}, at: ppp_release+0x178/0x240 drivers/net/ppp/ppp_generic.c:401 #1: ffff88804c422dd8 (&net->ipv6.fib6_gc_lock){+.-.}-{2:2}, at: spin_trylock_bh include/linux/spinlock.h:414 [inline] #1: ffff88804c422dd8 (&net->ipv6.fib6_gc_lock){+.-.}-{2:2}, at: fib6_run_gc+0x21b/0x2d0 net/ipv6/ip6_fib.c:2312 #2: ffffffff89bd6a40 (rcu_read_lock){....}-{1:2}, at: __fib6_clean_all+0x0/0x290 net/ipv6/ip6_fib.c:2613 #3: ffff8880a82e6430 (&tb->tb6_lock){+.-.}-{2:2}, at: spin_lock_bh include/linux/spinlock.h:359 [inline] #3: ffff8880a82e6430 (&tb->tb6_lock){+.-.}-{2:2}, at: __fib6_clean_all+0x107/0x290 net/ipv6/ip6_fib.c:2245 stack backtrace: CPU: 1 PID: 8095 Comm: syz-executor.5 Not tainted 5.9.0-rc4-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x198/0x1fd lib/dump_stack.c:118 fib6_del+0x12b4/0x1630 net/ipv6/ip6_fib.c:1996 fib6_clean_node+0x39b/0x570 net/ipv6/ip6_fib.c:2180 fib6_walk_continue+0x4aa/0x8e0 net/ipv6/ip6_fib.c:2102 fib6_walk+0x182/0x370 net/ipv6/ip6_fib.c:2150 fib6_clean_tree+0xdb/0x120 net/ipv6/ip6_fib.c:2230 __fib6_clean_all+0x120/0x290 net/ipv6/ip6_fib.c:2246 fib6_clean_all net/ipv6/ip6_fib.c:2257 [inline] fib6_run_gc+0x113/0x2d0 net/ipv6/ip6_fib.c:2320 ndisc_netdev_event+0x217/0x350 net/ipv6/ndisc.c:1805 notifier_call_chain+0xb5/0x200 kernel/notifier.c:83 call_netdevice_notifiers_info+0xb5/0x130 net/core/dev.c:2033 call_netdevice_notifiers_extack net/core/dev.c:2045 [inline] call_netdevice_notifiers net/core/dev.c:2059 [inline] dev_close_many+0x30b/0x650 net/core/dev.c:1634 rollback_registered_many+0x3a8/0x1210 net/core/dev.c:9261 rollback_registered net/core/dev.c:9329 [inline] unregister_netdevice_queue+0x2dd/0x570 net/core/dev.c:10410 unregister_netdevice include/linux/netdevice.h:2774 [inline] ppp_release+0x216/0x240 drivers/net/ppp/ppp_generic.c:403 __fput+0x285/0x920 fs/file_table.c:281 task_work_run+0xdd/0x190 kernel/task_work.c:141 tracehook_notify_resume include/linux/tracehook.h:188 [inline] exit_to_user_mode_loop kernel/entry/common.c:163 [inline] exit_to_user_mode_prepare+0x1e1/0x200 kernel/entry/common.c:190 syscall_exit_to_user_mode+0x7e/0x2e0 kernel/entry/common.c:265 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fixes: 421842edeaf6 ("net/ipv6: Add fib6_null_entry") Signed-off-by: Eric Dumazet Cc: David Ahern Reviewed-by: David Ahern Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 2fc322bf67594e240eb23b4e0c6c8a09c69f9918 Author: Wei Wang Date: Tue Sep 8 14:09:34 2020 -0700 ip: fix tos reflection in ack and reset packets [ Upstream commit ba9e04a7ddf4f22a10e05bf9403db6b97743c7bf ] Currently, in tcp_v4_reqsk_send_ack() and tcp_v4_send_reset(), we echo the TOS value of the received packets in the response. However, we do not want to echo the lower 2 ECN bits in accordance with RFC 3168 6.1.5 robustness principles. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Wei Wang Signed-off-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 45676c0bc28eff8f46455b28e2db80a77676488b Author: Dan Carpenter Date: Wed Sep 9 12:46:48 2020 +0300 hdlc_ppp: add range checks in ppp_cp_parse_cr() [ Upstream commit 66d42ed8b25b64eb63111a2b8582c5afc8bf1105 ] There are a couple bugs here: 1) If opt[1] is zero then this results in a forever loop. If the value is less than 2 then it is invalid. 2) It assumes that "len" is more than sizeof(valid_accm) or 6 which can result in memory corruption. In the case of LCP_OPTION_ACCM, then we should check "opt[1]" instead of "len" because, if "opt[1]" is less than sizeof(valid_accm) then "nak_len" gets out of sync and it can lead to memory corruption in the next iterations through the loop. In case of LCP_OPTION_MAGIC, the only valid value for opt[1] is 6, but the code is trying to log invalid data so we should only discard the data when "len" is less than 6 because that leads to a read overflow. Reported-by: ChenNan Of Chaitin Security Research Lab Fixes: e022c2f07ae5 ("WAN: new synchronous PPP implementation for generic HDLC.") Signed-off-by: Dan Carpenter Reviewed-by: Eric Dumazet Reviewed-by: Greg Kroah-Hartman Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit c797110d97c48054d1491251fd713900ff51615c Author: Mark Gray Date: Wed Sep 16 05:19:35 2020 -0400 geneve: add transport ports in route lookup for geneve [ Upstream commit 34beb21594519ce64a55a498c2fe7d567bc1ca20 ] This patch adds transport ports information for route lookup so that IPsec can select Geneve tunnel traffic to do encryption. This is needed for OVS/OVN IPsec with encrypted Geneve tunnels. This can be tested by configuring a host-host VPN using an IKE daemon and specifying port numbers. For example, for an Openswan-type configuration, the following parameters should be configured on both hosts and IPsec set up as-per normal: $ cat /etc/ipsec.conf conn in ... left=$IP1 right=$IP2 ... leftprotoport=udp/6081 rightprotoport=udp ... conn out ... left=$IP1 right=$IP2 ... leftprotoport=udp rightprotoport=udp/6081 ... The tunnel can then be setup using "ip" on both hosts (but changing the relevant IP addresses): $ ip link add tun type geneve id 1000 remote $IP2 $ ip addr add 192.168.0.1/24 dev tun $ ip link set tun up This can then be tested by pinging from $IP1: $ ping 192.168.0.2 Without this patch the traffic is unencrypted on the wire. Fixes: 2d07dc79fe04 ("geneve: add initial netdev driver for GENEVE tunnels") Signed-off-by: Qiuyu Xiao Signed-off-by: Mark Gray Reviewed-by: Greg Rose Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 35145dab2074abf12c1486317c912d8cff5a5fa8 Author: Ganji Aravind Date: Fri Sep 4 15:58:18 2020 +0530 cxgb4: Fix offset when clearing filter byte counters [ Upstream commit 94cc242a067a869c29800aa789d38b7676136e50 ] Pass the correct offset to clear the stale filter hit bytes counter. Otherwise, the counter starts incrementing from the stale information, instead of 0. Fixes: 12b276fbf6e0 ("cxgb4: add support to create hash filters") Signed-off-by: Ganji Aravind Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit ec56646e3b2a9a0c3a2fa63732fab731009a25af Author: Ralph Campbell Date: Fri Sep 18 21:20:24 2020 -0700 mm/thp: fix __split_huge_pmd_locked() for migration PMD [ Upstream commit ec0abae6dcdf7ef88607c869bf35a4b63ce1b370 ] A migrating transparent huge page has to already be unmapped. Otherwise, the page could be modified while it is being copied to a new page and data could be lost. The function __split_huge_pmd() checks for a PMD migration entry before calling __split_huge_pmd_locked() leading one to think that __split_huge_pmd_locked() can handle splitting a migrating PMD. However, the code always increments the page->_mapcount and adjusts the memory control group accounting assuming the page is mapped. Also, if the PMD entry is a migration PMD entry, the call to is_huge_zero_pmd(*pmd) is incorrect because it calls pmd_pfn(pmd) instead of migration_entry_to_pfn(pmd_to_swp_entry(pmd)). Fix these problems by checking for a PMD migration entry. Fixes: 84c3fc4e9c56 ("mm: thp: check pmd migration entry in common path") Signed-off-by: Ralph Campbell Signed-off-by: Andrew Morton Reviewed-by: Yang Shi Reviewed-by: Zi Yan Cc: Jerome Glisse Cc: John Hubbard Cc: Alistair Popple Cc: Christoph Hellwig Cc: Jason Gunthorpe Cc: Bharata B Rao Cc: Ben Skeggs Cc: Shuah Khan Cc: [4.14+] Link: https://lkml.kernel.org/r/20200903183140.19055-1-rcampbell@nvidia.com Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin commit d44a437826119e8307c3904c1e4f513095ea17cb Author: Muchun Song Date: Fri Sep 18 21:20:21 2020 -0700 kprobes: fix kill kprobe which has been marked as gone [ Upstream commit b0399092ccebd9feef68d4ceb8d6219a8c0caa05 ] If a kprobe is marked as gone, we should not kill it again. Otherwise, we can disarm the kprobe more than once. In that case, the statistics of kprobe_ftrace_enabled can unbalance which can lead to that kprobe do not work. Fixes: e8386a0cb22f ("kprobes: support probing module __exit function") Co-developed-by: Chengming Zhou Signed-off-by: Muchun Song Signed-off-by: Chengming Zhou Signed-off-by: Andrew Morton Acked-by: Masami Hiramatsu Cc: "Naveen N . Rao" Cc: Anil S Keshavamurthy Cc: David S. Miller Cc: Song Liu Cc: Steven Rostedt Cc: Link: https://lkml.kernel.org/r/20200822030055.32383-1-songmuchun@bytedance.com Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin commit 19184bd06f488af62924ff1747614a8cb284ad63 Author: Rustam Kovhaev Date: Mon Sep 7 11:55:35 2020 -0700 KVM: fix memory leak in kvm_io_bus_unregister_dev() [ Upstream commit f65886606c2d3b562716de030706dfe1bea4ed5e ] when kmalloc() fails in kvm_io_bus_unregister_dev(), before removing the bus, we should iterate over all other devices linked to it and call kvm_iodevice_destructor() for them Fixes: 90db10434b16 ("KVM: kvm_io_bus_unregister_dev() should never fail") Cc: stable@vger.kernel.org Reported-and-tested-by: syzbot+f196caa45793d6374707@syzkaller.appspotmail.com Link: https://syzkaller.appspot.com/bug?extid=f196caa45793d6374707 Signed-off-by: Rustam Kovhaev Reviewed-by: Vitaly Kuznetsov Message-Id: <20200907185535.233114-1-rkovhaev@gmail.com> Signed-off-by: Paolo Bonzini Signed-off-by: Sasha Levin commit b59a23d596807a5aa88d8dd5655a66c6843729b3 Author: Mark Salyzyn Date: Wed Jul 22 04:00:53 2020 -0700 af_key: pfkey_dump needs parameter validation commit 37bd22420f856fcd976989f1d4f1f7ad28e1fcac upstream. In pfkey_dump() dplen and splen can both be specified to access the xfrm_address_t structure out of bounds in__xfrm_state_filter_match() when it calls addr_match() with the indexes. Return EINVAL if either are out of range. Signed-off-by: Mark Salyzyn Cc: netdev@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: kernel-team@android.com Cc: Steffen Klassert Cc: Herbert Xu Cc: "David S. Miller" Cc: Jakub Kicinski Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Steffen Klassert Signed-off-by: Greg Kroah-Hartman