commit a9da8725b7a744be3ff0ff44cab2547e4d1e6675 Author: Greg Kroah-Hartman Date: Wed Nov 21 09:22:14 2018 +0100 Linux 4.18.20 commit 55eac9e85deaa9e359faf8501abc7cbbcac74b01 Author: Greg Kroah-Hartman Date: Tue Nov 20 10:08:18 2018 +0100 Revert "ACPICA: AML interpreter: add region addresses in global list during initialization" This reverts commit 7876d54ad642fbbd1857d37528aa1ec8c5a2c592 which is commit 4abb951b73ff0a8a979113ef185651aa3c8da19b upstream. Jean writes: This commit was tagged with: Link: https://bugzilla.kernel.org/show_bug.cgi?id=200011 Tested-by: Jean-Marc Lenoir Cc: All applicable making it sound like it was fixing an actual bug. This is not the case. The commit fixes a side issue discovered while investigating bug #200011. It does NOT fix bug #200011 itself (as explicitly reported by Jean-Marc at https://bugzilla.kernel.org/show_bug.cgi?id=200011#c65 ). It does however cause regressions, despite what the commit message says. See: https://bugzilla.kernel.org/show_bug.cgi?id=201721 and I expect more similar regressions, as ACPI resource conflicts are very frequent. This commit was not stable material to start with. It is intrusive, presents a risk of side effects, and does not solve an actual bug that is bothering users. Reported-by: Jean Delvare Cc: Jean-Marc Lenoir Cc: Erik Schmauss Cc: Rafael J. Wysocki Signed-off-by: Greg Kroah-Hartman commit 770271829fbe61d9846ec02bec90a1aa5c8ab99d Author: Stefano Stabellini Date: Wed Oct 31 16:11:49 2018 -0700 CONFIG_XEN_PV breaks xen_create_contiguous_region on ARM commit f9005571701920551bcf54a500973fb61f2e1eda upstream. xen_create_contiguous_region has now only an implementation if CONFIG_XEN_PV is defined. However, on ARM we never set CONFIG_XEN_PV but we do have an implementation of xen_create_contiguous_region which is required for swiotlb-xen to work correctly (although it just sets *dma_handle). [backport: remove change to xen_remap_pfn] Cc: # 4.12 Fixes: 16624390816c ("xen: create xen_create/destroy_contiguous_region() stubs for PVHVM only builds") Signed-off-by: Stefano Stabellini Reviewed-by: Juergen Gross CC: Jeff.Kubascik@dornerworks.com CC: Jarvis.Roach@dornerworks.com CC: Nathan.Studer@dornerworks.com CC: vkuznets@redhat.com CC: boris.ostrovsky@oracle.com CC: jgross@suse.com CC: julien.grall@arm.com Signed-off-by: Juergen Gross Signed-off-by: Greg Kroah-Hartman commit 355c0d23ff884aa0909ab6ba94bb39c05eb50dd3 Author: Lyude Paul Date: Tue Nov 6 16:30:12 2018 -0500 drm/i915: Fix possible race in intel_dp_add_mst_connector() commit 7c4512300cfa5a4dcc8c1c52ae61e3fa4bd11a39 upstream. This hasn't caused any issues yet that I'm aware of, but as Ville Syrjälä pointed out - we need to make sure that intel_connector->mst_port is set before initializing MST connectors, since in theory we could potentially check intel_connector->mst_port in i915_hpd_poll_init_work() after registering the connector but before having written it's value. Signed-off-by: Lyude Paul Reviewed-by: Ville Syrjälä Cc: Rodrigo Vivi Cc: stable@vger.kernel.org Link: https://patchwork.freedesktop.org/patch/msgid/20181106213017.14563-2-lyude@redhat.com (cherry picked from commit 66a5ab1034be801630816d1fa6cfc30db1a2f0b0) Signed-off-by: Joonas Lahtinen Signed-off-by: Greg Kroah-Hartman commit 0400eb06d37c71d9ebaafd65b407214791cc1a3a Author: Chris Wilson Date: Thu Nov 8 08:17:38 2018 +0000 drm/i915/execlists: Force write serialisation into context image vs execution commit 0a823e8fd4fd67726697854578f3584ee3a49b1d upstream. Ensure that the writes into the context image are completed prior to the register mmio to trigger execution. Although previously we were assured by the SDM that all writes are flushed before an uncached memory transaction (our mmio write to submit the context to HW for execution), we have empirical evidence to believe that this is not actually the case. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108656 References: https://bugs.freedesktop.org/show_bug.cgi?id=108315 References: https://bugs.freedesktop.org/show_bug.cgi?id=106887 Signed-off-by: Chris Wilson Cc: Mika Kuoppala Cc: Tvrtko Ursulin Acked-by: Mika Kuoppala Link: https://patchwork.freedesktop.org/patch/msgid/20181108081740.25615-1-chris@chris-wilson.co.uk Cc: stable@vger.kernel.org (cherry picked from commit 987abd5c62f92ee4970b45aa077f47949974e615) Signed-off-by: Joonas Lahtinen Signed-off-by: Greg Kroah-Hartman commit 06e562e7f515292ea7721475950f23554214adde Author: Chris Wilson Date: Mon Nov 5 09:43:05 2018 +0000 drm/i915/ringbuffer: Delay after EMIT_INVALIDATE for gen4/gen5 commit fb5bbae9b1333d44023713946fdd28db0cd85751 upstream. Exercising the gpu reloc path strenuously revealed an issue where the updated relocations (from MI_STORE_DWORD_IMM) were not being observed upon execution. After some experiments with adding pipecontrols (a lot of pipecontrols (32) as gen4/5 do not have a bit to wait on earlier pipe controls or even the current on), it was discovered that we merely needed to delay the EMIT_INVALIDATE by several flushes. It is important to note that it is the EMIT_INVALIDATE as opposed to the EMIT_FLUSH that needs the delay as opposed to what one might first expect -- that the delay is required for the TLB invalidation to take effect (one presumes to purge any CS buffers) as opposed to a delay after flushing to ensure the writes have landed before triggering invalidation. Testcase: igt/gem_tiled_fence_blits Signed-off-by: Chris Wilson Cc: stable@vger.kernel.org Reviewed-by: Ville Syrjälä Link: https://patchwork.freedesktop.org/patch/msgid/20181105094305.5767-1-chris@chris-wilson.co.uk (cherry picked from commit 55f99bf2a9c331838c981694bc872cd1ec4070b2) Signed-off-by: Joonas Lahtinen Signed-off-by: Greg Kroah-Hartman commit 41a2334c224ead8203d8f6235eb4915536bf700f Author: Chris Wilson Date: Fri Nov 2 16:12:09 2018 +0000 drm/i915: Mark pin flags as u64 commit 0014868b9c3c1dda1de6711cf58c3486fb422d07 upstream. Since the flags are being used to operate on a u64 variable, they too need to be marked as such so that the inverses are full width (and not zero extended on 32b kernels and bdw+). Reported-by: Sergii Romantsov Signed-off-by: Chris Wilson Cc: stable@vger.kernel.org Reviewed-by: Lionel Landwerlin Reviewed-by: Michal Wajdeczko Link: https://patchwork.freedesktop.org/patch/msgid/20181102161232.17742-2-chris@chris-wilson.co.uk (cherry picked from commit 83b466b1dc5f0b4d33f0a901e8b00197a8f3582d) Signed-off-by: Joonas Lahtinen Signed-off-by: Greg Kroah-Hartman commit a4820798a2edaa38164bd829214aefe1aa8eb3af Author: Ville Syrjälä Date: Mon Nov 5 21:46:04 2018 +0200 drm/i915: Don't oops during modeset shutdown after lpe audio deinit commit 6a8915d0f8cf323e1beb792a33095cf652db4056 upstream. We deinit the lpe audio device before we call drm_atomic_helper_shutdown(), which means the platform device may already be gone when it comes time to shut down the crtc. As we don't know when the last reference to the platform device gets dropped by the audio driver we can't assume that the device and its data are still around when turning off the crtc. Mark the platform device as gone as soon as we do the audio deinit. Cc: stable@vger.kernel.org Signed-off-by: Ville Syrjälä Link: https://patchwork.freedesktop.org/patch/msgid/20181105194604.6994-1-ville.syrjala@linux.intel.com Reviewed-by: Chris Wilson (cherry picked from commit f45a7977d1140c11f334e01a9f77177ed68e3bfa) Signed-off-by: Joonas Lahtinen Signed-off-by: Greg Kroah-Hartman commit d0baf6ac9fccfe563186461f65a91b6f6d0cbdc3 Author: Chris Wilson Date: Thu Oct 25 10:18:23 2018 +0100 drm/i915: Compare user's 64b GTT offset even on 32b commit 085603287452fc96376ed4888bf29f8c095d2b40 upstream. Beware mixing unsigned long constants and 64b values, as on 32b the constant will be zero extended and discard the high 32b when used as a mask! Reported-by: Sergii Romantsov Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108282 Signed-off-by: Chris Wilson Cc: Joonas Lahtinen Cc: stable@vger.kernel.org Reviewed-by: Matthew Auld Link: https://patchwork.freedesktop.org/patch/msgid/20181025091823.20571-2-chris@chris-wilson.co.uk (cherry picked from commit 6fc4e48f9ed46e9adff236a0c350074aafa3b7fa) Signed-off-by: Joonas Lahtinen Signed-off-by: Greg Kroah-Hartman commit b9d3cae0b45150d68b7d93ea2a6d97c3a3ea3387 Author: Ville Syrjälä Date: Thu Oct 25 16:05:36 2018 +0300 drm/i915: Fix ilk+ watermarks when disabling pipes commit df5e31c204b34e8d9e5ec33f5b28e960c4f25e14 upstream. We're no longer programming any watermarks when we're disabling a pipe. That means ilk_wm_merge() & co. will keep considering the any pipe that is getting disabled as still enabled. Thus we either get no LP1+ watermakrs (ilk-ivb), or we get suboptimal ones (hsw-bdw). This seems to have been broken by commit b6b178a77210 ("drm/i915: Calculate ironlake intermediate watermarks correctly, v2."). Before that we apparently had some difference between the intermediate and optimal watermarks and so we would program the optiomal ones. Now intermediate and optimal are identical for disabled pipes and so we don't program either. Fix this by programming the intermediate watermarks even for disabled pipes. We were already doing that for skl+. We'll leave out gmch platforms for now since those do the merging in a different manner and should work as is. We'll want to unify this eventually, but play it safe for now and just put in a FIXME. Cc: stable@vger.kernel.org Cc: Matt Roper Cc: Maarten Lankhorst Fixes: b6b178a77210 ("drm/i915: Calculate ironlake intermediate watermarks correctly, v2.") Signed-off-by: Ville Syrjälä Link: https://patchwork.freedesktop.org/patch/msgid/20181025130536.29024-1-ville.syrjala@linux.intel.com Reviewed-by: Maarten Lankhorst #irc (cherry picked from commit a748faea3bfd7fd1d1485bc1c426c7d460cc6503) Signed-off-by: Joonas Lahtinen Signed-off-by: Greg Kroah-Hartman commit 877e75bc93ecb8d7c7eeeaa7304cd3d030e36a58 Author: Ville Syrjälä Date: Mon Oct 29 16:00:31 2018 +0200 drm/i915: Fix error handling for the NV12 fb dimensions check commit f42f343887016330b321dd40eebc68c7292e4f1b upstream. Let's not leak obj->framebuffer_references when we decide that the framebuffer domensions are not suitable for NV12. Cc: stable@vger.kernel.org Cc: Maarten Lankhorst Cc: Vidya Srinivas Fixes: e44134f2673c ("drm/i915: Add NV12 support to intel_framebuffer_init") Signed-off-by: Ville Syrjälä Link: https://patchwork.freedesktop.org/patch/msgid/20181029140031.11765-1-ville.syrjala@linux.intel.com Reviewed-by: Matt Roper (cherry picked from commit 3b90946fcb6f13b65888c380461793a9dea9d1f4) Signed-off-by: Joonas Lahtinen Signed-off-by: Greg Kroah-Hartman commit 63f4972b9b22664a385f5623034a1ac64946743a Author: Clint Taylor Date: Thu Oct 25 11:52:00 2018 -0700 drm/i915/hdmi: Add HDMI 2.0 audio clock recovery N values commit 6503493145cba4413ecd3d4d153faeef4a1e9b85 upstream. HDMI 2.0 594Mhz modes were incorrectly selecting 25.200Mhz Automatic N value mode instead of HDMI specification values. V2: Fix 88.2 Hz N value Cc: Jani Nikula Cc: stable@vger.kernel.org Signed-off-by: Clint Taylor Signed-off-by: Jani Nikula Link: https://patchwork.freedesktop.org/patch/msgid/1540493521-1746-2-git-send-email-clinton.a.taylor@intel.com (cherry picked from commit 5a400aa3c562c4a726b4da286e63c96db905ade1) Signed-off-by: Joonas Lahtinen Signed-off-by: Greg Kroah-Hartman commit df00d4ac7d26931702c753f167e6f99d05dbd80e Author: Dhinakaran Pandiyan Date: Thu Sep 27 13:57:31 2018 -0700 drm/i915/dp: Restrict link retrain workaround to external monitors commit f9776280c29e77a18cbc7ebb6d48f7885e494990 upstream. Commit '3cf71bc9904d ("drm/i915: Re-apply "Perform link quality check, unconditionally during long pulse"")' applies a work around for sinks that don't signal link loss. The work around does not need to have to be that broad as the issue was seen with only one particular monitor; limit this only for external displays as eDP features like PSR turn off the link and the driver ends up retraining the link seeeing that link is not synchronized. Cc: Lyude Paul Cc: Jan-Marek Glogowski Cc: Ville Syrjälä Cc: Rodrigo Vivi References: 3cf71bc9904d ("drm/i915: Re-apply "Perform link quality check, unconditionally during long pulse"") Signed-off-by: Dhinakaran Pandiyan Reviewed-by: Ville Syrjälä Link: https://patchwork.freedesktop.org/patch/msgid/20180927205735.16651-2-dhinakaran.pandiyan@intel.com (cherry picked from commit f24f6eb95807bca0dbd8dc5b2f3a4099000f4472) Fixes: 399334708b4f ("drm/i915: Re-apply "Perform link quality check, unconditionally during long pulse"") Cc: stable@vger.kernel.org Signed-off-by: Joonas Lahtinen Signed-off-by: Greg Kroah-Hartman commit 6440b1a7f8f2698f7f128c9503ae103247a17151 Author: Dhinakaran Pandiyan Date: Thu Sep 27 13:57:30 2018 -0700 drm/i915/dp: Fix link retraining comment in intel_dp_long_pulse() commit 49af5d95b9b3c21a84ad115a9db9acbc036d849a upstream. Comment claims link needs to be retrained because the connected sink raised a long pulse to indicate link loss. If the sink did so, intel_dp_hotplug() would have handled link retraining. Looking at the logs in Bugzilla referenced in commit '3cf71bc9904d ("drm/i915: Re-apply Perform link quality check, unconditionally during long pulse"")', the issue is that the sink does not trigger an interrupt. What we want is ->detect() from user space to check link status and retrain. Ville's review for the original patch also indicates the same root cause. So, rewrite the comment. v2: Patch split and rewrote comment. Cc: Lyude Paul Cc: Ville Syrjälä Cc: Jani Nikula Cc: Rodrigo Vivi Cc: Jan-Marek Glogowski References: 3cf71bc9904d ("drm/i915: Re-apply "Perform link quality check, unconditionally during long pulse"") Signed-off-by: Dhinakaran Pandiyan Reviewed-by: Ville Syrjälä Link: https://patchwork.freedesktop.org/patch/msgid/20180927205735.16651-1-dhinakaran.pandiyan@intel.com (cherry picked from commit 9ebd8202393dde9d3678c9ec162c1aa63ba17eac) Fixes: 399334708b4f ("drm/i915: Re-apply "Perform link quality check, unconditionally during long pulse"") Cc: stable@vger.kernel.org Signed-off-by: Joonas Lahtinen Signed-off-by: Greg Kroah-Hartman commit f28285d100212361ddc832742b760169b75dce07 Author: Chris Wilson Date: Fri Oct 12 15:02:28 2018 +0100 drm/i915: Large page offsets for pread/pwrite commit ab0d6a141843e0b4b2709dfd37b53468b5452c3a upstream. Handle integer overflow when computing the sub-page length for shmem backed pread/pwrite. Reported-by: Tvrtko Ursulin Signed-off-by: Chris Wilson Cc: Tvrtko Ursulin Cc: stable@vger.kernel.org Reviewed-by: Tvrtko Ursulin Link: https://patchwork.freedesktop.org/patch/msgid/20181012140228.29783-1-chris@chris-wilson.co.uk (cherry picked from commit a5e856a5348f6cd50889d125c40bbeec7328e466) Signed-off-by: Rodrigo Vivi Signed-off-by: Greg Kroah-Hartman commit 874d2275ee9405e7355cb1b3b1a80102c0ac7549 Author: Lyude Paul Date: Mon Oct 8 19:24:33 2018 -0400 drm/i915: Skip vcpi allocation for MSTB ports that are gone commit c02ba4ef16eefe663fdefcccaa57fad32d5481bf upstream. Since we need to be able to allow DPMS on->off prop changes after an MST port has disappeared from the system, we need to be able to make sure we can compute a config for the resulting atomic commit. Currently this is impossible when the port has disappeared, since the VCPI slot searching we try to do in intel_dp_mst_compute_config() will fail with -EINVAL. Since the only commits we want to allow on no-longer-present MST ports are ones that shut off display hardware, we already know that no VCPI allocations are needed. So, hardcode the VCPI slot count to 0 when intel_dp_mst_compute_config() is called on an MST port that's gone. Changes since V4: - Don't use mst_port_gone at all, just check whether or not the drm connector is registered - Daniel Vetter Signed-off-by: Lyude Paul Reviewed-by: Daniel Vetter Cc: stable@vger.kernel.org Link: https://patchwork.freedesktop.org/patch/msgid/20181008232437.5571-5-lyude@redhat.com (cherry picked from commit f67207d78ceaf98b7531bc22df6f21328559c8d4) Signed-off-by: Rodrigo Vivi Signed-off-by: Greg Kroah-Hartman commit 0cf4813b215f2450e844f9431936048293da631f Author: Lyude Paul Date: Mon Oct 8 19:24:32 2018 -0400 drm/i915: Don't unset intel_connector->mst_port commit 80c188695a77eddaa6e8885510ff4ef59fd478c3 upstream. Currently we set intel_connector->mst_port to NULL to signify that the MST port has been removed from the system so that we can prevent further action on the port such as connector probes, mode probing, etc. However, we're going to need access to intel_connector->mst_port in order to fixup ->best_encoder() so that it can always return the correct encoder for an MST port to prevent legacy DPMS prop changes from failing. This should be safe, so instead keep intel_connector->mst_port always set and instead just check the status of drm_connector->regustered to signify whether or not the connector has disappeared from the system. Changes since v2: - Add a comment to mst_port_gone (Jani Nikula) - Change mst_port_gone to a u8 instead of a bool, per the kernel bot. Apparently bool is discouraged in structs these days Changes since v4: - Don't use mst_port_gone at all! Just check if the connector is registered or not - Daniel Vetter Signed-off-by: Lyude Paul Reviewed-by: Daniel Vetter Cc: stable@vger.kernel.org Link: https://patchwork.freedesktop.org/patch/msgid/20181008232437.5571-4-lyude@redhat.com (cherry picked from commit 6ed5bb1fbad34382c8cfe9a9bf737e9a43053df5) Signed-off-by: Rodrigo Vivi Signed-off-by: Greg Kroah-Hartman commit 6aba99c5574823f27962731a0aed4701c41d1a4e Author: Ville Syrjälä Date: Wed Oct 3 17:49:51 2018 +0300 drm/i915: Restore vblank interrupts earlier commit 7cada4d0b7a0fb813dbc9777fec092e9ed0546e9 upstream. Plane sanitation needs vblank interrupts (on account of CxSR disable). So let's restore vblank interrupts earlier. v2: Make it actually build v3: Add comment to explain why we need this (Daniel) Cc: stable@vger.kernel.org Cc: Dennis Tested-by: Dennis Tested-by: Peter Nowee Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105637 Fixes: b1e01595a66d ("drm/i915: Redo plane sanitation during readout") Signed-off-by: Ville Syrjälä Reviewed-by: Daniel Vetter Link: https://patchwork.freedesktop.org/patch/msgid/20181003144951.4397-1-ville.syrjala@linux.intel.com (cherry picked from commit 68bc30deac625b8be8d3950b30dc93d09a3645f5) Signed-off-by: Rodrigo Vivi Signed-off-by: Greg Kroah-Hartman commit 68b4918e70729a4662cdbc2dc86d45348ab4fa15 Author: Manasi Navare Date: Tue Oct 9 14:28:04 2018 -0700 drm/i915/dp: Link train Fallback on eDP only if fallback link BW can fit panel's native mode commit 041444458835d7fb2c9f042598bfe16bf375b15d upstream. This patch fixes the original commit c0cfb10d9e1de49 ("drm/i915/edp: Do not do link training fallback or prune modes on EDP") that causes a blank screen in case of certain eDP panels (Eg: seen on Dell XPS13 9350) where first link training fails and a retraining is required by falling back to lower link rate/lane count. In case of some panels they advertise higher link rate/lane count than whats required for supporting the panel's native mode. But we always link train at highest link rate/lane count for eDP and if that fails we can still fallback to lower link rate/lane count as long as the fallback link BW still fits the native mode to avoid pruning the panel's native mode yet retraining at fallback values to recover from a blank screen. v3: * Add const for fixed_mode (Ville) v2: * Send uevent if link failure on eDP unconditionally Fixes: c0cfb10d9e1d ("drm/i915/edp: Do not do link training fallback or prune modes on EDP") Cc: Clinton Taylor Cc: Jani Nikula Cc: Ville Syrjala Cc: Daniel Vetter Cc: Lucas De Marchi Cc: # v4.17+ Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107489 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105338 Signed-off-by: Manasi Navare Tested-by: Alexander Wilson Reviewed-by: Ville Syrjälä Link: https://patchwork.freedesktop.org/patch/msgid/20181009212804.702-1-manasi.d.navare@intel.com (cherry picked from commit 1e712535c51ab025ebc776d4405683d81521996d) Signed-off-by: Rodrigo Vivi Signed-off-by: Greg Kroah-Hartman commit a90d6f083b7a0b5cdf932562d041bb3b056eabe3 Author: Hans de Goede Date: Fri Oct 12 12:16:10 2018 +0200 drm: panel-orientation-quirks: Add quirk for Acer One 10 (S1003) commit 0e8afefd5da4875ddea9aa4ad17a2540a2bf9736 upstream. The Acer One 10 uses a clamshell design with a detachable keyboard. As such in normal operating mode, with the keyboard attach the device is in landscape mode (and the Acer logo at boot also shows in landscape mode). But the device uses a portrait screen rotated 90 degrees (sigh). This commit adds a quirk for this device so that we shown the fbcon the right way up and that we hint userspace to also show e.g. plymouth and gdm the right way up. Cc: stable@vger.kernel.org Acked-by: Daniel Vetter Signed-off-by: Hans de Goede Link: https://patchwork.freedesktop.org/patch/msgid/20181012101610.29100-1-hdegoede@redhat.com Signed-off-by: Greg Kroah-Hartman commit 4a7daecdaab98554a1cc51ae57f51636434e4f0c Author: Stanislav Lisovskiy Date: Fri Nov 9 11:00:12 2018 +0200 drm/dp_mst: Check if primary mstb is null commit 23d8003907d094f77cf959228e2248d6db819fa7 upstream. Unfortunately drm_dp_get_mst_branch_device which is called from both drm_dp_mst_handle_down_rep and drm_dp_mst_handle_up_rep seem to rely on that mgr->mst_primary is not NULL, which seem to be wrong as it can be cleared with simultaneous mode set, if probing fails or in other case. mgr->lock mutex doesn't protect against that as it might just get assigned to NULL right before, not simultaneously. There are currently bugs 107738, 108616 bugs which crash in drm_dp_get_mst_branch_device, caused by this issue. v2: Refactored the code, as it was nicely noticed. Fixed Bugzilla bug numbers(second was 108616, but not 108816) and added links. [changed title and added stable cc] Signed-off-by: Lyude Paul Signed-off-by: Stanislav Lisovskiy Cc: stable@vger.kernel.org Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108616 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107738 Link: https://patchwork.freedesktop.org/patch/msgid/20181109090012.24438-1-stanislav.lisovskiy@intel.com Signed-off-by: Greg Kroah-Hartman commit fbea4573dc0bfcd4cc03cbda76648b67e17c01aa Author: Lucas Stach Date: Thu Oct 4 11:37:00 2018 +0200 drm/etnaviv: fix bogus fence complete check in timeout handler commit 6fce3a406108ee6c8a61e2a33e52e9198a626ea0 upstream. The GPU hardware fences and the job out-fences are on different timelines so it's wrong to compare them. Fix this by only looking at the out-fence. Cc: Fixes: 2c83a726d6fb (drm/etnaviv: bring back progress check in job timeout handler) Signed-off-by: Lucas Stach Signed-off-by: Greg Kroah-Hartman commit 13b3707bafea4857df5f55c3018e08015174000e Author: Akshu Agrawal Date: Mon Sep 24 15:48:02 2018 +0530 drm/amd/powerplay: Enable/Disable NBPSTATE on On/OFF of UVD commit 51ef434a15b450bfbef1e06cc87ee4e98a224486 upstream. We observe black lines (underflow) on display when playing a 4K video with UVD. On Disabling Low memory P state this issue is not seen. Multiple runs of power measurement shows no imapct. Signed-off-by: Akshu Agrawal Signed-off-by: Satyajit Sahu Acked-by: Alex Deucher Signed-off-by: Alex Deucher Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman commit 1cca6c472fcc9e707872c12699f39473dfc3c2d5 Author: Lyude Paul Date: Mon Oct 8 19:24:31 2018 -0400 drm/nouveau: Fix nv50_mstc->best_encoder() commit 7b0f61e91b6056c71649efa3204112a4b6cf5fc8 upstream. As mentioned in the previous commit, we currently prevent new modesets on recently-removed MST connectors by returning no encoder from our ->best_encoder() callback once the MST port has disappeared. This is wrong however, because it prevents legacy modesetting users from being able to disable CRTCs on MST connectors after the connector's respective topology has disappeared. So, fix this by instead by just always returning a valid encoder. Changes since v2: - Remove usage of atomic MST helper for now, since that got replaced with a much simpler solution Signed-off-by: Lyude Paul Reviewed-by: Daniel Vetter Reviewed-by: Ben Skeggs Cc: stable@vger.kernel.org Link: https://patchwork.freedesktop.org/patch/msgid/20181008232437.5571-3-lyude@redhat.com (cherry picked from commit e87b0bbc9f0380d403f8f2f6abba0d51c74d944f) Signed-off-by: Joonas Lahtinen Signed-off-by: Greg Kroah-Hartman commit 699242cf4acab2d9352418b488baa51c111166f4 Author: Lyude Paul Date: Thu Sep 6 17:43:21 2018 -0400 drm/nouveau: Check backlight IDs are >= 0, not > 0 commit dc854914999d5d52ac1b31740cb0ea8d89d0372e upstream. Remember, ida IDs start at 0, not 1! Signed-off-by: Lyude Paul Reviewed-by: Karol Herbst Cc: stable@vger.kernel.org Signed-off-by: Ben Skeggs Signed-off-by: Greg Kroah-Hartman commit bbf40af9c2d8ee78b174c5ec30e33ae928223f96 Author: Alex Deucher Date: Tue Aug 28 14:16:23 2018 -0500 drm/amdgpu: add missing CHIP_HAINAN in amdgpu_ucode_get_load_type commit d9997b64c52b70bd98c48f443f068253621d1ffc upstream. This caused a confusing error message, but there is functionally no problem since the default method is DIRECT. Reviewed-by: Michel Dänzer Signed-off-by: Alex Deucher Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman commit 842b99a831c58e062a3e9ab10b7dbb31e9a3eaf8 Author: Rex Zhu Date: Fri Oct 12 22:26:11 2018 +0800 drm/amdgpu: Fix typo in amdgpu_vmid_mgr_init commit 3df27645395e8f79c0dc20a15cf1da61f376000d upstream. fix a typo in for loop: i->j Reviewed-by: Christian König Signed-off-by: Rex Zhu Signed-off-by: Alex Deucher Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman commit 5b231a7b332666826b98b608ababa79eb3c3343d Author: Johan Hovold Date: Mon Aug 27 10:21:47 2018 +0200 drm/msm: fix OF child-node lookup commit f9a7082327e26f54067a49cac2316d31e0cc8ba7 upstream. Use the new of_get_compatible_child() helper to lookup the legacy pwrlevels child node instead of using of_find_compatible_node(), which searches the entire tree from a given start node and thus can return an unrelated (i.e. non-child) node. This also addresses a potential use-after-free (e.g. after probe deferral) as the tree-wide helper drops a reference to its first argument (i.e. the probed device's node). While at it, also fix the related child-node reference leak. Fixes: e2af8b6b0ca1 ("drm/msm: gpu: Use OPP tables if we can") Cc: stable # 4.12 Cc: Jordan Crouse Cc: Rob Clark Cc: David Airlie Signed-off-by: Johan Hovold Signed-off-by: Rob Herring Signed-off-by: Greg Kroah-Hartman commit 0ad256e8ed36b48ad5f97c0479464a8f05fec767 Author: Marc Zyngier Date: Sun Aug 5 13:48:07 2018 +0100 drm/rockchip: Allow driver to be shutdown on reboot/kexec commit 7f3ef5dedb146e3d5063b6845781ad1bb59b92b5 upstream. Leaving the DRM driver enabled on reboot or kexec has the annoying effect of leaving the display generating transactions whilst the IOMMU has been shut down. In turn, the IOMMU driver (which shares its interrupt line with the VOP) starts warning either on shutdown or when entering the secondary kernel in the kexec case (nothing is expected on that front). A cheap way of ensuring that things are nicely shut down is to register a shutdown callback in the platform driver. Signed-off-by: Marc Zyngier Tested-by: Vicente Bergas Signed-off-by: Heiko Stuebner Link: https://patchwork.freedesktop.org/patch/msgid/20180805124807.18169-1-marc.zyngier@arm.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman commit 6f680252dc25203697356d938f5047f6d69652a6 Author: Ard Biesheuvel Date: Wed Nov 14 09:55:42 2018 -0800 efi/arm/libstub: Pack FDT after populating it commit 72a58a63a164b4e9d2d914e65caeb551846883f1 upstream. Commit: 24d7c494ce46 ("efi/arm-stub: Round up FDT allocation to mapping size") increased the allocation size for the FDT image created by the stub to a fixed value of 2 MB, to simplify the former code that made several attempts with increasing values for the size. This is reasonable given that the allocation is of type EFI_LOADER_DATA, which is released to the kernel unless it is explicitly memblock_reserve()d by the early boot code. However, this allocation size leaked into the 'size' field of the FDT header metadata, and so the entire allocation remains occupied by the device tree binary, even if most of it is not used to store device tree information. So call fdt_pack() to shrink the FDT data structure to its minimum size after populating all the fields, so that the remaining memory is no longer wasted. Signed-off-by: Ard Biesheuvel Cc: # v4.12+ Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: linux-efi@vger.kernel.org Fixes: 24d7c494ce46 ("efi/arm-stub: Round up FDT allocation to mapping size") Link: http://lkml.kernel.org/r/20181114175544.12860-4-ard.biesheuvel@linaro.org Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit e47645d9b0fb74f8663f78088952e8ad5e09ad61 Author: Vasily Averin Date: Fri Nov 16 15:08:11 2018 -0800 mm/swapfile.c: use kvzalloc for swap_info_struct allocation commit 873d7bcfd066663e3e50113dc4a0de19289b6354 upstream. Commit a2468cc9bfdf ("swap: choose swap device according to numa node") changed 'avail_lists' field of 'struct swap_info_struct' to an array. In popular linux distros it increased size of swap_info_struct up to 40 Kbytes and now swap_info_struct allocation requires order-4 page. Switch to kvzmalloc allows to avoid unexpected allocation failures. Link: http://lkml.kernel.org/r/fc23172d-3c75-21e2-d551-8b1808cbe593@virtuozzo.com Fixes: a2468cc9bfdf ("swap: choose swap device according to numa node") Signed-off-by: Vasily Averin Acked-by: Aaron Lu Acked-by: Michal Hocko Reviewed-by: Andrew Morton Cc: Huang Ying Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 82743f44a420d641174e09309d2a7ed024aea00a Author: Mike Kravetz Date: Fri Nov 16 15:08:04 2018 -0800 hugetlbfs: fix kernel BUG at fs/hugetlbfs/inode.c:444! commit 5e41540c8a0f0e98c337dda8b391e5dda0cde7cf upstream. This bug has been experienced several times by the Oracle DB team. The BUG is in remove_inode_hugepages() as follows: /* * If page is mapped, it was faulted in after being * unmapped in caller. Unmap (again) now after taking * the fault mutex. The mutex will prevent faults * until we finish removing the page. * * This race can only happen in the hole punch case. * Getting here in a truncate operation is a bug. */ if (unlikely(page_mapped(page))) { BUG_ON(truncate_op); In this case, the elevated map count is not the result of a race. Rather it was incorrectly incremented as the result of a bug in the huge pmd sharing code. Consider the following: - Process A maps a hugetlbfs file of sufficient size and alignment (PUD_SIZE) that a pmd page could be shared. - Process B maps the same hugetlbfs file with the same size and alignment such that a pmd page is shared. - Process B then calls mprotect() to change protections for the mapping with the shared pmd. As a result, the pmd is 'unshared'. - Process B then calls mprotect() again to chage protections for the mapping back to their original value. pmd remains unshared. - Process B then forks and process C is created. During the fork process, we do dup_mm -> dup_mmap -> copy_page_range to copy page tables. Copying page tables for hugetlb mappings is done in the routine copy_hugetlb_page_range. In copy_hugetlb_page_range(), the destination pte is obtained by: dst_pte = huge_pte_alloc(dst, addr, sz); If pmd sharing is possible, the returned pointer will be to a pte in an existing page table. In the situation above, process C could share with either process A or process B. Since process A is first in the list, the returned pte is a pointer to a pte in process A's page table. However, the check for pmd sharing in copy_hugetlb_page_range is: /* If the pagetables are shared don't copy or take references */ if (dst_pte == src_pte) continue; Since process C is sharing with process A instead of process B, the above test fails. The code in copy_hugetlb_page_range which follows assumes dst_pte points to a huge_pte_none pte. It copies the pte entry from src_pte to dst_pte and increments this map count of the associated page. This is how we end up with an elevated map count. To solve, check the dst_pte entry for huge_pte_none. If !none, this implies PMD sharing so do not copy. Link: http://lkml.kernel.org/r/20181105212315.14125-1-mike.kravetz@oracle.com Fixes: c5c99429fa57 ("fix hugepages leak due to pagetable page sharing") Signed-off-by: Mike Kravetz Reviewed-by: Naoya Horiguchi Cc: Michal Hocko Cc: Hugh Dickins Cc: Andrea Arcangeli Cc: "Kirill A . Shutemov" Cc: Davidlohr Bueso Cc: Prakash Sangappa Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 2ca904bea902e003afa3fa592a20e5bfb68a1ee8 Author: Arnd Bergmann Date: Fri Nov 16 15:08:35 2018 -0800 lib/ubsan.c: don't mark __ubsan_handle_builtin_unreachable as noreturn commit 1c23b4108d716cc848b38532063a8aca4f86add8 upstream. gcc-8 complains about the prototype for this function: lib/ubsan.c:432:1: error: ignoring attribute 'noreturn' in declaration of a built-in function '__ubsan_handle_builtin_unreachable' because it conflicts with attribute 'const' [-Werror=attributes] This is actually a GCC's bug. In GCC internals __ubsan_handle_builtin_unreachable() declared with both 'noreturn' and 'const' attributes instead of only 'noreturn': https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84210 Workaround this by removing the noreturn attribute. [aryabinin: add information about GCC bug in changelog] Link: http://lkml.kernel.org/r/20181107144516.4587-1-aryabinin@virtuozzo.com Signed-off-by: Arnd Bergmann Signed-off-by: Andrey Ryabinin Acked-by: Olof Johansson Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit f4ae32cf258910c6db2b7adb1832f50c6aae3929 Author: Eric Biggers Date: Sat Nov 3 14:56:00 2018 -0700 crypto: user - fix leaking uninitialized memory to userspace commit f43f39958beb206b53292801e216d9b8a660f087 upstream. All bytes of the NETLINK_CRYPTO report structures must be initialized, since they are copied to userspace. The change from strncpy() to strlcpy() broke this. As a minimal fix, change it back. Fixes: 4473710df1f8 ("crypto: user - Prepare for CRYPTO_MAX_ALG_NAME expansion") Cc: # v4.12+ Signed-off-by: Eric Biggers Signed-off-by: Herbert Xu Signed-off-by: Greg Kroah-Hartman commit f784bb167d5f059334751f9d1ebfc40a23a7b375 Author: Diego Viola Date: Mon Nov 12 17:22:52 2018 -0200 libata: blacklist SAMSUNG MZ7TD256HAFV-000L9 SSD commit 410b5c7b48368317af95f0113692561d01d8144e upstream. med_power_with_dipm still causes freezes after updating the firmware to the latest version (DXT04L5Q). Set model_rev to NULL and blacklist the device. Cc: stable@vger.kernel.org Signed-off-by: Diego Viola Reviewed-by: Hans de Goede Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman commit 08f382aeba0c7f33a4db60608482192feb55ccaa Author: Andreas Gruenbacher Date: Thu Nov 8 20:14:29 2018 +0000 gfs2: Fix metadata read-ahead during truncate (2) commit e7445ceddfc220c1aede6d42758a5acb8844e9c3 upstream. The previous attempt to fix for metadata read-ahead during truncate was incorrect: for files with a height > 2 (1006989312 bytes with a block size of 4096 bytes), read-ahead requests were not being issued for some of the indirect blocks discovered while walking the metadata tree, leading to significant slow-downs when deleting large files. Fix that. In addition, only issue read-ahead requests in the first pass through the meta-data tree, while deallocating data blocks. Fixes: c3ce5aa9b0 ("gfs2: Fix metadata read-ahead during truncate") Cc: stable@vger.kernel.org # v4.16+ Signed-off-by: Andreas Gruenbacher Signed-off-by: Greg Kroah-Hartman commit 47e7c3fc9b8bfc2c7048cb913e78f1935c519652 Author: Andreas Gruenbacher Date: Mon Nov 5 22:57:24 2018 +0000 gfs2: Put bitmap buffers in put_super commit 10283ea525d30f2e99828978fd04d8427876a7ad upstream. gfs2_put_super calls gfs2_clear_rgrpd to destroy the gfs2_rgrpd objects attached to the resource group glocks. That function should release the buffers attached to the gfs2_bitmap objects (bi_bh), but the call to gfs2_rgrp_brelse for doing that is missing. When gfs2_releasepage later runs across these buffers which are still referenced, it refuses to free them. This causes the pages the buffers are attached to to remain referenced as well. With enough mount/unmount cycles, the system will eventually run out of memory. Fix this by adding the missing call to gfs2_rgrp_brelse in gfs2_clear_rgrpd. (Also fix a gfs2_rgrp_relse -> gfs2_rgrp_brelse typo in a comment.) Fixes: 39b0f1e92908 ("GFS2: Don't brelse rgrp buffer_heads every allocation") Cc: stable@vger.kernel.org # v4.2+ Signed-off-by: Andreas Gruenbacher Signed-off-by: Greg Kroah-Hartman commit bd9568c3484b2fb487ecfaa6eaaa9a495d823991 Author: Guenter Roeck Date: Sun Jul 1 13:56:54 2018 -0700 configfs: replace strncpy with memcpy commit 1823342a1f2b47a4e6f5667f67cd28ab6bc4d6cd upstream. gcc 8.1.0 complains: fs/configfs/symlink.c:67:3: warning: 'strncpy' output truncated before terminating nul copying as many bytes from a string as its length fs/configfs/symlink.c: In function 'configfs_get_link': fs/configfs/symlink.c:63:13: note: length computed here Using strncpy() is indeed less than perfect since the length of data to be copied has already been determined with strlen(). Replace strncpy() with memcpy() to address the warning and optimize the code a little. Signed-off-by: Guenter Roeck Signed-off-by: Christoph Hellwig Signed-off-by: Nobuhiro Iwamatsu Signed-off-by: Greg Kroah-Hartman commit 5fe5a24a8cc8aad54ea6fce78929c61cebfbd707 Author: Ondrej Mosnacek Date: Tue Nov 13 16:16:08 2018 +0100 selinux: check length properly in SCTP bind hook commit c138325fb8713472d5a0c3c7258b9131bab40725 upstream. selinux_sctp_bind_connect() must verify if the address buffer has sufficient length before accessing the 'sa_family' field. See __sctp_connect() for a similar check. The length of the whole address ('len') is already checked in the callees. Reported-by: Qian Cai Fixes: d452930fd3b9 ("selinux: Add SCTP support") Cc: # 4.17+ Cc: Richard Haines Signed-off-by: Ondrej Mosnacek Tested-by: Qian Cai Signed-off-by: Paul Moore Signed-off-by: Greg Kroah-Hartman commit a80cb9b673349f65b5ce91661982d4a4fccc2e2e Author: Miklos Szeredi Date: Fri Nov 9 15:52:16 2018 +0100 fuse: fix leaked notify reply commit 7fabaf303458fcabb694999d6fa772cc13d4e217 upstream. fuse_request_send_notify_reply() may fail if the connection was reset for some reason (e.g. fs was unmounted). Don't leak request reference in this case. Besides leaking memory, this resulted in fc->num_waiting not being decremented and hence fuse_wait_aborted() left in a hanging and unkillable state. Fixes: 2d45ba381a74 ("fuse: add retrieve request") Fixes: b8f95e5d13f5 ("fuse: umount should wait for all requests") Reported-and-tested-by: syzbot+6339eda9cb4ebbc4c37b@syzkaller.appspotmail.com Signed-off-by: Miklos Szeredi Cc: #v2.6.36 Signed-off-by: Greg Kroah-Hartman commit e6fed825e0eb63fd8933842aeb827e5e15f8b965 Author: Lukas Czerner Date: Fri Nov 9 14:51:46 2018 +0100 fuse: fix use-after-free in fuse_direct_IO() commit ebacb81273599555a7a19f7754a1451206a5fc4f upstream. In async IO blocking case the additional reference to the io is taken for it to survive fuse_aio_complete(). In non blocking case this additional reference is not needed, however we still reference io to figure out whether to wait for completion or not. This is wrong and will lead to use-after-free. Fix it by storing blocking information in separate variable. This was spotted by KASAN when running generic/208 fstest. Signed-off-by: Lukas Czerner Reported-by: Zorro Lang Signed-off-by: Miklos Szeredi Fixes: 744742d692e3 ("fuse: Add reference counting for fuse_io_priv") Cc: # v4.6 Signed-off-by: Greg Kroah-Hartman commit 693a06b52aaba50349265857b45b5b1ebc5f11d9 Author: Maciej W. Rozycki Date: Mon Nov 5 03:48:25 2018 +0000 rtc: hctosys: Add missing range error reporting commit 7ce9a992ffde8ce93d5ae5767362a5c7389ae895 upstream. Fix an issue with the 32-bit range error path in `rtc_hctosys' where no error code is set and consequently the successful preceding call result from `rtc_read_time' is propagated to `rtc_hctosys_ret'. This in turn makes any subsequent call to `hctosys_show' incorrectly report in sysfs that the system time has been set from this RTC while it has not. Set the error to ERANGE then if we can't express the result due to an overflow. Signed-off-by: Maciej W. Rozycki Fixes: b3a5ac42ab18 ("rtc: hctosys: Ensure system time doesn't overflow time_t") Cc: stable@vger.kernel.org # 4.17+ Signed-off-by: Alexandre Belloni Signed-off-by: Greg Kroah-Hartman commit 651c5d16f3a90db286fb6d83a31eda29a2c07cda Author: Scott Mayhew Date: Thu Nov 8 11:11:36 2018 -0500 nfsd: COPY and CLONE operations require the saved filehandle to be set commit 01310bb7c9c98752cc763b36532fab028e0f8f81 upstream. Make sure we have a saved filehandle, otherwise we'll oops with a null pointer dereference in nfs4_preprocess_stateid_op(). Signed-off-by: Scott Mayhew Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields Signed-off-by: Greg Kroah-Hartman commit f194756edd6ce7c60b39b0f66fc8b59c48d312a8 Author: Trond Myklebust Date: Mon Nov 5 11:10:50 2018 -0500 NFSv4: Don't exit the state manager without clearing NFS4CLNT_MANAGER_RUNNING commit 21a446cf186570168b7281b154b1993968598aca upstream. If we exit the NFSv4 state manager due to a umount, then we can end up leaving the NFS4CLNT_MANAGER_RUNNING flag set. If another mount causes the nfs4_client to be rereferenced before it is destroyed, then we end up never being able to recover state. Fixes: 47c2199b6eb5 ("NFSv4.1: Ensure state manager thread dies on last ...") Signed-off-by: Trond Myklebust Cc: stable@vger.kernel.org # v4.15+ Signed-off-by: Greg Kroah-Hartman commit 7142f0dcc2c89e82b8ca2da47e3c54024043cd0a Author: Frank Sorenson Date: Tue Oct 30 15:10:40 2018 -0500 sunrpc: correct the computation for page_ptr when truncating commit 5d7a5bcb67c70cbc904057ef52d3fcfeb24420bb upstream. When truncating the encode buffer, the page_ptr is getting advanced, causing the next page to be skipped while encoding. The page is still included in the response, so the response contains a page of bogus data. We need to adjust the page_ptr backwards to ensure we encode the next page into the correct place. We saw this triggered when concurrent directory modifications caused nfsd4_encode_direct_fattr() to return nfserr_noent, and the resulting call to xdr_truncate_encode() corrupted the READDIR reply. Signed-off-by: Frank Sorenson Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields Signed-off-by: Greg Kroah-Hartman commit 13c70ef5b3c7b501d239344fe8cb6caf9ff037f7 Author: Christophe Leroy Date: Thu Sep 27 17:17:57 2018 +0000 kdb: print real address of pointers instead of hashed addresses commit 568fb6f42ac6851320adaea25f8f1b94de14e40a upstream. Since commit ad67b74d2469 ("printk: hash addresses printed with %p"), all pointers printed with %p are printed with hashed addresses instead of real addresses in order to avoid leaking addresses in dmesg and syslog. But this applies to kdb too, with is unfortunate: Entering kdb (current=0x(ptrval), pid 329) due to Keyboard Entry kdb> ps 15 sleeping system daemon (state M) processes suppressed, use 'ps A' to see all. Task Addr Pid Parent [*] cpu State Thread Command 0x(ptrval) 329 328 1 0 R 0x(ptrval) *sh 0x(ptrval) 1 0 0 0 S 0x(ptrval) init 0x(ptrval) 3 2 0 0 D 0x(ptrval) rcu_gp 0x(ptrval) 4 2 0 0 D 0x(ptrval) rcu_par_gp 0x(ptrval) 5 2 0 0 D 0x(ptrval) kworker/0:0 0x(ptrval) 6 2 0 0 D 0x(ptrval) kworker/0:0H 0x(ptrval) 7 2 0 0 D 0x(ptrval) kworker/u2:0 0x(ptrval) 8 2 0 0 D 0x(ptrval) mm_percpu_wq 0x(ptrval) 10 2 0 0 D 0x(ptrval) rcu_preempt The whole purpose of kdb is to debug, and for debugging real addresses need to be known. In addition, data displayed by kdb doesn't go into dmesg. This patch replaces all %p by %px in kdb in order to display real addresses. Fixes: ad67b74d2469 ("printk: hash addresses printed with %p") Cc: Signed-off-by: Christophe Leroy Signed-off-by: Daniel Thompson Signed-off-by: Greg Kroah-Hartman commit 6514d22a21adf5467607770e0bdc45b4232d205b Author: Christophe Leroy Date: Thu Sep 27 17:17:49 2018 +0000 kdb: use correct pointer when 'btc' calls 'btt' commit dded2e159208a9edc21dd5c5f583afa28d378d39 upstream. On a powerpc 8xx, 'btc' fails as follows: Entering kdb (current=0x(ptrval), pid 282) due to Keyboard Entry kdb> btc btc: cpu status: Currently on cpu 0 Available cpus: 0 kdb_getarea: Bad address 0x0 when booting the kernel with 'debug_boot_weak_hash', it fails as well Entering kdb (current=0xba99ad80, pid 284) due to Keyboard Entry kdb> btc btc: cpu status: Currently on cpu 0 Available cpus: 0 kdb_getarea: Bad address 0xba99ad80 On other platforms, Oopses have been observed too, see https://github.com/linuxppc/linux/issues/139 This is due to btc calling 'btt' with %p pointer as an argument. This patch replaces %p by %px to get the real pointer value as expected by 'btt' Fixes: ad67b74d2469 ("printk: hash addresses printed with %p") Cc: Signed-off-by: Christophe Leroy Reviewed-by: Daniel Thompson Signed-off-by: Daniel Thompson Signed-off-by: Greg Kroah-Hartman commit 6520fe9389466b7d1b970f61b853c03dcd6082c5 Author: Benjamin Coddington Date: Wed Oct 3 10:18:33 2018 -0400 mnt: fix __detach_mounts infinite loop commit 1e9c75fb9c47a75a9aec0cd17db5f6dc36b58e00 upstream. Since commit ff17fa561a04 ("d_invalidate(): unhash immediately") immediately unhashes the dentry, we'll never return the mountpoint in lookup_mountpoint(), which can lead to an unbreakable loop in d_invalidate(). I have reports of NFS clients getting into this condition after the server removes an export of an existing mount created through follow_automount(), but I suspect there are various other ways to produce this problem if we hunt down users of d_invalidate(). For example, it is possible to get into this state by using XFS' d_invalidate() call in xfs_vn_unlink(): truncate -s 100m img{1,2} mkfs.xfs -q -n version=ci img1 mkfs.xfs -q -n version=ci img2 mkdir -p /mnt/xfs mount img1 /mnt/xfs mkdir /mnt/xfs/sub1 mount img2 /mnt/xfs/sub1 cat > /mnt/xfs/sub1/foo & umount -l /mnt/xfs/sub1 mount img2 /mnt/xfs/sub1 mount --make-private /mnt/xfs mkdir /mnt/xfs/sub2 mount --move /mnt/xfs/sub1 /mnt/xfs/sub2 rmdir /mnt/xfs/sub1 Fix this by moving the check for an unlinked dentry out of the detach_mounts() path. Fixes: ff17fa561a04 ("d_invalidate(): unhash immediately") Cc: stable@vger.kernel.org Reviewed-by: "Eric W. Biederman" Signed-off-by: Benjamin Coddington Signed-off-by: Eric W. Biederman Signed-off-by: Greg Kroah-Hartman commit d1a5f8e4d27e76bc68541641233e5ae9a761b2de Author: Eric W. Biederman Date: Thu Oct 25 12:05:11 2018 -0500 mount: Prevent MNT_DETACH from disconnecting locked mounts commit 9c8e0a1b683525464a2abe9fb4b54404a50ed2b4 upstream. Timothy Baldwin wrote: > As per mount_namespaces(7) unprivileged users should not be able to look under mount points: > > Mounts that come as a single unit from more privileged mount are locked > together and may not be separated in a less privileged mount namespace. > > However they can: > > 1. Create a mount namespace. > 2. In the mount namespace open a file descriptor to the parent of a mount point. > 3. Destroy the mount namespace. > 4. Use the file descriptor to look under the mount point. > > I have reproduced this with Linux 4.16.18 and Linux 4.18-rc8. > > The setup: > > $ sudo sysctl kernel.unprivileged_userns_clone=1 > kernel.unprivileged_userns_clone = 1 > $ mkdir -p A/B/Secret > $ sudo mount -t tmpfs hide A/B > > > "Secret" is indeed hidden as expected: > > $ ls -lR A > A: > total 0 > drwxrwxrwt 2 root root 40 Feb 12 21:08 B > > A/B: > total 0 > > > The attack revealing "Secret": > > $ unshare -Umr sh -c "exec unshare -m ls -lR /proc/self/fd/4/ 4 /proc/self/fd/4/: > total 0 > drwxr-xr-x 3 root root 60 Feb 12 21:08 B > > /proc/self/fd/4/B: > total 0 > drwxr-xr-x 2 root root 40 Feb 12 21:08 Secret > > /proc/self/fd/4/B/Secret: > total 0 I tracked this down to put_mnt_ns running passing UMOUNT_SYNC and disconnecting all of the mounts in a mount namespace. Fix this by factoring drop_mounts out of drop_collected_mounts and passing 0 instead of UMOUNT_SYNC. There are two possible behavior differences that result from this. - No longer setting UMOUNT_SYNC will no longer set MNT_SYNC_UMOUNT on the vfsmounts being unmounted. This effects the lazy rcu walk by kicking the walk out of rcu mode and forcing it to be a non-lazy walk. - No longer disconnecting locked mounts will keep some mounts around longer as they stay because the are locked to other mounts. There are only two users of drop_collected mounts: audit_tree.c and put_mnt_ns. In audit_tree.c the mounts are private and there are no rcu lazy walks only calls to iterate_mounts. So the changes should have no effect except for a small timing effect as the connected mounts are disconnected. In put_mnt_ns there may be references from process outside the mount namespace to the mounts. So the mounts remaining connected will be the bug fix that is needed. That rcu walks are allowed to continue appears not to be a problem especially as the rcu walk change was about an implementation detail not about semantics. Cc: stable@vger.kernel.org Fixes: 5ff9d8a65ce8 ("vfs: Lock in place mounts from more privileged users") Reported-by: Timothy Baldwin Tested-by: Timothy Baldwin Signed-off-by: "Eric W. Biederman" Signed-off-by: Greg Kroah-Hartman commit 376732709e5d09b68139032541ec7766a49f9c8e Author: Eric W. Biederman Date: Thu Oct 25 09:04:18 2018 -0500 mount: Don't allow copying MNT_UNBINDABLE|MNT_LOCKED mounts commit df7342b240185d58d3d9665c0bbf0a0f5570ec29 upstream. Jonathan Calmels from NVIDIA reported that he's able to bypass the mount visibility security check in place in the Linux kernel by using a combination of the unbindable property along with the private mount propagation option to allow a unprivileged user to see a path which was purposefully hidden by the root user. Reproducer: # Hide a path to all users using a tmpfs root@castiana:~# mount -t tmpfs tmpfs /sys/devices/ root@castiana:~# # As an unprivileged user, unshare user namespace and mount namespace stgraber@castiana:~$ unshare -U -m -r # Confirm the path is still not accessible root@castiana:~# ls /sys/devices/ # Make /sys recursively unbindable and private root@castiana:~# mount --make-runbindable /sys root@castiana:~# mount --make-private /sys # Recursively bind-mount the rest of /sys over to /mnnt root@castiana:~# mount --rbind /sys/ /mnt # Access our hidden /sys/device as an unprivileged user root@castiana:~# ls /mnt/devices/ breakpoint cpu cstate_core cstate_pkg i915 intel_pt isa kprobe LNXSYSTM:00 msr pci0000:00 platform pnp0 power software system tracepoint uncore_arb uncore_cbox_0 uncore_cbox_1 uprobe virtual Solve this by teaching copy_tree to fail if a mount turns out to be both unbindable and locked. Cc: stable@vger.kernel.org Fixes: 5ff9d8a65ce8 ("vfs: Lock in place mounts from more privileged users") Reported-by: Jonathan Calmels Signed-off-by: "Eric W. Biederman" Signed-off-by: Greg Kroah-Hartman commit afae7f7336d502e11437da66ec876c8c13a846ee Author: Eric W. Biederman Date: Mon Oct 22 10:21:38 2018 -0500 mount: Retest MNT_LOCKED in do_umount commit 25d202ed820ee347edec0bf3bf553544556bf64b upstream. It was recently pointed out that the one instance of testing MNT_LOCKED outside of the namespace_sem is in ksys_umount. Fix that by adding a test inside of do_umount with namespace_sem and the mount_lock held. As it helps to fail fails the existing test is maintained with an additional comment pointing out that it may be racy because the locks are not held. Cc: stable@vger.kernel.org Reported-by: Al Viro Fixes: 5ff9d8a65ce8 ("vfs: Lock in place mounts from more privileged users") Signed-off-by: "Eric W. Biederman" Signed-off-by: Greg Kroah-Hartman commit e1d8594f10d14ffc2b1b6aea2a583280796a560c Author: Vasily Averin Date: Wed Nov 7 22:36:23 2018 -0500 ext4: fix buffer leak in __ext4_read_dirblock() on error path commit de59fae0043f07de5d25e02ca360f7d57bfa5866 upstream. Fixes: dc6982ff4db1 ("ext4: refactor code to read directory blocks ...") Signed-off-by: Vasily Averin Signed-off-by: Theodore Ts'o Cc: stable@kernel.org # 3.9 Signed-off-by: Greg Kroah-Hartman commit 4f1e8732299183adce98e69fd0e4cd5ce86e6820 Author: Vasily Averin Date: Wed Nov 7 11:14:35 2018 -0500 ext4: fix buffer leak in ext4_expand_extra_isize_ea() on error path commit 53692ec074d00589c2cf1d6d17ca76ad0adce6ec upstream. Fixes: de05ca852679 ("ext4: move call to ext4_error() into ...") Signed-off-by: Vasily Averin Signed-off-by: Theodore Ts'o Cc: stable@kernel.org # 4.17 Signed-off-by: Greg Kroah-Hartman commit 73b05bc9cfdc51954093c30ba5606ce6d4fdb8a8 Author: Vasily Averin Date: Wed Nov 7 11:10:21 2018 -0500 ext4: fix buffer leak in ext4_xattr_move_to_block() on error path commit 6bdc9977fcdedf47118d2caf7270a19f4b6d8a8f upstream. Fixes: 3f2571c1f91f ("ext4: factor out xattr moving") Fixes: 6dd4ee7cab7e ("ext4: Expand extra_inodes space per ...") Reviewed-by: Jan Kara Signed-off-by: Vasily Averin Signed-off-by: Theodore Ts'o Cc: stable@kernel.org # 2.6.23 Signed-off-by: Greg Kroah-Hartman commit 93b0fc8e900551e70ec77828a665981c5e61d939 Author: Vasily Averin Date: Wed Nov 7 11:07:01 2018 -0500 ext4: release bs.bh before re-using in ext4_xattr_block_find() commit 45ae932d246f721e6584430017176cbcadfde610 upstream. bs.bh was taken in previous ext4_xattr_block_find() call, it should be released before re-using Fixes: 7e01c8e5420b ("ext3/4: fix uninitialized bs in ...") Signed-off-by: Vasily Averin Signed-off-by: Theodore Ts'o Cc: stable@kernel.org # 2.6.26 Signed-off-by: Greg Kroah-Hartman commit ef3af4ee5b8150f84ddf7037253510b295f57109 Author: Vasily Averin Date: Wed Nov 7 11:01:33 2018 -0500 ext4: fix buffer leak in ext4_xattr_get_block() on error path commit ecaaf408478b6fb4d9986f9b6652f3824e374f4c upstream. Fixes: dec214d00e0d ("ext4: xattr inode deduplication") Signed-off-by: Vasily Averin Signed-off-by: Theodore Ts'o Cc: stable@kernel.org # 4.13 Signed-off-by: Greg Kroah-Hartman commit 2efa77a4df208ac375d2a438852035240a570973 Author: Vasily Averin Date: Wed Nov 7 10:56:28 2018 -0500 ext4: fix possible leak of s_journal_flag_rwsem in error path commit af18e35bfd01e6d65a5e3ef84ffe8b252d1628c5 upstream. Fixes: c8585c6fcaf2 ("ext4: fix races between changing inode journal ...") Signed-off-by: Vasily Averin Signed-off-by: Theodore Ts'o Cc: stable@kernel.org # 4.7 Signed-off-by: Greg Kroah-Hartman commit 891dd04dc1ccf85ec374d2be04afe4dbddc9a88b Author: Theodore Ts'o Date: Wed Nov 7 10:32:53 2018 -0500 ext4: fix possible leak of sbi->s_group_desc_leak in error path commit 9e463084cdb22e0b56b2dfbc50461020409a5fd3 upstream. Fixes: bfe0a5f47ada ("ext4: add more mount time checks of the superblock") Reported-by: Vasily Averin Signed-off-by: Theodore Ts'o Cc: stable@kernel.org # 4.18 Signed-off-by: Greg Kroah-Hartman commit 0ff14c16afc80359875490b1e8224242f21c21b7 Author: Theodore Ts'o Date: Tue Nov 6 17:18:17 2018 -0500 ext4: avoid possible double brelse() in add_new_gdb() on error path commit 4f32c38b4662312dd3c5f113d8bdd459887fb773 upstream. Fixes: b40971426a83 ("ext4: add error checking to calls to ...") Reported-by: Vasily Averin Signed-off-by: Theodore Ts'o Cc: stable@kernel.org # 2.6.38 Signed-off-by: Greg Kroah-Hartman commit 84d88fc7e751440a7433c7433039cde3c2275beb Author: Vasily Averin Date: Tue Nov 6 16:16:01 2018 -0500 ext4: fix missing cleanup if ext4_alloc_flex_bg_array() fails while resizing commit f348e2241fb73515d65b5d77dd9c174128a7fbf2 upstream. Fixes: 117fff10d7f1 ("ext4: grow the s_flex_groups array as needed ...") Signed-off-by: Vasily Averin Signed-off-by: Theodore Ts'o Cc: stable@kernel.org # 3.7 Signed-off-by: Greg Kroah-Hartman commit 78805335fb174f20012d6e36b6a162f00a8a399b Author: Vasily Averin Date: Tue Nov 6 17:01:36 2018 -0500 ext4: avoid buffer leak in ext4_orphan_add() after prior errors commit feaf264ce7f8d54582e2f66eb82dd9dd124c94f3 upstream. Fixes: d745a8c20c1f ("ext4: reduce contention on s_orphan_lock") Fixes: 6e3617e579e0 ("ext4: Handle non empty on-disk orphan link") Cc: Dmitry Monakhov Signed-off-by: Vasily Averin Signed-off-by: Theodore Ts'o Cc: stable@kernel.org # 2.6.34 Signed-off-by: Greg Kroah-Hartman commit aac055dfa4eeaeb534460d18be5141d6be97a35f Author: Vasily Averin Date: Tue Nov 6 16:49:50 2018 -0500 ext4: avoid buffer leak on shutdown in ext4_mark_iloc_dirty() commit a6758309a005060b8297a538a457c88699cb2520 upstream. ext4_mark_iloc_dirty() callers expect that it releases iloc->bh even if it returns an error. Fixes: 0db1ff222d40 ("ext4: add shutdown bit and check for it") Signed-off-by: Vasily Averin Signed-off-by: Theodore Ts'o Cc: stable@kernel.org # 4.11 Signed-off-by: Greg Kroah-Hartman commit d61417997e5037c3343c15820ed3472375a093de Author: Vasily Averin Date: Tue Nov 6 16:20:40 2018 -0500 ext4: fix possible inode leak in the retry loop of ext4_resize_fs() commit db6aee62406d9fbb53315fcddd81f1dc271d49fa upstream. Fixes: 1c6bd7173d66 ("ext4: convert file system to meta_bg if needed ...") Signed-off-by: Vasily Averin Signed-off-by: Theodore Ts'o Cc: stable@kernel.org # 3.7 Signed-off-by: Greg Kroah-Hartman commit dd6d368f6f18bf30dad1bebba52ed097fb295bed Author: Vasily Averin Date: Fri Nov 9 11:34:40 2018 -0500 ext4: missing !bh check in ext4_xattr_inode_write() commit eb6984fa4ce2837dcb1f66720a600f31b0bb3739 upstream. According to Ted Ts'o ext4_getblk() called in ext4_xattr_inode_write() should not return bh = NULL The only time that bh could be NULL, then, would be in the case of something really going wrong; a programming error elsewhere (perhaps a wild pointer dereference) or I/O error causing on-disk file system corruption (although that would be highly unlikely given that we had *just* allocated the blocks and so the metadata blocks in question probably would still be in the cache). Fixes: e50e5129f384 ("ext4: xattr-in-inode support") Signed-off-by: Vasily Averin Signed-off-by: Theodore Ts'o Cc: stable@kernel.org # 4.13 Signed-off-by: Greg Kroah-Hartman commit c802502a2e808b8eac4fc155109572427d27b803 Author: Vasily Averin Date: Sat Nov 3 16:13:17 2018 -0400 ext4: avoid potential extra brelse in setup_new_flex_group_blocks() commit 9e4028935cca3f9ef9b6a90df9da6f1f94853536 upstream. Currently bh is set to NULL only during first iteration of for cycle, then this pointer is not cleared after end of using. Therefore rollback after errors can lead to extra brelse(bh) call, decrements bh counter and later trigger an unexpected warning in __brelse() Patch moves brelse() calls in body of cycle to exclude requirement of brelse() call in rollback. Fixes: 33afdcc5402d ("ext4: add a function which sets up group blocks ...") Signed-off-by: Vasily Averin Signed-off-by: Theodore Ts'o Cc: stable@kernel.org # 3.3+ Signed-off-by: Greg Kroah-Hartman commit 56415293423bedcce47a67d7dd81b6f249d9bc76 Author: Vasily Averin Date: Sat Nov 3 16:50:08 2018 -0400 ext4: add missing brelse() add_new_gdb_meta_bg()'s error path commit 61a9c11e5e7a0dab5381afa5d9d4dd5ebf18f7a0 upstream. Fixes: 01f795f9e0d6 ("ext4: add online resizing support for meta_bg ...") Signed-off-by: Vasily Averin Signed-off-by: Theodore Ts'o Cc: stable@kernel.org # 3.7 Signed-off-by: Greg Kroah-Hartman commit ab88f38df8037950fa7da4470c8b0168136f35db Author: Vasily Averin Date: Sat Nov 3 16:22:10 2018 -0400 ext4: add missing brelse() in set_flexbg_block_bitmap()'s error path commit cea5794122125bf67559906a0762186cf417099c upstream. Fixes: 33afdcc5402d ("ext4: add a function which sets up group blocks ...") Cc: stable@kernel.org # 3.3 Signed-off-by: Vasily Averin Signed-off-by: Theodore Ts'o Signed-off-by: Greg Kroah-Hartman commit 5ee0b3985aeccf35627f0ada491451d15ec155c8 Author: Vasily Averin Date: Sat Nov 3 17:11:19 2018 -0400 ext4: add missing brelse() update_backups()'s error path commit ea0abbb648452cdb6e1734b702b6330a7448fcf8 upstream. Fixes: ac27a0ec112a ("ext4: initial copy of files from ext3") Signed-off-by: Vasily Averin Signed-off-by: Theodore Ts'o Cc: stable@kernel.org # 2.6.19 Signed-off-by: Greg Kroah-Hartman commit d4b97e20c7ea4057bf1d006cecf2270f6e0ec7b1 Author: Michael Kelley Date: Sun Nov 4 03:48:54 2018 +0000 clockevents/drivers/i8253: Add support for PIT shutdown quirk commit 35b69a420bfb56b7b74cb635ea903db05e357bec upstream. Add support for platforms where pit_shutdown() doesn't work because of a quirk in the PIT emulation. On these platforms setting the counter register to zero causes the PIT to start running again, negating the shutdown. Provide a global variable that controls whether the counter register is zero'ed, which platform specific code can override. Signed-off-by: Michael Kelley Signed-off-by: Thomas Gleixner Cc: "gregkh@linuxfoundation.org" Cc: "devel@linuxdriverproject.org" Cc: "daniel.lezcano@linaro.org" Cc: "virtualization@lists.linux-foundation.org" Cc: "jgross@suse.com" Cc: "akataria@vmware.com" Cc: "olaf@aepfle.de" Cc: "apw@canonical.com" Cc: vkuznets Cc: "jasowang@redhat.com" Cc: "marcelo.cerri@canonical.com" Cc: KY Srinivasan Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/1541303219-11142-2-git-send-email-mikelley@microsoft.com Signed-off-by: Greg Kroah-Hartman commit a77da38bfda3ddd99c16befbce48239798fe46de Author: Filipe Manana Date: Mon Nov 5 11:14:17 2018 +0000 Btrfs: fix data corruption due to cloning of eof block commit ac765f83f1397646c11092a032d4f62c3d478b81 upstream. We currently allow cloning a range from a file which includes the last block of the file even if the file's size is not aligned to the block size. This is fine and useful when the destination file has the same size, but when it does not and the range ends somewhere in the middle of the destination file, it leads to corruption because the bytes between the EOF and the end of the block have undefined data (when there is support for discard/trimming they have a value of 0x00). Example: $ mkfs.btrfs -f /dev/sdb $ mount /dev/sdb /mnt $ export foo_size=$((256 * 1024 + 100)) $ xfs_io -f -c "pwrite -S 0x3c 0 $foo_size" /mnt/foo $ xfs_io -f -c "pwrite -S 0xb5 0 1M" /mnt/bar $ xfs_io -c "reflink /mnt/foo 0 512K $foo_size" /mnt/bar $ od -A d -t x1 /mnt/bar 0000000 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 * 0524288 3c 3c 3c 3c 3c 3c 3c 3c 3c 3c 3c 3c 3c 3c 3c 3c * 0786528 3c 3c 3c 3c 00 00 00 00 00 00 00 00 00 00 00 00 0786544 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 * 0790528 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 * 1048576 The bytes in the range from 786532 (512Kb + 256Kb + 100 bytes) to 790527 (512Kb + 256Kb + 4Kb - 1) got corrupted, having now a value of 0x00 instead of 0xb5. This is similar to the problem we had for deduplication that got recently fixed by commit de02b9f6bb65 ("Btrfs: fix data corruption when deduplicating between different files"). Fix this by not allowing such operations to be performed and return the errno -EINVAL to user space. This is what XFS is doing as well at the VFS level. This change however now makes us return -EINVAL instead of -EOPNOTSUPP for cases where the source range maps to an inline extent and the destination range's end is smaller then the destination file's size, since the detection of inline extents is done during the actual process of dropping file extent items (at __btrfs_drop_extents()). Returning the -EINVAL error is done early on and solely based on the input parameters (offsets and length) and destination file's size. This makes us consistent with XFS and anyone else supporting cloning since this case is now checked at a higher level in the VFS and is where the -EINVAL will be returned from starting with kernel 4.20 (the VFS changed was introduced in 4.20-rc1 by commit 07d19dc9fbe9 ("vfs: avoid problematic remapping requests into partial EOF block"). So this change is more geared towards stable kernels, as it's unlikely the new VFS checks get removed intentionally. A test case for fstests follows soon, as well as an update to filter existing tests that expect -EOPNOTSUPP to accept -EINVAL as well. CC: # 4.4+ Signed-off-by: Filipe Manana Signed-off-by: David Sterba Signed-off-by: Greg Kroah-Hartman commit 0c4fb98c8f6f770d945964edba99de4ff8be75f3 Author: Filipe Manana Date: Mon Nov 5 11:14:05 2018 +0000 Btrfs: fix infinite loop on inode eviction after deduplication of eof block commit 11023d3f5fdf89bba5e1142127701ca6e6014587 upstream. If we attempt to deduplicate the last block of a file A into the middle of a file B, and file A's size is not a multiple of the block size, we end rounding the deduplication length to 0 bytes, to avoid the data corruption issue fixed by commit de02b9f6bb65 ("Btrfs: fix data corruption when deduplicating between different files"). However a length of zero will cause the insertion of an extent state with a start value greater (by 1) then the end value, leading to a corrupt extent state that will trigger a warning and cause chaos such as an infinite loop during inode eviction. Example trace: [96049.833585] ------------[ cut here ]------------ [96049.833714] WARNING: CPU: 0 PID: 24448 at fs/btrfs/extent_io.c:436 insert_state+0x101/0x120 [btrfs] [96049.833767] CPU: 0 PID: 24448 Comm: xfs_io Not tainted 4.19.0-rc7-btrfs-next-39 #1 [96049.833768] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.2-0-gf9626ccb91-prebuilt.qemu-project.org 04/01/2014 [96049.833780] RIP: 0010:insert_state+0x101/0x120 [btrfs] [96049.833783] RSP: 0018:ffffafd2c3707af0 EFLAGS: 00010282 [96049.833785] RAX: 0000000000000000 RBX: 000000000004dfff RCX: 0000000000000006 [96049.833786] RDX: 0000000000000007 RSI: ffff99045c143230 RDI: ffff99047b2168a0 [96049.833787] RBP: ffff990457851cd0 R08: 0000000000000001 R09: 0000000000000000 [96049.833787] R10: ffffafd2c3707ab8 R11: 0000000000000000 R12: ffff9903b93b12c8 [96049.833788] R13: 000000000004e000 R14: ffffafd2c3707b80 R15: ffffafd2c3707b78 [96049.833790] FS: 00007f5c14e7d700(0000) GS:ffff99047b200000(0000) knlGS:0000000000000000 [96049.833791] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [96049.833792] CR2: 00007f5c146abff8 CR3: 0000000115f4c004 CR4: 00000000003606f0 [96049.833795] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [96049.833796] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [96049.833796] Call Trace: [96049.833809] __set_extent_bit+0x46c/0x6a0 [btrfs] [96049.833823] lock_extent_bits+0x6b/0x210 [btrfs] [96049.833831] ? _raw_spin_unlock+0x24/0x30 [96049.833841] ? test_range_bit+0xdf/0x130 [btrfs] [96049.833853] lock_extent_range+0x8e/0x150 [btrfs] [96049.833864] btrfs_double_extent_lock+0x78/0xb0 [btrfs] [96049.833875] btrfs_extent_same_range+0x14e/0x550 [btrfs] [96049.833885] ? rcu_read_lock_sched_held+0x3f/0x70 [96049.833890] ? __kmalloc_node+0x2b0/0x2f0 [96049.833899] ? btrfs_dedupe_file_range+0x19a/0x280 [btrfs] [96049.833909] btrfs_dedupe_file_range+0x270/0x280 [btrfs] [96049.833916] vfs_dedupe_file_range_one+0xd9/0xe0 [96049.833919] vfs_dedupe_file_range+0x131/0x1b0 [96049.833924] do_vfs_ioctl+0x272/0x6e0 [96049.833927] ? __fget+0x113/0x200 [96049.833931] ksys_ioctl+0x70/0x80 [96049.833933] __x64_sys_ioctl+0x16/0x20 [96049.833937] do_syscall_64+0x60/0x1b0 [96049.833939] entry_SYSCALL_64_after_hwframe+0x49/0xbe [96049.833941] RIP: 0033:0x7f5c1478ddd7 [96049.833943] RSP: 002b:00007ffe15b196a8 EFLAGS: 00000202 ORIG_RAX: 0000000000000010 [96049.833945] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f5c1478ddd7 [96049.833946] RDX: 00005625ece322d0 RSI: 00000000c0189436 RDI: 0000000000000004 [96049.833947] RBP: 0000000000000000 R08: 00007f5c14a46f48 R09: 0000000000000040 [96049.833948] R10: 0000000000000541 R11: 0000000000000202 R12: 0000000000000000 [96049.833949] R13: 0000000000000000 R14: 0000000000000004 R15: 00005625ece322d0 [96049.833954] irq event stamp: 6196 [96049.833956] hardirqs last enabled at (6195): [] console_unlock+0x503/0x640 [96049.833958] hardirqs last disabled at (6196): [] trace_hardirqs_off_thunk+0x1a/0x1c [96049.833959] softirqs last enabled at (6114): [] __do_softirq+0x370/0x421 [96049.833964] softirqs last disabled at (6095): [] irq_exit+0xcd/0xe0 [96049.833965] ---[ end trace db7b05f01b7fa10c ]--- [96049.935816] R13: 0000000000000000 R14: 00005562e5259240 R15: 00007ffff092b910 [96049.935822] irq event stamp: 6584 [96049.935823] hardirqs last enabled at (6583): [] console_unlock+0x503/0x640 [96049.935825] hardirqs last disabled at (6584): [] trace_hardirqs_off_thunk+0x1a/0x1c [96049.935827] softirqs last enabled at (6328): [] __do_softirq+0x370/0x421 [96049.935828] softirqs last disabled at (6313): [] irq_exit+0xcd/0xe0 [96049.935829] ---[ end trace db7b05f01b7fa123 ]--- [96049.935840] ------------[ cut here ]------------ [96049.936065] WARNING: CPU: 1 PID: 24463 at fs/btrfs/extent_io.c:436 insert_state+0x101/0x120 [btrfs] [96049.936107] CPU: 1 PID: 24463 Comm: umount Tainted: G W 4.19.0-rc7-btrfs-next-39 #1 [96049.936108] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.2-0-gf9626ccb91-prebuilt.qemu-project.org 04/01/2014 [96049.936117] RIP: 0010:insert_state+0x101/0x120 [btrfs] [96049.936119] RSP: 0018:ffffafd2c3637bc0 EFLAGS: 00010282 [96049.936120] RAX: 0000000000000000 RBX: 000000000004dfff RCX: 0000000000000006 [96049.936121] RDX: 0000000000000007 RSI: ffff990445cf88e0 RDI: ffff99047b2968a0 [96049.936122] RBP: ffff990457851cd0 R08: 0000000000000001 R09: 0000000000000000 [96049.936123] R10: ffffafd2c3637b88 R11: 0000000000000000 R12: ffff9904574301e8 [96049.936124] R13: 000000000004e000 R14: ffffafd2c3637c50 R15: ffffafd2c3637c48 [96049.936125] FS: 00007fe4b87e72c0(0000) GS:ffff99047b280000(0000) knlGS:0000000000000000 [96049.936126] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [96049.936128] CR2: 00005562e52618d8 CR3: 00000001151c8005 CR4: 00000000003606e0 [96049.936129] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [96049.936131] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [96049.936131] Call Trace: [96049.936141] __set_extent_bit+0x46c/0x6a0 [btrfs] [96049.936154] lock_extent_bits+0x6b/0x210 [btrfs] [96049.936167] btrfs_evict_inode+0x1e1/0x5a0 [btrfs] [96049.936172] evict+0xbf/0x1c0 [96049.936174] dispose_list+0x51/0x80 [96049.936176] evict_inodes+0x193/0x1c0 [96049.936180] generic_shutdown_super+0x3f/0x110 [96049.936182] kill_anon_super+0xe/0x30 [96049.936189] btrfs_kill_super+0x13/0x100 [btrfs] [96049.936191] deactivate_locked_super+0x3a/0x70 [96049.936193] cleanup_mnt+0x3b/0x80 [96049.936195] task_work_run+0x93/0xc0 [96049.936198] exit_to_usermode_loop+0xfa/0x100 [96049.936201] do_syscall_64+0x17f/0x1b0 [96049.936202] entry_SYSCALL_64_after_hwframe+0x49/0xbe [96049.936204] RIP: 0033:0x7fe4b80cfb37 [96049.936206] RSP: 002b:00007ffff092b688 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6 [96049.936207] RAX: 0000000000000000 RBX: 00005562e5259060 RCX: 00007fe4b80cfb37 [96049.936208] RDX: 0000000000000001 RSI: 0000000000000000 RDI: 00005562e525faa0 [96049.936209] RBP: 00005562e525faa0 R08: 00005562e525f770 R09: 0000000000000015 [96049.936210] R10: 00000000000006b4 R11: 0000000000000246 R12: 00007fe4b85d1e64 [96049.936211] R13: 0000000000000000 R14: 00005562e5259240 R15: 00007ffff092b910 [96049.936211] R13: 0000000000000000 R14: 00005562e5259240 R15: 00007ffff092b910 [96049.936216] irq event stamp: 6616 [96049.936219] hardirqs last enabled at (6615): [] console_unlock+0x503/0x640 [96049.936219] hardirqs last disabled at (6616): [] trace_hardirqs_off_thunk+0x1a/0x1c [96049.936222] softirqs last enabled at (6328): [] __do_softirq+0x370/0x421 [96049.936222] softirqs last disabled at (6313): [] irq_exit+0xcd/0xe0 [96049.936223] ---[ end trace db7b05f01b7fa124 ]--- The second stack trace, from inode eviction, is repeated forever due to the infinite loop during eviction. This is the same type of problem fixed way back in 2015 by commit 113e8283869b ("Btrfs: fix inode eviction infinite loop after extent_same ioctl") and commit ccccf3d67294 ("Btrfs: fix inode eviction infinite loop after cloning into it"). So fix this by returning immediately if the deduplication range length gets rounded down to 0 bytes, as there is nothing that needs to be done in such case. Example reproducer: $ mkfs.btrfs -f /dev/sdb $ mount /dev/sdb /mnt $ xfs_io -f -c "pwrite -S 0xe6 0 100" /mnt/foo $ xfs_io -f -c "pwrite -S 0xe6 0 1M" /mnt/bar # Unmount the filesystem and mount it again so that we start without any # extent state records when we ask for the deduplication. $ umount /mnt $ mount /dev/sdb /mnt $ xfs_io -c "dedupe /mnt/foo 0 500K 100" /mnt/bar # This unmount triggers the infinite loop. $ umount /mnt A test case for fstests will follow soon. Fixes: de02b9f6bb65 ("Btrfs: fix data corruption when deduplicating between different files") CC: # 4.19+ Reviewed-by: Nikolay Borisov Signed-off-by: Filipe Manana Signed-off-by: David Sterba Signed-off-by: Greg Kroah-Hartman commit 1064b11af79de484f9de6474f7af5aedf700ea6d Author: Robbie Ko Date: Tue Oct 30 18:04:04 2018 +0800 Btrfs: fix cur_offset in the error case for nocow commit 506481b20e818db40b6198815904ecd2d6daee64 upstream. When the cow_file_range fails, the related resources are unlocked according to the range [start..end), so the unlock cannot be repeated in run_delalloc_nocow. In some cases (e.g. cur_offset <= end && cow_start != -1), cur_offset is not updated correctly, so move the cur_offset update before cow_file_range. kernel BUG at mm/page-writeback.c:2663! Internal error: Oops - BUG: 0 [#1] SMP CPU: 3 PID: 31525 Comm: kworker/u8:7 Tainted: P O Hardware name: Realtek_RTD1296 (DT) Workqueue: writeback wb_workfn (flush-btrfs-1) task: ffffffc076db3380 ti: ffffffc02e9ac000 task.ti: ffffffc02e9ac000 PC is at clear_page_dirty_for_io+0x1bc/0x1e8 LR is at clear_page_dirty_for_io+0x14/0x1e8 pc : [] lr : [] pstate: 40000145 sp : ffffffc02e9af4f0 Process kworker/u8:7 (pid: 31525, stack limit = 0xffffffc02e9ac020) Call trace: [] clear_page_dirty_for_io+0x1bc/0x1e8 [] extent_clear_unlock_delalloc+0x1e4/0x210 [btrfs] [] run_delalloc_nocow+0x3b8/0x948 [btrfs] [] run_delalloc_range+0x250/0x3a8 [btrfs] [] writepage_delalloc.isra.21+0xbc/0x1d8 [btrfs] [] __extent_writepage+0xe8/0x248 [btrfs] [] extent_write_cache_pages.isra.17+0x164/0x378 [btrfs] [] extent_writepages+0x48/0x68 [btrfs] [] btrfs_writepages+0x20/0x30 [btrfs] [] do_writepages+0x30/0x88 [] __writeback_single_inode+0x34/0x198 [] writeback_sb_inodes+0x184/0x3c0 [] __writeback_inodes_wb+0x6c/0xc0 [] wb_writeback+0x1b8/0x1c0 [] wb_workfn+0x150/0x250 [] process_one_work+0x1dc/0x388 [] worker_thread+0x130/0x500 [] kthread+0x10c/0x110 [] ret_from_fork+0x10/0x40 Code: d503201f a9025bb5 a90363b7 f90023b9 (d4210000) CC: stable@vger.kernel.org # 4.4+ Reviewed-by: Filipe Manana Signed-off-by: Robbie Ko Signed-off-by: David Sterba Signed-off-by: Greg Kroah-Hartman commit 383ceb87733bc9483e0058d91c95c24183b4c21f Author: Lu Fengqi Date: Wed Oct 24 20:24:03 2018 +0800 btrfs: fix pinned underflow after transaction aborted commit fcd5e74288f7d36991b1f0fb96b8c57079645e38 upstream. When running generic/475, we may get the following warning in dmesg: [ 6902.102154] WARNING: CPU: 3 PID: 18013 at fs/btrfs/extent-tree.c:9776 btrfs_free_block_groups+0x2af/0x3b0 [btrfs] [ 6902.109160] CPU: 3 PID: 18013 Comm: umount Tainted: G W O 4.19.0-rc8+ #8 [ 6902.110971] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 [ 6902.112857] RIP: 0010:btrfs_free_block_groups+0x2af/0x3b0 [btrfs] [ 6902.118921] RSP: 0018:ffffc9000459bdb0 EFLAGS: 00010286 [ 6902.120315] RAX: ffff880175050bb0 RBX: ffff8801124a8000 RCX: 0000000000170007 [ 6902.121969] RDX: 0000000000000002 RSI: 0000000000170007 RDI: ffffffff8125fb74 [ 6902.123716] RBP: ffff880175055d10 R08: 0000000000000000 R09: 0000000000000000 [ 6902.125417] R10: 0000000000000000 R11: 0000000000000000 R12: ffff880175055d88 [ 6902.127129] R13: ffff880175050bb0 R14: 0000000000000000 R15: dead000000000100 [ 6902.129060] FS: 00007f4507223780(0000) GS:ffff88017ba00000(0000) knlGS:0000000000000000 [ 6902.130996] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 6902.132558] CR2: 00005623599cac78 CR3: 000000014b700001 CR4: 00000000003606e0 [ 6902.134270] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 6902.135981] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 6902.137836] Call Trace: [ 6902.138939] close_ctree+0x171/0x330 [btrfs] [ 6902.140181] ? kthread_stop+0x146/0x1f0 [ 6902.141277] generic_shutdown_super+0x6c/0x100 [ 6902.142517] kill_anon_super+0x14/0x30 [ 6902.143554] btrfs_kill_super+0x13/0x100 [btrfs] [ 6902.144790] deactivate_locked_super+0x2f/0x70 [ 6902.146014] cleanup_mnt+0x3b/0x70 [ 6902.147020] task_work_run+0x9e/0xd0 [ 6902.148036] do_syscall_64+0x470/0x600 [ 6902.149142] ? trace_hardirqs_off_thunk+0x1a/0x1c [ 6902.150375] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 6902.151640] RIP: 0033:0x7f45077a6a7b [ 6902.157324] RSP: 002b:00007ffd589f3e68 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6 [ 6902.159187] RAX: 0000000000000000 RBX: 000055e8eec732b0 RCX: 00007f45077a6a7b [ 6902.160834] RDX: 0000000000000001 RSI: 0000000000000000 RDI: 000055e8eec73490 [ 6902.162526] RBP: 0000000000000000 R08: 000055e8eec734b0 R09: 00007ffd589f26c0 [ 6902.164141] R10: 0000000000000000 R11: 0000000000000246 R12: 000055e8eec73490 [ 6902.165815] R13: 00007f4507ac61a4 R14: 0000000000000000 R15: 00007ffd589f40d8 [ 6902.167553] irq event stamp: 0 [ 6902.168998] hardirqs last enabled at (0): [<0000000000000000>] (null) [ 6902.170731] hardirqs last disabled at (0): [] copy_process.part.55+0x3b0/0x1f00 [ 6902.172773] softirqs last enabled at (0): [] copy_process.part.55+0x3b0/0x1f00 [ 6902.174671] softirqs last disabled at (0): [<0000000000000000>] (null) [ 6902.176407] ---[ end trace 463138c2986b275c ]--- [ 6902.177636] BTRFS info (device dm-3): space_info 4 has 273465344 free, is not full [ 6902.179453] BTRFS info (device dm-3): space_info total=276824064, used=4685824, pinned=18446744073708158976, reserved=0, may_use=0, readonly=65536 In the above line there's "pinned=18446744073708158976" which is an unsigned u64 value of -1392640, an obvious underflow. When transaction_kthread is running cleanup_transaction(), another fsstress is running btrfs_commit_transaction(). The btrfs_finish_extent_commit() may get the same range as btrfs_destroy_pinned_extent() got, which causes the pinned underflow. Fixes: d4b450cd4b33 ("Btrfs: fix race between transaction commit and empty block group removal") CC: stable@vger.kernel.org # 4.4+ Reviewed-by: Josef Bacik Signed-off-by: Lu Fengqi Reviewed-by: David Sterba Signed-off-by: David Sterba Signed-off-by: Greg Kroah-Hartman commit 57da76a94a8a8d85b633851daa1395f813a1cd6a Author: Mathieu Malaterre Date: Wed Jun 6 21:42:32 2018 +0200 watchdog/core: Add missing prototypes for weak functions commit 81bd415c91eb966118d773dddf254aebf3022411 upstream. The split out of the hard lockup detector exposed two new weak functions, but no prototypes for them, which triggers the build warning: kernel/watchdog.c:109:12: warning: no previous prototype for ‘watchdog_nmi_enable’ [-Wmissing-prototypes] kernel/watchdog.c:115:13: warning: no previous prototype for ‘watchdog_nmi_disable’ [-Wmissing-prototypes] Add the prototypes. Fixes: 73ce0511c436 ("kernel/watchdog.c: move hardlockup detector to separate file") Signed-off-by: Mathieu Malaterre Signed-off-by: Thomas Gleixner Cc: Babu Moger Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20180606194232.17653-1-malat@debian.org Signed-off-by: Greg Kroah-Hartman commit 11cf9be2e473a4db1f8a90a80a106a75e80f5e21 Author: H. Peter Anvin (Intel) Date: Mon Oct 22 09:19:05 2018 -0700 arch/alpha, termios: implement BOTHER, IBSHIFT and termios2 commit d0ffb805b729322626639336986bc83fc2e60871 upstream. Alpha has had c_ispeed and c_ospeed, but still set speeds in c_cflags using arbitrary flags. Because BOTHER is not defined, the general Linux code doesn't allow setting arbitrary baud rates, and because CBAUDEX == 0, we can have an array overrun of the baud_rate[] table in drivers/tty/tty_baudrate.c if (c_cflags & CBAUD) == 037. Resolve both problems by #defining BOTHER to 037 on Alpha. However, userspace still needs to know if setting BOTHER is actually safe given legacy kernels (does anyone actually care about that on Alpha anymore?), so enable the TCGETS2/TCSETS*2 ioctls on Alpha, even though they use the same structure. Define struct termios2 just for compatibility; it is the exact same structure as struct termios. In a future patchset, this will be cleaned up so the uapi headers are usable from libc. Signed-off-by: H. Peter Anvin (Intel) Cc: Jiri Slaby Cc: Al Viro Cc: Richard Henderson Cc: Ivan Kokshaysky Cc: Matt Turner Cc: Thomas Gleixner Cc: Kate Stewart Cc: Philippe Ombredanne Cc: Eugene Syromiatnikov Cc: Cc: Cc: Johan Hovold Cc: Alan Cox Cc: Signed-off-by: Greg Kroah-Hartman commit 7d7a750eabf2eb0335524fdd5bb10587b2e906d3 Author: H. Peter Anvin Date: Mon Oct 22 09:19:04 2018 -0700 termios, tty/tty_baudrate.c: fix buffer overrun commit 991a25194097006ec1e0d2e0814ff920e59e3465 upstream. On architectures with CBAUDEX == 0 (Alpha and PowerPC), the code in tty_baudrate.c does not do any limit checking on the tty_baudrate[] array, and in fact a buffer overrun is possible on both architectures. Add a limit check to prevent that situation. This will be followed by a much bigger cleanup/simplification patch. Signed-off-by: H. Peter Anvin (Intel) Requested-by: Cc: Johan Hovold Cc: Jiri Slaby Cc: Al Viro Cc: Richard Henderson Cc: Ivan Kokshaysky Cc: Matt Turner Cc: Thomas Gleixner Cc: Kate Stewart Cc: Philippe Ombredanne Cc: Eugene Syromiatnikov Cc: Alan Cox Cc: stable Signed-off-by: Greg Kroah-Hartman commit d74a4fc841cf6384aa3f4b5cdf3ea86fd94e2bae Author: Michael Kelley Date: Sun Nov 4 03:48:57 2018 +0000 x86/hyper-v: Enable PIT shutdown quirk commit 1de72c706488b7be664a601cf3843bd01e327e58 upstream. Hyper-V emulation of the PIT has a quirk such that the normal PIT shutdown path doesn't work, because clearing the counter register restarts the timer. Disable the counter clearing on PIT shutdown. Signed-off-by: Michael Kelley Signed-off-by: Thomas Gleixner Cc: "gregkh@linuxfoundation.org" Cc: "devel@linuxdriverproject.org" Cc: "daniel.lezcano@linaro.org" Cc: "virtualization@lists.linux-foundation.org" Cc: "jgross@suse.com" Cc: "akataria@vmware.com" Cc: "olaf@aepfle.de" Cc: "apw@canonical.com" Cc: vkuznets Cc: "jasowang@redhat.com" Cc: "marcelo.cerri@canonical.com" Cc: KY Srinivasan Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/1541303219-11142-3-git-send-email-mikelley@microsoft.com Signed-off-by: Greg Kroah-Hartman commit bcdff99a9348344424d25994dda9dedef5d3edd3 Author: Steven Rostedt (VMware) Date: Fri Nov 9 15:22:07 2018 -0500 x86/cpu/vmware: Do not trace vmware_sched_clock() commit 15035388439f892017d38b05214d3cda6578af64 upstream. When running function tracing on a Linux guest running on VMware Workstation, the guest would crash. This is due to tracing of the sched_clock internal call of the VMware vmware_sched_clock(), which causes an infinite recursion within the tracing code (clock calls must not be traced). Make vmware_sched_clock() not traced by ftrace. Fixes: 80e9a4f21fd7c ("x86/vmware: Add paravirt sched clock") Reported-by: GwanYeong Kim Signed-off-by: Steven Rostedt (VMware) Signed-off-by: Borislav Petkov CC: Alok Kataria CC: GwanYeong Kim CC: "H. Peter Anvin" CC: Ingo Molnar Cc: stable@vger.kernel.org CC: Thomas Gleixner CC: virtualization@lists.linux-foundation.org CC: x86-ml Link: http://lkml.kernel.org/r/20181109152207.4d3e7d70@gandalf.local.home Signed-off-by: Greg Kroah-Hartman commit 9eaed90225025d854fbdb89bdc529a89e974b5a3 Author: John Garry Date: Thu Nov 8 18:17:03 2018 +0800 of, numa: Validate some distance map rules commit 89c38422e072bb453e3045b8f1b962a344c3edea upstream. Currently the NUMA distance map parsing does not validate the distance table for the distance-matrix rules 1-2 in [1]. However the arch NUMA code may enforce some of these rules, but not all. Such is the case for the arm64 port, which does not enforce the rule that the distance between separates nodes cannot equal LOCAL_DISTANCE. The patch adds the following rules validation: - distance of node to self equals LOCAL_DISTANCE - distance of separate nodes > LOCAL_DISTANCE This change avoids a yet-unresolved crash reported in [2]. A note on dealing with symmetrical distances between nodes: Validating symmetrical distances between nodes is difficult. If it were mandated in the bindings that every distance must be recorded in the table, then it would be easy. However, it isn't. In addition to this, it is also possible to record [b, a] distance only (and not [a, b]). So, when processing the table for [b, a], we cannot assert that current distance of [a, b] != [b, a] as invalid, as [a, b] distance may not be present in the table and current distance would be default at REMOTE_DISTANCE. As such, we maintain the policy that we overwrite distance [a, b] = [b, a] for b > a. This policy is different to kernel ACPI SLIT validation, which allows non-symmetrical distances (ACPI spec SLIT rules allow it). However, the distance debug message is dropped as it may be misleading (for a distance which is later overwritten). Some final notes on semantics: - It is implied that it is the responsibility of the arch NUMA code to reset the NUMA distance map for an error in distance map parsing. - It is the responsibility of the FW NUMA topology parsing (whether OF or ACPI) to enforce NUMA distance rules, and not arch NUMA code. [1] Documents/devicetree/bindings/numa.txt [2] https://www.spinics.net/lists/arm-kernel/msg683304.html Cc: stable@vger.kernel.org # 4.7 Signed-off-by: John Garry Acked-by: Will Deacon Signed-off-by: Rob Herring Signed-off-by: Greg Kroah-Hartman commit 688995faeb6f810148e9db511cdfe4ebc32a39e4 Author: Thomas Richter Date: Tue Oct 23 17:16:16 2018 +0200 perf stat: Handle different PMU names with common prefix commit ea1fa48c055f833eb25f0c33188feecb7002ada5 upstream. On s390 the CPU Measurement Facility for counters now supports 2 PMUs named cpum_cf (CPU Measurement Facility for counters) and cpum_cf_diag (CPU Measurement Facility for diagnostic counters) for one and the same CPU. Running command [root@s35lp76 perf]# ./perf stat -e tx_c_tend \ -- ~/mytests/cf-tx-events 1 Measuring transactions TX_C_TABORT_NO_SPECIAL: 0 expected:0 TX_C_TABORT_SPECIAL: 0 expected:0 TX_C_TEND: 1 expected:1 TX_NC_TABORT: 11 expected:11 TX_NC_TEND: 1 expected:1 Performance counter stats for '/root/mytests/cf-tx-events 1': 2 tx_c_tend 0.002120091 seconds time elapsed 0.000121000 seconds user 0.002127000 seconds sys [root@s35lp76 perf]# displays output which is unexpected (and wrong): 2 tx_c_tend The test program definitely triggers only one transaction, as shown in line 'TX_C_TEND: 1 expected:1'. This is caused by the following call sequence: pmu_lookup() scans and installs a PMU. +--> pmu_aliases() parses all aliases in directory ...//events/* which are file names. +--> pmu_aliases_parse() Read each file in directory and create an new alias entry. This is done with +--> perf_pmu__new_alias() and +--> __perf_pmu__new_alias() which also check for identical alias names. After pmu_aliases() returns, a complete list of event names for this pmu has been created. Now function pmu_add_cpu_aliases() is called to add the events listed in the json | files to the alias list of the cpu. +--> perf_pmu__find_map() Returns a pointer to the json events. Now function pmu_add_cpu_aliases() scans through all events listed in the JSON files for this CPU. Each json event pmu name is compared with the current PMU being built up and if they mismatch, the json event is added to the current PMUs alias list. To avoid duplicate entries the following comparison is done: if (!is_arm_pmu_core(name)) { pname = pe->pmu ? pe->pmu : "cpu"; if (strncmp(pname, name, strlen(pname))) continue; } The culprit is the strncmp() function. Using current s390 PMU naming, the first PMU is 'cpum_cf' and a long list of events is added, among them 'tx_c_tend' When the second PMU named 'cpum_cf_diag' is added, only one event named 'CF_DIAG' is added by the pmu_aliases() function. Now function pmu_add_cpu_aliases() is invoked for PMU 'cpum_cf_diag'. Since the CPUID string is the same for both PMUs, json file events for PMU named 'cpum_cf' are added to the PMU 'cpm_cf_diag' This happens because the strncmp() actually compares: strncmp("cpum_cf", "cpum_cf_diag", 6); The first parameter is the pmu name taken from the event in the json file. The second parameter is the pmu name of the PMU currently being built. They are different, but the length of the compare only tests the common prefix and this returns 0(true) when it should return false. Now all events for PMU cpum_cf are added to the alias list for pmu cpum_cf_diag. Later on in function parse_events_add_pmu() the event 'tx_c_end' is searched in all available PMUs and found twice, adding it two times to the evsel_list global variable which is the root of all events. This results in a counter value of 2 instead of 1. Output with this patch: [root@s35lp76 perf]# ./perf stat -e tx_c_tend \ -- ~/mytests/cf-tx-events 1 Measuring transactions TX_C_TABORT_NO_SPECIAL: 0 expected:0 TX_C_TABORT_SPECIAL: 0 expected:0 TX_C_TEND: 1 expected:1 TX_NC_TABORT: 11 expected:11 TX_NC_TEND: 1 expected:1 Performance counter stats for '/root/mytests/cf-tx-events 1': 1 tx_c_tend 0.001815365 seconds time elapsed 0.000123000 seconds user 0.001756000 seconds sys [root@s35lp76 perf]# Signed-off-by: Thomas Richter Reviewed-by: Hendrik Brueckner Reviewed-by: Sebastien Boisvert Cc: Heiko Carstens Cc: Kan Liang Cc: Martin Schwidefsky Cc: stable@vger.kernel.org Fixes: 292c34c10249 ("perf pmu: Fix core PMU alias list for X86 platform") Link: http://lkml.kernel.org/r/20181023151616.78193-1-tmricht@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: Greg Kroah-Hartman commit b66ad9290fbf2003c6407cfc953b327160be24ba Author: Dmitry Osipenko Date: Wed Oct 24 22:37:13 2018 +0300 hwmon: (core) Fix double-free in __hwmon_device_register() commit 74e3512731bd5c9673176425a76a7cc5efa8ddb6 upstream. Fix double-free that happens when thermal zone setup fails, see KASAN log below. ================================================================== BUG: KASAN: double-free or invalid-free in __hwmon_device_register+0x5dc/0xa7c CPU: 0 PID: 132 Comm: kworker/0:2 Tainted: G B 4.19.0-rc8-next-20181016-00042-gb52cd80401e9-dirty #41 Hardware name: NVIDIA Tegra SoC (Flattened Device Tree) Workqueue: events deferred_probe_work_func Backtrace: [] (dump_backtrace) from [] (show_stack+0x20/0x24) [] (show_stack) from [] (dump_stack+0x9c/0xb0) [] (dump_stack) from [] (print_address_description+0x68/0x250) [] (print_address_description) from [] (kasan_report_invalid_free+0x68/0x88) [] (kasan_report_invalid_free) from [] (__kasan_slab_free+0x1f4/0x200) [] (__kasan_slab_free) from [] (kasan_slab_free+0x14/0x18) [] (kasan_slab_free) from [] (kfree+0x90/0x294) [] (kfree) from [] (__hwmon_device_register+0x5dc/0xa7c) [] (__hwmon_device_register) from [] (hwmon_device_register_with_info+0xa0/0xa8) [] (hwmon_device_register_with_info) from [] (devm_hwmon_device_register_with_info+0x74/0xb4) [] (devm_hwmon_device_register_with_info) from [] (lm90_probe+0x414/0x578) [] (lm90_probe) from [] (i2c_device_probe+0x35c/0x384) [] (i2c_device_probe) from [] (really_probe+0x290/0x3e4) [] (really_probe) from [] (driver_probe_device+0x80/0x1c4) [] (driver_probe_device) from [] (__device_attach_driver+0x104/0x11c) [] (__device_attach_driver) from [] (bus_for_each_drv+0xa4/0xc8) [] (bus_for_each_drv) from [] (__device_attach+0xf0/0x15c) [] (__device_attach) from [] (device_initial_probe+0x1c/0x20) [] (device_initial_probe) from [] (bus_probe_device+0xdc/0xec) [] (bus_probe_device) from [] (deferred_probe_work_func+0xa8/0xd4) [] (deferred_probe_work_func) from [] (process_one_work+0x3dc/0x96c) [] (process_one_work) from [] (worker_thread+0x4ec/0x8bc) [] (worker_thread) from [] (kthread+0x230/0x240) [] (kthread) from [] (ret_from_fork+0x14/0x38) Exception stack(0xcf743fb0 to 0xcf743ff8) 3fa0: 00000000 00000000 00000000 00000000 3fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 3fe0: 00000000 00000000 00000000 00000000 00000013 00000000 Allocated by task 132: kasan_kmalloc.part.1+0x58/0xf4 kasan_kmalloc+0x90/0xa4 kmem_cache_alloc_trace+0x90/0x2a0 __hwmon_device_register+0xbc/0xa7c hwmon_device_register_with_info+0xa0/0xa8 devm_hwmon_device_register_with_info+0x74/0xb4 lm90_probe+0x414/0x578 i2c_device_probe+0x35c/0x384 really_probe+0x290/0x3e4 driver_probe_device+0x80/0x1c4 __device_attach_driver+0x104/0x11c bus_for_each_drv+0xa4/0xc8 __device_attach+0xf0/0x15c device_initial_probe+0x1c/0x20 bus_probe_device+0xdc/0xec deferred_probe_work_func+0xa8/0xd4 process_one_work+0x3dc/0x96c worker_thread+0x4ec/0x8bc kthread+0x230/0x240 ret_from_fork+0x14/0x38 (null) Freed by task 132: __kasan_slab_free+0x12c/0x200 kasan_slab_free+0x14/0x18 kfree+0x90/0x294 hwmon_dev_release+0x1c/0x20 device_release+0x4c/0xe8 kobject_put+0xac/0x11c device_unregister+0x2c/0x30 __hwmon_device_register+0xa58/0xa7c hwmon_device_register_with_info+0xa0/0xa8 devm_hwmon_device_register_with_info+0x74/0xb4 lm90_probe+0x414/0x578 i2c_device_probe+0x35c/0x384 really_probe+0x290/0x3e4 driver_probe_device+0x80/0x1c4 __device_attach_driver+0x104/0x11c bus_for_each_drv+0xa4/0xc8 __device_attach+0xf0/0x15c device_initial_probe+0x1c/0x20 bus_probe_device+0xdc/0xec deferred_probe_work_func+0xa8/0xd4 process_one_work+0x3dc/0x96c worker_thread+0x4ec/0x8bc kthread+0x230/0x240 ret_from_fork+0x14/0x38 (null) Cc: # v4.15+ Fixes: 47c332deb8e8 ("hwmon: Deal with errors from the thermal subsystem") Signed-off-by: Dmitry Osipenko Signed-off-by: Guenter Roeck Signed-off-by: Greg Kroah-Hartman commit 1ef2d80cb3da62d3b2d5ff0a3f3fd7b62767941d Author: Arnd Bergmann Date: Thu Oct 11 13:06:16 2018 +0200 mtd: docg3: don't set conflicting BCH_CONST_PARAMS option commit be2e1c9dcf76886a83fb1c433a316e26d4ca2550 upstream. I noticed during the creation of another bugfix that the BCH_CONST_PARAMS option that is set by DOCG3 breaks setting variable parameters for any other users of the BCH library code. The only other user we have today is the MTD_NAND software BCH implementation (most flash controllers use hardware BCH these days and are not affected). I considered removing BCH_CONST_PARAMS entirely because of the inherent conflict, but according to the description in lib/bch.c there is a significant performance benefit in keeping it. To avoid the immediate problem of the conflict between MTD_NAND_BCH and DOCG3, this only sets the constant parameters if MTD_NAND_BCH is disabled, which should fix the problem for all cases that are affected. This should also work for all stable kernels. Note that there is only one machine that actually seems to use the DOCG3 driver (arch/arm/mach-pxa/mioa701.c), so most users should have the driver disabled, but it almost certainly shows up if we wanted to test random kernels on machines that use software BCH in MTD. Fixes: d13d19ece39f ("mtd: docg3: add ECC correction code") Cc: stable@vger.kernel.org Cc: Robert Jarzmik Signed-off-by: Arnd Bergmann Signed-off-by: Boris Brezillon Signed-off-by: Greg Kroah-Hartman commit ca589cb08cd6003f5dd5682dc0bfa3c90d0e41f0 Author: Boris Brezillon Date: Sun Oct 28 12:29:55 2018 +0100 mtd: nand: Fix nanddev_neraseblocks() commit d098093ba06eb032057d1aca1c2e45889e099d00 upstream. nanddev_neraseblocks() currently returns the number pages per LUN instead of the total number of eraseblocks. Fixes: 9c3736a3de21 ("mtd: nand: Add core infrastructure to deal with NAND devices") Cc: Signed-off-by: Boris Brezillon Reviewed-by: Miquel Raynal Signed-off-by: Greg Kroah-Hartman commit 98f1ce39c92c8c04d1bf53bff9fa9eb22b442f8c Author: Christophe JAILLET Date: Tue Oct 16 09:13:46 2018 +0200 mtd: spi-nor: cadence-quadspi: Return error code in cqspi_direct_read_execute() commit 91d7b67000c6e9bd605624079fee5a084238ad92 upstream. We return 0 unconditionally in 'cqspi_direct_read_execute()'. However, 'ret' is set to some error codes in several error handling paths. Return 'ret' instead to propagate the error code. Fixes: ffa639e069fb ("mtd: spi-nor: cadence-quadspi: Add DMA support for direct mode reads") Cc: Signed-off-by: Christophe JAILLET Signed-off-by: Boris Brezillon Signed-off-by: Greg Kroah-Hartman commit ab2b363733ac114797ae6e5378a0b3cd15501175 Author: Jarod Wilson Date: Sun Nov 4 14:59:46 2018 -0500 bonding/802.3ad: fix link_failure_count tracking commit ea53abfab960909d622ca37bcfb8e1c5378d21cc upstream. Commit 4d2c0cda07448ea6980f00102dc3964eb25e241c set slave->link to BOND_LINK_DOWN for 802.3ad bonds whenever invalid speed/duplex values were read, to fix a problem with slaves getting into weird states, but in the process, broke tracking of link failures, as going straight to BOND_LINK_DOWN when a link is indeed down (cable pulled, switch rebooted) means we broke out of bond_miimon_inspect()'s BOND_LINK_DOWN case because !link_state was already true, we never incremented commit, and never got a chance to call bond_miimon_commit(), where slave->link_failure_count would be incremented. I believe the simple fix here is to mark the slave as BOND_LINK_FAIL, and let bond_miimon_inspect() transition the link from _FAIL to either _UP or _DOWN, and in the latter case, we now get proper incrementing of link_failure_count again. Fixes: 4d2c0cda0744 ("bonding: speed/duplex update at NETDEV_UP event") CC: Mahesh Bandewar CC: David S. Miller CC: netdev@vger.kernel.org CC: stable@vger.kernel.org Signed-off-by: Jarod Wilson Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit e36798fc10e22e771dd3de78b0bf801233e2b1f2 Author: Ard Biesheuvel Date: Mon Nov 5 14:54:56 2018 +0100 ARM: 8809/1: proc-v7: fix Thumb annotation of cpu_v7_hvc_switch_mm commit 6282e916f774e37845c65d1eae9f8c649004f033 upstream. Due to what appears to be a copy/paste error, the opening ENTRY() of cpu_v7_hvc_switch_mm() lacks a matching ENDPROC(), and instead, the one for cpu_v7_smc_switch_mm() is duplicated. Given that it is ENDPROC() that emits the Thumb annotation, the cpu_v7_hvc_switch_mm() routine will be called in ARM mode on a Thumb2 kernel, resulting in the following splat: Internal error: Oops - undefined instruction: 0 [#1] SMP THUMB2 Modules linked in: CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.18.0-rc1-00030-g4d28ad89189d-dirty #488 Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015 PC is at cpu_v7_hvc_switch_mm+0x12/0x18 LR is at flush_old_exec+0x31b/0x570 pc : [] lr : [] psr: 00000013 sp : ee899e50 ip : 00000000 fp : 00000001 r10: eda28f34 r9 : eda31800 r8 : c12470e0 r7 : eda1fc00 r6 : eda53000 r5 : 00000000 r4 : ee88c000 r3 : c0316eec r2 : 00000001 r1 : eda53000 r0 : 6da6c000 Flags: nzcv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none Note the 'ISA ARM' in the last line. Fix this by using the correct name in ENDPROC(). Cc: Fixes: 10115105cb3a ("ARM: spectre-v2: add firmware based hardening") Reviewed-by: Dave Martin Acked-by: Marc Zyngier Signed-off-by: Ard Biesheuvel Signed-off-by: Russell King Signed-off-by: Greg Kroah-Hartman commit 5d64390cff09abb62aad05eea92255cd701bae1d Author: Vasily Khoruzhick Date: Thu Oct 25 12:15:43 2018 -0700 netfilter: conntrack: fix calculation of next bucket number in early_drop commit f393808dc64149ccd0e5a8427505ba2974a59854 upstream. If there's no entry to drop in bucket that corresponds to the hash, early_drop() should look for it in other buckets. But since it increments hash instead of bucket number, it actually looks in the same bucket 8 times: hsize is 16k by default (14 bits) and hash is 32-bit value, so reciprocal_scale(hash, hsize) returns the same value for hash..hash+7 in most cases. Fix it by increasing bucket number instead of hash and rename _hash to bucket to avoid future confusion. Fixes: 3e86638e9a0b ("netfilter: conntrack: consider ct netns in early_drop logic") Cc: # v4.7+ Signed-off-by: Vasily Khoruzhick Signed-off-by: Pablo Neira Ayuso Signed-off-by: Greg Kroah-Hartman commit ca1c5698fa94635e74ff9ea7b46906b0e04c4076 Author: Michal Hocko Date: Fri Nov 2 15:48:46 2018 -0700 memory_hotplug: cond_resched in __remove_pages commit dd33ad7b251f900481701b2a82d25de583867708 upstream. We have received a bug report that unbinding a large pmem (>1TB) can result in a soft lockup: NMI watchdog: BUG: soft lockup - CPU#9 stuck for 23s! [ndctl:4365] [...] Supported: Yes CPU: 9 PID: 4365 Comm: ndctl Not tainted 4.12.14-94.40-default #1 SLE12-SP4 Hardware name: Intel Corporation S2600WFD/S2600WFD, BIOS SE5C620.86B.01.00.0833.051120182255 05/11/2018 task: ffff9cce7d4410c0 task.stack: ffffbe9eb1bc4000 RIP: 0010:__put_page+0x62/0x80 Call Trace: devm_memremap_pages_release+0x152/0x260 release_nodes+0x18d/0x1d0 device_release_driver_internal+0x160/0x210 unbind_store+0xb3/0xe0 kernfs_fop_write+0x102/0x180 __vfs_write+0x26/0x150 vfs_write+0xad/0x1a0 SyS_write+0x42/0x90 do_syscall_64+0x74/0x150 entry_SYSCALL_64_after_hwframe+0x3d/0xa2 RIP: 0033:0x7fd13166b3d0 It has been reported on an older (4.12) kernel but the current upstream code doesn't cond_resched in the hot remove code at all and the given range to remove might be really large. Fix the issue by calling cond_resched once per memory section. Link: http://lkml.kernel.org/r/20181031125840.23982-1-mhocko@kernel.org Signed-off-by: Michal Hocko Acked-by: Johannes Thumshirn Cc: Dan Williams Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit a01e1536fc68e47b6a87b43133a3a92956db283e Author: Andrea Arcangeli Date: Fri Nov 2 15:47:59 2018 -0700 mm: thp: relax __GFP_THISNODE for MADV_HUGEPAGE mappings commit ac5b2c18911ffe95c08d69273917f90212cf5659 upstream. THP allocation might be really disruptive when allocated on NUMA system with the local node full or hard to reclaim. Stefan has posted an allocation stall report on 4.12 based SLES kernel which suggests the same issue: kvm: page allocation stalls for 194572ms, order:9, mode:0x4740ca(__GFP_HIGHMEM|__GFP_IO|__GFP_FS|__GFP_COMP|__GFP_NOMEMALLOC|__GFP_HARDWALL|__GFP_THISNODE|__GFP_MOVABLE|__GFP_DIRECT_RECLAIM), nodemask=(null) kvm cpuset=/ mems_allowed=0-1 CPU: 10 PID: 84752 Comm: kvm Tainted: G W 4.12.0+98-ph 0000001 SLE15 (unreleased) Hardware name: Supermicro SYS-1029P-WTRT/X11DDW-NT, BIOS 2.0 12/05/2017 Call Trace: dump_stack+0x5c/0x84 warn_alloc+0xe0/0x180 __alloc_pages_slowpath+0x820/0xc90 __alloc_pages_nodemask+0x1cc/0x210 alloc_pages_vma+0x1e5/0x280 do_huge_pmd_wp_page+0x83f/0xf00 __handle_mm_fault+0x93d/0x1060 handle_mm_fault+0xc6/0x1b0 __do_page_fault+0x230/0x430 do_page_fault+0x2a/0x70 page_fault+0x7b/0x80 [...] Mem-Info: active_anon:126315487 inactive_anon:1612476 isolated_anon:5 active_file:60183 inactive_file:245285 isolated_file:0 unevictable:15657 dirty:286 writeback:1 unstable:0 slab_reclaimable:75543 slab_unreclaimable:2509111 mapped:81814 shmem:31764 pagetables:370616 bounce:0 free:32294031 free_pcp:6233 free_cma:0 Node 0 active_anon:254680388kB inactive_anon:1112760kB active_file:240648kB inactive_file:981168kB unevictable:13368kB isolated(anon):0kB isolated(file):0kB mapped:280240kB dirty:1144kB writeback:0kB shmem:95832kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 81225728kB writeback_tmp:0kB unstable:0kB all_unreclaimable? no Node 1 active_anon:250583072kB inactive_anon:5337144kB active_file:84kB inactive_file:0kB unevictable:49260kB isolated(anon):20kB isolated(file):0kB mapped:47016kB dirty:0kB writeback:4kB shmem:31224kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 31897600kB writeback_tmp:0kB unstable:0kB all_unreclaimable? no The defrag mode is "madvise" and from the above report it is clear that the THP has been allocated for MADV_HUGEPAGA vma. Andrea has identified that the main source of the problem is __GFP_THISNODE usage: : The problem is that direct compaction combined with the NUMA : __GFP_THISNODE logic in mempolicy.c is telling reclaim to swap very : hard the local node, instead of failing the allocation if there's no : THP available in the local node. : : Such logic was ok until __GFP_THISNODE was added to the THP allocation : path even with MPOL_DEFAULT. : : The idea behind the __GFP_THISNODE addition, is that it is better to : provide local memory in PAGE_SIZE units than to use remote NUMA THP : backed memory. That largely depends on the remote latency though, on : threadrippers for example the overhead is relatively low in my : experience. : : The combination of __GFP_THISNODE and __GFP_DIRECT_RECLAIM results in : extremely slow qemu startup with vfio, if the VM is larger than the : size of one host NUMA node. This is because it will try very hard to : unsuccessfully swapout get_user_pages pinned pages as result of the : __GFP_THISNODE being set, instead of falling back to PAGE_SIZE : allocations and instead of trying to allocate THP on other nodes (it : would be even worse without vfio type1 GUP pins of course, except it'd : be swapping heavily instead). Fix this by removing __GFP_THISNODE for THP requests which are requesting the direct reclaim. This effectivelly reverts 5265047ac301 on the grounds that the zone/node reclaim was known to be disruptive due to premature reclaim when there was memory free. While it made sense at the time for HPC workloads without NUMA awareness on rare machines, it was ultimately harmful in the majority of cases. The existing behaviour is similar, if not as widespare as it applies to a corner case but crucially, it cannot be tuned around like zone_reclaim_mode can. The default behaviour should always be to cause the least harm for the common case. If there are specialised use cases out there that want zone_reclaim_mode in specific cases, then it can be built on top. Longterm we should consider a memory policy which allows for the node reclaim like behavior for the specific memory ranges which would allow a [1] http://lkml.kernel.org/r/20180820032204.9591-1-aarcange@redhat.com Mel said: : Both patches look correct to me but I'm responding to this one because : it's the fix. The change makes sense and moves further away from the : severe stalling behaviour we used to see with both THP and zone reclaim : mode. : : I put together a basic experiment with usemem configured to reference a : buffer multiple times that is 80% the size of main memory on a 2-socket : box with symmetric node sizes and defrag set to "always". The defrag : setting is not the default but it would be functionally similar to : accessing a buffer with madvise(MADV_HUGEPAGE). Usemem is configured to : reference the buffer multiple times and while it's not an interesting : workload, it would be expected to complete reasonably quickly as it fits : within memory. The results were; : : usemem : vanilla noreclaim-v1 : Amean Elapsd-1 42.78 ( 0.00%) 26.87 ( 37.18%) : Amean Elapsd-3 27.55 ( 0.00%) 7.44 ( 73.00%) : Amean Elapsd-4 5.72 ( 0.00%) 5.69 ( 0.45%) : : This shows the elapsed time in seconds for 1 thread, 3 threads and 4 : threads referencing buffers 80% the size of memory. With the patches : applied, it's 37.18% faster for the single thread and 73% faster with two : threads. Note that 4 threads showing little difference does not indicate : the problem is related to thread counts. It's simply the case that 4 : threads gets spread so their workload mostly fits in one node. : : The overall view from /proc/vmstats is more startling : : 4.19.0-rc1 4.19.0-rc1 : vanillanoreclaim-v1r1 : Minor Faults 35593425 708164 : Major Faults 484088 36 : Swap Ins 3772837 0 : Swap Outs 3932295 0 : : Massive amounts of swap in/out without the patch : : Direct pages scanned 6013214 0 : Kswapd pages scanned 0 0 : Kswapd pages reclaimed 0 0 : Direct pages reclaimed 4033009 0 : : Lots of reclaim activity without the patch : : Kswapd efficiency 100% 100% : Kswapd velocity 0.000 0.000 : Direct efficiency 67% 100% : Direct velocity 11191.956 0.000 : : Mostly from direct reclaim context as you'd expect without the patch. : : Page writes by reclaim 3932314.000 0.000 : Page writes file 19 0 : Page writes anon 3932295 0 : Page reclaim immediate 42336 0 : : Writes from reclaim context is never good but the patch eliminates it. : : We should never have default behaviour to thrash the system for such a : basic workload. If zone reclaim mode behaviour is ever desired but on a : single task instead of a global basis then the sensible option is to build : a mempolicy that enforces that behaviour. This was a severe regression compared to previous kernels that made important workloads unusable and it starts when __GFP_THISNODE was added to THP allocations under MADV_HUGEPAGE. It is not a significant risk to go to the previous behavior before __GFP_THISNODE was added, it worked like that for years. This was simply an optimization to some lucky workloads that can fit in a single node, but it ended up breaking the VM for others that can't possibly fit in a single node, so going back is safe. [mhocko@suse.com: rewrote the changelog based on the one from Andrea] Link: http://lkml.kernel.org/r/20180925120326.24392-2-mhocko@kernel.org Fixes: 5265047ac301 ("mm, thp: really limit transparent hugepage allocation to local node") Signed-off-by: Andrea Arcangeli Signed-off-by: Michal Hocko Reported-by: Stefan Priebe Debugged-by: Andrea Arcangeli Reported-by: Alex Williamson Reviewed-by: Mel Gorman Tested-by: Mel Gorman Cc: Zi Yan Cc: Vlastimil Babka Cc: David Rientjes Cc: "Kirill A. Shutemov" Cc: [4.1+] Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit b74b152e94428c7f7e7b2845421d7900873ccefd Author: Wengang Wang Date: Fri Nov 16 15:08:25 2018 -0800 ocfs2: free up write context when direct IO failed commit 5040f8df56fb90c7919f1c9b0b6e54c843437456 upstream. The write context should also be freed even when direct IO failed. Otherwise a memory leak is introduced and entries remain in oi->ip_unwritten_list causing the following BUG later in unlink path: ERROR: bug expression: !list_empty(&oi->ip_unwritten_list) ERROR: Clear inode of 215043, inode has unwritten extents ... Call Trace: ? __set_current_blocked+0x42/0x68 ocfs2_evict_inode+0x91/0x6a0 [ocfs2] ? bit_waitqueue+0x40/0x33 evict+0xdb/0x1af iput+0x1a2/0x1f7 do_unlinkat+0x194/0x28f SyS_unlinkat+0x1b/0x2f do_syscall_64+0x79/0x1ae entry_SYSCALL_64_after_hwframe+0x151/0x0 This patch also logs, with frequency limit, direct IO failures. Link: http://lkml.kernel.org/r/20181102170632.25921-1-wen.gang.wang@oracle.com Signed-off-by: Wengang Wang Reviewed-by: Junxiao Bi Reviewed-by: Changwei Ge Reviewed-by: Joseph Qi Cc: Mark Fasheh Cc: Joel Becker Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 8362d09a7ff3686ff5101d522734648a033bfcde Author: Changwei Ge Date: Fri Nov 2 15:48:15 2018 -0700 ocfs2: fix a misuse a of brelse after failing ocfs2_check_dir_entry commit 29aa30167a0a2e6045a0d6d2e89d8168132333d5 upstream. Somehow, file system metadata was corrupted, which causes ocfs2_check_dir_entry() to fail in function ocfs2_dir_foreach_blk_el(). According to the original design intention, if above happens we should skip the problematic block and continue to retrieve dir entry. But there is obviouse misuse of brelse around related code. After failure of ocfs2_check_dir_entry(), current code just moves to next position and uses the problematic buffer head again and again during which the problematic buffer head is released for multiple times. I suppose, this a serious issue which is long-lived in ocfs2. This may cause other file systems which is also used in a the same host insane. So we should also consider about bakcporting this patch into linux -stable. Link: http://lkml.kernel.org/r/HK2PR06MB045211675B43EED794E597B6D56E0@HK2PR06MB0452.apcprd06.prod.outlook.com Signed-off-by: Changwei Ge Suggested-by: Changkuo Shi Reviewed-by: Andrew Morton Cc: Mark Fasheh Cc: Joel Becker Cc: Junxiao Bi Cc: Joseph Qi Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit f458499f2c96196c1b1b983a6eafe74df193588d Author: Marc Zyngier Date: Wed Oct 31 08:41:34 2018 +0000 soc: ti: QMSS: Fix usage of irq_set_affinity_hint commit 832ad0e3da4510fd17f98804abe512ea9a747035 upstream. The Keystone QMSS driver is pretty damaged, in the sense that it does things like this: irq_set_affinity_hint(irq, to_cpumask(&cpu_map)); where cpu_map is a local variable. As we leave the function, this will point to nowhere-land, and things will end-up badly. Instead, let's use a proper cpumask that gets allocated, giving the driver a chance to actually work with things like irqbalance as well as have a hypothetical 64bit future. Cc: stable@vger.kernel.org Acked-by: Santosh Shilimkar Signed-off-by: Marc Zyngier Signed-off-by: Olof Johansson Signed-off-by: Greg Kroah-Hartman commit dd4f21dfb81db8fa82607db8fda76d597b191f36 Author: Christophe Leroy Date: Fri Oct 19 06:54:54 2018 +0000 Revert "powerpc/8xx: Use L1 entry APG to handle _PAGE_ACCESSED for CONFIG_SWAP" commit cc4ebf5c0a3440ed0a32d25c55ebdb6ce5f3c0bc upstream. This reverts commit 4f94b2c7462d9720b2afa7e8e8d4c19446bb31ce. That commit was buggy, as it used rlwinm instead of rlwimi. Instead of fixing that bug, we revert the previous commit in order to reduce the dependency between L1 entries and L2 entries Fixes: 4f94b2c7462d9 ("powerpc/8xx: Use L1 entry APG to handle _PAGE_ACCESSED for CONFIG_SWAP") Cc: stable@vger.kernel.org Signed-off-by: Christophe Leroy Signed-off-by: Michael Ellerman Signed-off-by: Greg Kroah-Hartman commit 094e5a111d0e68595f26463413f36525f0e03590 Author: Ming Lei Date: Wed Nov 14 16:25:51 2018 +0800 SCSI: fix queue cleanup race before queue initialization is done commit 8dc765d438f1e42b3e8227b3b09fad7d73f4ec9a upstream. c2856ae2f315d ("blk-mq: quiesce queue before freeing queue") has already fixed this race, however the implied synchronize_rcu() in blk_mq_quiesce_queue() can slow down LUN probe a lot, so caused performance regression. Then 1311326cf4755c7 ("blk-mq: avoid to synchronize rcu inside blk_cleanup_queue()") tried to quiesce queue for avoiding unnecessary synchronize_rcu() only when queue initialization is done, because it is usual to see lots of inexistent LUNs which need to be probed. However, turns out it isn't safe to quiesce queue only when queue initialization is done. Because when one SCSI command is completed, the user of sending command can be waken up immediately, then the scsi device may be removed, meantime the run queue in scsi_end_request() is still in-progress, so kernel panic can be caused. In Red Hat QE lab, there are several reports about this kind of kernel panic triggered during kernel booting. This patch tries to address the issue by grabing one queue usage counter during freeing one request and the following run queue. Fixes: 1311326cf4755c7 ("blk-mq: avoid to synchronize rcu inside blk_cleanup_queue()") Cc: Andrew Jones Cc: Bart Van Assche Cc: linux-scsi@vger.kernel.org Cc: Martin K. Petersen Cc: Christoph Hellwig Cc: James E.J. Bottomley Cc: stable Cc: jianchao.wang Signed-off-by: Ming Lei Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman commit aef93deeff58652cc39edb99dba77ae97c7ede38 Author: Quinn Tran Date: Tue Nov 6 00:51:21 2018 -0800 scsi: qla2xxx: Initialize port speed to avoid setting lower speed commit f635e48e866ee1a47d2d42ce012fdcc07bf55853 upstream. This patch initializes port speed so that firmware does not set lower operating speed. Setting lower speed in firmware impacts WRITE perfomance. Fixes: 726b85487067 ("qla2xxx: Add framework for async fabric discovery") Cc: Signed-off-by: Quinn Tran Signed-off-by: Himanshu Madhani Tested-by: Laurence Oberman Reviewed-by: Ewan D. Milne Signed-off-by: Martin K. Petersen Signed-off-by: Greg Kroah-Hartman commit 3fb039d37f96b2b62dbe6115658a17a57156dcf3 Author: Greg Edwards Date: Wed Aug 22 13:21:53 2018 -0600 vhost/scsi: truncate T10 PI iov_iter to prot_bytes commit 4542d623c7134bc1738f8a68ccb6dd546f1c264f upstream. Commands with protection information included were not truncating the protection iov_iter to the number of protection bytes in the command. This resulted in vhost_scsi mis-calculating the size of the protection SGL in vhost_scsi_calc_sgls(), and including both the protection and data SG entries in the protection SGL. Fixes: 09b13fa8c1a1 ("vhost/scsi: Add ANY_LAYOUT support in vhost_scsi_handle_vq") Signed-off-by: Greg Edwards Signed-off-by: Michael S. Tsirkin Fixes: 09b13fa8c1a1093e9458549ac8bb203a7c65c62a Cc: stable@vger.kernel.org Reviewed-by: Paolo Bonzini Signed-off-by: Greg Kroah-Hartman commit 543c52975eb8d946678840fdb7a590db3c04a6a5 Author: Gustavo A. R. Silva Date: Wed Jul 25 19:47:19 2018 -0500 reset: hisilicon: fix potential NULL pointer dereference commit e9a2310fb689151166df7fd9971093362d34bd79 upstream. There is a potential execution path in which function platform_get_resource() returns NULL. If this happens, we will end up having a NULL pointer dereference. Fix this by replacing devm_ioremap with devm_ioremap_resource, which has the NULL check and the memory region request. This code was detected with the help of Coccinelle. Cc: stable@vger.kernel.org Fixes: 97b7129cd2af ("reset: hisilicon: change the definition of hisi_reset_init") Signed-off-by: Gustavo A. R. Silva Signed-off-by: Stephen Boyd Signed-off-by: Greg Kroah-Hartman commit 84d1d9cc1c0c61a8b8d254c23e31a91fbbb40698 Author: Dan Williams Date: Thu Nov 1 00:30:22 2018 -0700 acpi, nfit: Fix ARS overflow continuation commit 3fa58dcab50a0aa16817f16a8d38aee869eb3fb9 upstream. When the platform BIOS is unable to report all the media error records it requires the OS to restart the scrub at a prescribed location. The driver detects the overflow condition, but then fails to report it to the ARS state machine after reaping the records. Propagate -ENOSPC correctly to continue the ARS operation. Cc: Fixes: 1cf03c00e7c1 ("nfit: scrub and register regions in a workqueue") Reported-by: Jacek Zloch Reviewed-by: Dave Jiang Signed-off-by: Dan Williams Signed-off-by: Greg Kroah-Hartman commit 13bb0de8e25c9f2fa846ee92c1f8a736a54423aa Author: Vishal Verma Date: Thu Oct 25 18:37:29 2018 -0600 acpi/nfit, x86/mce: Validate a MCE's address before using it commit e8a308e5f47e545e0d41d0686c00f5f5217c5f61 upstream. The NFIT machine check handler uses the physical address from the mce structure, and compares it against information in the ACPI NFIT table to determine whether that location lies on an NVDIMM. The mce->addr field however may not always be valid, and this is indicated by the MCI_STATUS_ADDRV bit in the status field. Export mce_usable_address() which already performs validation for the address, and use it in the NFIT handler. Fixes: 6839a6d96f4e ("nfit: do an ARS scrub on hitting a latent media error") Reported-by: Robert Elliott Signed-off-by: Vishal Verma Signed-off-by: Borislav Petkov CC: Arnd Bergmann Cc: Dan Williams CC: Dave Jiang CC: elliott@hpe.com CC: "H. Peter Anvin" CC: Ingo Molnar CC: Len Brown CC: linux-acpi@vger.kernel.org CC: linux-edac CC: linux-nvdimm@lists.01.org CC: Qiuxu Zhuo CC: "Rafael J. Wysocki" CC: Ross Zwisler CC: stable CC: Thomas Gleixner CC: Tony Luck CC: x86-ml CC: Yazen Ghannam Link: http://lkml.kernel.org/r/20181026003729.8420-2-vishal.l.verma@intel.com Signed-off-by: Greg Kroah-Hartman commit cdd219834f93c2e83bf8045b212332e221727bda Author: Vishal Verma Date: Thu Oct 25 18:37:28 2018 -0600 acpi/nfit, x86/mce: Handle only uncorrectable machine checks commit 5d96c9342c23ee1d084802dcf064caa67ecaa45b upstream. The MCE handler for nfit devices is called for memory errors on a Non-Volatile DIMM and adds the error location to a 'badblocks' list. This list is used by the various NVDIMM drivers to avoid consuming known poison locations during IO. The MCE handler gets called for both corrected and uncorrectable errors. Until now, both kinds of errors have been added to the badblocks list. However, corrected memory errors indicate that the problem has already been fixed by hardware, and the resulting interrupt is merely a notification to Linux. As far as future accesses to that location are concerned, it is perfectly fine to use, and thus doesn't need to be included in the above badblocks list. Add a check in the nfit MCE handler to filter out corrected mce events, and only process uncorrectable errors. Fixes: 6839a6d96f4e ("nfit: do an ARS scrub on hitting a latent media error") Reported-by: Omar Avelar Signed-off-by: Vishal Verma Signed-off-by: Borislav Petkov CC: Arnd Bergmann CC: Dan Williams CC: Dave Jiang CC: elliott@hpe.com CC: "H. Peter Anvin" CC: Ingo Molnar CC: Len Brown CC: linux-acpi@vger.kernel.org CC: linux-edac CC: linux-nvdimm@lists.01.org CC: Qiuxu Zhuo CC: "Rafael J. Wysocki" CC: Ross Zwisler CC: stable CC: Thomas Gleixner CC: Tony Luck CC: x86-ml CC: Yazen Ghannam Link: http://lkml.kernel.org/r/20181026003729.8420-1-vishal.l.verma@intel.com Signed-off-by: Greg Kroah-Hartman commit 69df28065f304fadb86f1f85128f9c56af19c5c2 Author: Mikulas Patocka Date: Mon Oct 8 12:57:35 2018 +0200 mach64: fix image corruption due to reading accelerator registers commit c09bcc91bb94ed91f1391bffcbe294963d605732 upstream. Reading the registers without waiting for engine idle returns unpredictable values. These unpredictable values result in display corruption - if atyfb_imageblit reads the content of DP_PIX_WIDTH with the bit DP_HOST_TRIPLE_EN set (from previous invocation), the driver would never ever clear the bit, resulting in display corruption. We don't want to wait for idle because it would degrade performance, so this patch modifies the driver so that it never reads accelerator registers. HOST_CNTL doesn't have to be read, we can just write it with HOST_BYTE_ALIGN because no other part of the driver cares if HOST_BYTE_ALIGN is set. DP_PIX_WIDTH is written in the functions atyfb_copyarea and atyfb_fillrect with the default value and in atyfb_imageblit with the value set according to the source image data. Signed-off-by: Mikulas Patocka Reviewed-by: Ville Syrjälä Cc: stable@vger.kernel.org Signed-off-by: Bartlomiej Zolnierkiewicz Signed-off-by: Greg Kroah-Hartman commit 7f6c07f5402e27261431f473052a2b7d29c8138c Author: Mikulas Patocka Date: Mon Oct 8 12:57:34 2018 +0200 mach64: fix display corruption on big endian machines commit 3c6c6a7878d00a3ac997a779c5b9861ff25dfcc8 upstream. The code for manual bit triple is not endian-clean. It builds the variable "hostdword" using byte accesses, therefore we must read the variable with "le32_to_cpu". The patch also enables (hardware or software) bit triple only if the image is monochrome (image->depth). If we want to blit full-color image, we shouldn't use the triple code. Signed-off-by: Mikulas Patocka Reviewed-by: Ville Syrjälä Cc: stable@vger.kernel.org Signed-off-by: Bartlomiej Zolnierkiewicz Signed-off-by: Greg Kroah-Hartman commit c5a92417978308ac8a551d07f99fb92c213756de Author: Dmitry Osipenko Date: Mon Aug 13 20:14:00 2018 +0300 thermal: core: Fix use-after-free in thermal_cooling_device_destroy_sysfs commit 3c587768271e9c20276522025729e4ebca51583b upstream. This patch fixes use-after-free that was detected by KASAN. The bug is triggered on a CPUFreq driver module unload by freeing 'cdev' on device unregister and then using the freed structure during of the cdev's sysfs data destruction. The solution is to unregister the sysfs at first, then destroy sysfs data and finally release the cooling device. Cc: # v4.17+ Fixes: 8ea229511e06 ("thermal: Add cooling device's statistics in sysfs") Signed-off-by: Dmitry Osipenko Acked-by: Viresh Kumar Acked-by: Eduardo Valentin Signed-off-by: Zhang Rui Signed-off-by: Greg Kroah-Hartman commit bb34fbacd90ca34cc2f73844fbad5ebcd4a11a85 Author: Yan, Zheng Date: Thu Sep 27 21:16:05 2018 +0800 Revert "ceph: fix dentry leak in splice_dentry()" commit efe328230dc01aa0b1269aad0b5fae73eea4677a upstream. This reverts commit 8b8f53af1ed9df88a4c0fbfdf3db58f62060edf3. splice_dentry() is used by three places. For two places, req->r_dentry is passed to splice_dentry(). In the case of error, req->r_dentry does not get updated. So splice_dentry() should not drop reference. Cc: stable@vger.kernel.org # 4.18+ Signed-off-by: "Yan, Zheng" Signed-off-by: Ilya Dryomov Signed-off-by: Greg Kroah-Hartman commit e5d8d13800ca3f3008fc7c144ab3daad557b3fcd Author: Ilya Dryomov Date: Wed Sep 26 18:03:16 2018 +0200 libceph: bump CEPH_MSG_MAX_DATA_LEN commit 94e6992bb560be8bffb47f287194adf070b57695 upstream. If the read is large enough, we end up spinning in the messenger: libceph: osd0 192.168.122.1:6801 io error libceph: osd0 192.168.122.1:6801 io error libceph: osd0 192.168.122.1:6801 io error This is a receive side limit, so only reads were affected. Cc: stable@vger.kernel.org Signed-off-by: Ilya Dryomov Signed-off-by: Greg Kroah-Hartman commit 1189a221680017f7ffec6165085e66185f9b2581 Author: Enric Balletbo i Serra Date: Tue Oct 16 15:41:44 2018 +0200 clk: rockchip: Fix static checker warning in rockchip_ddrclk_get_parent call commit 665636b2940d0897c4130253467f5e8c42eea392 upstream. Fixes the signedness bug returning '(-22)' on the return type by removing the sanity checker in rockchip_ddrclk_get_parent(). The function should return and unsigned value only and it's safe to remove the sanity checker as the core functions that call get_parent like clk_core_get_parent_by_index already ensures the validity of the clk index returned (index >= core->num_parents). Fixes: a4f182bf81f18 ("clk: rockchip: add new clock-type for the ddrclk") Cc: stable@vger.kernel.org Signed-off-by: Enric Balletbo i Serra Reviewed-by: Stephen Boyd Signed-off-by: Heiko Stuebner Signed-off-by: Greg Kroah-Hartman commit 91a17b82ae41023c5160a5bd4e650868932e61f8 Author: Ziyuan Xu Date: Thu Oct 11 15:26:43 2018 +0800 clk: rockchip: fix wrong mmc sample phase shift for rk3328 commit 82f4b67f018c88a7cc9337f0067ed3d6ec352648 upstream. mmc sample shift is 0 for RK3328 referring to the TRM. So fix them. Fixes: fe3511ad8a1c ("clk: rockchip: add clock controller for rk3328") Cc: stable@vger.kernel.org Signed-off-by: Ziyuan Xu Signed-off-by: Shawn Lin Signed-off-by: Heiko Stuebner Signed-off-by: Greg Kroah-Hartman commit 7b659a3e511681476b7a5233abcff1d7d3d0c7dc Author: Icenowy Zheng Date: Thu Aug 9 01:19:52 2018 +0800 clk: sunxi-ng: h6: fix bus clocks' divider position commit 2852bfbf4f168fec27049ad9ed20941fc9e84b95 upstream. The bus clocks (AHB/APB) on Allwinner H6 have their second divider start at bit 8, according to the user manual and the BSP code. However, currently the divider offset is incorrectly set to 16, thus the divider is not correctly read and the clock frequency is not correctly calculated. Fix this bit offset on all affected bus clocks in ccu-sun50i-h6. Cc: stable@vger.kernel.org # v4.17.y Signed-off-by: Icenowy Zheng Signed-off-by: Chen-Yu Tsai Signed-off-by: Greg Kroah-Hartman commit 3a6f7116bf575da89a630a45ef9b2591d0b65a0b Author: Ronald Wahl Date: Wed Oct 10 15:54:54 2018 +0200 clk: at91: Fix division by zero in PLL recalc_rate() commit 0f5cb0e6225cae2f029944cb8c74617aab6ddd49 upstream. Commit a982e45dc150 ("clk: at91: PLL recalc_rate() now using cached MUL and DIV values") removed a check that prevents a division by zero. This now causes a stacktrace when booting the kernel on a at91 platform if the PLL DIV register contains zero. This commit reintroduces this check. Fixes: a982e45dc150 ("clk: at91: PLL recalc_rate() now using cached...") Cc: Signed-off-by: Ronald Wahl Acked-by: Ludovic Desroches Signed-off-by: Stephen Boyd Signed-off-by: Greg Kroah-Hartman commit b4972d19dbf84eba123a4fcc3c56afb76a123adb Author: Krzysztof Kozlowski Date: Wed Aug 29 21:20:10 2018 +0200 clk: s2mps11: Fix matching when built as module and DT node contains compatible commit 8985167ecf57f97061599a155bb9652c84ea4913 upstream. When driver is built as module and DT node contains clocks compatible (e.g. "samsung,s2mps11-clk"), the module will not be autoloaded because module aliases won't match. The modalias from uevent: of:NclocksTCsamsung,s2mps11-clk The modalias from driver: platform:s2mps11-clk The devices are instantiated by parent's MFD. However both Device Tree bindings and parent define the compatible for clocks devices. In case of module matching this DT compatible will be used. The issue will not happen if this is a built-in (no need for module matching) or when clocks DT node does not contain compatible (not correct from bindings perspective but working for driver). Note when backporting to stable kernels: adjust the list of device ID entries. Cc: Fixes: 53c31b3437a6 ("mfd: sec-core: Add of_compatible strings for clock MFD cells") Signed-off-by: Krzysztof Kozlowski Acked-by: Stephen Boyd Signed-off-by: Stephen Boyd Signed-off-by: Greg Kroah-Hartman commit 7b8b07a696c2ec670dfc755ac3199a045d4fec13 Author: Richard Weinberger Date: Fri Jun 15 16:42:54 2018 +0200 um: Drop own definition of PTRACE_SYSEMU/_SINGLESTEP commit 0676b957c24bfb6e495449ba7b7e72c5b5d79233 upstream. 32bit UML used to define PTRACE_SYSEMU and PTRACE_SYSEMU_SINGLESTEP own its own because many years ago not all libcs had these request codes in their UAPI. These days PTRACE_SYSEMU/_SINGLESTEP is well known and part of glibc and our own define becomes problematic. With change c48831d0eebf ("linux/x86: sync sys/ptrace.h with Linux 4.14 [BZ #22433]") glibc turned PTRACE_SYSEMU/_SINGLESTEP into a enum and UML failed to build. Let's drop our define and rely on the fact that every libc has PTRACE_SYSEMU/_SINGLESTEP. Cc: Cc: Ritesh Raj Sarraf Reported-and-tested-by: Ritesh Raj Sarraf Signed-off-by: Richard Weinberger Signed-off-by: Greg Kroah-Hartman commit 8e81ecdcbaafb40ffd246c4d2542e6eaffb0ac35 Author: Max Filippov Date: Tue Nov 13 23:46:42 2018 -0800 xtensa: fix boot parameters address translation commit 40dc948f234b73497c3278875eb08a01d5854d3f upstream. The bootloader may pass physical address of the boot parameters structure to the MMUv3 kernel in the register a2. Code in the _SetupMMU block in the arch/xtensa/kernel/head.S is supposed to map that physical address to the virtual address in the configured virtual memory layout. This code haven't been updated when additional 256+256 and 512+512 memory layouts were introduced and it may produce wrong addresses when used with these layouts. Cc: stable@vger.kernel.org Signed-off-by: Max Filippov Signed-off-by: Greg Kroah-Hartman commit b9474cabc3c463de37eb9a67999c989887e1a09f Author: Max Filippov Date: Sun Nov 4 01:46:00 2018 -0700 xtensa: make sure bFLT stack is 16 byte aligned commit 0773495b1f5f1c5e23551843f87b5ff37e7af8f7 upstream. Xtensa ABI requires stack alignment to be at least 16. In noMMU configuration ARCH_SLAB_MINALIGN is used to align stack. Make it at least 16. This fixes the following runtime error in noMMU configuration, caused by interaction between insufficiently aligned stack and alloca function, that results in corruption of on-stack variable in the libc function glob: Caught unhandled exception in 'sh' (pid = 47, pc = 0x02d05d65) - should not happen EXCCAUSE is 15 Cc: stable@vger.kernel.org Signed-off-by: Max Filippov Signed-off-by: Greg Kroah-Hartman commit 8501e03baaf2d41b27663dbc3b507f9990eb49fd Author: Max Filippov Date: Mon Oct 29 18:30:13 2018 -0700 xtensa: add NOTES section to the linker script commit 4119ba211bc4f1bf638f41e50b7a0f329f58aa16 upstream. This section collects all source .note.* sections together in the vmlinux image. Without it .note.Linux section may be placed at address 0, while the rest of the kernel is at its normal address, resulting in a huge vmlinux.bin image that may not be linked into the xtensa Image.elf. Cc: stable@vger.kernel.org Signed-off-by: Max Filippov Signed-off-by: Greg Kroah-Hartman commit 454dd3e93e3895460fb6ba1d35cbc724567597f4 Author: Huacai Chen Date: Wed Sep 5 17:33:09 2018 +0800 MIPS: Loongson-3: Fix BRIDGE irq delivery problem [ Upstream commit 360fe725f8849aaddc53475fef5d4a0c439b05ae ] After commit e509bd7da149dc349160 ("genirq: Allow migration of chained interrupts by installing default action") Loongson-3 fails at here: setup_irq(LOONGSON_HT1_IRQ, &cascade_irqaction); This is because both chained_action and cascade_irqaction don't have IRQF_SHARED flag. This will cause Loongson-3 resume fails because HPET timer interrupt can't be delivered during S3. So we set the irqchip of the chained irq to loongson_irq_chip which doesn't disable the chained irq in CP0.Status. Cc: stable@vger.kernel.org Signed-off-by: Huacai Chen Signed-off-by: Paul Burton Patchwork: https://patchwork.linux-mips.org/patch/20434/ Cc: Ralf Baechle Cc: James Hogan Cc: linux-mips@linux-mips.org Cc: Fuxin Zhang Cc: Zhangjin Wu Cc: Huacai Chen Signed-off-by: Sasha Levin commit 099ef76d3dd95ff7fa935fb35d77448815e375f4 Author: Huacai Chen Date: Wed Sep 5 17:33:08 2018 +0800 MIPS: Loongson-3: Fix CPU UART irq delivery problem [ Upstream commit d06f8a2f1befb5a3d0aa660ab1c05e9b744456ea ] Masking/unmasking the CPU UART irq in CP0_Status (and redirecting it to other CPUs) may cause interrupts be lost, especially in multi-package machines (Package-0's UART irq cannot be delivered to others). So make mask_loongson_irq() and unmask_loongson_irq() be no-ops. The original problem (UART IRQ may deliver to any core) is also because of masking/unmasking the CPU UART irq in CP0_Status. So it is safe to remove all of the stuff. Signed-off-by: Huacai Chen Signed-off-by: Paul Burton Patchwork: https://patchwork.linux-mips.org/patch/20433/ Cc: Ralf Baechle Cc: James Hogan Cc: linux-mips@linux-mips.org Cc: Fuxin Zhang Cc: Zhangjin Wu Cc: Huacai Chen Signed-off-by: Sasha Levin commit 22f185cf0272f69ef8c8b0bfafe2dc7a11aa6d7b Author: Minchan Kim Date: Wed Nov 14 14:52:23 2018 +0900 zram: close udev startup race condition as default groups commit fef912bf860e upstream. commit 98af4d4df889 upstream. I got a report from Howard Chen that he saw zram and sysfs race(ie, zram block device file is created but sysfs for it isn't yet) when he tried to create new zram devices via hotadd knob. v4.20 kernel fixes it by [1, 2] but it's too large size to merge into -stable so this patch fixes the problem by registering defualt group by Greg KH's approach[3]. This patch should be applied to every stable tree [3.16+] currently existing from kernel.org because the problem was introduced at 2.6.37 by [4]. [1] fef912bf860e, block: genhd: add 'groups' argument to device_add_disk [2] 98af4d4df889, zram: register default groups with device_add_disk() [3] http://kroah.com/log/blog/2013/06/26/how-to-create-a-sysfs-file-correctly/ [4] 33863c21e69e9, Staging: zram: Replace ioctls with sysfs interface Cc: Sergey Senozhatsky Cc: Hannes Reinecke Tested-by: Howard Chen Signed-off-by: Minchan Kim Reviewed-by: Hannes Reinecke Signed-off-by: Sasha Levin commit b2405b2330d30927260d92caf76e61eed7fa0b78 Author: Jerome Brunet Date: Thu Nov 8 10:31:23 2018 +0100 clk: meson: axg: mark fdiv2 and fdiv3 as critical [ Upstream commit d6ee1e7e9004d3d246cdfa14196989e0a9466c16 ] Similar to gxbb and gxl platforms, axg SCPI Cortex-M co-processor uses the fdiv2 and fdiv3 to, among other things, provide the cpu clock. Until clock hand-off mechanism makes its way to CCF and the generic SCPI claims platform specific clocks, these clocks must be marked as critical to make sure they are never disabled when needed by the co-processor. Fixes: 05f814402d61 ("clk: meson: add fdiv clock gates") Signed-off-by: Jerome Brunet Acked-by: Neil Armstrong Signed-off-by: Stephen Boyd Signed-off-by: Sasha Levin commit b7dcf0d3a8f17826c2b06a7a2189a8163b9c46d0 Author: Christian Hewitt Date: Tue Nov 6 00:08:20 2018 +0100 clk: meson-gxbb: set fclk_div3 as CLK_IS_CRITICAL [ Upstream commit e2576c8bdfd462c34b8a46c0084e7c30b0851bf4 ] On the Khadas VIM2 (GXM) and LePotato (GXL) board there are problems with reboot; e.g. a ~60 second delay between issuing reboot and the board power cycling (and in some OS configurations reboot will fail and require manual power cycling). Similar to 'commit c987ac6f1f088663b6dad39281071aeb31d450a8 ("clk: meson-gxbb: set fclk_div2 as CLK_IS_CRITICAL")' the SCPI Cortex-M4 Co-Processor seems to depend on FCLK_DIV3 being operational. Until commit 05f814402d6174369b3b29832cbb5eb5ed287059 ("clk: meson: add fdiv clock gates"), this clock was modeled and left on by the bootloader. We don't have precise documentation about the SCPI Co-Processor and its clock requirement so we are learning things the hard way. Marking this clock as critical solves the problem but it should not be viewed as final solution. Ideally, the SCPI driver should claim these clocks. We also depends on some clock hand-off mechanism making its way to CCF, to make sure the clock stays on between its registration and the SCPI driver probe. Fixes: 05f814402d61 ("clk: meson: add fdiv clock gates") Signed-off-by: Christian Hewitt Signed-off-by: Jerome Brunet Signed-off-by: Stephen Boyd Signed-off-by: Sasha Levin commit 74cd739ed9876882add2edd699dd466e589ab714 Author: Aaro Koskinen Date: Mon Nov 12 14:50:22 2018 -0600 arm64: dts: stratix10: fix multicast filtering commit fd5ba6ee3187617287fb9cb187e3d6b3631210a3 upstream On Stratix 10, the EMAC has 256 hash buckets for multicast filtering. This needs to be specified in DTS, otherwise the stmmac driver defaults to 64 buckets and initializes the filter incorrectly. As a result, e.g. valid IPv6 multicast traffic ends up being dropped. Fixes: 78cd6a9d8e15 ("arm64: dts: Add base stratix 10 dtsi") Cc: stable@vger.kernel.org Signed-off-by: Aaro Koskinen Signed-off-by: Dinh Nguyen Signed-off-by: Sasha Levin commit f0ef4cf3d407e35d4d6884df72f0e92568670bc0 Author: Thor Thayer Date: Mon Nov 12 14:50:21 2018 -0600 arm64: dts: stratix10: Support Ethernet Jumbo frame commit a27460c9768ee19949c5b91f3d959ccd88c2a64a upstream Properly specify the RX and TX FIFO size which is important for Jumbo frames. Update the max-frame-size to support Jumbo frames. Signed-off-by: Thor Thayer Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit f8c835816ad46ef72cd937cdff51107bed5d3d5b Author: Quinn Tran Date: Wed Sep 26 22:05:11 2018 -0700 scsi: qla2xxx: Fix NVMe session hang on unload commit f7d61c995df74d6bb57bbff6a2b7b1874c4a2baa upstream. Send aborts only when chip is active. Fixes: 623ee824e579 ("scsi: qla2xxx: Fix FC-NVMe IO abort during driver reset") Cc: # 4.14 Signed-off-by: Quinn Tran Reviewed-by: Ewan D. Milne Signed-off-by: Himanshu Madhani Signed-off-by: Martin K. Petersen Signed-off-by: Greg Kroah-Hartman commit 318cb27e7fa033d06311fdff7fc3ec89eae4c687 Author: Quinn Tran Date: Wed Sep 26 22:05:14 2018 -0700 scsi: qla2xxx: Fix re-using LoopID when handle is in use commit 5c6400536481d9ef44ef94e7bf2c7b8e81534db7 upstream. This patch fixes issue where driver clears NPort ID map instead of marking handle in use. Once driver clears NPort ID from the database, it can reuse the same NPort ID resulting in a PLOGI failure. [mkp: fixed Himanshu's SoB] Fixes: a084fd68e1d2 ("scsi: qla2xxx: Fix re-login for Nport Handle in use") Cc: Signed-of-by: Quinn Tran Reviewed-by: Ewan D. Milne Signed-off-by: Himanshu Madhani Signed-off-by: Martin K. Petersen Signed-off-by: Greg Kroah-Hartman commit aa67028ac42275edd65ed32c798b50249e11012f Author: Amir Goldstein Date: Thu Oct 18 09:45:49 2018 +0300 ovl: fix recursive oi->lock in ovl_link() commit 6cd078702f2f33cb6b19a682de3e9184112f1a46 upstream. linking a non-copied-up file into a non-copied-up parent results in a nested call to mutex_lock_interruptible(&oi->lock). Fix this by copying up target parent before ovl_nlink_start(), same as done in ovl_rename(). ~/unionmount-testsuite$ ./run --ov -s ~/unionmount-testsuite$ ln /mnt/a/foo100 /mnt/a/dir100/ WARNING: possible recursive locking detected -------------------------------------------- ln/1545 is trying to acquire lock: 00000000bcce7c4c (&ovl_i_lock_key[depth]){+.+.}, at: ovl_copy_up_start+0x28/0x7d but task is already holding lock: 0000000026d73d5b (&ovl_i_lock_key[depth]){+.+.}, at: ovl_nlink_start+0x3c/0xc1 [SzM: this seems to be a false positive, but doing the copy-up first is harmless and removes the lockdep splat] Reported-by: syzbot+3ef5c0d1a5cb0b21e6be@syzkaller.appspotmail.com Fixes: 5f8415d6b87e ("ovl: persistent overlay inode nlink for...") Cc: # v4.13 Signed-off-by: Amir Goldstein Signed-off-by: Miklos Szeredi [amir: backport to v4.18] Signed-off-by: Amir Goldstein Signed-off-by: Greg Kroah-Hartman commit 1b8ca12ee8afb4907b6b0a4b49e651c345f28aab Author: Miklos Szeredi Date: Fri Sep 28 16:43:22 2018 +0200 fuse: set FR_SENT while locked commit 4c316f2f3ff315cb48efb7435621e5bfb81df96d upstream. Otherwise fuse_dev_do_write() could come in and finish off the request, and the set_bit(FR_SENT, ...) could trigger the WARN_ON(test_bit(FR_SENT, ...)) in request_end(). Signed-off-by: Miklos Szeredi Reported-by: syzbot+ef054c4d3f64cd7f7cec@syzkaller.appspotmai Fixes: 46c34a348b0a ("fuse: no fc->lock for pqueue parts") Cc: # v4.2 Signed-off-by: Greg Kroah-Hartman commit 10b6b5d193b61c7212ac0a54767dc35276cc9e4d Author: Miklos Szeredi Date: Fri Sep 28 16:43:22 2018 +0200 fuse: fix blocked_waitq wakeup commit 908a572b80f6e9577b45e81b3dfe2e22111286b8 upstream. Using waitqueue_active() is racy. Make sure we issue a wake_up() unconditionally after storing into fc->blocked. After that it's okay to optimize with waitqueue_active() since the first wake up provides the necessary barrier for all waiters, not the just the woken one. Signed-off-by: Miklos Szeredi Fixes: 3c18ef8117f0 ("fuse: optimize wake_up") Cc: # v3.10 Signed-off-by: Greg Kroah-Hartman commit 8b71920c90c3e4a5e06125df1a69d4dc0bc3827d Author: Kirill Tkhai Date: Tue Sep 25 12:52:42 2018 +0300 fuse: Fix use-after-free in fuse_dev_do_write() commit d2d2d4fb1f54eff0f3faa9762d84f6446a4bc5d0 upstream. After we found req in request_find() and released the lock, everything may happen with the req in parallel: cpu0 cpu1 fuse_dev_do_write() fuse_dev_do_write() req = request_find(fpq, ...) ... spin_unlock(&fpq->lock) ... ... req = request_find(fpq, oh.unique) ... spin_unlock(&fpq->lock) queue_interrupt(&fc->iq, req); ... ... ... ... ... request_end(fc, req); fuse_put_request(fc, req); ... queue_interrupt(&fc->iq, req); Signed-off-by: Kirill Tkhai Signed-off-by: Miklos Szeredi Fixes: 46c34a348b0a ("fuse: no fc->lock for pqueue parts") Cc: # v4.2 Signed-off-by: Greg Kroah-Hartman commit f7e709c59e587b996d1ab9d9f2957fc24f613d1b Author: Kirill Tkhai Date: Tue Sep 25 12:28:55 2018 +0300 fuse: Fix use-after-free in fuse_dev_do_read() commit bc78abbd55dd28e2287ec6d6502b842321a17c87 upstream. We may pick freed req in this way: [cpu0] [cpu1] fuse_dev_do_read() fuse_dev_do_write() list_move_tail(&req->list, ...); ... spin_unlock(&fpq->lock); ... ... request_end(fc, req); ... fuse_put_request(fc, req); if (test_bit(FR_INTERRUPTED, ...)) queue_interrupt(fiq, req); Fix that by keeping req alive until we finish all manipulations. Reported-by: syzbot+4e975615ca01f2277bdd@syzkaller.appspotmail.com Signed-off-by: Kirill Tkhai Signed-off-by: Miklos Szeredi Fixes: 46c34a348b0a ("fuse: no fc->lock for pqueue parts") Cc: # v4.2 Signed-off-by: Greg Kroah-Hartman commit 38d3f7b2e20f5401f37d93c6003500702cff39c8 Author: Himanshu Madhani Date: Wed Sep 26 22:05:15 2018 -0700 scsi: qla2xxx: Fix driver hang when FC-NVMe LUNs are configured commit 39553065f77c297239308470ee313841f4e07db4 upstream. This patch fixes multiple call for qla_nvme_unregister_remote_port() as part of qlt_schedule_session_for_deletion(), Do not call it again during qla_nvme_delete() Fixes: e473b3074104 ("scsi: qla2xxx: Add FC-NVMe abort processing") Cc: Reviewed-by: Ewan D. Milne Signed-off-by: Himanshu Madhani Signed-off-by: Martin K. Petersen Signed-off-by: Greg Kroah-Hartman commit f05744c0277c9933e96832ac6c2d1cd588fd7dc2 Author: Quinn Tran Date: Wed Sep 26 22:05:13 2018 -0700 scsi: qla2xxx: Fix duplicate switch database entries commit 732ee9a912cf2d9a50c5f9c4213cdc2f885d6aa6 upstream. The response data buffer used in switch scan is reused 4 times. (For example, for commands GPN_FT, GNN_FT for FCP and FC-NVME) Before driver reuses this buffer, clear it to prevent duplicate entries in our database. Fixes: a4239945b8ad1 ("scsi: qla2xxx: Add switch command to simplify fabric discovery" Cc: Signed-off-by: Quinn Tran Reviewed-by: Ewan D. Milne Signed-off-by: Himanshu Madhani Signed-off-by: Martin K. Petersen Signed-off-by: Greg Kroah-Hartman commit 792b70b674bf708bc8cf8e5e76936d8bf6cd2d5a Author: Quinn Tran Date: Tue Sep 11 10:18:21 2018 -0700 scsi: qla2xxx: shutdown chip if reset fail commit 1e4ac5d6fe0a4af17e4b6251b884485832bf75a3 upstream. If chip unable to fully initialize, use full shutdown sequence to clear out any stale FW state. Fixes: e315cd28b9ef ("[SCSI] qla2xxx: Code changes for qla data structure refactoring") Cc: stable@vger.kernel.org #4.10 Signed-off-by: Quinn Tran Signed-off-by: Himanshu Madhani Signed-off-by: Martin K. Petersen Signed-off-by: Greg Kroah-Hartman commit 187dc52ddacb5de570dd7a27649959fd38a8ae2d Author: Quinn Tran Date: Tue Sep 11 10:18:24 2018 -0700 scsi: qla2xxx: Remove stale debug trace message from tcm_qla2xxx commit 7c388f91ec1a59b0ed815b07b90536e2d57e1e1f upstream. Remove stale debug trace. Fixes: 1eb42f965ced ("qla2xxx: Make trace flags more readable") Cc: stable@vger.kernel.org #4.10 Signed-off-by: Quinn Tran Signed-off-by: Himanshu Madhani Signed-off-by: Martin K. Petersen Signed-off-by: Greg Kroah-Hartman commit 008bab2eb77dd7e95a2fd73c66f4f10b0f263727 Author: Quinn Tran Date: Fri Aug 31 11:24:26 2018 -0700 scsi: qla2xxx: Fix process response queue for ISP26XX and above commit b86ac8fd4b2f6ec2f9ca9194c56eac12d620096f upstream. This patch improves performance for 16G and above adapter by removing additional call to process_response_queue(). [mkp: typo] Cc: Signed-off-by: Quinn Tran Signed-off-by: Himanshu Madhani Signed-off-by: Martin K. Petersen Signed-off-by: Greg Kroah-Hartman commit 7e9178071fb1af837f6cfd38ceddb8312c00c2cb Author: Himanshu Madhani Date: Fri Aug 31 11:24:27 2018 -0700 scsi: qla2xxx: Fix incorrect port speed being set for FC adapters commit 4c1458df9635c7e3ced155f594d2e7dfd7254e21 upstream. Fixes: 6246b8a1d26c7c ("[SCSI] qla2xxx: Enhancements to support ISP83xx.") Fixes: 1bb395485160d2 ("qla2xxx: Correct iiDMA-update calling conventions.") Cc: Signed-off-by: Himanshu Madhani Signed-off-by: Martin K. Petersen Signed-off-by: Greg Kroah-Hartman commit c1d44176f2c3a9b97f8da17ace4534c56657f77a Author: Yoshihiro Shimoda Date: Tue Oct 30 15:13:35 2018 +0900 serial: sh-sci: Fix could not remove dev_attr_rx_fifo_timeout commit 641a41dbba217ee5bd26abe6be77f8cead9cd00e upstream. This patch fixes an issue that the sci_remove() could not remove dev_attr_rx_fifo_timeout because uart_remove_one_port() set the port->port.type to PORT_UNKNOWN. Reported-by: Hiromitsu Yamasaki Fixes: 5d23188a473d ("serial: sh-sci: make RX FIFO parameters tunable via sysfs") Cc: # v4.11+ Signed-off-by: Yoshihiro Shimoda Reviewed-by: Ulrich Hecht Signed-off-by: Greg Kroah-Hartman commit 6fcbb25da516421694c80404a496dec2b2ec914b Author: Miklos Szeredi Date: Wed Oct 31 12:15:23 2018 +0100 ovl: check whiteout in ovl_create_over_whiteout() commit 5e1275808630ea3b2c97c776f40e475017535f72 upstream. Kaixuxia repors that it's possible to crash overlayfs by removing the whiteout on the upper layer before creating a directory over it. This is a reproducer: mkdir lower upper work merge touch lower/file mount -t overlay overlay -olowerdir=lower,upperdir=upper,workdir=work merge rm merge/file ls -al merge/file rm upper/file ls -al merge/ mkdir merge/file Before commencing with a vfs_rename(..., RENAME_EXCHANGE) verify that the lookup of "upper" is positive and is a whiteout, and return ESTALE otherwise. Reported by: kaixuxia Signed-off-by: Miklos Szeredi Fixes: e9be9d5e76e3 ("overlay filesystem") Cc: # v3.18 Signed-off-by: Greg Kroah-Hartman commit 6915a20df5f87d14b80f967d233d0e56eb693abb Author: Amir Goldstein Date: Wed Oct 10 19:10:06 2018 +0300 ovl: fix error handling in ovl_verify_set_fh() commit babf4770be0adc69e6d2de150f4040f175e24beb upstream. We hit a BUG on kfree of an ERR_PTR()... Reported-by: syzbot+ff03fe05c717b82502d0@syzkaller.appspotmail.com Fixes: 8b88a2e64036 ("ovl: verify upper root dir matches lower root dir") Cc: # v4.13 Signed-off-by: Amir Goldstein Signed-off-by: Miklos Szeredi Signed-off-by: Greg Kroah-Hartman commit 1dffd49a92ddaba101375e3157f90b5510d3cb46 Author: Young_X Date: Wed Oct 3 12:54:29 2018 +0000 cdrom: fix improper type cast, which can leat to information leak. commit e4f3aa2e1e67bb48dfbaaf1cad59013d5a5bc276 upstream. There is another cast from unsigned long to int which causes a bounds check to fail with specially crafted input. The value is then used as an index in the slot array in cdrom_slot_status(). This issue is similar to CVE-2018-16658 and CVE-2018-10940. Signed-off-by: Young_X Signed-off-by: Jens Axboe Cc: Ben Hutchings Signed-off-by: Greg Kroah-Hartman commit 09901a24c215b686d8afc5603003bbf5d5fa0ed0 Author: Dominique Martinet Date: Tue Aug 28 07:32:35 2018 +0900 9p: clear dangling pointers in p9stat_free [ Upstream commit 62e3941776fea8678bb8120607039410b1b61a65 ] p9stat_free is more of a cleanup function than a 'free' function as it only frees the content of the struct; there are chances of use-after-free if it is improperly used (e.g. p9stat_free called twice as it used to be possible to) Clearing dangling pointers makes the function idempotent and safer to use. Link: http://lkml.kernel.org/r/1535410108-20650-2-git-send-email-asmadeus@codewreck.org Signed-off-by: Dominique Martinet Reported-by: syzbot+d4252148d198410b864f@syzkaller.appspotmail.com Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit ab5b8371449b678545b359d654e6e227300ff114 Author: Sébastien Szymanski Date: Wed Aug 22 13:38:03 2018 +0200 ARM: dts: imx6ull: keep IMX6UL_ prefix for signals on both i.MX6UL and i.MX6ULL [ Upstream commit 31edaa6e7fd8143085a6a60c564447c07e76ed9f ] Signals available on both i.MX6UL and i.MX6ULL should have the same name because it is the case of all others common signals, it avoids to make mistakes (use the wrong ones) and it makes writing device tree files less complicated. For example: imx6ul-imx6ull-board.dtsi: ... pinctrl_uart5: uart5grp { fsl,pins = < MX6UL_PAD_UART5_TX_DATA__UART5_DCE_TX 0x1b0b1 MX6UL_PAD_UART5_RX_DATA__UART5_DCE_RX 0x1b0b1 >; }; imx6ul-board.dts: #include #include ... imx6ull-board.dts: #include #include ... Without this patch, the imx6ull-board.dtb will use MX6UL_PAD_UART5_RX_DATA__UART5_DCE_RX instead of MX6ULL_PAD_UART5_RX_DATA__UART5_DCE_RX and the uart5 will be misconfigured. Signed-off-by: Sébastien Szymanski Reviewed-by: Fabio Estevam Acked-by: Rob Herring Signed-off-by: Shawn Guo Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 8515b9edf7a0c9dcf1ad218ccc783700db217336 Author: Jan Kara Date: Thu Sep 6 15:56:10 2018 +0200 udf: Prevent write-unsupported filesystem to be remounted read-write [ Upstream commit a9ad01bc759df79b0012f43ee52164391e31cd96 ] There are certain filesystem features which we support for reading but not for writing. We properly refuse to mount such filesystems read-write however for some features (such as read-only partitions), we don't check for these features when remounting the filesystem from read-only to read-write. Thus such filesystems could be remounted read-write leading to strange behavior (most likely crashes). Fix the problem by marking in superblock whether the filesystem has some features that are supported in read-only mode and check this flag during remount. Signed-off-by: Jan Kara Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit a9dbfcffbfba56b40afbafb63efdce633e560f4a Author: Dominique Martinet Date: Sat Sep 8 01:18:43 2018 +0900 9p locks: fix glock.client_id leak in do_lock [ Upstream commit b4dc44b3cac9e8327e0655f530ed0c46f2e6214c ] the 9p client code overwrites our glock.client_id pointing to a static buffer by an allocated string holding the network provided value which we do not care about; free and reset the value as appropriate. This is almost identical to the leak in v9fs_file_getlock() fixed by Al Viro in commit ce85dd58ad5a6 ("9p: we are leaking glock.client_id in v9fs_file_getlock()"), which was returned as an error by a coverity false positive -- while we are here attempt to make the code slightly more robust to future change of the net/9p/client code and hopefully more clear to coverity that there is no problem. Link: http://lkml.kernel.org/r/1536339057-21974-5-git-send-email-asmadeus@codewreck.org Signed-off-by: Dominique Martinet Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit b710256edba8c0e92a9fff05aa67f6c47910379a Author: Colin Ian King Date: Wed Sep 5 10:46:05 2018 +0100 staging: most: video: fix registration of an empty comp core_component [ Upstream commit 1f447e51c0b9e8beeec0917ea5f51930f55e17c9 ] Currently we have structrues comp (which is empty) and comp_info being used to register and deregister the component. This mismatch in naming occurred from a previous commit that renamed aim_info to comp. Fix this to use consistent component naming in line with most/net, most/sound etc. This fixes the message two issues, one with a null empty name when loading the module: [ 1485.269515] most_core: registered new core component (null) and an Oops when removing the module: [ 1485.277971] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 [ 1485.278648] PGD 0 P4D 0 [ 1485.279253] Oops: 0002 [#2] SMP PTI [ 1485.279847] CPU: 1 PID: 32629 Comm: modprobe Tainted: P D WC OE 4.18.0-8-generic #9 [ 1485.280442] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015 [ 1485.281040] RIP: 0010:most_deregister_component+0x3c/0x70 [most_core] .. etc Fixes: 1b10a0316e2d ("staging: most: video: remove aim designators") Signed-off-by: Colin Ian King Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 938e30c5644629feba66ea9646da6b5e3a49e38f Author: Andrey Grodzovsky Date: Mon Sep 10 18:43:58 2018 -0400 drm/amdgpu: Fix SDMA TO after GPU reset v3 [ Upstream commit d8de8260a45aae8f74af77eae9a162bdc0ed48d2 ] After GPU reset amdgpu_vm_clear_bo triggers VM flush but job->vm_pd_addr is not set causing SDMA TO. v2: Per advise by Christian König avoid flushing VM for jobs where job->vm_pd_addr wasn't explicitly set. v3: Shortcut vm_flush_needed early. Fixes cbd5285 drm/amdgpu: move setting the GART addr into TTM. Signed-off-by: Andrey Grodzovsky Reviewed-by: Christian König Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 35c37a531fa087977b7ddb7146a29e24e3ad0df7 Author: Kieran Bingham Date: Fri Aug 31 19:12:57 2018 +0100 drm: rcar-du: Update Gen3 output limitations [ Upstream commit 2a3181d9cfd6d5aa48f8527708d0c32072072cef ] The R-Car Gen3 DU utilises the VSP1 hardware for memory access. The limits on the RPF and WPF in this pipeline are 8190x8190. Update the supported maximum sizes accordingly. Signed-off-by: Kieran Bingham Reviewed-by: Laurent Pinchart Signed-off-by: Laurent Pinchart Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit e6b5db61b2d65dbbf2a6948d57f4fb350b682ad0 Author: Alexandru Ardelean Date: Thu Sep 13 11:44:09 2018 +0300 staging:iio:ad7606: fix voltage scales [ Upstream commit 4ee033301c898dd0835d035d0e0eb768a3d35da1 ] Fixes commit 17be2a2905a6ec9aa27cd59521495e2f490d2af0 ("staging: iio: ad7606: replace range/range_available with corresponding scale"). The AD7606 devices don't have a 2.5V voltage range, they have 5V & 10V voltage range, which is selectable via the `gpio_range` descriptor. The scales also seem to have been miscomputed, because when they were applied to the raw values, the results differ from the expected values. After checking the ADC transfer function in the datasheet, these were re-computed. Signed-off-by: Alexandru Ardelean Signed-off-by: Jonathan Cameron Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 3136e7a31323e7d82683a1ff83085b33457eaf55 Author: Breno Leitao Date: Tue Jul 31 17:55:57 2018 -0300 powerpc/selftests: Wait all threads to join [ Upstream commit 693b31b2fc1636f0aa7af53136d3b49f6ad9ff39 ] Test tm-tmspr might exit before all threads stop executing, because it just waits for the very last thread to join before proceeding/exiting. This patch makes sure that all threads that were created will join before proceeding/exiting. This patch also guarantees that the amount of threads being created is equal to thread_num. Signed-off-by: Breno Leitao Signed-off-by: Michael Ellerman Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit e7bce56063ac578e4f2ed362f2ad3e362dee0c77 Author: Marco Felsch Date: Thu Jun 28 12:20:33 2018 -0400 media: tvp5150: fix width alignment during set_selection() [ Upstream commit bd24db04101f45a9c1d874fe21b0c7eab7bcadec ] The driver ignored the width alignment which exists due to the UYVY colorspace format. Fix the width alignment and make use of the the provided v4l2 helper function to set the width, height and all alignments in one. Fixes: 963ddc63e20d ("[media] media: tvp5150: Add cropping support") Signed-off-by: Marco Felsch Signed-off-by: Mauro Carvalho Chehab Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 2139f23fd23345ef079484a97fa4cf3144b1c3cc Author: Phil Elwell Date: Wed Sep 12 15:31:55 2018 +0100 sc16is7xx: Fix for multi-channel stall [ Upstream commit 8344498721059754e09d30fe255a12dab8fb03ef ] The SC16IS752 is a dual-channel device. The two channels are largely independent, but the IRQ signals are wired together as an open-drain, active low signal which will be driven low while either of the channels requires attention, which can be for significant periods of time until operations complete and the interrupt can be acknowledged. In that respect it is should be treated as a true level-sensitive IRQ. The kernel, however, needs to be able to exit interrupt context in order to use I2C or SPI to access the device registers (which may involve sleeping). Therefore the interrupt needs to be masked out or paused in some way. The usual way to manage sleeping from within an interrupt handler is to use a threaded interrupt handler - a regular interrupt routine does the minimum amount of work needed to triage the interrupt before waking the interrupt service thread. If the threaded IRQ is marked as IRQF_ONESHOT the kernel will automatically mask out the interrupt until the thread runs to completion. The sc16is7xx driver used to use a threaded IRQ, but a patch switched to using a kthread_worker in order to set realtime priorities on the handler thread and for other optimisations. The end result is non-threaded IRQ that schedules some work then returns IRQ_HANDLED, making the kernel think that all IRQ processing has completed. The work-around to prevent a constant stream of interrupts is to mark the interrupt as edge-sensitive rather than level-sensitive, but interpreting an active-low source as a falling-edge source requires care to prevent a total cessation of interrupts. Whereas an edge-triggering source will generate a new edge for every interrupt condition a level-triggering source will keep the signal at the interrupting level until it no longer requires attention; in other words, the host won't see another edge until all interrupt conditions are cleared. It is therefore vital that the interrupt handler does not exit with an outstanding interrupt condition, otherwise the kernel will not receive another interrupt unless some other operation causes the interrupt state on the device to be cleared. The existing sc16is7xx driver has a very simple interrupt "thread" (kthread_work job) that processes interrupts on each channel in turn until there are no more. If both channels are active and the first channel starts interrupting while the handler for the second channel is running then it will not be detected and an IRQ stall ensues. This could be handled easily if there was a shared IRQ status register, or a convenient way to determine if the IRQ had been deasserted for any length of time, but both appear to be lacking. Avoid this problem (or at least make it much less likely to happen) by reducing the granularity of per-channel interrupt processing to one condition per iteration, only exiting the overall loop when both channels are no longer interrupting. Signed-off-by: Phil Elwell Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 809923bfbf77f318f1859819bcb3b6ece11bdb8c Author: Huacai Chen Date: Sat Sep 15 14:01:12 2018 +0800 MIPS/PCI: Call pcie_bus_configure_settings() to set MPS/MRRS [ Upstream commit 2794f688b2c336e0da85e9f91fed33febbd9f54a ] Call pcie_bus_configure_settings() on MIPS, like for other platforms. The function pcie_bus_configure_settings() makes sure the MPS (Max Payload Size) across the bus is uniform and provides the ability to tune the MRSS (Max Read Request Size) and MPS (Max Payload Size) to higher performance values. Some devices will not operate properly if these aren't set correctly because the firmware doesn't always do it. Signed-off-by: Huacai Chen Signed-off-by: Paul Burton Patchwork: https://patchwork.linux-mips.org/patch/20649/ Cc: Ralf Baechle Cc: James Hogan Cc: linux-mips@linux-mips.org Cc: Fuxin Zhang Cc: Zhangjin Wu Cc: Huacai Chen Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 0305be8f74fb24cdded4967aa6db6bce80ab0de5 Author: Rashmica Gupta Date: Fri Aug 17 14:25:01 2018 +1000 powerpc/memtrace: Remove memory in chunks [ Upstream commit 3f7daf3d7582dc6628ac40a9045dd1bbd80c5f35 ] When hot-removing memory release_mem_region_adjustable() splits iomem resources if they are not the exact size of the memory being hot-deleted. Adding this memory back to the kernel adds a new resource. Eg a node has memory 0x0 - 0xfffffffff. Hot-removing 1GB from 0xf40000000 results in the single resource 0x0-0xfffffffff being split into two resources: 0x0-0xf3fffffff and 0xf80000000-0xfffffffff. When we hot-add the memory back we now have three resources: 0x0-0xf3fffffff, 0xf40000000-0xf7fffffff, and 0xf80000000-0xfffffffff. This is an issue if we try to remove some memory that overlaps resources. Eg when trying to remove 2GB at address 0xf40000000, release_mem_region_adjustable() fails as it expects the chunk of memory to be within the boundaries of a single resource. We then get the warning: "Unable to release resource" and attempting to use memtrace again gives us this error: "bash: echo: write error: Resource temporarily unavailable" This patch makes memtrace remove memory in chunks that are always the same size from an address that is always equal to end_of_memory - n*size, for some n. So hotremoving and hotadding memory of different sizes will now not attempt to remove memory that spans multiple resources. Signed-off-by: Rashmica Gupta Signed-off-by: Michael Ellerman Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 439f5244a9f4599cde71ec738d7ea4aa094343a6 Author: Joel Stanley Date: Fri Sep 14 13:36:47 2018 +0930 powerpc/boot: Ensure _zimage_start is a weak symbol [ Upstream commit ee9d21b3b3583712029a0db65a4b7c081d08d3b3 ] When building with clang crt0's _zimage_start is not marked weak, which breaks the build when linking the kernel image: $ objdump -t arch/powerpc/boot/crt0.o |grep _zimage_start$ 0000000000000058 g .text 0000000000000000 _zimage_start ld: arch/powerpc/boot/wrapper.a(crt0.o): in function '_zimage_start': (.text+0x58): multiple definition of '_zimage_start'; arch/powerpc/boot/pseries-head.o:(.text+0x0): first defined here Clang requires the .weak directive to appear after the symbol is declared. The binutils manual says: This directive sets the weak attribute on the comma separated list of symbol names. If the symbols do not already exist, they will be created. So it appears this is different with clang. The only reference I could see for this was an OpenBSD mailing list post[1]. Changing it to be after the declaration fixes building with Clang, and still works with GCC. $ objdump -t arch/powerpc/boot/crt0.o |grep _zimage_start$ 0000000000000058 w .text 0000000000000000 _zimage_start Reported to clang as https://bugs.llvm.org/show_bug.cgi?id=38921 [1] https://groups.google.com/forum/#!topic/fa.openbsd.tech/PAgKKen2YCY Signed-off-by: Joel Stanley Reviewed-by: Nick Desaulniers Signed-off-by: Michael Ellerman Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit c6e2ae7ca034f7e1f5d01ef494b3de0ce3037d75 Author: Dengcheng Zhu Date: Tue Sep 11 14:49:20 2018 -0700 MIPS: kexec: Mark CPU offline before disabling local IRQ [ Upstream commit dc57aaf95a516f70e2d527d8287a0332c481a226 ] After changing CPU online status, it will not be sent any IPIs such as in __flush_cache_all() on software coherency systems. Do this before disabling local IRQ. Signed-off-by: Dengcheng Zhu Signed-off-by: Paul Burton Patchwork: https://patchwork.linux-mips.org/patch/20571/ Cc: pburton@wavecomp.com Cc: ralf@linux-mips.org Cc: linux-mips@linux-mips.org Cc: rachel.mozes@intel.com Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 09249888b1adbd19a3934de4ed0860bc741c8116 Author: Lucas Stach Date: Wed Aug 1 10:18:04 2018 -0400 media: coda: don't overwrite h.264 profile_idc on decoder instance [ Upstream commit 1f32061e843205f6fe8404d5100d5adcec334e75 ] On a decoder instance, after the profile has been parsed from the stream __v4l2_ctrl_s_ctrl() is called to notify userspace about changes in the read-only profile control. This ends up calling back into the CODA driver where a missing check on the s_ctrl caused the profile information that has just been parsed from the stream to be overwritten with the default baseline profile. Later on the driver fails to enable frame reordering, based on the wrong profile information. Fixes: 347de126d1da (media: coda: add read-only h.264 decoder profile/level controls) Signed-off-by: Lucas Stach Reviewed-by: Philipp Zabel Signed-off-by: Hans Verkuil Signed-off-by: Mauro Carvalho Chehab Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 56a738841c4997c7951328507c97fe145810c17b Author: Nicholas Mc Guire Date: Sun Sep 9 12:02:32 2018 -0400 media: pci: cx23885: handle adding to list failure [ Upstream commit c5d59528e24ad22500347b199d52b9368e686a42 ] altera_hw_filt_init() which calls append_internal() assumes that the node was successfully linked in while in fact it can silently fail. So the call-site needs to set return to -ENOMEM on append_internal() returning NULL and exit through the err path. Fixes: 349bcf02e361 ("[media] Altera FPGA based CI driver module") Signed-off-by: Nicholas Mc Guire Signed-off-by: Hans Verkuil Signed-off-by: Mauro Carvalho Chehab Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit ae2a9a2cebf34322531dbc646908584194b68166 Author: John Garry Date: Sat Sep 22 01:25:25 2018 +0800 drm/hisilicon: hibmc: Do not carry error code in HiBMC framebuffer pointer [ Upstream commit 331d880b35a76b5de0eec8cbcecbf615d758a5f9 ] In hibmc_drm_fb_create(), when the call to hibmc_framebuffer_init() fails with error, do not store the error code in the HiBMC device frame-buffer pointer, as this will be later checked for non-zero value in hibmc_fbdev_destroy() when our intention is to check for a valid function pointer. This fixes the following crash: [ 9.699791] Unable to handle kernel NULL pointer dereference at virtual address 000000000000001a [ 9.708672] Mem abort info: [ 9.711489] ESR = 0x96000004 [ 9.714570] Exception class = DABT (current EL), IL = 32 bits [ 9.720551] SET = 0, FnV = 0 [ 9.723631] EA = 0, S1PTW = 0 [ 9.726799] Data abort info: [ 9.729702] ISV = 0, ISS = 0x00000004 [ 9.733573] CM = 0, WnR = 0 [ 9.736566] [000000000000001a] user address but active_mm is swapper [ 9.742987] Internal error: Oops: 96000004 [#1] PREEMPT SMP [ 9.748614] Modules linked in: [ 9.751694] CPU: 16 PID: 293 Comm: kworker/16:1 Tainted: G W 4.19.0-rc4-next-20180920-00001-g9b0012c #322 [ 9.762681] Hardware name: Huawei Taishan 2280 /D05, BIOS Hisilicon D05 IT21 Nemo 2.0 RC0 04/18/2018 [ 9.771915] Workqueue: events work_for_cpu_fn [ 9.776312] pstate: 60000005 (nZCv daif -PAN -UAO) [ 9.781150] pc : drm_mode_object_put+0x0/0x20 [ 9.785547] lr : hibmc_fbdev_fini+0x40/0x58 [ 9.789767] sp : ffff00000af1bcf0 [ 9.793108] x29: ffff00000af1bcf0 x28: 0000000000000000 [ 9.798473] x27: 0000000000000000 x26: ffff000008f66630 [ 9.803838] x25: 0000000000000000 x24: ffff0000095abb98 [ 9.809203] x23: ffff8017db92fe00 x22: ffff8017d2b13000 [ 9.814568] x21: ffffffffffffffea x20: ffff8017d2f80018 [ 9.819933] x19: ffff8017d28a0018 x18: ffffffffffffffff [ 9.825297] x17: 0000000000000000 x16: 0000000000000000 [ 9.830662] x15: ffff0000092296c8 x14: ffff00008939970f [ 9.836026] x13: ffff00000939971d x12: ffff000009229940 [ 9.841391] x11: ffff0000085f8fc0 x10: ffff00000af1b9a0 [ 9.846756] x9 : 000000000000000d x8 : 6620657a696c6169 [ 9.852121] x7 : ffff8017d3340580 x6 : ffff8017d4168000 [ 9.857486] x5 : 0000000000000000 x4 : ffff8017db92fb20 [ 9.862850] x3 : 0000000000002690 x2 : ffff8017d3340480 [ 9.868214] x1 : 0000000000000028 x0 : 0000000000000002 [ 9.873580] Process kworker/16:1 (pid: 293, stack limit = 0x(____ptrval____)) [ 9.880788] Call trace: [ 9.883252] drm_mode_object_put+0x0/0x20 [ 9.887297] hibmc_unload+0x1c/0x80 [ 9.890815] hibmc_pci_probe+0x170/0x3c8 [ 9.894773] local_pci_probe+0x3c/0xb0 [ 9.898555] work_for_cpu_fn+0x18/0x28 [ 9.902337] process_one_work+0x1e0/0x318 [ 9.906382] worker_thread+0x228/0x450 [ 9.910164] kthread+0x128/0x130 [ 9.913418] ret_from_fork+0x10/0x18 [ 9.917024] Code: a94153f3 a8c27bfd d65f03c0 d503201f (f9400c01) [ 9.923180] ---[ end trace 2695ffa0af5be375 ]--- Fixes: d1667b86795a ("drm/hisilicon/hibmc: Add support for frame buffer") Signed-off-by: John Garry Reviewed-by: Xinliang Liu Signed-off-by: Xinliang Liu Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 7cb3b831fea3bef8cdaa60b51ee840e4e0bde29f Author: SivapiriyanKumarasamy Date: Wed Sep 12 14:15:42 2018 -0400 drm/amd/display: fix gamma not being applied [ Upstream commit 30049754ab7c4b6148dd3cd64af7d54850604582 ] [WHY] Previously night light forced a full update by applying a transfer function update regardless of if it was changed. This logic was removed, Now gamma surface updates are only applied when there is also a plane info update, this does not work in cases such as using the night light slider. [HOW] When moving the night light slider we will perform a full update if the gamma has changed and there is a surface, even when the surface has not changed. Also get stream updates in setgamma prior to update planes and stream. Signed-off-by: SivapiriyanKumarasamy Reviewed-by: Anthony Koo Acked-by: Bhawanpreet Lakha Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 7657b5bd3475249c0d8512e0120cfa694e051322 Author: Tomi Valkeinen Date: Wed Sep 26 12:11:27 2018 +0300 drm/omap: fix memory barrier bug in DMM driver [ Upstream commit 538f66ba204944470a653a4cccc5f8befdf97c22 ] A DMM timeout "timed out waiting for done" has been observed on DRA7 devices. The timeout happens rarely, and only when the system is under heavy load. Debugging showed that the timeout can be made to happen much more frequently by optimizing the DMM driver, so that there's almost no code between writing the last DMM descriptors to RAM, and writing to DMM register which starts the DMM transaction. The current theory is that a wmb() does not properly ensure that the data written to RAM is observable by all the components in the system. This DMM timeout has caused interesting (and rare) bugs as the error handling was not functioning properly (the error handling has been fixed in previous commits): * If a DMM timeout happened when a GEM buffer was being pinned for display on the screen, a timeout error would be shown, but the driver would continue programming DSS HW with broken buffer, leading to SYNCLOST floods and possible crashes. * If a DMM timeout happened when other user (say, video decoder) was pinning a GEM buffer, a timeout would be shown but if the user handled the error properly, no other issues followed. * If a DMM timeout happened when a GEM buffer was being released, the driver does not even notice the error, leading to crashes or hang later. This patch adds wmb() and readl() calls after the last bit is written to RAM, which should ensure that the execution proceeds only after the data is actually in RAM, and thus observable by DMM. The read-back should not be needed. Further study is required to understand if DMM is somehow special case and read-back is ok, or if DRA7's memory barriers do not work correctly. Signed-off-by: Tomi Valkeinen Signed-off-by: Peter Ujfalusi Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 56b14ace2cdab79ef9a6676adb1c7849596ffbd8 Author: Christophe Leroy Date: Mon Aug 13 13:19:52 2018 +0000 powerpc/mm: Don't report hugepage tables as memory leaks when using kmemleak [ Upstream commit 803d690e68f0c5230183f1a42c7d50a41d16e380 ] When a process allocates a hugepage, the following leak is reported by kmemleak. This is a false positive which is due to the pointer to the table being stored in the PGD as physical memory address and not virtual memory pointer. unreferenced object 0xc30f8200 (size 512): comm "mmap", pid 374, jiffies 4872494 (age 627.630s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [] huge_pte_alloc+0xdc/0x1f8 [<9e0df1e1>] hugetlb_fault+0x560/0x8f8 [<7938ec6c>] follow_hugetlb_page+0x14c/0x44c [] __get_user_pages+0x1c4/0x3dc [] __mm_populate+0xac/0x140 [<3215421e>] vm_mmap_pgoff+0xb4/0xb8 [] ksys_mmap_pgoff+0xcc/0x1fc [<4fcd760f>] ret_from_syscall+0x0/0x38 See commit a984506c542e2 ("powerpc/mm: Don't report PUDs as memory leaks when using kmemleak") for detailed explanation. To fix that, this patch tells kmemleak to ignore the allocated hugepage table. Signed-off-by: Christophe Leroy Signed-off-by: Michael Ellerman Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 905119e2ea85ca9639fde3d088b27617b5374977 Author: Daniel Axtens Date: Mon Oct 1 16:21:51 2018 +1000 powerpc/nohash: fix undefined behaviour when testing page size support [ Upstream commit f5e284803a7206d43e26f9ffcae5de9626d95e37 ] When enumerating page size definitions to check hardware support, we construct a constant which is (1U << (def->shift - 10)). However, the array of page size definitions is only initalised for various MMU_PAGE_* constants, so it contains a number of 0-initialised elements with def->shift == 0. This means we end up shifting by a very large number, which gives the following UBSan splat: ================================================================================ UBSAN: Undefined behaviour in /home/dja/dev/linux/linux/arch/powerpc/mm/tlb_nohash.c:506:21 shift exponent 4294967286 is too large for 32-bit type 'unsigned int' CPU: 0 PID: 0 Comm: swapper Not tainted 4.19.0-rc3-00045-ga604f927b012-dirty #6 Call Trace: [c00000000101bc20] [c000000000a13d54] .dump_stack+0xa8/0xec (unreliable) [c00000000101bcb0] [c0000000004f20a8] .ubsan_epilogue+0x18/0x64 [c00000000101bd30] [c0000000004f2b10] .__ubsan_handle_shift_out_of_bounds+0x110/0x1a4 [c00000000101be20] [c000000000d21760] .early_init_mmu+0x1b4/0x5a0 [c00000000101bf10] [c000000000d1ba28] .early_setup+0x100/0x130 [c00000000101bf90] [c000000000000528] start_here_multiplatform+0x68/0x80 ================================================================================ Fix this by first checking if the element exists (shift != 0) before constructing the constant. Signed-off-by: Daniel Axtens Signed-off-by: Michael Ellerman Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit c89005fa57cc600ea8efb79214bd3b7fe20d0a60 Author: Fabio Estevam Date: Mon Sep 10 14:45:23 2018 -0300 ARM: imx_v6_v7_defconfig: Select CONFIG_TMPFS_POSIX_ACL [ Upstream commit 35d3cbe84544da74e39e1cec01374092467e3119 ] Andreas Müller reports: "Fixes: | Sep 04 09:05:10 imx6qdl-variscite-som systemd-udevd[220]: Failed to apply ACL on /dev/v4l-subdev0: Operation not supported | Sep 04 09:05:10 imx6qdl-variscite-som systemd-udevd[224]: Failed to apply ACL on /dev/v4l-subdev1: Operation not supported | Sep 04 09:05:10 imx6qdl-variscite-som systemd-udevd[215]: Failed to apply ACL on /dev/v4l-subdev10: Operation not supported | Sep 04 09:05:10 imx6qdl-variscite-som systemd-udevd[228]: Failed to apply ACL on /dev/v4l-subdev2: Operation not supported | Sep 04 09:05:10 imx6qdl-variscite-som systemd-udevd[232]: Failed to apply ACL on /dev/v4l-subdev5: Operation not supported | Sep 04 09:05:10 imx6qdl-variscite-som systemd-udevd[217]: Failed to apply ACL on /dev/v4l-subdev11: Operation not supported | Sep 04 09:05:10 imx6qdl-variscite-som systemd-udevd[214]: Failed to apply ACL on /dev/dri/card1: Operation not supported | Sep 04 09:05:10 imx6qdl-variscite-som systemd-udevd[216]: Failed to apply ACL on /dev/v4l-subdev8: Operation not supported | Sep 04 09:05:10 imx6qdl-variscite-som systemd-udevd[226]: Failed to apply ACL on /dev/v4l-subdev9: Operation not supported and nasty follow-ups: Starting weston from sddm as unpriviledged user fails with some hints on missing access rights." Select the CONFIG_TMPFS_POSIX_ACL option to fix these issues. Reported-by: Andreas Müller Signed-off-by: Fabio Estevam Acked-by: Otavio Salvador Signed-off-by: Shawn Guo Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit fab609de31de2d7d2318b59e5824c6db04946b8c Author: Colin Ian King Date: Mon Oct 8 17:22:28 2018 +0100 drm/amdgpu/powerplay: fix missing break in switch statements [ Upstream commit 14b284832e7dea6f54f0adfd7bed105548b94e57 ] There are several switch statements that are missing break statements. Add missing breaks to handle any fall-throughs corner cases. Detected by CoverityScan, CID#1457175 ("Missing break in switch") Fixes: 18aafc59b106 ("drm/amd/powerplay: implement fw related smu interface for iceland.") Acked-by: Huang Rui Signed-off-by: Colin Ian King Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 50513ecaf5f8336e7ff9698f683a46399cde863d Author: Masami Hiramatsu Date: Wed Aug 29 01:18:15 2018 +0900 tracing/kprobes: Check the probe on unloaded module correctly [ Upstream commit 59158ec4aef7d44be51a6f3e7e17fc64c32604eb ] Current kprobe event doesn't checks correctly whether the given event is on unloaded module or not. It just checks the event has ":" in the name. That is not enough because if we define a probe on non-exist symbol on loaded module, it allows to define that (with warning message) To ensure it correctly, this searches the module name on loaded module list and only if there is not, it allows to define it. (this event will be available when the target module is loaded) Link: http://lkml.kernel.org/r/153547309528.26502.8300278470528281328.stgit@devbox Signed-off-by: Masami Hiramatsu Signed-off-by: Steven Rostedt (VMware) Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 39abc57fe589ab9dd59555fceafed218cc26ebaa Author: Miles Chen Date: Mon Oct 8 10:39:17 2018 +0800 tty: check name length in tty_find_polling_driver() [ Upstream commit 33a1a7be198657c8ca26ad406c4d2a89b7162bcc ] The issue is found by a fuzzing test. If tty_find_polling_driver() recevies an incorrect input such as ',,' or '0b', the len becomes 0 and strncmp() always return 0. In this case, a null p->ops->poll_init() is called and it causes a kernel panic. Fix this by checking name length against zero in tty_find_polling_driver(). $echo ,, > /sys/module/kgdboc/parameters/kgdboc [ 20.804451] WARNING: CPU: 1 PID: 104 at drivers/tty/serial/serial_core.c:457 uart_get_baud_rate+0xe8/0x190 [ 20.804917] Modules linked in: [ 20.805317] CPU: 1 PID: 104 Comm: sh Not tainted 4.19.0-rc7ajb #8 [ 20.805469] Hardware name: linux,dummy-virt (DT) [ 20.805732] pstate: 20000005 (nzCv daif -PAN -UAO) [ 20.805895] pc : uart_get_baud_rate+0xe8/0x190 [ 20.806042] lr : uart_get_baud_rate+0xc0/0x190 [ 20.806476] sp : ffffffc06acff940 [ 20.806676] x29: ffffffc06acff940 x28: 0000000000002580 [ 20.806977] x27: 0000000000009600 x26: 0000000000009600 [ 20.807231] x25: ffffffc06acffad0 x24: 00000000ffffeff0 [ 20.807576] x23: 0000000000000001 x22: 0000000000000000 [ 20.807807] x21: 0000000000000001 x20: 0000000000000000 [ 20.808049] x19: ffffffc06acffac8 x18: 0000000000000000 [ 20.808277] x17: 0000000000000000 x16: 0000000000000000 [ 20.808520] x15: ffffffffffffffff x14: ffffffff00000000 [ 20.808757] x13: ffffffffffffffff x12: 0000000000000001 [ 20.809011] x11: 0101010101010101 x10: ffffff880d59ff5f [ 20.809292] x9 : ffffff880d59ff5e x8 : ffffffc06acffaf3 [ 20.809549] x7 : 0000000000000000 x6 : ffffff880d59ff5f [ 20.809803] x5 : 0000000080008001 x4 : 0000000000000003 [ 20.810056] x3 : ffffff900853e6b4 x2 : dfffff9000000000 [ 20.810693] x1 : ffffffc06acffad0 x0 : 0000000000000cb0 [ 20.811005] Call trace: [ 20.811214] uart_get_baud_rate+0xe8/0x190 [ 20.811479] serial8250_do_set_termios+0xe0/0x6f4 [ 20.811719] serial8250_set_termios+0x48/0x54 [ 20.811928] uart_set_options+0x138/0x1bc [ 20.812129] uart_poll_init+0x114/0x16c [ 20.812330] tty_find_polling_driver+0x158/0x200 [ 20.812545] configure_kgdboc+0xbc/0x1bc [ 20.812745] param_set_kgdboc_var+0xb8/0x150 [ 20.812960] param_attr_store+0xbc/0x150 [ 20.813160] module_attr_store+0x40/0x58 [ 20.813364] sysfs_kf_write+0x8c/0xa8 [ 20.813563] kernfs_fop_write+0x154/0x290 [ 20.813764] vfs_write+0xf0/0x278 [ 20.813951] __arm64_sys_write+0x84/0xf4 [ 20.814400] el0_svc_common+0xf4/0x1dc [ 20.814616] el0_svc_handler+0x98/0xbc [ 20.814804] el0_svc+0x8/0xc [ 20.822005] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 [ 20.826913] Mem abort info: [ 20.827103] ESR = 0x84000006 [ 20.827352] Exception class = IABT (current EL), IL = 16 bits [ 20.827655] SET = 0, FnV = 0 [ 20.827855] EA = 0, S1PTW = 0 [ 20.828135] user pgtable: 4k pages, 39-bit VAs, pgdp = (____ptrval____) [ 20.828484] [0000000000000000] pgd=00000000aadee003, pud=00000000aadee003, pmd=0000000000000000 [ 20.829195] Internal error: Oops: 84000006 [#1] SMP [ 20.829564] Modules linked in: [ 20.829890] CPU: 1 PID: 104 Comm: sh Tainted: G W 4.19.0-rc7ajb #8 [ 20.830545] Hardware name: linux,dummy-virt (DT) [ 20.830829] pstate: 60000085 (nZCv daIf -PAN -UAO) [ 20.831174] pc : (null) [ 20.831457] lr : serial8250_do_set_termios+0x358/0x6f4 [ 20.831727] sp : ffffffc06acff9b0 [ 20.831936] x29: ffffffc06acff9b0 x28: ffffff9008d7c000 [ 20.832267] x27: ffffff900969e16f x26: 0000000000000000 [ 20.832589] x25: ffffff900969dfb0 x24: 0000000000000000 [ 20.832906] x23: ffffffc06acffad0 x22: ffffff900969e160 [ 20.833232] x21: 0000000000000000 x20: ffffffc06acffac8 [ 20.833559] x19: ffffff900969df90 x18: 0000000000000000 [ 20.833878] x17: 0000000000000000 x16: 0000000000000000 [ 20.834491] x15: ffffffffffffffff x14: ffffffff00000000 [ 20.834821] x13: ffffffffffffffff x12: 0000000000000001 [ 20.835143] x11: 0101010101010101 x10: ffffff880d59ff5f [ 20.835467] x9 : ffffff880d59ff5e x8 : ffffffc06acffaf3 [ 20.835790] x7 : 0000000000000000 x6 : ffffff880d59ff5f [ 20.836111] x5 : c06419717c314100 x4 : 0000000000000007 [ 20.836419] x3 : 0000000000000000 x2 : 0000000000000000 [ 20.836732] x1 : 0000000000000001 x0 : ffffff900969df90 [ 20.837100] Process sh (pid: 104, stack limit = 0x(____ptrval____)) [ 20.837396] Call trace: [ 20.837566] (null) [ 20.837816] serial8250_set_termios+0x48/0x54 [ 20.838089] uart_set_options+0x138/0x1bc [ 20.838570] uart_poll_init+0x114/0x16c [ 20.838834] tty_find_polling_driver+0x158/0x200 [ 20.839119] configure_kgdboc+0xbc/0x1bc [ 20.839380] param_set_kgdboc_var+0xb8/0x150 [ 20.839658] param_attr_store+0xbc/0x150 [ 20.839920] module_attr_store+0x40/0x58 [ 20.840183] sysfs_kf_write+0x8c/0xa8 [ 20.840183] sysfs_kf_write+0x8c/0xa8 [ 20.840440] kernfs_fop_write+0x154/0x290 [ 20.840702] vfs_write+0xf0/0x278 [ 20.840942] __arm64_sys_write+0x84/0xf4 [ 20.841209] el0_svc_common+0xf4/0x1dc [ 20.841471] el0_svc_handler+0x98/0xbc [ 20.841713] el0_svc+0x8/0xc [ 20.842057] Code: bad PC value [ 20.842764] ---[ end trace a8835d7de79aaadf ]--- [ 20.843134] Kernel panic - not syncing: Fatal exception [ 20.843515] SMP: stopping secondary CPUs [ 20.844289] Kernel Offset: disabled [ 20.844634] CPU features: 0x0,21806002 [ 20.844857] Memory Limit: none [ 20.845172] ---[ end Kernel panic - not syncing: Fatal exception ]--- Signed-off-by: Miles Chen Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 45394cc9a3c448ddf22c2271f4f328cc7a2f2823 Author: Sam Bobroff Date: Wed Sep 12 11:23:20 2018 +1000 powerpc/eeh: Fix possible null deref in eeh_dump_dev_log() [ Upstream commit f9bc28aedfb5bbd572d2d365f3095c1becd7209b ] If an error occurs during an unplug operation, it's possible for eeh_dump_dev_log() to be called when edev->pdn is null, which currently leads to dereferencing a null pointer. Handle this by skipping the error log for those devices. Signed-off-by: Sam Bobroff Signed-off-by: Michael Ellerman Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 6fc2ea8572973dd9d05f91d6417786cb42cb97ab Author: Joel Stanley Date: Thu Oct 11 13:13:03 2018 +1030 powerpc/Makefile: Fix PPC_BOOK3S_64 ASFLAGS [ Upstream commit 960e30029863db95ec79a71009272d4661db5991 ] Ever since commit 15a3204d24a3 ("powerpc/64s: Set assembler machine type to POWER4") we force -mpower4 to be passed to the assembler irrespective of the CFLAGS used (for Book3s 64). When building a powerpc64 kernel with clang, clang will not add -many to the assembler flags, so any instructions that the compiler has generated that are not available on power4 will cause an error: /usr/bin/as -a64 -mppc64 -mlittle-endian -mpower8 \ -I ./arch/powerpc/include -I ./arch/powerpc/include/generated \ -I ./include -I ./arch/powerpc/include/uapi \ -I ./arch/powerpc/include/generated/uapi -I ./include/uapi \ -I ./include/generated/uapi -I arch/powerpc -I arch/powerpc \ -maltivec -mpower4 -o init/do_mounts.o /tmp/do_mounts-3b0a3d.s /tmp/do_mounts-51ce54.s:748: Error: unrecognized opcode: `isel' GCC does include -many, so the GCC driven gas call will succeed: as -v -I ./arch/powerpc/include -I ./arch/powerpc/include/generated -I ./include -I ./arch/powerpc/include/uapi -I ./arch/powerpc/include/generated/uapi -I ./include/uapi -I ./include/generated/uapi -I arch/powerpc -I arch/powerpc -a64 -mpower8 -many -mlittle -maltivec -mpower4 -o init/do_mounts.o Note that isel is power7 and above for IBM CPUs. GCC only generates it for Power9 and above, but the above test was run against the clang generated assembly. Peter Bergner explains: When using -many -mpower4, gas will first try and find a matching power4 mnemonic and failing that, it will then allow any valid mnemonic that gas knows about. GCC's use of -many predates me though. IIRC, Alan looked at trying to remove it, but I forget why he didn't. Could be either a gcc or gas issue at the time. I'm not sure whether issue still exists or not. He and I have modified how gas works internally a fair amount since he tried removing gcc use of -many. I will also note that when using -many, gas will choose the first mnemonic that matches in the mnemonic table and we have (mostly) sorted the table so that server mnemonics show up earlier in the table than other mnemonics, so they'll be seen/chosen first. By explicitly setting -many we can build with Clang and GCC while retaining the -mpower4 option. Signed-off-by: Joel Stanley Signed-off-by: Michael Ellerman Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 9cfe21c38e5bec9d5b68a15b3ba7394d84356b9d Author: Randy Dunlap Date: Mon Oct 15 11:16:58 2018 -0700 Input: wm97xx-ts - fix exit path [ Upstream commit a3f7c3fcf60868c1e90671df5d0cf9be5900a09b ] Loading then unloading wm97xx-ts.ko when CONFIG_AC97_BUS=m causes a WARNING: from drivers/base/driver.c: Unexpected driver unregister! WARNING: CPU: 0 PID: 1709 at ../drivers/base/driver.c:193 driver_unregister+0x30/0x40 Fix this by only calling driver_unregister() with the same condition that driver_register() is called. Fixes: ae9d1b5fbd7b ("Input: wm97xx: add new AC97 bus support") Signed-off-by: Randy Dunlap Signed-off-by: Dmitry Torokhov Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit f813b2981fc39472889f10ae82f06162e40e7486 Author: Su Sung Chung Date: Thu Sep 20 15:03:27 2018 -0400 drm/amd/display: fix bug of accessing invalid memory [ Upstream commit 43c3ff27a47d83d153c4adc088243ba594582bf5 ] [Why] A loop inside of build_evenly_distributed_points function that traverse through the array of points become an infinite loop when m_GammaUpdates does not get assigned to any value. [How] In DMColor, clear m_gammaIsValid bit just before writting all Zeromem for m_GammaUpdates, to prevent calling build_evenly_distributed_points before m_GammaUpdates gets assigned to some value. Signed-off-by: Su Sung Chung Reviewed-by: Aric Cyr Acked-by: Bhawanpreet Lakha Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit d74680a91a1589e55322dcdbe142ca061872d276 Author: Christophe Leroy Date: Thu Aug 2 09:25:55 2018 +0000 powerpc/mm: fix always true/false warning in slice.c [ Upstream commit 37e9c674e7e6f445e12cb1151017bd4bacdd1e2d ] This patch fixes the following warnings (obtained with make W=1). arch/powerpc/mm/slice.c: In function 'slice_range_to_mask': arch/powerpc/mm/slice.c:73:12: error: comparison is always true due to limited range of data type [-Werror=type-limits] if (start < SLICE_LOW_TOP) { ^ arch/powerpc/mm/slice.c:81:20: error: comparison is always false due to limited range of data type [-Werror=type-limits] if ((start + len) > SLICE_LOW_TOP) { ^ arch/powerpc/mm/slice.c: In function 'slice_mask_for_free': arch/powerpc/mm/slice.c:136:17: error: comparison is always true due to limited range of data type [-Werror=type-limits] if (high_limit <= SLICE_LOW_TOP) ^ arch/powerpc/mm/slice.c: In function 'slice_check_range_fits': arch/powerpc/mm/slice.c:185:12: error: comparison is always true due to limited range of data type [-Werror=type-limits] if (start < SLICE_LOW_TOP) { ^ arch/powerpc/mm/slice.c:195:39: error: comparison is always false due to limited range of data type [-Werror=type-limits] if (SLICE_NUM_HIGH && ((start + len) > SLICE_LOW_TOP)) { ^ arch/powerpc/mm/slice.c: In function 'slice_scan_available': arch/powerpc/mm/slice.c:306:11: error: comparison is always true due to limited range of data type [-Werror=type-limits] if (addr < SLICE_LOW_TOP) { ^ arch/powerpc/mm/slice.c: In function 'get_slice_psize': arch/powerpc/mm/slice.c:709:11: error: comparison is always true due to limited range of data type [-Werror=type-limits] if (addr < SLICE_LOW_TOP) { ^ Signed-off-by: Christophe Leroy Signed-off-by: Michael Ellerman Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 7bd05ac862e29bbf5eee657b11adfd465c29fe61 Author: Michael Ellerman Date: Wed Aug 15 21:29:45 2018 +1000 powerpc/mm: Fix page table dump to work on Radix [ Upstream commit 0d923962ab69c27cca664a2d535e90ef655110ca ] When we're running on Book3S with the Radix MMU enabled the page table dump currently prints the wrong addresses because it uses the wrong start address. Fix it to use PAGE_OFFSET rather than KERN_VIRT_START. Signed-off-by: Michael Ellerman Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit c467bb652d448ed6f8e9c0e0e7047e6dec5c9e22 Author: Nicholas Piggin Date: Wed Aug 29 21:56:56 2018 +1000 powerpc/64/module: REL32 relocation range check [ Upstream commit b851ba02a6f3075f0f99c60c4bc30a4af80cf428 ] The recent module relocation overflow crash demonstrated that we have no range checking on REL32 relative relocations. This patch implements a basic check, the same kernel that previously oopsed and rebooted now continues with some of these errors when loading the module: module_64: x_tables: REL32 527703503449812 out of range! Possibly other relocations (ADDR32, REL16, TOC16, etc.) should also have overflow checks. Signed-off-by: Nicholas Piggin Signed-off-by: Michael Ellerman Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 8d16dd049428072519333e4ab6ed5df8084d1435 Author: Christophe Leroy Date: Sat Oct 13 09:16:22 2018 +0000 powerpc/traps: restore recoverability of machine_check interrupts [ Upstream commit daf00ae71dad8aa05965713c62558aeebf2df48e ] commit b96672dd840f ("powerpc: Machine check interrupt is a non- maskable interrupt") added a call to nmi_enter() at the beginning of machine check restart exception handler. Due to that, in_interrupt() always returns true regardless of the state before entering the exception, and die() panics even when the system was not already in interrupt. This patch calls nmi_exit() before calling die() in order to restore the interrupt state we had before calling nmi_enter() Fixes: b96672dd840f ("powerpc: Machine check interrupt is a non-maskable interrupt") Signed-off-by: Christophe Leroy Reviewed-by: Nicholas Piggin Signed-off-by: Michael Ellerman Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman