Re: WARNING: CPU: 0 PID: 0 at arch/x86/include/asm/kfence.h:46 kfence_protect

From: Marco Elver
Date: Thu Nov 17 2022 - 08:58:55 EST


On Thu, Nov 17, 2022 at 05:01PM +0530, Naresh Kamboju wrote:
> Kunit test cases failed and found warnings while booting Linux next
> version 6.1.0-rc5-next-20221117 on qemu-x86_64 [1].
>
> It was working on Linux next-20221116 tag.
>
> [ 0.663761] WARNING: CPU: 0 PID: 0 at
> arch/x86/include/asm/kfence.h:46 kfence_protect+0x7b/0x120
> [ 0.664033] WARNING: CPU: 0 PID: 0 at mm/kfence/core.c:234
> kfence_protect+0x7d/0x120
> [ 0.664465] kfence: kfence_init failed
>
> Reported-by: Linux Kernel Functional Testing <lkft@xxxxxxxxxx>
[...]
> [ 0.663758] ------------[ cut here ]------------
> [ 0.663761] WARNING: CPU: 0 PID: 0 at
> arch/x86/include/asm/kfence.h:46 kfence_protect+0x7b/0x120
[...]
> [ 0.664465] kfence: kfence_init failed
>
> metadata:
> git_ref: master
> git_repo: https://gitlab.com/Linaro/lkft/mirrors/next/linux-next
> git_sha: af37ad1e01c72483c4ee8453d9d9bac95d35f023
> git_describe: next-20221117
> kernel_version: 6.1.0-rc5
> kernel-config: https://builds.tuxbuild.com/2Hfb6n1z0frt4iBlIvqUzjMHiLm/config
> build-url: https://gitlab.com/Linaro/lkft/mirrors/next/linux-next/-/pipelines/697483979
> artifact-location: https://builds.tuxbuild.com/2Hfb6n1z0frt4iBlIvqUzjMHiLm
> toolchain: gcc-11

I bisected this to:

commit 127960a05548ea699a95791669e8112552eb2452
Author: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Date: Thu Nov 10 13:33:57 2022 +0100

x86/mm: Inhibit _PAGE_NX changes from cpa_process_alias()

There is a cludge in change_page_attr_set_clr() that inhibits
propagating NX changes to the aliases (directmap and highmap) -- this
is a cludge twofold:

- it also inhibits the primary checks in __change_page_attr();
- it hard depends on single bit changes.

The introduction of set_memory_rox() triggered this last issue for
clearing both _PAGE_RW and _PAGE_NX.

Explicitly ignore _PAGE_NX in cpa_process_alias() instead.

Fixes: b38994948567 ("x86/mm: Implement native set_memory_rox()")
Reported-by: kernel test robot <oliver.sang@xxxxxxxxx>
Debugged-by: Dave Hansen <dave.hansen@xxxxxxxxx>
Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
Link: https://lkml.kernel.org/r/20221110125544.594991716%40infradead.org

A simple revert of this commit fixes the issue.

Since all this seems to be about set_memory_rox(), and this is a fix
commit, the fix itself missed something?

Thanks,
-- Marco