Re: [PATCH] cpuidle: riscv-sbi: Stop using non-retentive suspend
From: Samuel Holland
Date: Mon Nov 21 2022 - 19:47:07 EST
Hi Palmer,
On 11/21/22 14:56, Palmer Dabbelt wrote:
> From: Palmer Dabbelt <palmer@xxxxxxxxxxxx>
>
> As per [1], whether or not the core can wake up from non-retentive
> suspend is a platform-specific detail. We don't have any way to encode
> that, so just stop using them until we've sorted that out.
We do have *exactly* a way to encode that. Specifically, the existence
or non-existence of a non-retentive CPU suspend state in the DT.
If a hart has no way of resuming from non-retentive suspend (i.e. a
complete lack of interrupt delivery in non-retentive suspend), then it
makes zero sense to advertise such a suspend state in the DT. Therefore,
if the state exists in the DT, you can assume there is *some* interrupt
that can wake up the hart. And I would extend that to assume at least
one of those wakeup-capable interrupts is a timer interrupt, although
not necessarily the architectural timer interrupt.
> Link: https://github.com/riscv-non-isa/riscv-sbi-doc/issues/98#issuecomment-1288564687
This comment refers specifically to the architectural timer interrupt,
not interrupts more generally.
> Fixes: 6abf32f1d9c5 ("cpuidle: Add RISC-V SBI CPU idle driver")
> Signed-off-by: Palmer Dabbelt <palmer@xxxxxxxxxxxx>
>
> ---
>
> This should allow us to revert 232ccac1bd9b ("clocksource/drivers/riscv:
> Events are stopped during CPU suspend"), which fixes suspend on the D1
> but breaks timers everywhere.
I understand that reverting 232ccac1bd9b is the easiest way to fix the
issues you and others are seeing. I do not have any functioning RISC-V
hardware with SMP, so it is hard for me to help debug the root issue in
the Linux timer code. I do not know why turning on the C3STOP flag
breaks RCU stall detection or clock_nanosleep(), but I agree that
breakage should not happen.
So while I still think 232ccac1bd9b is the right change to make from a
"following the spec" standpoint, I am okay with reverting it for
pragmatic reasons. Since the D1 has another timer driver that is
currently used in preference to the RISC-V/SBI timer triver, reverting
232ccac1bd9b does not break non-retentive suspend for the D1.
But please do not make the change below. It is unnecessarily broad, and
will break something that works fine right now. If the DT advertises a
CPU suspend state that cannot be woken up from at all, then that is a
bug in the DT. Linux should not try to work around that.
So revert 232ccac1bd9b for now, and we can figure out what to do about
the DT property, but please do not merge this.
Regards,
Samuel
> ---
> drivers/cpuidle/cpuidle-riscv-sbi.c | 11 +++++++++++
> 1 file changed, 11 insertions(+)
>
> diff --git a/drivers/cpuidle/cpuidle-riscv-sbi.c b/drivers/cpuidle/cpuidle-riscv-sbi.c
> index 05fe2902df9a..9d1063a54495 100644
> --- a/drivers/cpuidle/cpuidle-riscv-sbi.c
> +++ b/drivers/cpuidle/cpuidle-riscv-sbi.c
> @@ -214,6 +214,17 @@ static bool sbi_suspend_state_is_valid(u32 state)
> if (state > SBI_HSM_SUSPEND_NON_RET_DEFAULT &&
> state < SBI_HSM_SUSPEND_NON_RET_PLATFORM)
> return false;
> +
> + /*
> + * Whether or not RISC-V systems deliver interrupts to harts in a
> + * non-retentive suspend state is a platform-specific detail. This can
> + * leave the hart unable to wake up, so just mark these states as
> + * unsupported until we have a mechanism to expose these
> + * platform-specific details to Linux.
> + */
> + if (state & SBI_HSM_SUSP_NON_RET_BIT)
> + return false;
> +
> return true;
> }
>