Re: [PATCH] cpuidle: riscv-sbi: Stop using non-retentive suspend
From: Anup Patel
Date: Tue Nov 22 2022 - 23:39:13 EST
On Tue, Nov 22, 2022 at 8:58 PM Palmer Dabbelt <palmer@xxxxxxxxxxxx> wrote:
>
> On Tue, 22 Nov 2022 03:19:43 PST (-0800), conor.dooley@xxxxxxxxxxxxx wrote:
> > On Tue, Nov 22, 2022 at 11:06:15AM +0530, Anup Patel wrote:
> >> On Tue, Nov 22, 2022 at 10:46 AM Palmer Dabbelt <palmer@xxxxxxxxxxxx> wrote:
> >> >
> >> > On Mon, 21 Nov 2022 19:45:07 PST (-0800), anup@xxxxxxxxxxxxxx wrote:
> >> > > On Tue, Nov 22, 2022 at 2:27 AM Palmer Dabbelt <palmer@xxxxxxxxxxxx> wrote:
> >> > >>
> >> > >> From: Palmer Dabbelt <palmer@xxxxxxxxxxxx>
> >> > >>
> >> > >> As per [1], whether or not the core can wake up from non-retentive
> >> > >> suspend is a platform-specific detail. We don't have any way to encode
> >> > >> that, so just stop using them until we've sorted that out.
> >> > >>
> >> > >> Link: https://github.com/riscv-non-isa/riscv-sbi-doc/issues/98#issuecomment-1288564687
> >> > >> Fixes: 6abf32f1d9c5 ("cpuidle: Add RISC-V SBI CPU idle driver")
> >> > >> Signed-off-by: Palmer Dabbelt <palmer@xxxxxxxxxxxx>
> >> > >
> >> > > This is just unnecessary maintenance churn and it's not the
> >> > > right way to go. Better to fix this the right way instead of having
> >> > > a temporary fix.
> >> > >
> >> > > I had already sent-out a patch series 5 months back to describe
> >> > > this in DT:
> >> > > https://lore.kernel.org/lkml/20220727114302.302201-1-apatel@xxxxxxxxxxxxxxxx/
> >> > >
> >> > > No one has commented/suggested anything (except Samuel
> >> > > Holland and Sudeep Holla).
> >> >
> >> > I see some comments from Krzysztof here
> >> > <https://lore.kernel.org/lkml/7a0477a0-9f0f-87d6-4070-30321745f4cc@xxxxxxxxxx/>
> >> > as well. Looks like everyone is pointing out that having our CPU nodes
> >> > encode timers is a bad idea, my guess is that they're probably right.
> >>
> >> Adding a separate timer DT node, creates a new set of compatibility
> >> issues for existing platforms. I am fine updating my series to have
> >> separate timer DT node but do we want to go in this direction ?
> >
> > I don't really follow. How is there a compatibility issue created by
> > adding a new node that is not added for a new property? Both will
> > require changes to the device tree. (You need not reply here, I am going
> > to review the other thread, it's been on my todo list for too long. Been
> > caught up with non-coherent stuff & our sw release cycle..)
> >
> >> Even if ARM has a separate timer DT node, the timers are still part
> >> of the CPU. It depends on how we see the DT bindings aligning with
> >> actual HW.
> >>
> >> >
> >> > > Please review this series. I can quickly address comments to
> >> > > make this available for Linux-6.2. Until this series is merged,
> >> > > the affected platforms can simply remove non-retentive suspend
> >> > > states from their DT.
> >> >
> >> > That leaves us with a dependency between kernel versions and DT
> >> > bindings: kernels with the current driver will result in broken systems
> >> > with the non-retentive suspend states in the DT they boot with when
> >> > those states can't wake up the CPU.
> >
> > Can someone point me at a (non D1 or virt) system that has suspend
> > states in the DT that would need fixing?
> >
> >> This is not a new problem we are facing. Even in the ARM world,
> >> the DT bindings grew organically over time based on newer platform
> >> requirements.
> >>
> >> Now that we have a platform which does not want the time
> >> C3STOP feature, we need to first come-up with DT bindings
> >> to support this platform instead of temporarily disabling
> >> features which don't work on this platform.
> >
> > It's the opposite surely? It should be "now that we have a platform that
> > *does want* the C3STOP feature", right?
>
> IMO we just shouldn't be turning on C3STOP at all. Sure it's making the
> problem here go away, but it's trying to emulate a bunch of Intel timer
> features we don't have on RISC-V and ending up in a bunch of fallback
> paths.
>
> If we need some workaround in the timer subsystem to sort out the
> non-retentive states then we can sort that out, but my guess is that
> vendors are just going to 3
Actually, it will be better if we set C3STOP in the RISC-V timer driver
only for D1 (and future platforms that might need it). We can easily
do this based on the D1 compatible string from the root DT node.
>
> >> > > With all due respect, NACK to this patch from my side.
> >
> > As Samuel pointed out that the D1 doesn't actually use the timer in
> > question, I think we are okay here?
>
> IIUC it just should use the sunxi timer driver, but that requires some
> priority changes (and presumably breaks
>
> That said, I guess I'm confused about what's actually broken here. My
> understanding of the problem is: The D1's firmware disables some
> interrupts during non-retentive suspend, which results in SBI timers
> failing to wake up the core. The patch to add C3STOP makes the core
> come back from sleep, but that results in a bunch of other timer-related
> issues.
>
> So IMO we should revert the C3STOP patch as it's causing regressions
> (ie, old workloads break in order to make new ones work). Seems like
> folks mostly agree on that one?
Yes, I agree.
We can go a step further and add quirks in the RISC-V timer driver which
only sets C3STOP for certain platforms (e.g. D1).
>
> I also think we should stop entering non-retentive suspend until we can
> sort out how reliably wake up from it, as the SBI makes that a
> platform-specific detail. If the answer there is "non-retentive suspend
> is fine on the D1 as long as we don't use the SBI timers" then that
> seems fine, we just need some way to describe that in Linux -- that
> doesn't fix other platforms and other interrupts, but at least it's a
> start.
>
> Sorry if I've just misunderstood something here?
Rather than "stop entering non-retentive suspend", we should improve
the relevant drivers.
>
> >
> >> > >>
> >> > >> ---
> >> > >>
> >> > >> This should allow us to revert 232ccac1bd9b ("clocksource/drivers/riscv:
> >> > >> Events are stopped during CPU suspend"), which fixes suspend on the D1
> >> > >> but breaks timers everywhere.
> >> > >> ---
> >> > >> drivers/cpuidle/cpuidle-riscv-sbi.c | 11 +++++++++++
> >> > >> 1 file changed, 11 insertions(+)
> >> > >>
> >> > >> diff --git a/drivers/cpuidle/cpuidle-riscv-sbi.c b/drivers/cpuidle/cpuidle-riscv-sbi.c
> >> > >> index 05fe2902df9a..9d1063a54495 100644
> >> > >> --- a/drivers/cpuidle/cpuidle-riscv-sbi.c
> >> > >> +++ b/drivers/cpuidle/cpuidle-riscv-sbi.c
> >> > >> @@ -214,6 +214,17 @@ static bool sbi_suspend_state_is_valid(u32 state)
> >> > >> if (state > SBI_HSM_SUSPEND_NON_RET_DEFAULT &&
> >> > >> state < SBI_HSM_SUSPEND_NON_RET_PLATFORM)
> >> > >> return false;
> >> > >> +
> >> > >> + /*
> >> > >> + * Whether or not RISC-V systems deliver interrupts to harts in a
> >> > >> + * non-retentive suspend state is a platform-specific detail. This can
> >> > >> + * leave the hart unable to wake up, so just mark these states as
> >> > >> + * unsupported until we have a mechanism to expose these
> >> > >> + * platform-specific details to Linux.
> >> > >> + */
> >> > >> + if (state & SBI_HSM_SUSP_NON_RET_BIT)
> >> > >> + return false;
> >> > >> +
> >> > >> return true;
> >> > >> }
> >> > >>
> >> > >> --
> >> > >> 2.38.1
> >> > >>
Regards,
Anup