Re: [PATCH v2] Revert "clocksource/drivers/riscv: Events are stopped during CPU suspend"

From: Samuel Holland
Date: Wed Nov 23 2022 - 00:50:00 EST


On 11/22/22 06:16, Conor Dooley wrote:
> This reverts commit 232ccac1bd9b5bfe73895f527c08623e7fa0752d.
>
> On the subject of suspend, the RISC-V SBI spec states:
>> Request the SBI implementation to put the calling hart in a platform
>> specific suspend (or low power) state specified by the suspend_type
>> parameter. The hart will automatically come out of suspended state and
>> resume normal execution when it receives an interrupt or platform
>> specific hardware event.
>
> This does not cover whether any given events actually reach the hart or
> not, just what the hart will do if it receives an event. On PolarFire
> SoC, and potentially other SiFive based implementations, events from the
> RISC-V timer do reach a hart during suspend. This is not the case for
> the implementation on the Allwinner D1 - there timer events are not
> received during suspend.
>
> To fix this, the x86 C3STOP feature was enabled for the timer driver -

C3STOP isn't inherently x86-specific.

> but this has broken both RCU stall detection and timers generally on
> PolarFire SoC (and potentially other SiFive based implementations).
>
> If an AXI read to the PCIe controller on PolarFire SoC times out, the
> system will stall, however, with this patch applied, the system just
> locks up without RCU stalling:
> io scheduler mq-deadline registered
> io scheduler kyber registered
> microchip-pcie 2000000000.pcie: host bridge /soc/pcie@2000000000 ranges:
> microchip-pcie 2000000000.pcie: MEM 0x2008000000..0x2087ffffff -> 0x0008000000
> microchip-pcie 2000000000.pcie: sec error in pcie2axi buffer
> microchip-pcie 2000000000.pcie: ded error in pcie2axi buffer
> microchip-pcie 2000000000.pcie: axi read request error
> microchip-pcie 2000000000.pcie: axi read timeout
> microchip-pcie 2000000000.pcie: sec error in pcie2axi buffer
> microchip-pcie 2000000000.pcie: ded error in pcie2axi buffer
> microchip-pcie 2000000000.pcie: sec error in pcie2axi buffer
> microchip-pcie 2000000000.pcie: ded error in pcie2axi buffer
> microchip-pcie 2000000000.pcie: sec error in pcie2axi buffer
> microchip-pcie 2000000000.pcie: ded error in pcie2axi buffer
> Freeing initrd memory: 7332K
>
> Similarly issues were reported with clock_nanosleep() - with a test app
> that sleeps each cpu for 6, 5, 4, 3 ms respectively, HZ=250 & the blamed
> commit in place, the sleep times are rounded up to the next jiffy:
>
> == CPU: 1 == == CPU: 2 == == CPU: 3 == == CPU: 4 ==
> Mean: 7.974992 Mean: 7.976534 Mean: 7.962591 Mean: 3.952179
> Std Dev: 0.154374 Std Dev: 0.156082 Std Dev: 0.171018 Std Dev: 0.076193
> Hi: 9.472000 Hi: 10.495000 Hi: 8.864000 Hi: 4.736000
> Lo: 6.087000 Lo: 6.380000 Lo: 4.872000 Lo: 3.403000
> Samples: 521 Samples: 521 Samples: 521 Samples: 521
>
> Fortunately, the D1 has a second timer, which is "currently used in
> preference to the RISC-V/SBI timer driver" so a revert here does not
> hurt operation of D1 in it's current form.

typo: its

> Ultimately, a DeviceTree property (or node) will be added to encode the
> behaviour of the timers, but until then revert the addition of
> CLOCK_EVT_FEAT_C3STOP.
>
> Link: https://lore.kernel.org/linux-riscv/YzYTNQRxLr7Q9JR0@spud/
> Link: https://github.com/riscv-non-isa/riscv-sbi-doc/issues/98/
> Link: https://lore.kernel.org/linux-riscv/bf6d3b1f-f703-4a25-833e-972a44a04114@xxxxxxxxxxxx/
> Fixes: 232ccac1bd9b ("clocksource/drivers/riscv: Events are stopped during CPU suspend")
> CC: Samuel Holland <samuel@xxxxxxxxxxxx>
> CC: Anup Patel <anup@xxxxxxxxxxxxxx>
> CC: Palmer Dabbelt <palmer@xxxxxxxxxxx>
> Reviewed-by: Palmer Dabbelt <palmer@xxxxxxxxxxxx>
> Acked-by: Palmer Dabbelt <palmer@xxxxxxxxxxxx>
> Signed-off-by: Conor Dooley <conor.dooley@xxxxxxxxxxxxx>
> ---
>
> For v2, I have re-worked the commit message to (hopefully) add improved
> context.
>
> CC: aou@xxxxxxxxxxxxxxxxx
> CC: atishp@xxxxxxxxxxxxxx
> CC: daniel.lezcano@xxxxxxxxxx
> CC: dmitriy@xxxxxxxxxxxx
> CC: paul.walmsley@xxxxxxxxxx
> CC: tglx@xxxxxxxxxxxxx
> CC: linux-kernel@xxxxxxxxxxxxxxx
> CC: linux-riscv@xxxxxxxxxxxxxxxxxxx
> ---
> drivers/clocksource/timer-riscv.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/clocksource/timer-riscv.c b/drivers/clocksource/timer-riscv.c
> index 969a552da8d2..a0d66fabf073 100644
> --- a/drivers/clocksource/timer-riscv.c
> +++ b/drivers/clocksource/timer-riscv.c
> @@ -51,7 +51,7 @@ static int riscv_clock_next_event(unsigned long delta,
> static unsigned int riscv_clock_event_irq;
> static DEFINE_PER_CPU(struct clock_event_device, riscv_clock_event) = {
> .name = "riscv_timer_clockevent",
> - .features = CLOCK_EVT_FEAT_ONESHOT | CLOCK_EVT_FEAT_C3STOP,
> + .features = CLOCK_EVT_FEAT_ONESHOT,
> .rating = 100,
> .set_next_event = riscv_clock_next_event,
> };

Acked-by: Samuel Holland <samuel@xxxxxxxxxxxx>