Re: [RFC PATCH 2/3] cpuidle: ladder: Tune promotion/demotion threshold

From: Rafael J. Wysocki
Date: Wed Nov 23 2022 - 12:50:47 EST


On Sat, Nov 5, 2022 at 6:40 PM Zhang Rui <rui.zhang@xxxxxxxxx> wrote:
>
> After fixing the bogus comparison between u64 and s64, the ladder
> governor stops making promotion decisions errornously.
>
> However, after this, it is found that the ladder governor demotes much
> easier than promotes.

"After fixing an error related to using signed and unsigned integers
in the ladder governor in a previous patch, that governor turns out to
demote much easier than promote"

> Below is captured using turbostat after a 30 seconds runtime idle,
>
> Without previous patch,
> Busy% IRQ POLL C1 C1E C3 C6 C7s C8 C9 C10 CPU%c1 CPU%c3 CPU%c6 CPU%c7 PkgWatt
> 0.30 2373 0 0 0 4 9 25 122 326 2857 0.36 0.04 0.57 98.73 1.48

Why is the above relevant?

> With previous patch,
> Busy% IRQ POLL C1 C1E C3 C6 C7s C8 C9 C10 CPU%c1 CPU%c3 CPU%c6 CPU%c7 PkgWatt
> 0.42 3071 0 771 838 447 327 336 382 299 344 34.18 16.21 17.69 31.51 2.00
>
> And this is caused by the imbalanced PROMOTION_COUNT/DEMOTION_COUNT.

I would explain why/how the imbalanced PROMOTION_COUNT/DEMOTION_COUNT
imbalance causes this.

I guess more residency in the deeper idle state is expected? Or desired??

> With this patch,
> Busy% IRQ POLL C1 C1E C3 C6 C7s C8 C9 C10 CPU%c1 CPU%c3 CPU%c6 CPU%c7 PkgWatt
> 0.39 2436 0 1 72 177 51 194 243 799 1883 0.50 0.32 0.35 98.45 1.53
>
> Note that this is an experimental patch to illustrate the problem,
> and it is checked with idle scenario only for now.
> I will try to evaluate with more scenarios, and if someone can help
> evaluate with more scenarios at the same time and provide data for the
> benefit with different PROMOTION_COUNT/DEMOTION_COUNT values, that
> would be great.

So yes, this requires more work.

Overall, I think that you are concerned that the previous change might
be regarded as a regression and are trying to compensate for it with a
PROMOTION_COUNT/DEMOTION_COUNT change.

I'm not sure I can agree with that approach, because the shallower
idle states might be preferred by the original ladder design
intentionally, for performance reasons.

> Signed-off-by: Zhang Rui <rui.zhang@xxxxxxxxx>
> ---
> drivers/cpuidle/governors/ladder.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/cpuidle/governors/ladder.c b/drivers/cpuidle/governors/ladder.c
> index fb61118aef37..4b47aa0a4da9 100644
> --- a/drivers/cpuidle/governors/ladder.c
> +++ b/drivers/cpuidle/governors/ladder.c
> @@ -20,8 +20,8 @@
> #include <asm/io.h>
> #include <linux/uaccess.h>
>
> -#define PROMOTION_COUNT 4
> -#define DEMOTION_COUNT 1
> +#define PROMOTION_COUNT 2
> +#define DEMOTION_COUNT 4
>
> struct ladder_device_state {
> struct {
> --