RE: [RFC PATCH 2/3] cpuidle: ladder: Tune promotion/demotion threshold
From: Doug Smythies
Date: Wed Nov 23 2022 - 18:53:34 EST
On 2022.11.23 09:50 Rafael wrote:
> On Sat, Nov 5, 2022 at 6:40 PM Zhang Rui <rui.zhang@xxxxxxxxx> wrote:
>>
>> After fixing the bogus comparison between u64 and s64, the ladder
>> governor stops making promotion decisions errornously.
>>
>> However, after this, it is found that the ladder governor demotes much
>> easier than promotes.
>
> "After fixing an error related to using signed and unsigned integers
> in the ladder governor in a previous patch, that governor turns out to
> demote much easier than promote"
>
>> Below is captured using turbostat after a 30 seconds runtime idle,
>>
>> Without previous patch,
>> Busy% IRQ POLL C1 C1E C3 C6 C7s C8 C9 C10 CPU%c1 CPU%c3 CPU%c6 CPU%c7 PkgWatt
>> 0.30 2373 0 0 0 4 9 25 122 326 2857 0.36 0.04 0.57 98.73 1.48
>
> Why is the above relevant?
>
>> With previous patch,
>> Busy% IRQ POLL C1 C1E C3 C6 C7s C8 C9 C10 CPU%c1 CPU%c3 CPU%c6 CPU%c7 PkgWatt
>> 0.42 3071 0 771 838 447 327 336 382 299 344 34.18 16.21 17.69 31.51 2.00
>>
>> And this is caused by the imbalanced PROMOTION_COUNT/DEMOTION_COUNT.
>
> I would explain why/how the imbalanced PROMOTION_COUNT/DEMOTION_COUNT
> imbalance causes this.
>
> I guess more residency in the deeper idle state is expected? Or desired??
>
>> With this patch,
>> Busy% IRQ POLL C1 C1E C3 C6 C7s C8 C9 C10 CPU%c1 CPU%c3 CPU%c6 CPU%c7 PkgWatt
>> 0.39 2436 0 1 72 177 51 194 243 799 1883 0.50 0.32 0.35 98.45 1.53
>>
>> Note that this is an experimental patch to illustrate the problem,
>> and it is checked with idle scenario only for now.
>> I will try to evaluate with more scenarios, and if someone can help
>> evaluate with more scenarios at the same time and provide data for the
>> benefit with different PROMOTION_COUNT/DEMOTION_COUNT values, that
>> would be great.
>
> So yes, this requires more work.
>
> Overall, I think that you are concerned that the previous change might
> be regarded as a regression and are trying to compensate for it with a
> PROMOTION_COUNT/DEMOTION_COUNT change.
>
> I'm not sure I can agree with that approach, because the shallower
> idle states might be preferred by the original ladder design
> intentionally, for performance reasons.
Hi All,
Because I was continuing to test the teo governor with
the util patch version 4, it was fairly easy for me to test
this patch set also. However, I have had difficulties having
enough time to write up my results.
The best improvement was for a slow speed ping-pong
(I did 3 speeds, fast, medium, and slow)
2 pairs of ping pongs, not forced CPU affinity,
schedutil CPU scaling governor,
intel_cpufreq CPU scaling driver,
HWP disabled.
The menu governor was considered the master reference:
Old ladder was 44% slower.
New ladder was 5.9% slower.
Just for reference:
Old teo was 29% slower.
teo util V4 was 13% slower.
The worst degradation was for a fast speed ping-pong
2 pairs of ping pongs, not forced CPU affinity,
schedutil CPU scaling governor,
intel_cpufreq CPU scaling driver,
HWP disabled.
The menu governor was considered the master reference:
Old ladder was 64% slower.
New ladder was 71% slower.
Interestingly, the old ladder governor outperformed
the menu governor on the slow 2 pair ping-pong
with the performance governor:
Old ladder was 0.56% faster.
New ladder was 0.81% slower.
Disclaimer: Test results using the schedutil
CPU scaling governor are noisy, with
questionable repeatability.
I'll try to get all the test results written up soon.
... Doug