Re: [PATCH] timers: fix LVL_START macro
From: Thomas Gleixner
Date: Wed Nov 16 2022 - 18:48:14 EST
On Tue, Nov 15 2022 at 23:40, Frederic Weisbecker wrote:
> On Tue, Nov 15, 2022 at 01:15:11PM +0000, Zhou, Yun wrote:
>> Hi Frederic,
>>
>> The issue now is that a timer may be thrown into the upper level bucket. For example, expires 4090 and 1000 HZ, it should be in level 2, but now it will be placed in the level 3. Is this expected?
>>
>> * HZ 1000 steps
>> * Level Offset Granularity Range
>> * 0 0 1 ms 0 ms - 63 ms
>> * 1 64 8 ms 64 ms - 511 ms
>> * 2 128 64 ms 512 ms - 4095 ms (512ms - ~4s)
>> * 3 192 512 ms 4096 ms - 32767 ms (~4s - ~32s)
>> * 4 256 4096 ms (~4s) 32768 ms - 262143 ms (~32s - ~4m)
>
> The rule is that a timer is not allowed to expire too early. But it can expire
> a bit late. Hence why it is always rounded up. So in the case of 4090, we have
> the choice between:
>
> 1) expiring at bucket 2 after 4096 - 64 = 4032 ms
> 2) expiring at bucket 3 after 4096 ms
>
> The 1) rounds down and expires too early. The 2) rounds up and expires a bit
> late. So the second solution is preferred.
It's not only preferred, it's required simply because the timer wheel
has only one guarantee: Not to expire early.
Timer wheel based timers are fundamentaly not precise unless the timeout
is short and hits the first level.
But even hrtimers which are designed to be precise have only one real
guarantee: Not to expire early.
hrtimers do not have the side effect of batching on long timeouts like
timer wheel based timer have, but that's it.
Timers in the kernel come with a choice:
- Imprecise and inexpensive to arm and cancel (timer_list)
- Precise and expensive to arm and cancel (hrtimer)
You can't have both. That's well documented.
Thanks,
tglx