[PATCH v2 0/7] x86/sched: Avoid unnecessary migrations within SMT domains
From: Ricardo Neri
Date: Tue Nov 22 2022 - 15:28:25 EST
Hi,
This v2 of this patchset. V1 can be found here [1]. In this version I took
the suggestion of Peter to teach arch_asym_cpu_priority() the CPU state.
Also, I reworded the cover letter to explain better the intent.
---
asym_packing load balancing is used to balance load among physical cores
with SMT (e.g., Intel processors that support Intel Turbo Boost Max 3.0 and
hybrid processors) and among SMT siblings of a physical cores (e.g.,
Power7).
The current implementation is sufficient for the latter case as it favors
higher-priority SMT siblings. In the former case, however, we must consider
the fact that the throughput of a CPU degrades if one or more of its SMT
siblings are busy. Hence, a lower-priority CPU that is fully idle is more
desirable than a high-priority CPU with busy SMT siblings.
To fit the current implementation of asym_packing, x86 artificially assigns
a lower priority to the higher-numbered SMT siblings. In reality, there is
no difference between any of the SMT siblings of a core.
Do not use different priorities for each SMT sibling. Instead, tweak the
asym_packing load balancing logic to consider the idle state of the SMT
siblings of a CPU.
Removing these artificial priorities avoids superfluous migrations and let
lower-priority cores inspect all SMT siblings for the busiest queue. The
latter is also necessary to support IPC classes of tasks [2], as the
destination CPU will need to inspect the tasks running on CPUs of equal
priority.
This patchset should not break Power7 SMT8. Functionality does not change
for architectures that do not implement the new check_smt parameter of
sched_prefer_asym().
These patches apply cleanly on today's tip tree.
Changes since v1:
* Tweaked arch_asym_cpu_priority() and sched_asym_prefer() to handle
the idle state of the SMT siblings of a CPU. (PeterZ)
* Expose functionality of the scheduler that determines the idle state
of the SMT siblings of a CPU.
* Addressed concerns from Peter about SMT2 assumptions and breaking
Power7.
* Removed the SD_ASYM_PACKING flag from the "SMT" domain in x86.
* Reworked x86's arch_asym_cpu_priority() to consider the idle state
of the SMT siblings of a CPU.
[1]. https://lore.kernel.org/lkml/20220825225529.26465-1-ricardo.neri-calderon@xxxxxxxxxxxxxxx/
[2]. https://lore.kernel.org/lkml/20220909231205.14009-1-ricardo.neri-calderon@xxxxxxxxxxxxxxx/
Ricardo Neri (7):
sched/fair: Generalize asym_packing logic for SMT local sched group
sched: Prepare sched_asym_prefer() to handle idle state of SMT
siblings
sched: Teach arch_asym_cpu_priority() the idle state of SMT siblings
sched/fair: Introduce sched_smt_siblings_idle()
x86/sched: Remove SD_ASYM_PACKING from the "SMT" domain
x86/sched/itmt: Give all SMT siblings of a core the same priority
x86/sched/itmt: Consider the idle state of SMT siblings
arch/x86/kernel/itmt.c | 30 ++++--------
arch/x86/kernel/smpboot.c | 2 +-
include/linux/sched.h | 2 +
include/linux/sched/topology.h | 2 +-
kernel/sched/fair.c | 90 +++++++++++++++++-----------------
kernel/sched/sched.h | 11 +++--
kernel/sched/topology.c | 6 ++-
7 files changed, 72 insertions(+), 71 deletions(-)
--
2.25.1