Re: [PATCH v3 1/2] x86/resctrl: IPI all CPUs for group updates
From: Peter Newman
Date: Wed Nov 23 2022 - 06:12:31 EST
Hi Reinette,
On Mon, Nov 21, 2022 at 10:53 PM Reinette Chatre
<reinette.chatre@xxxxxxxxx> wrote:
> On 11/15/2022 6:19 AM, Peter Newman wrote:
> > To rule out needing to update a CPU when deleting an rdtgroup, we must
>
> Please do not impersonate code in changelog and comments (do not
> use "we"). This is required for resctrl changes to be considered for
> inclusion because resctrl patches are routed via the "tip" repo
> and is thus required to follow the "tip tree handbook"
> (Documentation/process/maintainer-tip.rst). Please also
> stick to a clear "context-problem-solution" changelog as is the custom
> in this area.
Thanks, I forgot about the "tip tree handbook" and was trying to
remember where the pointers about wording from the reply to the other
patch came from. Unfortunately I didn't read your replies in FIFO
order.
>
> > search the entire tasklist for group members which could be running on
> > that CPU. This needs to be done while blocking updates to the tasklist
> > to avoid leaving newly-created child tasks assigned to the old
> > CLOSID/RMID.
>
> This is not clear to me. rdt_move_group_tasks() obtains a read lock,
> read_lock(&tasklist_lock), so concurrent modifications to the tasklist
> are indeed possible. Should this perhaps be write_lock() instead?
> It sounds like the scenario you are describing may be a concern. That is,
> if a task belonging to a group that is being removed happens to
> call fork()/clone() during the move then the child may end up being
> created with old closid.
Shouldn't read_lock(&tasklist_lock) cause write_lock(&tasklist_lock) to
block?
Maybe I paraphrased too much in the explanation.
> > The cost of reliably propagating a CLOSID or RMID update to a single
> > task is higher than originally thought. The present understanding is
> > that we must obtain the task_rq_lock() on each task to ensure that it
> > observes CLOSID/RMID updates in the case that it migrates away from its
> > current CPU before the update IPI reaches it.
>
> I find this confusing since it describes why a potential solution does
> not solve a problem, neither problem nor solution is well described at this
> point.
>
> What if you switch the order of the two patches? Patch #2 provides
> the potential solution mentioned here so that may be helpful to have as
> reference in this changelog.
Yes, I will try that. The single-task and multi-task cases should be
independent enough that I can handle them in either order.
> > For now, just notify all the CPUs after updating the closid/rmid fields
>
> For now? If you anticipate changes then there should be a plan for that,
> otherwise this is the fix without further speculation.
It seems like I have to either do something to acknowledge the tradeoff,
or make a case that the negatively affected usage isn't important.
Should I assert that deleting groups while a realtime, core-isolated
application is running isn't a legitimate use case?
>
> > in impacted tasks task_structs rather than paying the cost of obtaining
> > a more precise cpu mask.
>
> s/cpu/CPU/
> It may be helpful to add that an accurate CPU mask cannot be guaranteed and
> the more tasks moved the less accurate it could be (if I understand correctly).
Ok.
> > diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> > index e5a48f05e787..049971efea2f 100644
> > --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> > +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> > @@ -2385,12 +2385,10 @@ static int reset_all_ctrls(struct rdt_resource *r)
> > * Move tasks from one to the other group. If @from is NULL, then all tasks
> > * in the systems are moved unconditionally (used for teardown).
> > *
> > - * If @mask is not NULL the cpus on which moved tasks are running are set
> > - * in that mask so the update smp function call is restricted to affected
> > - * cpus.
> > + * Following this operation, the caller is required to update the MSRs on all
> > + * CPUs.
> > */
>
> On x86 only one MSR needs updating, the PQR_ASSOC MSR. The above could be
> summarized as:
> "Caller should update per CPU storage and PQR_ASSOC."
Sounds good.
> [...]
>
> The fix looks good to me. I do think its motivation and description
> needs to improve to make it palatable to folks not familiar with this area.
>
> Reinette
Once again, thanks for your review. I will work on clarifying the
comments. The explanation is usually most of the work with changes like
these, which is tough to do without some feedback from an actual reader.
I've been looking at this section of code too long to be able to judge
my own explanation anymore.
-Peter