Re: [PATCH] srcu: Add detection of boot CPU srcu_data structure's->srcu_cblist

From: Paul E. McKenney
Date: Wed Nov 23 2022 - 14:05:05 EST


On Wed, Nov 23, 2022 at 02:01:50AM +0000, Zhang, Qiang1 wrote:
>
> >On Thu, Nov 17, 2022 at 07:20:50PM +0800, Zqiang wrote:
> > Before SRCU_SIZE_WAIT_CALL srcu_size_state is reached, the srcu
> > callback only insert to boot-CPU srcu_data structure's->srcu_cblist
> > and other CPU srcu_data structure's->srcu_cblist is always empty. so
> > before SRCU_SIZE_WAIT_CALL is reached, need to correctly check boot-CPU
> > pend cbs in srcu_might_be_idle().
> >
> > Signed-off-by: Zqiang <qiang1.zhang@xxxxxxxxx>
> > ---
> > kernel/rcu/srcutree.c | 7 +++++--
> > 1 file changed, 5 insertions(+), 2 deletions(-)
> >
> > diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
> > index 6af031200580..6d9af9901765 100644
> > --- a/kernel/rcu/srcutree.c
> > +++ b/kernel/rcu/srcutree.c
> > @@ -1098,8 +1098,11 @@ static bool srcu_might_be_idle(struct srcu_struct *ssp)
> > unsigned long tlast;
> >
> > check_init_srcu_struct(ssp);
> > - /* If the local srcu_data structure has callbacks, not idle. */
> > - sdp = raw_cpu_ptr(ssp->sda);
> > + /* If the boot CPU or local srcu_data structure has callbacks, not idle. */
> > + if (smp_load_acquire(&ssp->srcu_size_state) < SRCU_SIZE_WAIT_CALL)
> > + sdp = per_cpu_ptr(ssp->sda, get_boot_cpu_id());
> > + else
> > + sdp = raw_cpu_ptr(ssp->sda);
> >
>
> Hi Paul,
>
> For the convert_to_big default configuration(SRCU_SIZING_AUTO), I found that the srcu is in
> SRCU_SIZE_SMALL state most of the time in the system, I think this change is meaningful.
>
> Thoughts ?

You are right that this change might make srcu_might_be_idle() return a
more accurate value in the common case. As things are now, some other
CPU might just now have added a callback, but might not yet have started
that new grace period. In that case, we might expedite an SRCU grace
period when we would not have otherwise done so. However, this change
would also increase contention on the get_boot_cpu_id() CPU's srcu_data
structure's ->lock.

So the result of that inaccurate return value is that the first two SRCU
grace periods in a burst might be expedited instead of only the first one,
and even then only if we hit a very narrow race window.

Besides, this same thing can happen if two CPUs do synchronize_srcu()
at the same time after a long-enough pause between grace periods.
Both see no callbacks, so both ask for an expedited grace period.
So again, the first two grace periods are expedited.

As a result, I am having a very hard time justifying the increased
lock contention.

Am I missing something here?

Thanx, Paul

> Thanks
> Zqiang
>
>
> >While at it if someone is interested in documenting/commenting on the meaning of
> >all these SRCU_SIZE_* things, it would be much appreciated!
> >
> >In the initial stage(SRCU_SIZE_SMALL), there is no srcu_node structures, only initialized
> >per-CPU srcu_data structures, at this time, we only use boot-cpu srcu_data structures's ->srcu_cblist
> >to store srcu callbacks.
> >if the contention of srcu_struct and srcu_data structure's->lock become busy,
> >transition to SRCU_SIZE_ALLOC. allocated memory for srcu_node structure at end of the SRCU
> >grace period.
> >if allocated success, transition to SRCU_SIZE_WAIT_BARRIER, in this state, invoke
> >srcu_barrier() will iterate over all possible CPUs, but we still only use boot-CPU srcu_cblist to store callbacks.
> >the task calling call_srcu() may have access to a new srcu_node structure or may not,
> >because call_srcu() is protected by SRCU, we may not transition to SRCU_SIZE_WAIT_CALL quickly,
> >need to wait in this state for a SRCU grace period to end.
> >After transition to SRCU_SIZE_WAIT_CALL, we enqueue srcu callbacks to own srcu_data structures's ->srcu_cblist.
> >
> >Does my description make my commit more detailed?
> >
> >Thanks
> >Zqiang
> >
> >
>
>
> >
> >Thanks.
> >
> > spin_lock_irqsave_rcu_node(sdp, flags);
> > if (rcu_segcblist_pend_cbs(&sdp->srcu_cblist)) {
> > spin_unlock_irqrestore_rcu_node(sdp, flags);
> > --
> > 2.25.1
> >