Re: [PATCH bpf-next v7 0/3] Support storing struct task_struct objects as kptrs

From: David Vernet
Date: Thu Nov 17 2022 - 20:41:34 EST


On Thu, Nov 17, 2022 at 02:36:50PM -0800, John Fastabend wrote:
> David Vernet wrote:
> > On Thu, Nov 17, 2022 at 01:03:45PM -0800, John Fastabend wrote:
> > > David Vernet wrote:
> > > > Now that BPF supports adding new kernel functions with kfuncs, and
> > > > storing kernel objects in maps with kptrs, we can add a set of kfuncs
> > > > which allow struct task_struct objects to be stored in maps as
> > > > referenced kptrs.
> > > >
> > > > The possible use cases for doing this are plentiful. During tracing,
> > > > for example, it would be useful to be able to collect some tasks that
> > > > performed a certain operation, and then periodically summarize who they
> > > > are, which cgroup they're in, how much CPU time they've utilized, etc.
> > > > Doing this now would require storing the tasks' pids along with some
> > > > relevant data to be exported to user space, and later associating the
> > > > pids to tasks in other event handlers where the data is recorded.
> > > > Another useful by-product of this is that it allows a program to pin a
> > > > task in a BPF program, and by proxy therefore also e.g. pin its task
> > > > local storage.
> > >
> > > Sorry wasn't obvious to me (late to the party so if it was in some
> > > other v* described apologies). Can we say something about the life
> > > cycle of this acquired task_structs because they are incrementing
> > > the ref cnt on the task struct they have potential to impact system.
> >
> > We should probably add an entire docs page which describes how kptrs
> > work, and I am happy to do that (ideally in a follow-on patch set if
> > that's OK with you). In general I think it would be useful to include
> > docs for any general-purpose kfuncs like the ones proposed in this set.
>
> Sure, I wouldn't require that for your series though fwiw.

Sounds good to me

[...]

> > > quick question. If you put acquired task struct in a map what
> > > happens if user side deletes the entry? Presumably this causes the
> > > release to happen and the task_struct is good to go. Did I miss
> > > the logic? I was thinking you would have something in bpf_map_free_kptrs
> > > and a type callback to release() the refcnt?
> >
> > Someone else can chime in here to correct me if I'm wrong, but AFAIU
> > this is handled by the map implementations calling out to
> > bpf_obj_free_fields() to invoke the kptr destructor when the element is
> > destroyed. See [3] and [4] for examples of where they're called from the
> > arraymap and hashmap logic respectively. This is how the destructors are
> > similarly invoked when the maps are destroyed.
>
> Yep I found the dtor() gets populated in btf.c and apparently needed
> to repull my local tree because I missed it. Thanks for the detailed
> response.
>
> And last thing I was checking is because KF_SLEEPABLE is not set
> this should be blocked from running on sleepable progs which would
> break the call_rcu in the destructor. Maybe small nit, not sure
> its worth it but might be nice to annotate the helper description
> with a note, "will not work on sleepable progs" or something to
> that effect.

KF_SLEEPABLE is used to indicate whether the kfunc _itself_ may sleep,
not whether the calling program can be sleepable. call_rcu() doesn't
block, so no need to mark the kfunc as KF_SLEEPABLE. The key is that if
a kfunc is sleepable, non-sleepable programs are not able to call it
(and this is enforced in the verifier).