Re: [PATCH V2] PCI/DOE: Detect on stack work items automatically
From: Ira Weiny
Date: Fri Nov 18 2022 - 13:43:49 EST
On Fri, Nov 18, 2022 at 09:20:38AM +0000, David Laight wrote:
> From: ira.weiny@xxxxxxxxx
> > Sent: 18 November 2022 00:05
> >
> > Work item initialization needs to be done with either
> > INIT_WORK_ONSTACK() or INIT_WORK() depending on how the work item is
> > allocated.
> >
> > The callers of pci_doe_submit_task() allocate struct pci_doe_task on the
> > stack and pci_doe_submit_task() incorrectly used INIT_WORK().
> >
> > Jonathan suggested creating doe task allocation macros such as
> > DECLARE_CDAT_DOE_TASK_ONSTACK().[1] The issue with this is the work
> > function is not known to the callers and must be initialized correctly.
> >
> > A follow up suggestion was to have an internal 'pci_doe_work' item
> > allocated by pci_doe_submit_task().[2] This requires an allocation which
> > could restrict the context where tasks are used.
> >
> > Another idea was to have an intermediate step to initialize the task
> > struct with a new call.[3] This added a lot of complexity.
> >
> > Lukas pointed out that object_is_on_stack() is available to detect this
> > automatically.
> >
> > Use object_is_on_stack() to determine the correct init work function to
> > call.
>
> This is all a bit strange.
> The 'onstack' flag is needed for the diagnostic check:
> is_on_stack = object_is_on_stack(addr);
> if (is_on_stack == onstack)
> return;
> pr_warn(...);
> WARN_ON(1);
>
:-(
> So setting the flag to the location of the buffer just subverts the check.
> It that is sane there ought to be a proper way to do it.
Ok this brings me back to my previous point and suggested patch.[*] The
fundamental bug is that the work item is allocated in different code from
the code which uses it. Separating the work item from the task.
[*] https://lore.kernel.org/linux-cxl/20221014151045.24781-1-Jonathan.Cameron@xxxxxxxxxx/T/#m63c636c5135f304480370924f4d03c00357be667
Bjorn would this solution be acceptable and just use GFP_KERNEL and mark the
required context for pci_doe_submit_task()?
> OTOH using an on-stack structure for INIT_WORK seems rather strange.
> Since the kernel thread must sleep waiting for the 'work' to complete
> why not just perform the required code there.
It is not strange if some task submitters want to wait while others do not. It
was suggested that all submit task operations be async and the callers who
wanted to be synchronous would wait like this.
As Dan said there is a difference between submit_bio() and submit_bio_wait().
We have simply left the wait part up to the users who all wait right now.
>
> Also you really don't want to OOPS with anything from the stack
> linked into global kernel data structures.
I'm not following what you mean here. I'm not seeing anything like this in the
current code nor any of the solutions suggested.
Ira
> While wait queues are pretty limited in scope and probably ok,
> this looks like a big accident waiting to happen.
>
> David
>
> -
> Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
> Registration No: 1397386 (Wales)
>