Re: [PATCH 3/3] compiler: inline does not imply notrace
From: Nadav Amit
Date: Tue Nov 22 2022 - 15:51:38 EST
On Nov 22, 2022, at 12:09 PM, Arnd Bergmann <arnd@xxxxxxxx> wrote:
> !! External Email
>
> On Tue, Nov 22, 2022, at 20:53, Nadav Amit wrote:
>> From: Nadav Amit <namit@xxxxxxxxxx>
>>
>> Functions that are marked as "inline" are currently also not tracable.
>> Apparently, this has been done to prevent differences between different
>> configs that caused different functions to be tracable on different
>> platforms.
>>
>> Anyhow, this consideration is not very strong, and tying "inline" and
>> "notrace" does not seem very beneficial. The "inline" keyword is just a
>> hint, and many functions are currently not tracable due to this reason.
>
> The original reason was listed in 93b3cca1ccd3 ("ftrace: Make all
> inline tags also include notrace"), which describes
>
> Commit 5963e317b1e9d2a ("ftrace/x86: Do not change stacks in DEBUG when
> calling lockdep") prevented lockdep calls from the int3 breakpoint handler
> from reseting the stack if a function that was called was in the process
> of being converted for tracing and had a breakpoint on it. The idea is,
> before calling the lockdep code, do a load_idt() to the special IDT that
> kept the breakpoint stack from reseting. This worked well as a quick fix
> for this kernel release, until a certain config caused a lockup in the
> function tracer start up tests.
>
> Investigating it, I found that the load_idt that was used to prevent
> the int3 from changing stacks was itself being traced!
>
> and this sounds like a much stronger reason than what you describe,
> and I would expect your change to cause regressions in similar places.
I had no intention of misrepresenting. That was my understanding from my
previous discussion with Steven.
I assume that this patch might cause some regressions. The first patch in
this series was intended to prevents some regressions that I encountered.
This patch also marks a function that was missing “notrace” before. And I
did get kernel hangs due to the missing “notrace”.
Anyhow, I believe that the alternative - of leaving things as they are
(“inline”->”notrace”) - is even worse. Obviously it prevents proper tracing,
as there are even system calls that use inline, for instance
__do_sys_process_madvise() and __do_sys_mremap().
But more importantly, the current “inline”->”notrace” solution just papers
over missing “notrace” annotations. Anyone can remove the “inline” at any
given moment since there is no direct (or indirect) relationship between
“inline” and “notrace”. It seems to me all random and bound to fail at some
point.
> It's possible that the right answer is that the affected functions
> should be marked as __always_inline.
I think that it is probably better to mark them as notrace. People might
remove __always_inline. If we want two versions - one traceable and one not
traceable - we can also do that. But I am not sure how many people are aware
of the relationships between inline/__always_inline and tracing.