Re: [PATCH] perf/x86/intel/pt: Fix sampling using single range output

From: Adrian Hunter
Date: Wed Nov 16 2022 - 01:27:37 EST


On 15/11/22 21:46, Andi Kleen wrote:
> Peter Zijlstra <peterz@xxxxxxxxxxxxx> writes:
>
>> On Mon, Nov 14, 2022 at 01:10:38PM +0200, Adrian Hunter wrote:
>>> On 14/11/22 12:51, Peter Zijlstra wrote:
>>>> On Sat, Nov 12, 2022 at 05:15:08PM +0200, Adrian Hunter wrote:
>>>>> Deal with errata TGL052, ADL037 and RPL017 "Trace May Contain Incorrect
>>>>> Data When Configured With Single Range Output Larger Than 4KB" by
>>>>> disabling single range output whenever larger than 4KB.
>>>>>
>>>>> Fixes: 670638477aed ("perf/x86/intel/pt: Opportunistically use single range output mode")
>>>>> Cc: stable@xxxxxxxxxxxxxxx
>>>>> Signed-off-by: Adrian Hunter <adrian.hunter@xxxxxxxxx>
>>>>> ---
>>>>> arch/x86/events/intel/pt.c | 9 +++++++++
>>>>> 1 file changed, 9 insertions(+)
>>>>>
>>>>> diff --git a/arch/x86/events/intel/pt.c b/arch/x86/events/intel/pt.c
>>>>> index 82ef87e9a897..42a55794004a 100644
>>>>> --- a/arch/x86/events/intel/pt.c
>>>>> +++ b/arch/x86/events/intel/pt.c
>>>>> @@ -1263,6 +1263,15 @@ static int pt_buffer_try_single(struct pt_buffer *buf, int nr_pages)
>>>>> if (1 << order != nr_pages)
>>>>> goto out;
>>>>>
>>>>> + /*
>>>>> + * Some processors cannot always support single range for more than
>>>>> + * 4KB - refer errata TGL052, ADL037 and RPL017. Future processors might
>>>>> + * also be affected, so for now rather than trying to keep track of
>>>>> + * which ones, just disable it for all.
>>>>> + */
>>>>> + if (nr_pages > 1)
>>>>> + goto out;
>>>>
>>>> This effectively declares single-output-mode dead? Because I don't think
>>>> anybody uses PT with a single 4K buffer.
>>>
>>> 4K is the default size for "sample mode" i.e. stuffing 4KB of Intel PT trace
>>> data into a PERF_RECORD_SAMPLE record that has sample_type bit PERF_SAMPLE_AUX
>>>
>>> e.g.
>>>
>>> $ perf record -vv --aux-sample -e '{intel_pt//u,cycles:u}' uname 2>err.txt
>>> Linux
>>> $ grep aux_sample_size err.txt
>>> aux_sample_size 4096
>>
>> Ah, ok. Not as bad then. Anyway, I'll go queue it for perf/urgent I
>> suppose.
>
> It would be better to only limit on the CPUs with the bug because
> switching buffers causes some extra latencies. So this patch may regress
> PT overhead or tail latencies.

I could whitelist CPUs that do not have the issue, because a blacklist
would keep expanding, which would be a bit of a pain to maintain.