Re: [PATCH v8 10/13] x86/resctrl: Add sysfs interface to write mbm_total_bytes_config
From: Reinette Chatre
Date: Wed Nov 23 2022 - 17:23:18 EST
Hi Babu,
On 11/23/2022 1:44 PM, Moger, Babu wrote:
> [AMD Official Use Only - General]
>
> Hi Reinette,
>
>> -----Original Message-----
>> From: Reinette Chatre <reinette.chatre@xxxxxxxxx>
>> Sent: Tuesday, November 22, 2022 5:43 PM
>> To: Moger, Babu <Babu.Moger@xxxxxxx>; Peter Newman
>> <peternewman@xxxxxxxxxx>
>> Cc: akpm@xxxxxxxxxxxxxxxxxxxx; bagasdotme@xxxxxxxxx; bp@xxxxxxxxx;
>> chang.seok.bae@xxxxxxxxx; corbet@xxxxxxx;
>> damien.lemoal@xxxxxxxxxxxxxxxxxx; daniel.sneddon@xxxxxxxxxxxxxxx;
>> dave.hansen@xxxxxxxxxxxxxxx; eranian@xxxxxxxxxx; fenghua.yu@xxxxxxxxx;
>> hpa@xxxxxxxxx; james.morse@xxxxxxx; jmattson@xxxxxxxxxx;
>> jpoimboe@xxxxxxxxxx; linux-doc@xxxxxxxxxxxxxxx; linux-
>> kernel@xxxxxxxxxxxxxxx; mingo@xxxxxxxxxx; paulmck@xxxxxxxxxx;
>> pawan.kumar.gupta@xxxxxxxxxxxxxxx; pbonzini@xxxxxxxxxx;
>> peterz@xxxxxxxxxxxxx; quic_neeraju@xxxxxxxxxxx; rdunlap@xxxxxxxxxxxxx;
>> Das1, Sandipan <Sandipan.Das@xxxxxxx>; songmuchun@xxxxxxxxxxxxx;
>> tglx@xxxxxxxxxxxxx; tony.luck@xxxxxxxxx; x86@xxxxxxxxxx
>> Subject: Re: [PATCH v8 10/13] x86/resctrl: Add sysfs interface to write
>> mbm_total_bytes_config
>>
>> Hi Babu,
>>
>> On 11/7/2022 11:00 AM, Moger, Babu wrote:
>>>
>>> On 11/7/22 04:21, Peter Newman wrote:
>>>> Hi Babu,
>>>>
>>>> On Fri, Nov 04, 2022 at 03:01:09PM -0500, Babu Moger wrote:
>>>>> + /*
>>>>> + * When an Event Configuration is changed, the bandwidth counters
>>>>> + * for all RMIDs and Events will be cleared by the hardware. The
>>>>> + * hardware also sets MSR_IA32_QM_CTR.Unavailable (bit 62) for
>>>>> + * every RMID on the next read to any event for every RMID.
>>>>> + * Subsequent reads will have MSR_IA32_QM_CTR.Unavailable (bit 62)
>>>>> + * cleared while it is tracked by the hardware. Clear the
>>>>> + * mbm_local and mbm_total counts for all the RMIDs.
>>>>> + */
>>>>> + memset(d->mbm_local, 0, sizeof(struct mbm_state) * r->num_rmid);
>>>>> + memset(d->mbm_total, 0, sizeof(struct mbm_state) * r->num_rmid);
>>>> Looking around, I can't find a reader for mbm_total anymore. It looks
>>>> like the last place it was used went away in James's recent change:
>>>>
>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flor
>>>> e.kernel.org%2Fall%2F20220902154829.30399-19-
>> james.morse%40arm.com&am
>>>>
>> p;data=05%7C01%7Cbabu.moger%40amd.com%7Ccb4a2daf65b84b45aeac08da
>> cce35
>>>>
>> 66d%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C63804757402544
>> 6241%7
>>>>
>> CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI
>> 6Ik
>>>>
>> 1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=QZUVrpdr0YQFSJ
>> BbS0BHSu
>>>> q%2BhMwZHAA06MUqx98hD0U%3D&reserved=0
>>>>
>>>> Are we supposed to be clearing arch_mbm_total now?
>>>>
>>> Patch got garbled in previous response.
>>>
>>> Here is it now.
>>>
>>> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>> b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>> index 6b222f8e58ae..28d9d99a639e 100644
>>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>> @@ -1517,7 +1517,7 @@ static int mbm_config_write(struct rdt_resource
>>> *r, struct rdt_domain *d,
>>> u32 evtid, u32 val)
>>> {
>>> struct mon_config_info mon_info = {0};
>>> - int ret = 0;
>>> + int ret = 0, i;
>>>
>>> rdt_last_cmd_clear();
>>>
>>> @@ -1557,8 +1557,10 @@ static int mbm_config_write(struct rdt_resource
>>> *r, struct rdt_domain *d,
>>> * cleared while it is tracked by the hardware. Clear the
>>> * mbm_local and mbm_total counts for all the RMIDs.
>>> */
>>> - memset(d->mbm_local, 0, sizeof(struct mbm_state) *
>>> r->num_rmid);
>>> - memset(d->mbm_total, 0, sizeof(struct mbm_state) *
>>> r->num_rmid);
>>> + for (i = 0; i < r->num_rmid; i++) {
>>> + resctrl_arch_reset_rmid(r, d, i,
>>> +QOS_L3_MBM_TOTAL_EVENT_ID);
>>> + resctrl_arch_reset_rmid(r, d, i,
>>> +QOS_L3_MBM_LOCAL_EVENT_ID);
>>> + }
>>>
>>> write_exit:
>>> return ret;
>>
>> Resetting each member of an array individually seems unnecessary when the
>> array could just be reset as a unit. How about instead a new
>> resctrl_arch_reset_rmid_all() that can do so?
>
> Yes. We can do something like this below.
>
> diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
> index a188dacab6c8..2e67de911222 100644
> --- a/arch/x86/kernel/cpu/resctrl/monitor.c
> +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
> @@ -176,6 +176,14 @@ void resctrl_arch_reset_rmid(struct rdt_resource *r, struct rdt_domain *d,
> memset(am, 0, sizeof(*am));
> }
>
> +void resctrl_arch_reset_rmid_all(struct rdt_domain *d)
> +{
> + struct rdt_hw_domain *hw_dom = resctrl_to_arch_dom(d);
> +
> + memset(hw_dom->arch_mbm_total, 0, sizeof(*hw_dom->arch_mbm_total));
> + memset(hw_dom->arch_mbm_local, 0, sizeof(*hw_dom->arch_mbm_local));
> +}
> +
> static u64 mbm_overflow_count(u64 prev_msr, u64 cur_msr, unsigned int width)
> {
> u64 shift = 64 - width, chunks;
It looks like the above would only reset the first entry of the array.
I expect that the resource should also be provided as parameter so that
the num_rmid can be obtained to be able to clear the entire array.
Also, what is the likelihood of this being called when the array does not
exist? It may be safer to wrap each memset() with an is_mbm_total_enabled()
or is_mbm_local_enabled().
Reinette