Re: [PATCH v7 09/20] x86/virt/tdx: Get information about TDX module and TDX-capable memory
From: Dave Hansen
Date: Wed Nov 23 2022 - 11:44:49 EST
On 11/23/22 03:40, Huang, Kai wrote:
> On Tue, 2022-11-22 at 15:39 -0800, Dave Hansen wrote:
>> That last sentece is kinda goofy. I think there's a way to distill this
>> whole thing down more effecively.
>>
>> CMRs tell the kernel which memory is TDX compatible. The kernel
>> takes CMRs and constructs "TD Memory Regions" (TDMRs). TDMRs
>> let the kernel grant TDX protections to some or all of the CMR
>> areas.
>
> Will do.
>
> But it seems we should still mention "Constructing TDMRs requires information of
> both the TDX module (TDSYSINFO_STRUCT) and the CMRs"? The reason is to justify
> "use static to avoid having to pass them as function arguments when constructing
> TDMRs" below.
In a changelog, no. You do *NOT* use super technical language in
changelogs if not super necessary. Mentioning "TDSYSINFO_STRUCT" here
is useless. The *MOST* you would do for a good changelog is:
The kernel takes CMRs (plus a little more metadata) and
constructs "TD Memory Regions" (TDMRs).
You just need to talk about things at a high level in mostly
non-technical language so that folks know the structure of the code
below. It's not a replacement for the code, the comments, *OR* the TDX
module specification.
I'm also not quite sure that this justifies the static variables anyway.
They could be dynamically allocated and passed around, for instance.
>>> Use static variables for both TDSYSINFO_STRUCT and CMR array to avoid
>>
>> I find it very useful to be precise when referring to code. Your code
>> says 'tdsysinfo_struct', yet this says 'TDSYSINFO_STRUCT'. Why the
>> difference?
>
> Here I actually didn't intend to refer to any code. In the above paragraph
> (that is going to be replaced with yours), I mentioned "TDSYSINFO_STRUCT" to
> explain what does "information of the TDX module" actually refer to, since
> TDSYSINFO_STRUCT is used in the spec.
>
> What's your preference?
Kill all mentions to TDSYSINFO_STRUCT whatsoever in the changelog.
Write comprehensible English.
>>> having to pass them as function arguments when constructing the TDMR
>>> array. And they are too big to be put to the stack anyway. Also, KVM
>>> needs to use the TDSYSINFO_STRUCT to create TDX guests.
>>
>> This is also a great place to mention that the tdsysinfo_struct contains
>> a *lot* of gunk which will not be used for a bit or that may never get
>> used.
>
> Perhaps below?
>
> "Note many members in tdsysinfo_struct' are not used by the kernel".
>
> Btw, may I ask why does it matter?
Because you're adding a massive structure with all kinds of fields.
Those fields mostly aren't used. That could be from an error in this
series, or because they will be used later or because they will *never*
be used.
>>> + cmr = &cmr_array[0];
>>> + /* There must be at least one valid CMR */
>>> + if (WARN_ON_ONCE(is_cmr_empty(cmr) || !is_cmr_ok(cmr)))
>>> + goto err;
>>> +
>>> + cmr_num = *actual_cmr_num;
>>> + for (i = 1; i < cmr_num; i++) {
>>> + struct cmr_info *cmr = &cmr_array[i];
>>> + struct cmr_info *prev_cmr = NULL;
>>> +
>>> + /* Skip further empty CMRs */
>>> + if (is_cmr_empty(cmr))
>>> + break;
>>> +
>>> + /*
>>> + * Do sanity check anyway to make sure CMRs:
>>> + * - are 4K aligned
>>> + * - don't overlap
>>> + * - are in address ascending order.
>>> + */
>>> + if (WARN_ON_ONCE(!is_cmr_ok(cmr)))
>>> + goto err;
>>
>> Why does cmr_array[0] get a pass on the empty and sanity checks?
>
> TDX MCHECK verifies CMRs before enabling TDX, so there must be at least one
> valid CMR.
>
> And cmr_array[0] is checked before this loop.
I think you're confusing two separate things. MCHECK ensures that there
is convertible memory. The CMRs that this code looks at are software
(TD module) defined and created structures that the OS and the module share.
This cmr_array[] structure is not created by MCHECK.
Go look at your code. Consider what will happen if cmr_array[0] is
empty or !is_cmr_ok(). Then consider what will happen if cmr_array[1]
has the same happen.
Does that end result really justify having separate code for
cmr_array[0] and cmr_array[>0]?
>>> + prev_cmr = &cmr_array[i - 1];
>>> + if (WARN_ON_ONCE((prev_cmr->base + prev_cmr->size) >
>>> + cmr->base))
>>> + goto err;
>>> + }
>>> +
>>> + /* Update the actual number of CMRs */
>>> + *actual_cmr_num = i;
>>
>> That comment is not helpful. Yes, this is literally updating the number
>> of CMRs. Literally. That's the "what". But, the "why" is important.
>> Why is it doing this?
>
> When building the list of "TDX-usable" memory regions, the kernel verifies those
> regions against CMRs to see whether they are truly convertible memory.
>
> How about adding a comment like below:
>
> /*
> * When the kernel builds the TDX-usable memory regions, it verifies
> * they are truly convertible memory by checking them against CMRs.
> * Update the actual number of CMRs to skip those empty CMRs.
> */
>
> Also, I think printing CMRs in the dmesg is helpful. Printing empty (zero) CMRs
> will put meaningless log to the dmesg.
So it's just about printing them?
Then put a dang switch to the print function that says "print them all"
or not.
...
>> Also, I saw the loop above check 'cmr_num' CMRs for is_cmr_ok(). Now,
>> it'll print an 'actual_cmr_num=1' number of CMRs as being
>> "kernel-checked". Why? That makes zero sense.
>
> The loop quits when it sees an empty CMR. I think there's no need to check
> further CMRs as they must be empty (TDX MCHECK verifies CMRs).
OK, so you're going to get some more homework here. Please explain to
me how MCHECK and the CMR array that comes out of the TDX module are
related. How does the output from MCHECK get turned into the in-memory
cmr_array[], step by step?
At this point, I fear that you're offering up MCHECK like it's a bag of
magic beans rather than really truly thinking about the cmr_array[] data
structure. How it is generated? How might it be broken? Who might
break it? If so, what the kernel should do about it?
>>> +
>>> + /*
>>> + * trim_empty_cmrs() updates the actual number of CMRs by
>>> + * dropping all tail empty CMRs.
>>> + */
>>> + return trim_empty_cmrs(tdx_cmr_array, &tdx_cmr_num);
>>> +}
>>
>> Why does this both need to respect the "tdx_cmr_num = out.r9" value
>> *and* trim the empty ones? Couldn't it just ignore the "tdx_cmr_num =
>> out.r9" value and just trim the empty ones either way? It's not like
>> there is a billion of them. It would simplify the code for sure.
>
> OK. Since spec says MAX_CMRs is 32, so I can use 32 instead of reading out from
> R9.
But then you still have the "trimming" code. Why not just trust "r9"
and then axe all the trimming code? Heck, and most of the sanity checks.
This code could be a *lot* smaller.