Re: [PATCH v7 10/20] x86/virt/tdx: Use all system memory when initializing TDX module as TDX memory
From: Huang, Kai
Date: Tue Nov 22 2022 - 04:17:11 EST
> > > > +/*
> > > > + * Add all memblock memory regions to the @tdx_memlist as TDX memory.
> > > > + * Must be called when get_online_mems() is called by the caller.
> > > > + */
> > > > +static int build_tdx_memory(void)
> > > > +{
> > > > + unsigned long start_pfn, end_pfn;
> > > > + int i, nid, ret;
> > > > +
> > > > + for_each_mem_pfn_range(i, MAX_NUMNODES, &start_pfn, &end_pfn, &nid) {
> > > > + /*
> > > > + * The first 1MB may not be reported as TDX convertible
> > > > + * memory. Manually exclude them as TDX memory.
> > > > + *
> > > > + * This is fine as the first 1MB is already reserved in
> > > > + * reserve_real_mode() and won't end up to ZONE_DMA as
> > > > + * free page anyway.
> > > > + */
> > > > + start_pfn = max(start_pfn, (unsigned long)SZ_1M >> PAGE_SHIFT);
> > > > + if (start_pfn >= end_pfn)
> > > > + continue;
> > >
> > > How about check whether first 1MB is reserved instead of depending on
> > > the corresponding code isn't changed? Via for_each_reserved_mem_range()?
> >
> > IIUC, some reserved memory can be freed to page allocator directly, i.e. kernel
> > init code/data. I feel it's not safe to just treat reserved memory will never
> > be in page allocator. Otherwise we have for_each_free_mem_range() can use.
>
> Yes. memblock reverse information isn't perfect. But I still think
> that it is still better than just assumption to check whether the frist
> 1MB is reserved in memblock. Or, we can check whether the pages of the
> first 1MB is reversed via checking struct page directly?
>
Sorry I am a little bit confused what you want to achieve here. Do you want to
make some sanity check to make sure the first 1MB is indeed not in the page
allocator?
IIUC, it is indeed true. Please see the comment of calling reserve_real_mode()
in setup_arch(). Also please see efi_free_boot_services(), which doesn't free
the boot service if it is below 1MB.
Also, my understanding is kernel's intention is to always reserve the first 1MB:
/*
* Don't free memory under 1M for two reasons:
* - BIOS might clobber it
* - Crash kernel needs it to be reserved
*/
So if any page in first 1MB ended up to the page allocator, it should be the
kernel bug which is not related to TDX, correct?