Re: [PATCH 2/2] arm64/mm: fix incorrect file_map_count for invalid pmd/pud
From: Anshuman Khandual
Date: Wed Nov 16 2022 - 23:24:19 EST
On 11/16/22 21:16, Mark Rutland wrote:
> On Wed, Nov 16, 2022 at 10:08:27AM +0100, David Hildenbrand wrote:
>> On 16.11.22 09:38, Liu Shixin wrote:
>>> The page table check trigger BUG_ON() unexpectedly when split hugepage:
>>>
>>> ------------[ cut here ]------------
>>> kernel BUG at mm/page_table_check.c:119!
>>> Internal error: Oops - BUG: 00000000f2000800 [#1] SMP
>>> Dumping ftrace buffer:
>>> (ftrace buffer empty)
>>> Modules linked in:
>>> CPU: 7 PID: 210 Comm: transhuge-stres Not tainted 6.1.0-rc3+ #748
>>> Hardware name: linux,dummy-virt (DT)
>>> pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
>>> pc : page_table_check_set.isra.0+0x398/0x468
>>> lr : page_table_check_set.isra.0+0x1c0/0x468
>>> [...]
>>> Call trace:
>>> page_table_check_set.isra.0+0x398/0x468
>>> __page_table_check_pte_set+0x160/0x1c0
>>> __split_huge_pmd_locked+0x900/0x1648
>>> __split_huge_pmd+0x28c/0x3b8
>>> unmap_page_range+0x428/0x858
>>> unmap_single_vma+0xf4/0x1c8
>>> zap_page_range+0x2b0/0x410
>>> madvise_vma_behavior+0xc44/0xe78
>>> do_madvise+0x280/0x698
>>> __arm64_sys_madvise+0x90/0xe8
>>> invoke_syscall.constprop.0+0xdc/0x1d8
>>> do_el0_svc+0xf4/0x3f8
>>> el0_svc+0x58/0x120
>>> el0t_64_sync_handler+0xb8/0xc0
>>> el0t_64_sync+0x19c/0x1a0
>>> [...]
>>>
>>> On arm64, pmd_present() will return true even if the pmd is invalid.
>>
>> I assume that's because of the pmd_present_invalid() check.
>>
>> ... I wonder why that behavior was chosen. Sounds error-prone to me.
>
> That seems to be down to commit:
>
> b65399f6111b03df ("arm64/mm: Change THP helpers to comply with generic MM semantics")
>
> ... apparently because Andrea Arcangelli said this was necessary in:
>
> https://lore.kernel.org/lkml/20181017020930.GN30832@xxxxxxxxxx/
>
> ... but that does see to contradict what's said in:
>
> Documentation/mm/arch_pgtable_helpers.rst
>
> ... which just says:
>
> pmd_present Tests a valid mapped PMD
It should be as follows instead, will update. Not sure about PUD level though,
where anon THP is not supported (AFAIK).
+---------------------------+--------------------------------------------------+
| pmd_present | Tests if pmd_page() points to valid memory page |
+---------------------------+--------------------------------------------------+
>
> ... and it's not clear to me why this *only* applies to the PMD level.
>
> Anshuman?
Because THP is supported at PMD level. As Andrea had explained earlier, pmd_present()
should return positive if pmd_page() on the entry points to valid memory irrespective
of whether the entry is valid/mapped or not. That is the semantics expected in generic
THP during PMD split, collapse, migration etc and other memory code walking past such
PMD entries. That was my understanding.