Re: [PATCH] Revert "arm64: dma: Drop cache invalidation from arch_dma_prep_coherent()"
From: Manivannan Sadhasivam
Date: Fri Nov 18 2022 - 05:54:16 EST
On Mon, Nov 14, 2022 at 05:38:00PM +0000, Catalin Marinas wrote:
> On Mon, Nov 14, 2022 at 03:14:21PM +0000, Robin Murphy wrote:
> > On 2022-11-14 14:11, Will Deacon wrote:
> > > On Mon, Nov 14, 2022 at 04:33:29PM +0530, Manivannan Sadhasivam wrote:
> > > > This reverts commit c44094eee32f32f175aadc0efcac449d99b1bbf7.
> > > >
> > > > As reported by Amit [1], dropping cache invalidation from
> > > > arch_dma_prep_coherent() triggers a crash on the Qualcomm SM8250 platform
> > > > (most probably on other Qcom platforms too). The reason is, Qcom
> > > > qcom_q6v5_mss driver copies the firmware metadata and shares it with modem
> > > > for validation. The modem has a secure block (XPU) that will trigger a
> > > > whole system crash if the shared memory is accessed by the CPU while modem
> > > > is poking at it.
> > > >
> > > > To avoid this issue, the qcom_q6v5_mss driver allocates a chunk of memory
> > > > with no kernel mapping, vmap's it, copies the firmware metadata and
> > > > unvmap's it. Finally the address is then shared with modem for metadata
> > > > validation [2].
> > > >
> > > > Now because of the removal of cache invalidation from
> > > > arch_dma_prep_coherent(), there will be cache lines associated with this
> > > > memory even after sharing with modem. So when the CPU accesses it, the XPU
> > > > violation gets triggered.
> > >
> > > This last past is a non-sequitur: the buffer is no longer mapped on the CPU
> > > side, so how would the CPU access it?
> >
> > Right, for the previous change to have made a difference the offending part
> > of this buffer must be present in some cache somewhere *before* the DMA
> > buffer allocation completes.
> >
> > Clearly that driver is completely broken though. If the DMA allocation came
> > from a no-map carveout vma_dma_alloc_from_dev_coherent() then the vmap()
> > shenanigans wouldn't work, so if it backed by struct pages then the whole
> > dance is still pointless because *a cacheable linear mapping exists*, and
> > it's just relying on the reduced chance that anything's going to re-fetch
> > the linear map address after those pages have been allocated, exactly as I
> > called out previously[1].
>
> So I guess a DMA pool that's not mapped in the linear map, together with
> memremap() instead of vmap(), would work around the issue. But the
> driver needs fixing, not the arch code.
>
Okay, thanks for the hint. Can you share how to allocate the dma-pool that's
not part of the kernel's linear map? I looked into it but couldn't find a way.
Thanks,
Mani
> --
> Catalin
--
மணிவண்ணன் சதாசிவம்