Re: [patch 21/33] genirq/msi: Provide msi_domain_alloc_irq_at()
From: Thomas Gleixner
Date: Thu Nov 17 2022 - 19:58:40 EST
On Thu, Nov 17 2022 at 15:33, Reinette Chatre wrote:
> I am trying all three parts of this work out with some experimental
> code within the IDXD driver that attempts to use IMS on the host.
>
> In the test, pci_ims_alloc_irq() always encounters -EBUSY and it
> seems that there is an attempt to insert the struct msi_desc into
> the xarray twice, the second attempt encountering the -EBUSY.
>
> While trying to understand what is going on I found myself looking
> at this code area and I'll annotate this patch with what I learned.
Ok.
> When calling pci_ims_alloc_irq(), msi_insert_desc() ends up being
> called twice, first with index = MSI_ANY_INDEX, second with index = 0.
> (domid = 1 both times)
How so?
>> }
>>
>> hwsize = msi_domain_get_hwsize(dev, domid);
>> - if (index >= hwsize) {
>> - ret = -ERANGE;
>> - goto fail;
>> - }
>>
>> - desc->msi_index = index;
>> - index += baseidx;
>> - ret = xa_insert(&md->__store, index, desc, GFP_KERNEL);
>> - if (ret)
>> - goto fail;
>> - return 0;
>> + if (index == MSI_ANY_INDEX) {
>> + struct xa_limit limit;
>> + unsigned int index;
>> +
>> + limit.min = baseidx;
>> + limit.max = baseidx + hwsize - 1;
>>
>> + /* Let the xarray allocate a free index within the limits */
>> + ret = xa_alloc(&md->__store, &index, desc, limit, GFP_KERNEL);
>> + if (ret)
>> + goto fail;
>> +
>
> This path (index == MSI_ANY_INDEX) is followed when msi_insert_desc()
> is called the first time and the xa_alloc() succeeds at index 65536.
>
>> + desc->msi_index = index;
>
> This is problematic with desc->msi_index being a u16, assigning
> 65536 to it becomes 0.
You are partially right. I need to fix that and make it explicit as it's
a "works by chance or maybe not" construct right now.
But desc->msi_index is correct to be truncated because it's the index
within the domain space which is zero based.
>> + return 0;
>> + } else {
>> + if (index >= hwsize) {
>> + ret = -ERANGE;
>> + goto fail;
>> + }
>> +
>> + desc->msi_index = index;
>> + index += baseidx;
>> + ret = xa_insert(&md->__store, index, desc, GFP_KERNEL);
>> + if (ret)
>> + goto fail;
>
> This "else" path is followed when msi_insert_desc() is called the second
> time with "index = 0". The xa_insert() above fails at index 65536
> (baseidx = 65536) with -EBUSY, trickling up as the return code to
> pci_ims_alloc_irq().
Why is it called with index=0 the second time?
>> + desc = msi_alloc_desc(dev, 1, affdesc);
>> + if (!desc) {
>> + map.index = -ENOMEM;
>> + goto unlock;
>> + }
>> +
>> + if (cookie)
>> + desc->data.cookie = *cookie;
>> +
>> + ret = msi_insert_desc(dev, desc, domid, index);
>> + if (ret) {
>> + map.index = ret;
>> + goto unlock;
>> + }
>
> Above is the first call to msi_insert_desc(/* index = MSI_ANY_INDEX */)
>
>> +
>> + map.index = desc->msi_index;
>
> msi_insert_desc() did attempt to set desc->msi_index to 65536 but map.index ends
> up being 0.
which is kinda correct.
>> + ret = msi_domain_alloc_irqs_range_locked(dev, domid, map.index, map.index);
>
> Here is where the second call to msi_insert_desc() originates:
>
> msi_domain_alloc_irqs_range_locked() -> msi_domain_alloc_locked() -> \
> __msi_domain_alloc_locked() -> msi_domain_alloc_simple_msi_descs() -> \
> msi_domain_add_simple_msi_descs() -> msi_insert_desc()
but yes, that's bogus because it tries to allocate what is allocated already.
Too tired to decode this circular dependency right now. Will stare at it
with brain awake in the morning. Duh!
Thanks,
tglx