Re: [PATCH 04/13] KVM: nSVM: clean up copying of int_ctl fields back to vmcb01/vmcb12
From: Maxim Levitsky
Date: Mon Nov 21 2022 - 06:16:41 EST
On Thu, 2022-11-17 at 20:15 +0000, Sean Christopherson wrote:
> On Thu, Nov 17, 2022, Maxim Levitsky wrote:
> > Clean up the nested_sync_int_ctl_from_vmcb02:
> >
> > 1. The comment about preservation of V_IRQ is wrong: when the L2 doesn't
> > use virtual interrupt masking, then the field just doesn't exist in
> > vmcb12 thus it should not be touched at all.
> > Since it is unused in this case, touching it doesn't matter that much,
> > so the bug is theoretical.
> >
> > 2. When the L2 doesn't use virtual interrupt masking, then in the *theory*
> > if KVM uses the feature, it should copy the changes to V_IRQ* bits from
> > vmcb02 to vmcb01.
> >
> > In practise, KVM only uses it for detection of the interrupt window,
> > and it happens to re-open it on each nested VM exit because
> > kvm_set_rflags happens to raise the KVM_REQ_EVENT.
> > Do this explicitly.
> >
> > 3. Add comment on why we don't need to copy V_GIF from vmcb02 to vmcb01
> > when nested guest doesn't use nested V_GIF (and thus L1's GIF is in
> > vmcb02 while nested), even though it can in theory affect L1's GIF.
> >
> > 4. Add support code to also copy some bits of int_ctl from
> > vmcb02 to vmcb01.
> > Currently there are none.
>
> Unless it's impossible for whatever reason, this patch should be split into
> multiple patches. IIUC, there are at least 2 different functional changes being
> made, they just happen to not have any actual impact on things.
No objection to this.
>
> > No (visible) functional change is intended.
> >
> > Signed-off-by: Maxim Levitsky <mlevitsk@xxxxxxxxxx>
> > ---
> > arch/x86/kvm/svm/nested.c | 47 ++++++++++++++++++++++++++-------------
> > 1 file changed, 32 insertions(+), 15 deletions(-)
> >
> > diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
> > index 54eb152e2b60b6..1f2b8492c8782f 100644
> > --- a/arch/x86/kvm/svm/nested.c
> > +++ b/arch/x86/kvm/svm/nested.c
> > @@ -410,28 +410,45 @@ void nested_copy_vmcb_save_to_cache(struct vcpu_svm *svm,
> > static void nested_sync_int_ctl_from_vmcb02(struct vcpu_svm *svm,
> > struct vmcb *vmcb12)
> > {
> > - u32 mask;
> > + struct vmcb *vmcb02 = svm->nested.vmcb02.ptr;
> > + struct vmcb *vmcb01 = svm->vmcb01.ptr;
> > +
> > + /* bitmask of bits of int_ctl that we copy from vmcb02 to vmcb12*/
> > + u32 l2_to_l1_mask = 0;
> > + /* bitmask of bits of int_ctl that we copy from vmcb02 to vmcb01*/
> > + u32 l2_to_l0_mask = 0;
> >
> > - /* Only a few fields of int_ctl are written by the processor. */
>
> Can this comment be kept in some form? I found it super useful when reading this
> code just now.
No problem.
>
> > - mask = V_IRQ_MASK | V_TPR_MASK;
> > - if (!(svm->nested.ctl.int_ctl & V_INTR_MASKING_MASK) &&
> > - svm_is_intercept(svm, INTERCEPT_VINTR)) {
> > + if (svm->nested.ctl.int_ctl & V_INTR_MASKING_MASK)
> > + l2_to_l1_mask |= V_IRQ_MASK | V_TPR_MASK;
> > + else {
> > /*
> > - * In order to request an interrupt window, L0 is usurping
> > - * svm->vmcb->control.int_ctl and possibly setting V_IRQ
> > - * even if it was clear in L1's VMCB. Restoring it would be
> > - * wrong. However, in this case V_IRQ will remain true until
> > - * interrupt_window_interception calls svm_clear_vintr and
> > - * restores int_ctl. We can just leave it aside.
> > + * If IRQ window was opened while in L2, it must be reopened
> > + * after the VM exit
> > + *
> > + * vTPR value doesn't need to be copied from vmcb02 to vmcb01
> > + * because it is synced from/to apic registers on each VM exit
> > */
> > - mask &= ~V_IRQ_MASK;
> > + if (vmcb02->control.int_ctl & V_IRQ_MASK)
> > + kvm_make_request(KVM_REQ_EVENT, &svm->vcpu);
> > }
> >
> > if (nested_vgif_enabled(svm))
> > - mask |= V_GIF_MASK;
> > + l2_to_l1_mask |= V_GIF_MASK;
> > + else
> > + /* There is no need to sync V_GIF from vmcb02 to vmcb01
> > + * because GIF is cleared on VMexit, thus even though
> > + * nested guest can control host's GIF, on VM exit
> > + * its set value is lost
> > + */
> > + ;
>
> The "else ... ;" is unnecessary, just throw the block comment above the nested
> vGIF if-statment, e.g. if I'm understanding everything, this?
Yes.
>
> /*
> * If nested vGIF is not enabled, L2 has access to L1's "real" GIF. In
> * this case, there's no need to sync V_GIF from vmcb02 to vmcb01
> * because GIF is cleared on VM-Exit, thus any changes made by L2 are
> * overwritten on VM-Exit to L1.
> */
> if (nested_vgif_enabled(svm))
> l2_to_l1_mask |= V_GIF_MASK;
>
> > +
> > + vmcb12->control.int_ctl =
> > + (svm->nested.ctl.int_ctl & ~l2_to_l1_mask) |
> > + (vmcb02->control.int_ctl & l2_to_l1_mask);
> >
> > - vmcb12->control.int_ctl &= ~mask;
> > - vmcb12->control.int_ctl |= svm->vmcb->control.int_ctl & mask;
> > + vmcb01->control.int_ctl =
> > + (vmcb01->control.int_ctl & ~l2_to_l0_mask) |
> > + (vmcb02->control.int_ctl & l2_to_l0_mask);
>
> No need for wrapping immediately after the "=", these all fit under the soft limit:
>
> vmcb12->control.int_ctl = (svm->nested.ctl.int_ctl & ~l2_to_l1_mask) |
> (vmcb02->control.int_ctl & l2_to_l1_mask);
>
> vmcb01->control.int_ctl = (vmcb01->control.int_ctl & ~l2_to_l0_mask) |
> (vmcb02->control.int_ctl & l2_to_l0_mask);
OK.
Best regards,
Maxim Levitsky
>