Re: [PATCH] s390: cmpxchg: Make loop condition for 1,2 byte cases precise

From: Heiko Carstens
Date: Thu Nov 17 2022 - 13:19:47 EST


On Wed, Nov 16, 2022 at 03:47:11PM +0100, Janis Schoetterl-Glausch wrote:
> The cmpxchg implementation for 1 and 2 bytes consists of a 4 byte
> cmpxchg loop. Currently, the decision to retry is imprecise, looping if
> bits outside the target byte(s) change instead of retrying until the
> target byte(s) differ from the old value.
> E.g. if an attempt to exchange (prev_left_0 old_bytes prev_right_0) is
> made and it fails because the word at the address is
> (prev_left_1 x prev_right_1) where both x != old_bytes and one of the
> prev_*_1 values differs from the respective prev_*_0 value, the cmpxchg
> is retried, even if by a semantic equivalent to a normal cmpxchg, the
> exchange would fail.
> Instead exit the loop if x != old_bytes and retry otherwise.
>
> Signed-off-by: Janis Schoetterl-Glausch <scgl@xxxxxxxxxxxxx>
> ---
>
>
> Unfortunately the diff got blown up quite a bit, even tho the asm
> changes are not that complex. This is mostly because of in arguments
> becoming (in)out arguments.
>
> I don't think all the '&' constraints are necessary, but I don't see how
> they could affect code generation.

For cmpxchg() it wouldn't make any difference. For cmpxchg_user_key()
it might lead to a small improvement, since the register that is
allocated for the key variable might be reused. But I haven't looked
into that in detail. None of the early clobbers is necessary anymore
after your changes, but let's leave it as it is.

> I don't see why we would need the memory clobber, however.

The memory clobber (aka memory barrier) is necessary because it may be
used to implement e.g. some custom locking. For that it is required
the compiler does reorder read or write accesses behind/before the
inline assembly.

> I tested the cmpxchg_user_key changes via the kvm memop selftest that is
> part of the KVM cmpxchg memop series.
> I looked for an existing way to test the cmpxchg changes, but didn't
> find anything.

Yeah, guess having a test for this would be a nice to have :)

> arch/s390/include/asm/cmpxchg.h | 60 ++++++++++++++-----------
> arch/s390/include/asm/uaccess.h | 80 ++++++++++++++++++---------------
> 2 files changed, 78 insertions(+), 62 deletions(-)

The patch looks good - applied to the wip/cmpxchg_user_key branch.