Yet another spinlock implementation

📆 June 3, 2025 | ⌛ 4 minutes read

Another Spinlock Question
Implementation
Explanation
Limitations & Discussion

Another Spinlock Question

This question is adapted from my year’s Advanced Operating Systems finals. I didn’t check the answer scheme / results for this, but I’m fairly confident this is somewhat correct. I’m writing this to scratch that concurrency itch.

Consider this atomic instruction: lock-xadd(), where it has the side effects of the following C code:

0
1
2
3
4
5
6


void lock_xadd(int *src, int *dst);
// Atomically does the following
{ 
    temp = *src + *dst;
    *src = *dst;
    *dst = temp;
}

Question: Implement spin_lock() and spin_unlock() in C using the lock_xadd() function.

Implementation

The insight: we can get the ‘old’ state of the atomic operation, since *src = *dst. This means that we can implement this spinlock with a compare-and-swap pattern.

 0
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47


void lock_xadd(int *src, int *dst);

// Lock state: 
//   0 is unlocked
// > 0 is locked
// < 0 is undefined (should never happen)
int64_t lock = 0; // i am an evil global lock :)

void spin_lock() {
    while(1){
        // attempt to acquire the lock 
        int64_t add = 1;
        lock_xadd(&add, &lock);
        if(add == 0){
            // we acquired the lock! 
            return;
        }
        // we did not acquire the lock if test != 0.
        // means that the lock was already acquired.
        if(add < 0) panic(); // something terrible happened. 

        // "compare and swap"
        while(1){
            // wait for tmp to return 0
            // (implies that spin_unlock did its shit)
            int64_t tmp = 0; 
            lock_xadd(&tmp, &lock);

            if(tmp == 0) break;
            if(tmp < 0) panic();
            // lock state > 0, so just keep spinning~
        }
        // we still need to actually acquire the lock.
        // so we re-try the acquire (add 1) operation!
    }
}

void spin_unlock(){
    while(1){
        int64_t dec = -1;
        lock_xadd(&dec, &lock);
        // == 1 implies that prev atomic op had
        // decremented lock from 1 to 0 hence unlocked!
        if(dec == 1)
            break; 
    }
}

Explanation

Remember that atomic instructions are atomic. We are guaranteed that each thread will indivisibly operate on lock.

For spin_lock, we add 1 to the lock to try and acquire it. Since operations are atomic, the returned value add tells us the previous state of lock.

If add==0, we know for sure the operation incremented lock from 0 -> 1. This means that we acquired the lock successfully, as we are the only lock acquirer that accessed the state of 0 for lock.

Else, the operation worked on a positive nonzero value of lock (meaning it was locked by someone else already), so we spin.

The spin operation is incredibly fun when I thought of the solution. Typically we would use some powerful compare exchange atomic instruction to do compare-and-swap. We don’t have the ability to do a CMP and SET atomically in this problem. The idea is to just separate both operations.

We can exploit the idempotency of 0 w.r.t. the addition operation. We can exploit this to atomically check the state of lock when we do tmp = 0; lock_xadd(&tmp, &lock);.

Only an idempotent operation here will work. Why?

If we increment, then our lock will starve or overflow. (acquire keeps adding, unlock keeps decrementing). We cannot decrement as it will conflict with the unlock operation.

Once we are sure that the lock state is 0, then we try to acquire. This should keep all additions to lock finitely bounded.

For spin_unlock, we just keep decrementing until lock is 0 (unlocked) 👍

Limitations & Discussion

We follow the pthread model, where spin_unlock can only be unlocked by the lock holding thread. See this docs on pthread_spin_unlock. If this invariant does not hold, then exclusive access to the lock (for decrement) is violated. We get the funky lock < 0 situation.

If there are an unbounded number of lock acquirers, it could be theoretically possible to starve all acquirers since the unlocking thread can only decrement one at a time. One does not simply set lock to 0 with lock_xadd(). This shouldn’t be possible (left as an exercise).

Practically, unbounded acquires should not happen. Progress will eventually be made, since for a finite number of acquires N, it will just take O(N)¹ spins by the unlocker to eventually decrement lock to 0. No starvation!

You also have the problem of data representation where we have more acquires than the int64_t can fit in the positive range. Then we get an overflow. But this is an unsolveable problem given the constraints, since we can either kick the can down the road by using uint64_t or a larger int representation (whatever this may be).

Lastly, note that the implementation I provided is not as fast as the real spinlock/unlock due to the limitations of the atomic instruction we are given to implement.

In the actual semantics of big O, where I mean the worst case. ↩︎