Liam White
Incorrect uses of volatile

Chances are, if you are using volatile in your C or C++ code, you are using it incorrectly. Let me explain.

The intended usecase

volatile int *dma_control;

void kickoff_command(int command) {
    *dma_control = command;
}

volatile is an escape hatch added by the designers of the C programming language to support memory-mapped IO. It guarantees that reads and writes to a given memory address are not reordered by the compiler. But as you will see later, this may actually be insufficient!

volatile incorrectly used for atomics

volatile int a;
volatile int b;

void thread_1() {
    a++;
}

void thread_2() {
    a++;
    b++;
}

This is incorrect and can result in lost updates. In order to guarantee atomic access, you need to use atomic operations. volatile is not necessary to guarantee race-free semantics.

std::atomic<int> a;
std::atomic<int> b;

void thread_1() {
    a.fetch_add(1);
}

void thread_2() {
    a.fetch_add(1);
    b.fetch_add(1);
}

volatile incorrectly used for spinlocks

volatile int is_locked;

void lock() {
    while (is_locked) {
    }
    is_locked = 1;
}

void unlock() {
    is_locked = 0;
}

This is incorrect for three reasons.

  1. The assignment to is_locked in lock is not atomic.
  2. There is no memory barrier guarding the lock.
  3. Don't use spinlocks.

We can fix issue 1 with a relatively contrived example using relaxed atomics:

std::atomic<int> is_locked;

void lock() {
    int expected = 0;
    while (!is_locked.compare_exchange_weak(expected, 1, std::memory_order::relaxed)) {
    }
}

void unlock() {
    is_locked.store(0, std::memory_order::relaxed);
}

but that still leaves issue 2, so what are we missing?

On processor architectures with weak memory models (like ARM), memory accesses are actually allowed to be reordered almost anywhere as long as there are no conflicts. This means that a memory access that occurs inside a lock could be moved outside it, which would break the atomicity of the lock.

To fix this, we need to add memory barriers. Fortunately, with std::atomic, this is an easy fix.

std::atomic<int> is_locked;

void lock() {
    int expected = 0;
    while (!is_locked.compare_exchange_weak(expected, 1, std::memory_order::acquire)) {
        // With acquire semantics, memory accesses which appear after the atomic must also occur after it.
    }
}

void unlock() {
    is_locked.store(0, std::memory_order::release);
    // With release semantics, memory accesses which appear before the atomic must also occur before it.
}

Issue 3 can be solved by using std::mutex, or any mutex structure of your choosing, instead of spinlocks. Unless you can show me how you are disabling interrupts during the duration of the lock, I am going to immediately flag this issue in any code review.

Redux: volatile incorrectly used for MMIO

volatile int *dma_control;
volatile int *gpu_control;

void kickoff_command() {
    *dma_control = DMA_KICKOFF;
    *gpu_control = GPU_KICKOFF;
}

It may be important to the programmer that these commands are executed in the same order that the appear in the source code. And based on a preliminary reading, it seems like they should be, since the compiler is not allowed to reorder them. Yet due to the processor memory model, they may not be! If that is required, memory barriers are actually necessary here as well to prevent the reordering.

volatile int *dma_control;
volatile int *gpu_control;

void kickoff_command() {
    *dma_control = DMA_KICKOFF;
    std::atomic_thread_fence(std::memory_order_seq_cst);

    *gpu_control = GPU_KICKOFF;
    std::atomic_thread_fence(std::memory_order_seq_cst);
}