Chances are, if you are using volatile
in your C or C++ code, you are using it incorrectly. Let me explain.
The intended usecase
volatile int *dma_control;
void kickoff_command(int command) {
*dma_control = command;
}
volatile
is an escape hatch added by the designers of the C programming language to support memory-mapped IO. It guarantees that reads and writes to a given memory address are not reordered by the compiler. But as you will see later, this may actually be insufficient!
volatile
incorrectly used for atomics
volatile int a;
volatile int b;
void thread_1() {
a++;
}
void thread_2() {
a++;
b++;
}
This is incorrect and can result in lost updates. In order to guarantee atomic access, you need to use atomic operations. volatile
is not necessary to guarantee race-free semantics.
std::atomic<int> a;
std::atomic<int> b;
void thread_1() {
a.fetch_add(1);
}
void thread_2() {
a.fetch_add(1);
b.fetch_add(1);
}
volatile
incorrectly used for spinlocks
volatile int is_locked;
void lock() {
while (is_locked) {
}
is_locked = 1;
}
void unlock() {
is_locked = 0;
}
This is incorrect for three reasons.
- The assignment to
is_locked
inlock
is not atomic. - There is no memory barrier guarding the lock.
- Don't use spinlocks.
We can fix issue 1 with a relatively contrived example using relaxed atomics:
std::atomic<int> is_locked;
void lock() {
int expected = 0;
while (!is_locked.compare_exchange_weak(expected, 1, std::memory_order::relaxed)) {
}
}
void unlock() {
is_locked.store(0, std::memory_order::relaxed);
}
but that still leaves issue 2, so what are we missing?
On processor architectures with weak memory models (like ARM), memory accesses are actually allowed to be reordered almost anywhere as long as there are no conflicts. This means that a memory access that occurs inside a lock could be moved outside it, which would break the atomicity of the lock.
To fix this, we need to add memory barriers. Fortunately, with std::atomic
, this is an easy fix.
std::atomic<int> is_locked;
void lock() {
int expected = 0;
while (!is_locked.compare_exchange_weak(expected, 1, std::memory_order::acquire)) {
// With acquire semantics, memory accesses which appear after the atomic must also occur after it.
}
}
void unlock() {
is_locked.store(0, std::memory_order::release);
// With release semantics, memory accesses which appear before the atomic must also occur before it.
}
Issue 3 can be solved by using std::mutex
, or any mutex structure of your choosing, instead of spinlocks. Unless you can show me how you are disabling interrupts during the duration of the lock, I am going to immediately flag this issue in any code review.
Redux: volatile
incorrectly used for MMIO
volatile int *dma_control;
volatile int *gpu_control;
void kickoff_command() {
*dma_control = DMA_KICKOFF;
*gpu_control = GPU_KICKOFF;
}
It may be important to the programmer that these commands are executed in the same order that the appear in the source code. And based on a preliminary reading, it seems like they should be, since the compiler is not allowed to reorder them. Yet due to the processor memory model, they may not be! If that is required, memory barriers are actually necessary here as well to prevent the reordering.
volatile int *dma_control;
volatile int *gpu_control;
void kickoff_command() {
*dma_control = DMA_KICKOFF;
std::atomic_thread_fence(std::memory_order_seq_cst);
*gpu_control = GPU_KICKOFF;
std::atomic_thread_fence(std::memory_order_seq_cst);
}