"Threads cannot be implemented as a library"

We often take multi-threading for granted, given its prevalence in modern systems. Data races are often ridiculously tricky, and designing language specs around it even more so.

I was recently reading a paper(title is the same as this post's[1]) which goes into depth about why optimizing compilers find dealing with multi-threading exceptionally hard - especially arguing that creation of thread functionality as a library (i.e. uninvolved with the compiler implementation) is impossible.

Multithreading as a feature must be baked into the language, the earlier in its life-cycle, the better (which tells us why C/C++ data race specs pre-2011 were just UB, Java's are still ridiculously complex, and within safe Rust they're impossible to create).

DRF => SC

Any program that is free from data races must be sequentially consistent (i.e. the instructions for each thread must come together in an interleaved but causal fashion).

Optimization fail example

y = 0, x = 0

# Thread 1
if (y == 1)
    x += 1

# Thread 2
if (x == 1)
    y += 1

This code should ideally do nothing, x and y will never be 1.
If we assume no compiler involvement in the threading, i.e. the compiler sees the two thread instructions in isolation and tries to optimize, a perfectly valid optimization might be the following -

y = 0, x = 0

# Thread 1
x += 1
if (y != 1)
    x -= 1

# Thread 2
y += 1
if (x != 1)
    y -= 1

This program is still identical in its behaviour for each thread. Now however, we can see that interleaving the two thread instructions can cause x and y both to be 1 -

y = 0, x = 0

# Thread 1
x += 1        # x == 1

# Thread 2
y += 1        # y == 1
if (x != 1)
    y -= 1    # y does not get decremented

# Thread 1 contd.
if (y != 1)
    x -= 1    # x does not get decremented either

This is invalid.

Bibliography

[1] Threads cannot be implemented as a library