Author: Michael Wong
Document: N5036
Date: 2026-02-22
Target: SG5
Link: https://wg21.link/n5036
The Transactional Memory TS is back with a version 2. N5036 replaces the original TM TS (ISO/IEC TS 19841:2015) with a slimmed-down design centered on atomic do { ... } blocks - compound statements that execute atomically with respect to other transactions. No more transaction_safe annotations on every function in the call tree.
The trade-off for simplicity is a long list of things you can't do inside an atomic block. No I/O, no thread operations, no std::atomic, no shared_ptr, no virtual calls (implementation-defined), no coroutines, and throwing an exception is UB. The paper is non-normative - it's meant to build existing practice, not ship in C++26.
Michael Wong and the SG5 group have been at this for over a decade. The hardware landscape has shifted since the effort started - Intel TSX is disabled on most consumer CPUs and GCC's libitm falls back to a global lock.
N5036 - Extensions to C++ for Transactional Memory Version 2
Author: Michael Wong · Date: 2026-02-22 · Target: SG5
Paper link · PDF
So let me get this straight: you can't use I/O, you can't use threads, you can't use atomics, you can't call virtual functions, you can't use
shared_ptr, you can't throw exceptions, and you can't use coroutines.But other than that, how was the play, Mrs. Lincoln?
You forgot: calling a non-inline function is implementation-defined. So basically you can do arithmetic in a loop. Transactionally.
constexprbut at runtimeTransactional memory: the fusion power of computer science. Always ten years away. Every decade.
SG5 has been working on this since at least 2013. The original TM TS shipped in 2015. We're on version 2. The hardware that was supposed to make it fast (Intel TSX) got disabled for security reasons. And the paper is still non-normative.
We're past ten years.
Wait, is
atomicbecoming a keyword now? What happens to all mystd::atomiccode?atomicis already in the table of identifiers with special meaning (Table 4). It's not a keyword - same bucket asoverrideandfinal. The syntax isatomic do { ... }, so it only triggers when followed bydo.And no, it has nothing to do with
std::atomic. In fact you explicitly cannot usestd::atomicoperations inside anatomic doblock. Section 16.4.6.17, exclusion item (1.17): "atomic operations (Clause 31)."Everyone focusing on the restriction list is missing why TM matters.
The real problem TM solves is composability. You have two correct critical sections protected by their own mutexes. You need to compose them into one atomic operation. With mutexes, you either take both locks (hello deadlock if anyone does it in a different order) or you refactor everything into one big lock (hello contention). There is no correct composition of two correct mutex-protected operations without imposing a global lock ordering - which doesn't scale and breaks modularity.
atomic dogives you that. The restrictions exist because rollback semantics require it - you can't un-do I/O, you can't un-do a thread creation, you can't un-do an atomic store that another thread already observed. The restriction list is the list of things that aren't reversible. That's not a limitation of the design. That's the definition of what "atomic" means when you need rollback.Whether this particular TS is the right vehicle is debatable. Whether the problem exists is not.
Edit: to be clear, I'm not advocating for TM over other concurrency primitives in general. I'm saying the composability problem is real and the restriction list follows logically from the atomicity guarantee.
I keep hearing the composability argument but I've genuinely never hit a case in 15 years of C++ where I needed to atomically compose two independently-locked critical sections and couldn't restructure the code to use a single lock or a lock hierarchy. The textbook examples are always bank transfers.
The bank transfer is the canonical example because it's the simplest one that demonstrates the property. In practice it shows up anywhere you have two modules with internal locking that need a cross-module atomic operation. Databases solved this with MVCC decades ago. The question is whether language-level transactions can do for in-memory data structures what database transactions did for persistent storage.
You can always restructure. The question is whether you should have to.
I worked on TSX-accelerated code for a lock elision library circa 2018-2019. Intel disabled TSX on virtually all consumer Skylake-and-later CPUs via microcode update due to the TAA side-channel vulnerability. Server parts kept it longer but even there it's been phased out on newer cores.
GCC's libitm runtime exists but falls back to a global lock when HTM isn't available. Which on most hardware today, it isn't.
So the performance model is: hope for HTM, fall back to a global lock. For small critical sections. Where a regular mutex would also work fine and you'd have predictable performance. What's the sell?
IBM POWER still has HTM. But the capacity limits are tight and you fall back on abort anyway. The problem isn't just that Intel killed TSX - it's that hardware TM has fundamental tension with cache coherence protocols. Every time a cache line gets evicted during a transaction, you abort. The hardware window for "quick and little data" is genuinely small, and it keeps getting smaller as core counts go up and cache pressure increases.
Two things bother me about this paper.
1. It's based on C++20. The normative reference is ISO/IEC 14882:2020. We have C++23 published and C++26 nearing completion. If this TS is supposed to build "widespread existing practice," shouldn't it reference the standard people are actually targeting?
2. Section 8.8 paragraph 6 is a minefield of implementation-defined behavior:
So calling a non-inline function is implementation-defined. Virtual calls are implementation-defined. Dynamic initialization of block-scope statics is implementation-defined. The usable subset of
atomic doblocks is going to be wildly compiler-specific. Portable code inside atomic blocks will be limited to arithmetic and direct member access on local data.Good catch on the C++20 base. Combined with the exception UB and the implementation-defined function calls, you'd basically need a dedicated lint pass to tell you whether your atomic block is even valid on your compiler.
Section 8.8 paragraph 4:
Paragraph 5 then says the recommended practice is to terminate without invoking the terminate handler. So the committee knows exceptions will escape atomic blocks in practice - the recommended practice acknowledges it. Why UB instead of defined behavior?
An STM implementation needs to detect transaction abort conditions anyway. If it detects an exception propagating out of the block, it could roll back and then rethrow, or roll back and terminate. Either would be defined. UB means the optimizer can assume it never happens and remove code paths that lead to it.
You can't statically prove that no expression inside the block will throw. Making it ill-formed would require
noexcepton every function call, which makes the feature unusable with any existing codebase.UB is the pragmatic middle ground: the programmer promises not to throw, the implementation is free to abort, retry, or terminate as it sees fit. The "recommended practice" note is guidance to implementations, not a semantic guarantee. An implementation that terminates on exception-in-transaction is conforming and useful - it just isn't required to do it that way.
The alternative - mandating rollback+rethrow as defined behavior - would force every STM implementation to support full exception-safe rollback, which is a significantly harder implementation requirement for something the programmer isn't supposed to do in the first place.
I take the implementation cost point. But I'm not asking for rollback+rethrow. I'd settle for implementation-defined or unspecified. Both give implementations the same freedom. Neither permits time travel.
UB for something the paper itself expects to happen at runtime - that's the part that doesn't sit right. If you're writing a "recommended practice" for how to handle it, you've already conceded it's a realistic scenario.
Meanwhile in Rust,
Send + Syncand the borrow checker make data races a compile-time error. No UB needed, no eleven-page exclusion list of things you can't call.Transactional memory and Rust's ownership model solve different problems. TM gives you atomic composition of arbitrary memory operations. Rust prevents data races at compile time. Neither subsumes the other. Rust doesn't have TM either.
every thread. every single thread.
N5036. Five thousand and thirty-six papers. And we still don't have
std::networking. Priorities.[removed by moderator]
Spam.