r/wg21 - N5036 - Extensions to C++ for Transactional Memory Version 2

r/wg21

N5036 - Extensions to C++ for Transactional Memory Version 2 WG21

Posted by u/weekly_paper_drop · 7 hr. ago

Author: Michael Wong
Document: N5036
Date: 2026-02-22
Target: SG5
Link: https://wg21.link/n5036

The Transactional Memory TS is back with a version 2. N5036 replaces the original TM TS (ISO/IEC TS 19841:2015) with a slimmed-down design centered on atomic do { ... } blocks - compound statements that execute atomically with respect to other transactions. No more transaction_safe annotations on every function in the call tree.

The trade-off for simplicity is a long list of things you can't do inside an atomic block. No I/O, no thread operations, no std::atomic, no shared_ptr, no virtual calls (implementation-defined), no coroutines, and throwing an exception is UB. The paper is non-normative - it's meant to build existing practice, not ship in C++26.

Michael Wong and the SG5 group have been at this for over a decade. The hardware landscape has shifted since the effort started - Intel TSX is disabled on most consumer CPUs and GCC's libitm falls back to a global lock.

▲ 89 points (83% upvoted) · 24 comments

sorted by: best

▲ ▼

u/AutoModerator 1 point 7 hr. ago pinned comment

N5036 - Extensions to C++ for Transactional Memory Version 2

Author: Michael Wong · Date: 2026-02-22 · Target: SG5

Paper link · PDF

Reply Share Report

▲ ▼

u/mutex_appreciator 127 points 6 hr. ago 🏆

So let me get this straight: you can't use I/O, you can't use threads, you can't use atomics, you can't call virtual functions, you can't use shared_ptr, you can't throw exceptions, and you can't use coroutines.

But other than that, how was the play, Mrs. Lincoln?

Reply Share Report Save

▲ ▼

u/template_crimes_unit 43 points 6 hr. ago

You forgot: calling a non-inline function is implementation-defined. So basically you can do arithmetic in a loop. Transactionally.

Reply Share Report

▲ ▼

u/daily_constexpr 11 points 5 hr. ago

constexpr but at runtime

Reply Share Report

▲ ▼

u/compiles_first_try 76 points 7 hr. ago

Transactional memory: the fusion power of computer science. Always ten years away. Every decade.

Reply Share Report

▲ ▼

u/process_archeologist 14 points 4 hr. ago

SG5 has been working on this since at least 2013. The original TM TS shipped in 2015. We're on version 2. The hardware that was supposed to make it fast (Intel TSX) got disabled for security reasons. And the paper is still non-normative.

We're past ten years.

Reply Share Report

▲ ▼

u/UB_enjoyer_420 41 points 6 hr. ago

Wait, is atomic becoming a keyword now? What happens to all my std::atomic code?

Reply Share Report

▲ ▼

u/actually_reads_papers 23 points 5 hr. ago

atomic is already in the table of identifiers with special meaning (Table 4). It's not a keyword - same bucket as override and final. The syntax is atomic do { ... }, so it only triggers when followed by do.

And no, it has nothing to do with std::atomic. In fact you explicitly cannot use std::atomic operations inside an atomic do block. Section 16.4.6.17, exclusion item (1.17): "atomic operations (Clause 31)."

Reply Share Report

Promoted

godbolt.org - because you need to see the assembly.

Compiler Explorer. 60+ compilers. C++, Rust, Go, and more. godbolt.org

▲ ▼

u/formerly_sg5 34 points 3 hr. ago

Everyone focusing on the restriction list is missing why TM matters.

The real problem TM solves is composability. You have two correct critical sections protected by their own mutexes. You need to compose them into one atomic operation. With mutexes, you either take both locks (hello deadlock if anyone does it in a different order) or you refactor everything into one big lock (hello contention). There is no correct composition of two correct mutex-protected operations without imposing a global lock ordering - which doesn't scale and breaks modularity.

atomic do gives you that. The restrictions exist because rollback semantics require it - you can't un-do I/O, you can't un-do a thread creation, you can't un-do an atomic store that another thread already observed. The restriction list is the list of things that aren't reversible. That's not a limitation of the design. That's the definition of what "atomic" means when you need rollback.

Whether this particular TS is the right vehicle is debatable. Whether the problem exists is not.

Edit: to be clear, I'm not advocating for TM over other concurrency primitives in general. I'm saying the composability problem is real and the restriction list follows logically from the atomicity guarantee.

Reply Share Report Save

▲ ▼

u/just_ship_it_already 18 points 3 hr. ago

I keep hearing the composability argument but I've genuinely never hit a case in 15 years of C++ where I needed to atomically compose two independently-locked critical sections and couldn't restructure the code to use a single lock or a lock hierarchy. The textbook examples are always bank transfers.

Reply Share Report

▲ ▼

u/formerly_sg5 12 points 2 hr. ago

The bank transfer is the canonical example because it's the simplest one that demonstrates the property. In practice it shows up anywhere you have two modules with internal locking that need a cross-module atomic operation. Databases solved this with MVCC decades ago. The question is whether language-level transactions can do for in-memory data structures what database transactions did for persistent storage.

You can always restructure. The question is whether you should have to.

Reply Share Report

▲ ▼

u/tsx_survivor_2019 29 points † 4 hr. ago

I worked on TSX-accelerated code for a lock elision library circa 2018-2019. Intel disabled TSX on virtually all consumer Skylake-and-later CPUs via microcode update due to the TAA side-channel vulnerability. Server parts kept it longer but even there it's been phased out on newer cores.

GCC's libitm runtime exists but falls back to a global lock when HTM isn't available. Which on most hardware today, it isn't.

Atomic blocks are likely to perform best where they execute quickly and touch little data.

So the performance model is: hope for HTM, fall back to a global lock. For small critical sections. Where a regular mutex would also work fine and you'd have predictable performance. What's the sell?

Reply Share Report

▲ ▼

u/embedded_for_20_years 8 points 3 hr. ago

IBM POWER still has HTM. But the capacity limits are tight and you fall back on abort anyway. The problem isn't just that Intel killed TSX - it's that hardware TM has fundamental tension with cache coherence protocols. Every time a cache line gets evicted during a transaction, you abort. The hardware window for "quick and little data" is genuinely small, and it keeps getting smaller as core counts go up and cache pressure increases.

Reply Share Report

▲ ▼

u/not_even_wrong_42 21 points 2 hr. ago

Two things bother me about this paper.

1. It's based on C++20. The normative reference is ISO/IEC 14882:2020. We have C++23 published and C++26 nearing completion. If this TS is supposed to build "widespread existing practice," shouldn't it reference the standard people are actually targeting?

2. Section 8.8 paragraph 6 is a minefield of implementation-defined behavior:

an invocation of a function, unless the function is inline with a reachable definition or the function is a library function that may be used in an atomic block

So calling a non-inline function is implementation-defined. Virtual calls are implementation-defined. Dynamic initialization of block-scope statics is implementation-defined. The usable subset of atomic do blocks is going to be wildly compiler-specific. Portable code inside atomic blocks will be limited to arithmetic and direct member access on local data.

Reply Share Report

▲ ▼

u/weekly_paper_drop 5 points 2 hr. ago

Good catch on the C++20 base. Combined with the exception UB and the implementation-defined function calls, you'd basically need a dedicated lint pass to tell you whether your atomic block is even valid on your compiler.

Reply Share Report

▲ ▼

u/the_real_stm_user 16 points 1 hr. ago

Section 8.8 paragraph 4:

If the execution of an atomic block evaluates an inter-thread side effect or if an atomic block is exited via an exception, the behavior is undefined.

Paragraph 5 then says the recommended practice is to terminate without invoking the terminate handler. So the committee knows exceptions will escape atomic blocks in practice - the recommended practice acknowledges it. Why UB instead of defined behavior?

An STM implementation needs to detect transaction abort conditions anyway. If it detects an exception propagating out of the block, it could roll back and then rethrow, or roll back and terminate. Either would be defined. UB means the optimizer can assume it never happens and remove code paths that lead to it.

Reply Share Report

▲ ▼

u/formerly_sg5 11 points 47 min. ago

You can't statically prove that no expression inside the block will throw. Making it ill-formed would require noexcept on every function call, which makes the feature unusable with any existing codebase.

UB is the pragmatic middle ground: the programmer promises not to throw, the implementation is free to abort, retry, or terminate as it sees fit. The "recommended practice" note is guidance to implementations, not a semantic guarantee. An implementation that terminates on exception-in-transaction is conforming and useful - it just isn't required to do it that way.

The alternative - mandating rollback+rethrow as defined behavior - would force every STM implementation to support full exception-safe rollback, which is a significantly harder implementation requirement for something the programmer isn't supposed to do in the first place.

Reply Share Report

▲ ▼

u/the_real_stm_user 8 points 22 min. ago

I take the implementation cost point. But I'm not asking for rollback+rethrow. I'd settle for implementation-defined or unspecified. Both give implementations the same freedom. Neither permits time travel.

UB for something the paper itself expects to happen at runtime - that's the part that doesn't sit right. If you're writing a "recommended practice" for how to handle it, you've already conceded it's a realistic scenario.

Reply Share Report

▲ ▼

u/safe_cpp_when 15 points 5 hr. ago

Meanwhile in Rust, Send + Sync and the borrow checker make data races a compile-time error. No UB needed, no eleven-page exclusion list of things you can't call.

Reply Share Report

▲ ▼

u/template_crimes_unit 9 points 5 hr. ago

Transactional memory and Rust's ownership model solve different problems. TM gives you atomic composition of arbitrary memory operations. Rust prevents data races at compile time. Neither subsumes the other. Rust doesn't have TM either.

Reply Share Report

▲ ▼

u/mutex_appreciator 3 points 4 hr. ago

every thread. every single thread.

Reply Share Report

Promoted

CppCon 2026 - Aurora, CO

Early bird ends May 15. The conference for the C++ community. cppcon.org

▲ ▼

u/definitely_not_10x -3 points 1 hr. ago

N5036. Five thousand and thirty-six papers. And we still don't have std::networking. Priorities.

Reply Share Report

▲ ▼

[deleted] -8 points 3 hr. ago

[removed by moderator]

▲ ▼

u/paper_trail_2019 1 point 3 hr. ago

Spam.

Reply Share Report