r/wg21
P2964R2 - User-defined element types in std::simd through trait-based vectorizable definition WG21
Posted by u/simd_paper_watcher · 8 hr. ago

Document: P2964R2
Authors: Daniel Towner, Ruslan Arutyunyan (Intel)
Date: 2026-02-19
Audience: LEWG
Revision: R2

The current std::simd restricts element types to arithmetic types and std::complex. This paper proposes replacing that closed list with trait-based constraints - if your type is trivially copyable, the right size (1, 2, 4, 8, or 16 bytes), and not over-aligned, it's vectorizable. That's it. Change the gatekeeper, keep everything else the same.

This means you get simd<Meters> with a strong typedef, simd<Color> with a scoped enum, simd<std::byte> for packet processing, and simd<fixed_point_16s8> for DSP - all without modifying operation semantics. The key bet is that compilers can auto-vectorize element-wise operations on user-defined types well enough that you don't need explicit customization for most cases.

The paper includes proposed wording, implementation evidence from Intel showing Clang and oneAPI generate assembly identical to built-in types for common operations, and an optional ADL-based customization mechanism (simd_operator / simd_convert) for the cases where compilers need a hand.

▲ 32 points (89% upvoted) · 9 comments
sorted by: best
u/still_on_gcc12 22 points 5 hr. ago

They're extending std::simd with user-defined types and most of us still can't use plain std::simd in production. Great priorities.

u/reads_the_appendix 5 points 4 hr. ago

std::simd was voted into the C++26 working draft via P1928. libstdc++ has had experimental support for years. This paper is targeting C++29.

u/template_curious_2025 2 points 3 hr. ago

Fair enough. C++29 timeline makes more sense, I keep mixing up which papers landed where.

u/template_curious_2025 18 points 6 hr. ago

Nice. The enum and std::byte support alone makes this worth it. Strong typedefs in simd without losing type safety is a solid quality-of-life improvement.

u/opt_in_advocate 11 points 3 hr. ago

The opt-out design gives me pause. From section 2.3:

A type T is now vectorizable if: is_trivially_copyable_v<T> is true, sizeof(T) is 1, 2, 4, 8, or 16, alignof(T) <= sizeof(T), disable_vectorization<T> is false

So every trivially-copyable struct of the right size is vectorizable by default. That's a lot of types. Any struct { int x; } anywhere in your codebase now silently qualifies for simd<T>. You'll get a compile error if you try to use operator+ and it doesn't exist, sure. But the concept of "vectorizable" now covers way more types than anyone intended to vectorize.

Why not opt-in? A specialization of enable_vectorization<T> = true feels more intentional than a banned-types list patching holes in the opt-out approach. The list in section 4.6 is already getting long and they say "implementations may provide additional specializations." That's the kind of sentence that leads to portability headaches.

u/former_sse_intrinsics_dev 9 points 4 hr. ago

The assembly results in the appendix are impressive. vec<Meters> addition compiles to the same vaddps as vec<float>. Zero overhead for the strong typedef. Saturating add goes straight to vpaddsw. The permutation example optimizes to a single vprold.

But the paper is very upfront about the compiler coverage:

Testing was performed with Clang 20 and Intel oneAPI 2025.0 targeting Intel Sapphire Rapids.

That's two compilers, one architecture, and both from vendors with a strong interest in this landing. The reduction case in section 6.3 already shows the approach falling apart - the saturating add reduction degenerates to scalar after two vector instructions, even on a good compiler.

The customization points in section 7 are the escape hatch, but if most users end up needing them to get reasonable codegen, the "just change the gatekeeper" pitch starts to look like the simple explanation for a complicated reality.

Edit: to be fair, the paper explicitly frames compiler variance as QoI and says compilers will improve. That's probably right long-term. I'd just like to see GCC and MSVC numbers before this advances.

u/dimensional_analysis_fan 7 points 2 hr. ago

Good first step, and the trait-based approach is clean. But I keep coming back to section 2.4.2:

Heterogeneous type operations, where simd<T> op simd<U> -> simd<V>, are explicitly excluded from this proposal.

The most compelling use case for putting user-defined types in simd is dimensional analysis. simd<Meters> / simd<Seconds> should give you simd<MetersPerSecond>. That's what mp-units users would expect. Instead, you get a compile error.

I understand why they deferred it - the type-level computation for result types is genuinely harder. But shipping the strong typedef wrapper without the type algebra feels like delivering half the motivation. The paper's forward-compatibility argument in 2.4.3 is reassuring at least.

u/just_ship_it_already 14 points 1 hr. ago

The paper literally says "no concrete use cases for allowing unit-like operations in simd have been presented." Let them ship what works and iterate. Perfect is the enemy of shipped.

u/xmm_register_enjoyer 3 points 7 hr. ago

simd<std::byte> for packet processing is the sleeper hit of this paper. Everything else is nice-to-have.