Authors: Jan Schultke, Corentin Jabot
Document: P3688R6
Date: 2026-02-21
Target: LEWG
Link: wg21.link/p3688r6
The <cctype> functions have been a quiet source of pain for decades - locale-dependent behavior, no constexpr support, UB traps with signed char, and zero support for Unicode character types. P3688R6 proposes a new <ascii> header with lightweight, locale-independent, constexpr alternatives for all the character classification and transformation functions you actually need when parsing ASCII text.
The paper covers 18 functions total - the usual suspects like ascii_is_digit, ascii_is_alphabetic, ascii_to_lower, plus additions like ascii_is_bit, ascii_is_any, and case-insensitive comparison helpers. Everything is constexpr, noexcept (mostly), and works with char, wchar_t, char8_t, char16_t, and char32_t.
SG16 has been iterating on the naming across revisions. R6 uses the ascii_is_* prefix convention after feedback from earlier revisions that used is_ascii_*. Six revisions and a handful of SG16 polls later, this is heading to LEWG.
Abseil has had
absl::ascii_is*for years. Nice that the standard is finally catching up. At least the naming ended up in the same ballpark.The
constexprangle is what makes this worth standardizing over just using abseil. None of the existing solutions give you that. There is a godbolt demo in the paper showing the whole API. Also thechar8_t/char16_tsupport - try passing those tostd::isdigitand see what happens.committee gonna committee. Six revisions to standardize
c >= '0' && c <= '9'.This is literally the simplest kind of proposal the committee has seen in months. If LEWG cannot fast-track something this straightforward, we have bigger problems.
The design choice in section 3.7.3 is worth reading carefully. They considered three options for non-ASCII-compatible encodings and landed on "treat the input as ASCII regardless of the literal encoding." Which means:
This is the right call for protocol parsing - JSON, HTTP, XML are all ASCII/UTF-8 regardless of the host encoding. But it does mean the functions are not a drop-in replacement for
<cctype>on every platform. On EBCDIC,'0'is0xf0, not0x30. The functions work on the numeric value, not the character you typed.If your code already does
c >= '0' && c <= '9'and works on EBCDIC, switching toascii_is_digitwill break you. Narrow use case, but the paper is honest about the tradeoff.Wait,
ascii_is_digit('0')can be false? On what planet?EBCDIC. Some of us still ship to mainframes.
'0'is0xf0there, not0x30. The paper calls these functions "ASCII utilities," not "literal encoding utilities" - it means it.Section 3.13 dismisses
namespace asciibecause of hypothetical SIMD overloads:That is not a convincing argument. We can cross the SIMD bridge when we get there, and nested namespaces are not that hard. A dedicated namespace would let users write
using std::ascii::is_lower;instead of the mouthfulstd::ascii_is_lower. Theascii_is_*prefix works fine butstd::ascii::would have been cleaner.Not a dealbreaker. But I expect this bikeshed to reopen in LEWG.
From the design discussion on function objects:
I get the argument but this is still going to be painful in practice. Every single time you want to use one of these in
ranges::find_ifyou are wrapping it in a lambda:Yes, the general LIFT problem exists. Section 3.6 punts to P3312R1 (Overload Set Types) as a potential general solution, which is not exactly around the corner. Knowing that does not make the boilerplate less annoying.