April 23, 2025Apr 23 Undefined behavior in C++ is a well-known source of headaches for developers, but surprisingly, even the lexing process contained cases of itâuntil now. Thanks to P2621R3 by Corentin Jabot, unterminated strings, macro-generated universal character names, and spliced UCNs are now formally defined, aligning the standard with real-world compiler behavior. C++26: No More UB in Lexing by Sandor Dargo From the article: If you ever used C++, for sure you had to face undefined behaviour. Even though it gives extra freedom for implementers, itâs dreaded by developers as it may cause havoc in your systems and itâs better to avoid it if possible. Surprisingly, even the lexing process in C++ can result in undefined behaviour. Thanks to Corentin Jabotâs work and his P2621R3 that wonât be the case anymore. As it was accepted as a defect report starting from C++98, in fact, you benefit from this already if you use a new enough compiler. Truth be told, compilers didnât do any dangerous. They handled the below cases safely and deterministically. So this change is really about updating the standard and matching implementersâ work. Letâs quickly see the three cases. Unterminated strings    // unterminated string used to be UB    const char * foo = " Who would have thought that an unterminated string or a character was UB?! Despite the permissive standard, all major compilers identified it as ill-formed. From now on, even the standard says so.  View the full article
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.