How to Use std::u8string as Default
The Erbsland Regular Expression Library supports using either std::string or std::u8string as its default
string type.
This page explains how to enable the std::u8string configuration when you integrate the library as a submodule.
For the generic submodule integration steps (project layout, linking, etc.), see Integrate the Engine as Submodule.
Enable std::u8string in CMake
The library selects the string type at build time using the CMake option ERBSLAND_RE_U8STRING.
In your top-level CMakeLists.txt, enable the option before you call add_subdirectory(erbsland-re):
cmake_minimum_required(VERSION 3.25)
project(ExampleProject)
set(ERBSLAND_RE_U8STRING ON CACHE BOOL "Use std::u8string for all strings" FORCE)
add_subdirectory(erbsland-re)
add_subdirectory(example)
This option causes the library target to export the public compile definition ERBSLAND_RE_USE_U8STRING=1.
All targets that link against erbsland-re will then compile with the same string configuration.
What Changes in the API
The public API exposes the configured string types via the aliases in the namespace erbsland::re:
Stringbecomesstd::u8stringStringViewbecomesstd::u8string_view
This is the recommended way to write code that works with both configurations.
String and Character Literals
In std::u8string mode, string literals are UTF-8 byte sequences of type const char8_t[].
Prefer using the provided macros when you need a literal that should compile in both modes:
using namespace el::re;
auto pattern = ERBSLAND_RE_STRING_LITERAL("\\d+");
auto text = String{ERBSLAND_RE_STRING_LITERAL("abc 123 xyz")};
If you only target the std::u8string configuration, use the C++20 u8"..." literals directly:
auto text = std::u8string{u8"abc 123 xyz"};