Goals

Minimal Dependencies

Only requires C++20 and the standard library—nothing else.

Secure

Timeouts, limits and a strict pattern and input checking makes this library suitable to safely work with patterns and text from unknown origins.

Overview

THe primary goals of this modern C++ regular expression engine are security, robustness, and a clean, understandable codebase.

Secure by Design

Security was a core focus from the start:

  • The engine is implemented in plain, explicit C++, avoiding code generation or macro-based lexers.

  • The code is written defensively, with comprehensive error and limit checks.

  • It is designed to safely handle data from untrusted sources, making it suitable for embedded, backend, or toolchain usage.

This approach ensures predictable behavior and minimizes the risk of memory or parsing vulnerabilities. This, of course, comes at the cost of performance. We did our best to build a fast and moden engine, but not at the cost of security. Therefore, if security minimal dependencies are your main focus, this is the right engine for you. If regular expression matching is the main bottleneck in your application, this is definitely not your first choice.

Dependency-Free

The regular expression engine is intentionally designed with zero external dependencies:

  • No third-party libraries are required.

  • Safe UTF-8 string handling with support for Unicode character properties and simple case folding with no dependency to any Unicode library.

  • Easy to integrate into any C++ project.

Modern, Readable C++

Written in modern C++ (up to C++20), the codebase is:

  • Compatible with a wide range of C++ compilers

  • Cleanly structured for readability and extensibility

  • Explicit rather than clever, emphasizing clarity over abstraction

We want that you be able to fully understand every part of the code of this library. Also, the code is kept well structured, so you can include it in code reviews for security certifications.

Security Over Performance

Traditionally, regular expression engines are optimized for maximum performance. For that reason, many engines assume that all input comes from trusted sources. In contrast, our engine does not trust any input at all.

  • The UTF-8 encoding of patterns and matched input is checked for encoding errors.

  • Unusual or ambiguous pattern constructs cause errors (e.g. unknown escape sequences).

  • The engine has customizable and compiled in memory limits and timeouts to prevent DOS attacks.