Modes and Flags
Flags control how the regex engine parses and matches a pattern. You can set flags either
programmatically via erbsland::re::Flags or inline as part of the pattern.
Inline flags can apply in two different scopes:
Global flags apply to the entire pattern and are only accepted at the very start.
Local flags apply only within a specific group and must use group syntax.
This strict separation avoids ambiguous flag lifetime and makes the effective matching rules explicit at every position in the pattern.
Standard Flags
(?i)
Enables case-insensitive matching.
Applies to literal characters and character classes.
Unicode categories such as \p{Uppercase_Letter} are not affected; they always keep their literal meaning (see Character Types).
This form is accepted only at the start of the pattern.
(?i)error:\s+file\s+not\s+found
Use the grouped form (?i:…) to limit case-insensitive matching to a specific part of the pattern.
(?m)
Enables multi-line mode.
With this flag, ^ and $ match the start and end of each line instead of the entire input.
Line boundaries are based on line feeds. If you need Windows-style CRLF handling, enable
Flag::CRLFprogrammatically.
This form is accepted only at the start of the pattern.
(?m)^ERROR:.*$
Use (?m:…) for a local override.
(?s)
Enables dot-all mode.
. also matches line feeds (\n).
The whitespace class \s also includes line feeds in this mode, keeping whitespace handling consistent with ..
This form is accepted only at the start of the pattern.
(?s)<tag>.*?</tag>
Use (?s:…) for a local override.
(?x)
Enables verbose mode.
Whitespace is ignored while parsing the pattern.
A # character starts a comment that runs until the end of the line.
Inside character classes, whitespace and # retain their literal meaning.
Verbose mode is especially useful for long or complex patterns.
(?x)
^\s* # optional leading whitespace
ERROR # literal keyword
\s*:\s*
(?i:file) # case-insensitive match for "file"
\s+not\s+found
$
To match literal whitespace in verbose mode, place it inside a character class (for example [ ]) or escape it.
Inline Flag Sets
(?flags)
Enables multiple flags at once (global only).
Allowed flags: i, m, s, x, a, u.
This form is accepted only at the very start of the pattern.
A maximum of 10 flag characters is allowed.
Examples:
(?ims) enables case-insensitive, multi-line, and dot-all mode.
(?u) explicitly switches back to Unicode mode.
Local Flag Overrides
(?flags:…)
Applies or removes flags only within the group.
Flags not listed are inherited from the surrounding context.
A single minus sign - switches from enabling to disabling flags.
Only i, m, s, and x may be disabled.
ASCII and Unicode mode are switched explicitly using a and u.
Examples:
(?i:error:\s+(?-i:FILE)\s+not\s+found)
header:(?m:^.+$)
(?a:\w+)
Unsupported Standalone Flag Removal
Standalone flag removal without a group is not supported.
The following form is rejected:
(?-i)
This is intentional. Without a group boundary, it would be unclear where the disabled flag should stop applying.
Always use the grouped form instead:
(?-i:…) disables a flag only within that group.
This design ensures that every flag change has a clear and explicit scope.
Legacy and Compatibility Flags
These flag toggles exist primarily for compatibility with engines that default to ASCII or allow implicit mode switches. This engine is Unicode-first.
ASCII Mode
(?a) enables ASCII mode and restricts character escapes such as \d, \s, and \w (and their negations) to ASCII semantics.
See Character Types for the exact definitions.
Unicode Mode
(?u) disables ASCII mode and returns to Unicode semantics.
This is the only supported way to leave ASCII mode. The parser rejects -a and -u in flag lists to avoid ambiguous transitions.
Differences from Other Regex Engines
Global inline flags are only accepted at the start of the pattern. Many engines allow them anywhere and apply them from that point onward.
Flag changes must always be scoped explicitly using groups. Standalone removal like (?-i) is not accepted.
ASCII and Unicode mode are switched explicitly using (?a) and (?u), rather than implicit negation.
These rules favor explicit scope, readability, and safe refactoring over implicit or positional flag behavior.