Quoting

Quoting escape sequences let you insert literal characters that would otherwise have a special meaning in the pattern syntax.

They are used whenever you want to match a character as itself rather than invoking its syntactic function (for example, matching . instead of “any character”).

Strict Escaping

This library uses strict escaping rules:

  • Only defined escape sequences are accepted.

  • Unknown, incomplete, or malformed escape sequences raise a parse error.

This differs from many other regex engines, which silently accept unknown escapes and either drop the backslash or treat the escaped character as a literal.

Strict escaping is intentional and has important benefits:

  • Typos are caught immediately instead of silently changing pattern behavior.

  • Patterns remain readable and reviewable because every escape has a defined meaning.

  • There is no ambiguity between “quoted literal” and “unknown syntax”.

In short, patterns either mean exactly what they say — or they fail fast.

Literal Quoting Escapes

The following escape sequences insert literal characters that would otherwise be interpreted as operators, anchors, or structural syntax.

Expression

Description

\\

Inserts a literal backslash character.

\.

Inserts a literal dot character. Without quoting, . matches any character.

\"

Inserts a literal double-quote character.

\'

Inserts a literal single-quote character.

\#

Inserts a literal # character. Without quoting, # starts a comment in verbose mode (see Special Characters).

\< / \>

Inserts a literal < or > character.

\+

Inserts a literal plus character. Without quoting, + is a quantifier.

\*

Inserts a literal asterisk character. Without quoting, * is a quantifier.

\?

Inserts a literal question mark character. Without quoting, ? is a quantifier and part of special group syntax.

\{

Inserts a literal opening brace. Without quoting, { starts a counted quantifier.

\}

Inserts a literal closing brace.

\$

Inserts a literal dollar sign. Without quoting, $ is an anchor.

\^

Inserts a literal caret character. Without quoting, ^ is an anchor.

\(

Inserts a literal opening parenthesis. Without quoting, ( starts a group.

\)

Inserts a literal closing parenthesis.

\|

Inserts a literal pipe character. Without quoting, | is the alternation operator.

\[

Inserts a literal opening bracket. Without quoting, [ starts a character class.

\]

Inserts a literal closing bracket.

As a general rule: if a character has a syntactic meaning outside a character class, it must be quoted to be matched literally.

Quoting Inside Character Classes

Inside character classes ([...]), many characters lose their special meaning.

However, some characters still require care:

  • ] must be escaped to be used literally.

  • ^ is only literal if it does not appear first.

  • - must be escaped or placed carefully to avoid creating a range.

See Character Classes for the exact rules.

Legacy and Compatibility Expressions

Expression

Description

\Q\E

Legacy quoting mode found in other engines. Accepted for compatibility but not recommended; prefer explicit quoting as described above.

Some regex engines (notably PCRE and Java) provide a quoting mechanism using the sequence \Q\E. This switches the parser into a mode where almost everything is treated as literal text until \E is encountered.

This library intentionally does not recommend this style for new patterns.

Why \Q\E is discouraged

While convenient at first glance, broad quoting modes have several drawbacks:

  • They are less explicit than quoting individual characters, which makes patterns harder to review and audit.

  • They can introduce surprising edge cases when the quoted text contains \E or needs to embed it.

  • They encourage treating patterns as opaque strings instead of structured syntax, which conflicts with the goal of a clear and predictable pattern language.