The Match Interface
A successful regular expression match is represented by a match object.
To make match handling explicit and safe, the library intentionally splits match results into two dedicated interfaces:
MatchAn owning match. All captured text is owned by the match object and remains valid for its entire lifetime.MatchViewA view-based match. All captured text is only referenced and depends on the lifetime of the original input string.
Both interfaces share the same positional and group-based API via the common base
interface MatchBase, but they differ
intentionally in how match content can be accessed.
MatchViewprovidescontentViewmethods.
This design prevents a common and subtle pitfall: switching matching code from an owning API to a view-based API without noticing the lifetime implications.
For example, if code is changed from
match to
matchView and both returned the same
interface, the code would still compile — but might suddenly introduce undefined
behavior if the input string does not outlive the match result.
By requiring two distinct interfaces, switching between owning and view-based matches becomes a conscious and visible decision.
Obtaining a Match
You typically obtain match objects from RegEx:
match,fullMatch,findFirst,findAll,collectAllreturnMatch.matchView,fullMatchView,findFirstView,findAllView,collectAllViewreturnMatchView.replaceAllusesMatchViewwhen invoking the replacement callback.
Match Lifetime and Ownership
All matching functions return shared pointers to match objects. This allows match results to be copied and passed around cheaply while clearly expressing shared ownership.
The lifetime rules depend on how the match was created:
Matchobjects own their matched text internally. They are fully self-contained and safe to use independently.MatchViewobjects only reference the original input text. You must ensure that this text remains alive for as long as the match object is used.
Violating these lifetime rules results in undefined behavior. If you are unsure which variant to use, prefer the owning APIs.
Accessing the Match Data
The shared base interface MatchBase
provides access to positional and structural information:
Access to the matched content depends on the concrete type:
MatchViewprovidescontentView.
Parameters
Most access methods are available in three variants:
No parameter Addresses the entire match. This is equivalent to accessing capture group zero.
Group Index Accesses a capture group by its numeric index. Group zero refers to the whole match; group indices starting from one refer to individual capture groups.
Group Name If the pattern defines named capture groups, they can be accessed using their names.
Begin, End and Range
The begin,
end and
range methods return positions into
the input text.
The unit used for these positions depends on the type of the matched string.
For UTF-8 encoded strings, positions are expressed in bytes. This allows
you to use them directly with standard library facilities such as substr or
file seek operations.
The end position always points to the
first byte after the captured range.
If a capture group did not participate in the match,
begin and
end return zero.
For UTF-16 and UTF-32 strings, see the section at the end of this page.
Accessing the Content
Use content (owning) or
contentView (view-based) to
access the captured text.
If a capture group was not matched, an empty string or empty view is returned.
Thread Safety
Match objects are immutable after creation. As a result, it is safe to read from a match concurrently from multiple threads, provided the documented lifetime rules are respected.
UTF-16 and UTF-32 Strings
When matching UTF-16 or UTF-32 strings, the corresponding overloads in
RegEx return
Match16,
Match16View,
Match32 and
Match32View objects.
These types behave exactly like their UTF-8 counterparts, with the key
difference that all returned positions are expressed as indices into
char16_t or char32_t sequences.
For UTF-16 strings containing surrogate pairs, the index refers to the first
char16_t code unit of the character. This mirrors the behaviour of the
UTF-8 variants, where positions always point to the first byte of a
multi-byte character.
Interfaces and Types
-
class MatchBase
The abstract baseclass for regular expression matches.
Subclassed by erbsland::re::Match, erbsland::re::Match16, erbsland::re::Match16View, erbsland::re::Match32, erbsland::re::Match32View, erbsland::re::MatchView
Public Functions
-
virtual InputPosition begin() const
Get the start position of the match.
-
virtual InputPosition begin(std::size_t groupIndex) const
Get the start position of the given group.
- Parameters:
groupIndex – The index of the group.
- Throws:
std::out_of_range – if the group index is invalid.
-
virtual InputPosition begin(StringView groupName) const
Get the start position of the given group.
- Parameters:
groupName – The name of the group.
- Throws:
std::out_of_range – if the group name is invalid.
-
virtual InputPosition end() const
Get the end position of the match. The end position points after the last character of the match.
-
virtual InputPosition end(std::size_t groupIndex) const
Get the end position of the given group. The end position points after the last character of the match.
- Parameters:
groupIndex – The index of the group.
- Throws:
std::out_of_range – if the group index is invalid.
-
virtual InputPosition end(StringView groupName) const
Get the end position of the given group. The end position points after the last character of the match.
- Parameters:
groupName – The name of the group.
- Throws:
std::out_of_range – if the group name is invalid.
-
virtual CaptureRange range() const
Get the range of the match. The range is defined as [begin, end] where begin is inclusive and end is exclusive.
-
virtual CaptureRange range(std::size_t groupIndex) const
Get the range of the given group. The range is defined as [begin, end] where begin is inclusive and end is exclusive.
- Parameters:
groupIndex – The index of the group.
- Throws:
std::out_of_range – if the group index is invalid.
-
virtual CaptureRange range(StringView groupName) const
Get the range of the given group. The range is defined as [begin, end] where begin is inclusive and end is exclusive.
- Parameters:
groupName – The group name.
- Throws:
std::out_of_range – if the group name is invalid.
-
virtual const CaptureGroup &group() const
Get the capture group of the match.
- Returns:
A reference to the capture group instance.
-
virtual const CaptureGroup &group(std::size_t groupIndex) const
Get the capture group of the given group.
- Parameters:
groupIndex – The index of the group.
- Throws:
std::out_of_range – if the group index is invalid.
- Returns:
A reference to the capture group instance.
-
virtual const CaptureGroup &group(StringView groupName) const
Get the capture group of the given group.
- Parameters:
groupName – The name of the group.
- Throws:
std::out_of_range – if the group name is invalid.
- Returns:
A reference to the capture group instance.
-
virtual std::size_t groupCount() const noexcept
Get the number of capture groups, including the full match. If there are two capture groups
(a)(b), this function will return 3, as the full match counts as capture group zero((a)(b)).
-
virtual bool hasGroupIndex(std::size_t groupIndex) const noexcept
Test if the given group index exists.
-
virtual bool hasGroupName(StringView groupName) const noexcept
Test if the given group name exists.
-
virtual InputPosition begin() const
-
class Match : public erbsland::re::MatchBase
An owning match result.
The returned views from
content()are valid for the lifetime of this object.Public Functions
-
StringView content() const
Get the full content of the match.
-
StringView content(std::size_t groupIndex) const
Get the content of the specified group.
- Throws:
std::out_of_range – if the group index is invalid.
-
StringView content(StringView groupName) const
Get the content of the specified group.
- Throws:
std::out_of_range – if the group name is invalid.
-
StringView content() const
-
class MatchView : public erbsland::re::MatchBase
A view-based match result.
The returned views from
contentView()are only valid while the underlying input data is still alive.Public Functions
-
StringView contentView() const
Get the full content of the match.
-
StringView contentView(std::size_t groupIndex) const
Get the content of the specified group.
- Throws:
std::out_of_range – if the group index is invalid.
-
StringView contentView(StringView groupName) const
Get the content of the specified group.
- Throws:
std::out_of_range – if the group name is invalid.
-
StringView contentView() const
-
class Match16 : public erbsland::re::MatchBase
An owning UTF-16 match result.
The returned views from
content()are valid for the lifetime of this object.Public Functions
-
std::u16string_view content() const
Get the full content of the match.
-
std::u16string_view content(std::size_t groupIndex) const
Get the content of the specified group.
- Throws:
std::out_of_range – if the group index is invalid.
-
std::u16string_view content(StringView groupName) const
Get the content of the specified group.
- Throws:
std::out_of_range – if the group name is invalid.
-
std::u16string_view content() const
-
class Match16View : public erbsland::re::MatchBase
A UTF-16 view-based match result.
The returned views from
contentView()are only valid while the underlying input data is still alive.Public Functions
-
std::u16string_view contentView() const
Get the full content of the match.
-
std::u16string_view contentView(std::size_t groupIndex) const
Get the content of the specified group.
- Throws:
std::out_of_range – if the group index is invalid.
-
std::u16string_view contentView(StringView groupName) const
Get the content of the specified group.
- Throws:
std::out_of_range – if the group name is invalid.
-
std::u16string_view contentView() const
-
class Match32 : public erbsland::re::MatchBase
An owning UTF-32 match result.
The returned views from
content()are valid for the lifetime of this object.Public Functions
-
std::u32string_view content() const
Get the full content of the match.
-
std::u32string_view content(std::size_t groupIndex) const
Get the content of the specified group.
- Throws:
std::out_of_range – if the group index is invalid.
-
std::u32string_view content(StringView groupName) const
Get the content of the specified group.
- Throws:
std::out_of_range – if the group name is invalid.
-
std::u32string_view content() const
-
class Match32View : public erbsland::re::MatchBase
A UTF-32 view-based match result.
The returned views from
contentView()are only valid while the underlying input data is still alive.Public Functions
-
std::u32string_view contentView() const
Get the full content of the match.
-
std::u32string_view contentView(std::size_t groupIndex) const
Get the content of the specified group.
- Throws:
std::out_of_range – if the group index is invalid.
-
std::u32string_view contentView(StringView groupName) const
Get the content of the specified group.
- Throws:
std::out_of_range – if the group name is invalid.
-
std::u32string_view contentView() const
-
class CaptureRange
Represents a range in the input, defined by begin and end positions.
This class is used to represent captured content ranges in regular expression matches. The range is defined as [begin, end], where begin is inclusive and end is exclusive.
Public Functions
-
inline constexpr CaptureRange(const InputPosition begin, const InputPosition end) noexcept
Create a new capture range with the given begin and end positions.
- Parameters:
begin – The start position of the range (inclusive).
end – The end position of the range (exclusive).
-
inline constexpr bool isEmpty() const noexcept
Test if the range is empty.
- Returns:
True if begin equals end, false otherwise.
-
inline constexpr std::size_t size() const noexcept
Get the size of the range.
- Returns:
The number of positions between begin and end.
-
inline constexpr InputPosition begin() const noexcept
Get the start position of the range.
- Returns:
The begin position (inclusive).
-
inline constexpr InputPosition end() const noexcept
Get the end position of the range.
- Returns:
The end position (exclusive).
-
inline void setBegin(const InputPosition begin) noexcept
Set the start position of the range.
- Parameters:
begin – The new begin position (inclusive).
-
inline void setEnd(const InputPosition end) noexcept
Set the end position of the range.
- Parameters:
end – The new end position (exclusive).
-
inline std::string toString() const noexcept
Convert the range into a short, human-readable string.
- Returns:
The formatted range as
"begin-end".
-
inline constexpr CaptureRange(const InputPosition begin, const InputPosition end) noexcept
-
class CaptureGroup
The definition of a match group.
Public Functions
-
constexpr CaptureGroup() noexcept = default
Create an empty capture group.
-
inline constexpr CaptureGroup(const CaptureGroupIndex index, const CaptureRange range, const StringView name) noexcept
Create a new match group with the given capture range, and name.
- Parameters:
index – The index of the group. 0 = complete match, 1 = first capture group.
range – The character range for the group.
name – The name of the group.
-
inline constexpr CaptureGroupIndex index() const noexcept
Get the index of this capture group.
-
inline constexpr bool isEmpty() const noexcept
Test if the group is empty.
- Returns:
True if begin equals end, false otherwise.
-
inline constexpr std::size_t size() const noexcept
Get the size of the group.
- Returns:
The number of positions between begin and end.
-
inline constexpr InputPosition begin() const noexcept
Get the start position for the match.
- Returns:
The begin position (inclusive).
-
inline constexpr InputPosition end() const noexcept
Get the end position for the match.
- Returns:
The end position (exclusive).
-
inline constexpr CaptureRange range() const noexcept
Get the range of for the match.
- Returns:
The range.
-
inline constexpr StringView name() const noexcept
Get the name of the match group.
-
inline void setIndex(const CaptureGroupIndex index) noexcept
Set the index for the group.
- Parameters:
index – The group index.
-
inline void setBegin(const InputPosition begin) noexcept
Set the start position for the match.
- Parameters:
begin – The begin position (inclusive).
-
inline void setEnd(const InputPosition end) noexcept
Set the end position for the match.
- Parameters:
end – The end position (exclusive).
-
inline void setName(const StringView name) noexcept
Set the name of the match group.
- Parameters:
name – The name of the group.
-
constexpr CaptureGroup() noexcept = default
-
using erbsland::re::MatchViewPtr = std::shared_ptr<MatchView>
A shared pointer to a view-based match result.
-
using erbsland::re::Match16Ptr = std::shared_ptr<Match16>
A shared pointer to a UTF-16 match result.
-
using erbsland::re::Match16ViewPtr = std::shared_ptr<Match16View>
A shared pointer to a UTF-16 view-based match result.
-
using erbsland::re::Match32Ptr = std::shared_ptr<Match32>
A shared pointer to a UTF-32 match result.
-
using erbsland::re::Match32ViewPtr = std::shared_ptr<Match32View>
A shared pointer to a UTF-32 view-based match result.