The Match Interface

A successful regular expression match is represented by a match object.

To make match handling explicit and safe, the library intentionally splits match results into two dedicated interfaces:

  • Match An owning match. All captured text is owned by the match object and remains valid for its entire lifetime.

  • MatchView A view-based match. All captured text is only referenced and depends on the lifetime of the original input string.

Both interfaces share the same positional and group-based API via the common base interface MatchBase, but they differ intentionally in how match content can be accessed.

This design prevents a common and subtle pitfall: switching matching code from an owning API to a view-based API without noticing the lifetime implications.

For example, if code is changed from match to matchView and both returned the same interface, the code would still compile — but might suddenly introduce undefined behavior if the input string does not outlive the match result.

By requiring two distinct interfaces, switching between owning and view-based matches becomes a conscious and visible decision.

Obtaining a Match

You typically obtain match objects from RegEx:

Match Lifetime and Ownership

All matching functions return shared pointers to match objects. This allows match results to be copied and passed around cheaply while clearly expressing shared ownership.

The lifetime rules depend on how the match was created:

  • Match objects own their matched text internally. They are fully self-contained and safe to use independently.

  • MatchView objects only reference the original input text. You must ensure that this text remains alive for as long as the match object is used.

Violating these lifetime rules results in undefined behavior. If you are unsure which variant to use, prefer the owning APIs.

Accessing the Match Data

The shared base interface MatchBase provides access to positional and structural information:

Access to the matched content depends on the concrete type:

Parameters

Most access methods are available in three variants:

  • No parameter Addresses the entire match. This is equivalent to accessing capture group zero.

  • Group Index Accesses a capture group by its numeric index. Group zero refers to the whole match; group indices starting from one refer to individual capture groups.

  • Group Name If the pattern defines named capture groups, they can be accessed using their names.

Begin, End and Range

The begin, end and range methods return positions into the input text.

The unit used for these positions depends on the type of the matched string. For UTF-8 encoded strings, positions are expressed in bytes. This allows you to use them directly with standard library facilities such as substr or file seek operations.

The end position always points to the first byte after the captured range.

If a capture group did not participate in the match, begin and end return zero.

For UTF-16 and UTF-32 strings, see the section at the end of this page.

Accessing the Content

Use content (owning) or contentView (view-based) to access the captured text.

If a capture group was not matched, an empty string or empty view is returned.

Thread Safety

Match objects are immutable after creation. As a result, it is safe to read from a match concurrently from multiple threads, provided the documented lifetime rules are respected.

UTF-16 and UTF-32 Strings

When matching UTF-16 or UTF-32 strings, the corresponding overloads in RegEx return Match16, Match16View, Match32 and Match32View objects.

These types behave exactly like their UTF-8 counterparts, with the key difference that all returned positions are expressed as indices into char16_t or char32_t sequences.

For UTF-16 strings containing surrogate pairs, the index refers to the first char16_t code unit of the character. This mirrors the behaviour of the UTF-8 variants, where positions always point to the first byte of a multi-byte character.

Settings →

Interfaces and Types

class MatchBase

The abstract baseclass for regular expression matches.

Subclassed by erbsland::re::Match, erbsland::re::Match16, erbsland::re::Match16View, erbsland::re::Match32, erbsland::re::Match32View, erbsland::re::MatchView

Public Functions

virtual InputPosition begin() const

Get the start position of the match.

virtual InputPosition begin(std::size_t groupIndex) const

Get the start position of the given group.

Parameters:

groupIndex – The index of the group.

Throws:

std::out_of_range – if the group index is invalid.

virtual InputPosition begin(StringView groupName) const

Get the start position of the given group.

Parameters:

groupName – The name of the group.

Throws:

std::out_of_range – if the group name is invalid.

virtual InputPosition end() const

Get the end position of the match. The end position points after the last character of the match.

virtual InputPosition end(std::size_t groupIndex) const

Get the end position of the given group. The end position points after the last character of the match.

Parameters:

groupIndex – The index of the group.

Throws:

std::out_of_range – if the group index is invalid.

virtual InputPosition end(StringView groupName) const

Get the end position of the given group. The end position points after the last character of the match.

Parameters:

groupName – The name of the group.

Throws:

std::out_of_range – if the group name is invalid.

virtual CaptureRange range() const

Get the range of the match. The range is defined as [begin, end] where begin is inclusive and end is exclusive.

virtual CaptureRange range(std::size_t groupIndex) const

Get the range of the given group. The range is defined as [begin, end] where begin is inclusive and end is exclusive.

Parameters:

groupIndex – The index of the group.

Throws:

std::out_of_range – if the group index is invalid.

virtual CaptureRange range(StringView groupName) const

Get the range of the given group. The range is defined as [begin, end] where begin is inclusive and end is exclusive.

Parameters:

groupName – The group name.

Throws:

std::out_of_range – if the group name is invalid.

virtual const CaptureGroup &group() const

Get the capture group of the match.

Returns:

A reference to the capture group instance.

virtual const CaptureGroup &group(std::size_t groupIndex) const

Get the capture group of the given group.

Parameters:

groupIndex – The index of the group.

Throws:

std::out_of_range – if the group index is invalid.

Returns:

A reference to the capture group instance.

virtual const CaptureGroup &group(StringView groupName) const

Get the capture group of the given group.

Parameters:

groupName – The name of the group.

Throws:

std::out_of_range – if the group name is invalid.

Returns:

A reference to the capture group instance.

virtual std::size_t groupCount() const noexcept

Get the number of capture groups, including the full match. If there are two capture groups (a)(b), this function will return 3, as the full match counts as capture group zero ((a)(b)).

virtual bool hasGroupIndex(std::size_t groupIndex) const noexcept

Test if the given group index exists.

virtual bool hasGroupName(StringView groupName) const noexcept

Test if the given group name exists.

class Match : public erbsland::re::MatchBase

An owning match result.

The returned views from content() are valid for the lifetime of this object.

Public Functions

StringView content() const

Get the full content of the match.

StringView content(std::size_t groupIndex) const

Get the content of the specified group.

Throws:

std::out_of_range – if the group index is invalid.

StringView content(StringView groupName) const

Get the content of the specified group.

Throws:

std::out_of_range – if the group name is invalid.

class MatchView : public erbsland::re::MatchBase

A view-based match result.

The returned views from contentView() are only valid while the underlying input data is still alive.

Public Functions

StringView contentView() const

Get the full content of the match.

StringView contentView(std::size_t groupIndex) const

Get the content of the specified group.

Throws:

std::out_of_range – if the group index is invalid.

StringView contentView(StringView groupName) const

Get the content of the specified group.

Throws:

std::out_of_range – if the group name is invalid.

class Match16 : public erbsland::re::MatchBase

An owning UTF-16 match result.

The returned views from content() are valid for the lifetime of this object.

Public Functions

std::u16string_view content() const

Get the full content of the match.

std::u16string_view content(std::size_t groupIndex) const

Get the content of the specified group.

Throws:

std::out_of_range – if the group index is invalid.

std::u16string_view content(StringView groupName) const

Get the content of the specified group.

Throws:

std::out_of_range – if the group name is invalid.

class Match16View : public erbsland::re::MatchBase

A UTF-16 view-based match result.

The returned views from contentView() are only valid while the underlying input data is still alive.

Public Functions

std::u16string_view contentView() const

Get the full content of the match.

std::u16string_view contentView(std::size_t groupIndex) const

Get the content of the specified group.

Throws:

std::out_of_range – if the group index is invalid.

std::u16string_view contentView(StringView groupName) const

Get the content of the specified group.

Throws:

std::out_of_range – if the group name is invalid.

class Match32 : public erbsland::re::MatchBase

An owning UTF-32 match result.

The returned views from content() are valid for the lifetime of this object.

Public Functions

std::u32string_view content() const

Get the full content of the match.

std::u32string_view content(std::size_t groupIndex) const

Get the content of the specified group.

Throws:

std::out_of_range – if the group index is invalid.

std::u32string_view content(StringView groupName) const

Get the content of the specified group.

Throws:

std::out_of_range – if the group name is invalid.

class Match32View : public erbsland::re::MatchBase

A UTF-32 view-based match result.

The returned views from contentView() are only valid while the underlying input data is still alive.

Public Functions

std::u32string_view contentView() const

Get the full content of the match.

std::u32string_view contentView(std::size_t groupIndex) const

Get the content of the specified group.

Throws:

std::out_of_range – if the group index is invalid.

std::u32string_view contentView(StringView groupName) const

Get the content of the specified group.

Throws:

std::out_of_range – if the group name is invalid.

class CaptureRange

Represents a range in the input, defined by begin and end positions.

This class is used to represent captured content ranges in regular expression matches. The range is defined as [begin, end], where begin is inclusive and end is exclusive.

Public Functions

inline constexpr CaptureRange(const InputPosition begin, const InputPosition end) noexcept

Create a new capture range with the given begin and end positions.

Parameters:
  • begin – The start position of the range (inclusive).

  • end – The end position of the range (exclusive).

inline constexpr bool isEmpty() const noexcept

Test if the range is empty.

Returns:

True if begin equals end, false otherwise.

inline constexpr std::size_t size() const noexcept

Get the size of the range.

Returns:

The number of positions between begin and end.

inline constexpr InputPosition begin() const noexcept

Get the start position of the range.

Returns:

The begin position (inclusive).

inline constexpr InputPosition end() const noexcept

Get the end position of the range.

Returns:

The end position (exclusive).

inline void setBegin(const InputPosition begin) noexcept

Set the start position of the range.

Parameters:

begin – The new begin position (inclusive).

inline void setEnd(const InputPosition end) noexcept

Set the end position of the range.

Parameters:

end – The new end position (exclusive).

inline std::string toString() const noexcept

Convert the range into a short, human-readable string.

Returns:

The formatted range as "begin-end".

class CaptureGroup

The definition of a match group.

Public Functions

constexpr CaptureGroup() noexcept = default

Create an empty capture group.

inline constexpr CaptureGroup(const CaptureGroupIndex index, const CaptureRange range, const StringView name) noexcept

Create a new match group with the given capture range, and name.

Parameters:
  • index – The index of the group. 0 = complete match, 1 = first capture group.

  • range – The character range for the group.

  • name – The name of the group.

inline constexpr CaptureGroupIndex index() const noexcept

Get the index of this capture group.

inline constexpr bool isEmpty() const noexcept

Test if the group is empty.

Returns:

True if begin equals end, false otherwise.

inline constexpr std::size_t size() const noexcept

Get the size of the group.

Returns:

The number of positions between begin and end.

inline constexpr InputPosition begin() const noexcept

Get the start position for the match.

Returns:

The begin position (inclusive).

inline constexpr InputPosition end() const noexcept

Get the end position for the match.

Returns:

The end position (exclusive).

inline constexpr CaptureRange range() const noexcept

Get the range of for the match.

Returns:

The range.

inline constexpr StringView name() const noexcept

Get the name of the match group.

inline void setIndex(const CaptureGroupIndex index) noexcept

Set the index for the group.

Parameters:

index – The group index.

inline void setBegin(const InputPosition begin) noexcept

Set the start position for the match.

Parameters:

begin – The begin position (inclusive).

inline void setEnd(const InputPosition end) noexcept

Set the end position for the match.

Parameters:

end – The end position (exclusive).

inline void setName(const StringView name) noexcept

Set the name of the match group.

Parameters:

name – The name of the group.

using erbsland::re::MatchPtr = std::shared_ptr<Match>

A shared pointer to a match result.

using erbsland::re::MatchViewPtr = std::shared_ptr<MatchView>

A shared pointer to a view-based match result.

using erbsland::re::Match16Ptr = std::shared_ptr<Match16>

A shared pointer to a UTF-16 match result.

using erbsland::re::Match16ViewPtr = std::shared_ptr<Match16View>

A shared pointer to a UTF-16 view-based match result.

using erbsland::re::Match32Ptr = std::shared_ptr<Match32>

A shared pointer to a UTF-32 match result.

using erbsland::re::Match32ViewPtr = std::shared_ptr<Match32View>

A shared pointer to a UTF-32 view-based match result.