Skip to main content

Mask Type: Extended Regular Expressions

  • 7 minutes to read

This topic covers the RegEx mask mode in which your create masks using Extended Regular Expressions.

Extended Regular Expressions

Extended Regular Expressions have rich functionality to create input masks. The syntax used by masks in this mode is similar to the syntax defined by the POSIX ERE specification. Back referencing is not supported.

If the editor accepts a numeric or date/time value in a specific format, the Numeric and DateTime masks can be used instead of regular expressions. In these modes, masks use simplified syntaxes and there are a number of predefined masks that correspond to common numeric and date/time formats. Refer to the Mask Type: Numeric and Mask Type: Date-time topics for more details.

For general information on available masked modes, see the Mask Types document.

Regular expressions give you a number of significant advantages when creating masks as compared with other masked modes. You can create a mask that allows:

  • enter a value of indeterminate length;
  • enter a value using one of multiple alternative forms;
  • enter characters only from a specific range at a specific position.

Moreover, when Extended Regular Expressions are used, the autocomplete feature can be enabled. The editor tries to complete the text entered by an end user according to the current mask.

To enable Extended Regular Expressions, set the editor’s TextEdit.MaskType (or TextEditSettings.MaskType for the in-place editors) property to MaskType.RegEx. The mask itself should be specified by using the TextEdit.Mask (or TextEditSettings.Mask) property.

A mask is a string that can consist of meta-characters, quantifiers, and special characters.

Meta-Characters

Meta-characters are used to represent a range of symbols. An end user can enter text only in the positions that correspond to meta-characters. When a meta-character is found in a specific position in the mask, an end user can enter any character from the related range for this position in the edit box. The following table lists available meta-characters.

Character

Description

.

Matches any character.

[aeiou]

Matches any single character included in the specified set of characters.

Note

This character cannot be used with special characters (for example, \w, \d).

[^aeiou]

Matches any single character, which is not included in the specified set of characters.

Note

This character cannot be used with special characters (for example, \w, \d).

[0-9a-fA-F]

Use of a hyphen (-) allows specification of contiguous character ranges. a-f - lowercase, A-F - uppercase

Note

This character cannot be used with special characters (for example, ‘\w’, ‘\d’).

\w

Matches any word character.

\W

Matches any non-word character.

\d

Matches any decimal digit. Same as [0-9].

\D

Matches any non-digit. Same as [^0-9].

\s

Matches any white-space character (space, tab, etc.).

\S

Matches any non-white-space character.

\x20

Matches an ASCII character using hexadecimal representation (exactly two digits).

\u0020

Matches a Unicode character using hexadecimal representation (exactly four digits).

\p{unicodeCategoryName}

Matches any character from the specified Unicode character category. The full and short names of common Unicode categories are listed below. For information on other categories, refer to the System.Text.UnicodeEncoding.UnicodeCategory class description in MSDN.

Letter (L) - any letter.

UppercaseLetter (Lu) - an uppercase letter. Entered characters are converted to uppercase.

LowercaseLetter (Ll) - a lowercase letter. Entered characters are converted to lowercase.

Number (N) - any number.

Symbol (S) - a mathematical symbol, currency symbol, or a modifier symbol.

Punctuation (P) - any punctuation mark.

Separator (Z) - any separator.

\P{unicodeCategoryName}

Matches any character that is not included in the specified Unicode character category. This is the inversion of the “\p{unicodeCategoryName}” specifier.

\R.

Matches the decimal separator specified by the NumberDecimalSeparator property of the current culture.

\R:

Matches the time separator specified by the TimeSeparator property of the current culture.

\R/

Matches the time separator specified by the DateSeparator property of the current culture.

\R{name}

where name is one of the following:

DateSeparator - Matches the date separator (the same as “\R/“)

TimeSeparator - Matches the time separator (the same as “\R:”)

AbbreviatedDayNames - Matches one of the abbreviated names of the days according to the current culture.

AbbreviatedMonthNames - Matches one of the abbreviated names of the months according to the current culture.

AMDesignator - Matches the string designator for hours that are “ante meridian” (before noon).

DayNames - Matches one of the full names of the days according to the current culture.

MonthNames - Matches one of the full names of the months according to the current culture.

PMDesignator - Matches the string designator for hours that are “post meridian” (after noon).

NumberDecimalSeparator - Matches the string used as the decimal separator in numeric values (the same as “\R.”).

CurrencyDecimalSeparator - Matches the string used as the decimal separator in currency values.

CurrencySymbol - Matches the string used as the currency symbol.

NumberPattern - Matches any numeric value in the format specified by the current culture. If the number of decimal digits to use in numeric values is set to 0, the mask matches only integer values. Otherwise, the mask matches real values.

CurrencyPattern - Matches any currency value in the format specified by the current culture (without the currency symbol). If the number of decimal digits to use in currency values is set to 0, the mask matches only integer values. Otherwise, the mask matches real values.

\AnyChar

Matches the specified character. For instance, the \* mask string can be used to insert the * character as a literal, the \\ string inserts the \ symbol as a literal, etc.

Quantifiers

These are special characters which denote the number of repetitions for the preceding character. See the table below for the list of qualifiers and their descriptions.

Quantifier Description Samples
* Specifies zero or more matches. The \w* mask matches a string consisting of zero or more letter characters. It’s equivalent to the \w{0,} mask.
+ Specifies one or more matches. The \w+ mask matches a string consisting of one or more letter characters. It’s equivalent to the \w{1,} mask.
? Specifies zero or one matches. The \w? mask matches zero or one letter character. It’s equivalent to the \w{0,1} mask.
{n} Specifies exactly n matches. The \d{4} mask matches exactly four digits.
{n,} Specifies at least n matches. The \d{2,} mask matches two or more digits.
{n,m} Specifies at least n, but no more than m matches. The \d{1,3} mask matches either one, or two, or three digits.

Special Characters

The following table lists the available special characters that implement the grouping feature and alternation.

Character

Description

Samples

|

Alternation symbol. This can be used to implement a choice between two or more alternatives.

The 1|2|3 mask matches either 1 or 2 or 3.

The abc|123 mask matches either abc or 123.

The \d{2}|\p{L}{2} mask matches either two digits or two letters.

()

Grouping. You can use parentheses to create sub-expressions or to limit the scope of the alternation.

The (an|ba)t mask matches either ant or bat.

The (net)+ mask matches net, netnet, netnetnet,.. strings. Compare with the net+ mask, which matches the net, nett, nettt,.. strings.

The (0|1)+ mask matches a string of indeterminate length consisting of 0 and 1.

Any Other Characters

Any other character (which is not a meta-character, a quantifier, nor a special character) matches itself. For instance, if the a character appears in the mask, it matches the a character.

Again, if a specific character (even a meta-character, quantifier, or special character) is preceded by a backslash (for example, \[, \*), this expression matches the specified character ([ and *).

Precedence Rules

Below is a list of the operators in a decreasing order of precedence.

  1. Escaped Characters “\AnyCharacter“; Bracket Expressions [...]
  2. Grouping (...)
  3. Quantifiers ...*, ...+, ...?, {...}
  4. Concatenation
  5. Alternation ...|...

Examples

  1. The mask for entering numeric values with and without the fractional part: \d+(\R.\d{0,2})?

    Below are examples of valid values.

    CD_Mask_RegEx_example1_1

    CD_Mask_RegEx_example1_2

    The \d+ expression indicates that any number of decimal digits can be entered before the decimal point. The \R. meta-character is used to represent the decimal point. The \d{0,2} expression matches 0, 1, or 2 digits of the fractional part. The (...)? mask indicates that the expression in the brackets can appear only once or not at all during editing.

  2. The mask that accepts numbers only in the range of 1-24: (1?[1-9])|([12][0-4])

    This mask consists of two parts: (1?[1-9]) and ([12][0-4]), separated by the alternation symbol. The first part matches numbers in the ranges of 1-9 and 11-19. The second part matches numbers in the ranges of 10-14 and 20-24.