Mask Type: Extended Regular Expressions
- 7 minutes to read
This topic covers the RegEx mask mode in which your create masks using Extended Regular Expressions.
Extended Regular Expressions
Extended Regular Expressions have rich functionality to create input masks. The syntax used by masks in this mode is similar to the syntax defined by the POSIX ERE specification. Back referencing is not supported.
If the editor accepts a numeric or date/time value in a specific format, the Numeric and DateTime masks can be used instead of regular expressions. In these modes, masks use simplified syntaxes and there are a number of predefined masks that correspond to common numeric and date/time formats. Refer to the Mask Type: Numeric and Mask Type: Date-time topics for more details.
For general information on available masked modes, see the Mask Types document.
Regular expressions give you a number of significant advantages when creating masks as compared with other masked modes. You can create a mask that allows:
- enter a value of indeterminate length;
- enter a value using one of multiple alternative forms;
- enter characters only from a specific range at a specific position.
Moreover, when Extended Regular Expressions are used, the autocomplete feature can be enabled. The editor tries to complete the text entered by an end user according to the current mask.
To enable Extended Regular Expressions, set the editor’s TextEdit.MaskType (or TextEditSettings.MaskType for the in-place editors) property to MaskType.RegEx. The mask itself should be specified by using the TextEdit.Mask (or TextEditSettings.Mask) property.
A mask is a string that can consist of meta-characters, quantifiers, and special characters.
Meta-Characters
Meta-characters are used to represent a range of symbols. An end user can enter text only in the positions that correspond to meta-characters. When a meta-character is found in a specific position in the mask, an end user can enter any character from the related range for this position in the edit box. The following table lists available meta-characters.
Character | Description |
---|---|
. | Matches any character. |
| Matches any single character included in the specified set of characters. Note This character cannot be used with special characters (for example, |
| Matches any single character, which is not included in the specified set of characters. Note This character cannot be used with special characters (for example, |
| Use of a hyphen ( Note This character cannot be used with special characters (for example, ‘\w’, ‘\d’). |
| Matches any word character. |
| Matches any non-word character. |
| Matches any decimal digit. Same as |
| Matches any non-digit. Same as |
| Matches any white-space character (space, tab, etc.). |
| Matches any non-white-space character. |
| Matches an ASCII character using hexadecimal representation (exactly two digits). |
| Matches a Unicode character using hexadecimal representation (exactly four digits). |
\p{unicodeCategoryName} | Matches any character from the specified Unicode character category. The full and short names of common Unicode categories are listed below. For information on other categories, refer to the System.Text.UnicodeEncoding.UnicodeCategory class description in MSDN. Letter (L) - any letter. UppercaseLetter (Lu) - an uppercase letter. Entered characters are converted to uppercase. LowercaseLetter (Ll) - a lowercase letter. Entered characters are converted to lowercase. Number (N) - any number. Symbol (S) - a mathematical symbol, currency symbol, or a modifier symbol. Punctuation (P) - any punctuation mark. Separator (Z) - any separator. |
\P{unicodeCategoryName} | Matches any character that is not included in the specified Unicode character category. This is the inversion of the “\p{unicodeCategoryName}” specifier. |
| Matches the decimal separator specified by the NumberDecimalSeparator property of the current culture. |
| Matches the time separator specified by the TimeSeparator property of the current culture. |
| Matches the time separator specified by the DateSeparator property of the current culture. |
\R{name} | where name is one of the following: DateSeparator - Matches the date separator (the same as “\R/“) TimeSeparator - Matches the time separator (the same as “\R:”) AbbreviatedDayNames - Matches one of the abbreviated names of the days according to the current culture. AbbreviatedMonthNames - Matches one of the abbreviated names of the months according to the current culture. AMDesignator - Matches the string designator for hours that are “ante meridian” (before noon). DayNames - Matches one of the full names of the days according to the current culture. MonthNames - Matches one of the full names of the months according to the current culture. PMDesignator - Matches the string designator for hours that are “post meridian” (after noon). NumberDecimalSeparator - Matches the string used as the decimal separator in numeric values (the same as “\R.”). CurrencyDecimalSeparator - Matches the string used as the decimal separator in currency values. CurrencySymbol - Matches the string used as the currency symbol. NumberPattern - Matches any numeric value in the format specified by the current culture. If the number of decimal digits to use in numeric values is set to 0, the mask matches only integer values. Otherwise, the mask matches real values. CurrencyPattern - Matches any currency value in the format specified by the current culture (without the currency symbol). If the number of decimal digits to use in currency values is set to 0, the mask matches only integer values. Otherwise, the mask matches real values. |
\AnyChar | Matches the specified character. For instance, the |
Quantifiers
These are special characters which denote the number of repetitions for the preceding character. See the table below for the list of qualifiers and their descriptions.
Quantifier | Description | Samples |
---|---|---|
* |
Specifies zero or more matches. | The \w* mask matches a string consisting of zero or more letter characters. It’s equivalent to the \w{0,} mask. |
+ |
Specifies one or more matches. | The \w+ mask matches a string consisting of one or more letter characters. It’s equivalent to the \w{1,} mask. |
? |
Specifies zero or one matches. | The \w? mask matches zero or one letter character. It’s equivalent to the \w{0,1} mask. |
{n} | Specifies exactly n matches. | The \d{4} mask matches exactly four digits. |
{n,} | Specifies at least n matches. | The \d{2,} mask matches two or more digits. |
{n,m} | Specifies at least n, but no more than m matches. | The \d{1,3} mask matches either one, or two, or three digits. |
Special Characters
The following table lists the available special characters that implement the grouping feature and alternation.
Character | Description | Samples |
---|---|---|
| Alternation symbol. This can be used to implement a choice between two or more alternatives. | The The The |
| Grouping. You can use parentheses to create sub-expressions or to limit the scope of the alternation. | The The The |
Any Other Characters
Any other character (which is not a meta-character, a quantifier, nor a special character) matches itself. For instance, if the a
character appears in the mask, it matches the a
character.
Again, if a specific character (even a meta-character, quantifier, or special character) is preceded by a backslash (for example, \[
, \*
), this expression matches the specified character ([
and *
).
Precedence Rules
Below is a list of the operators in a decreasing order of precedence.
- Escaped Characters “\AnyCharacter“; Bracket Expressions
[...]
- Grouping
(...)
- Quantifiers
...*
,...+
,...?
,{...}
- Concatenation
- Alternation
...|...
Examples
The mask for entering numeric values with and without the fractional part:
\d+(\R.\d{0,2})?
Below are examples of valid values.
The
\d+
expression indicates that any number of decimal digits can be entered before the decimal point. The\R.
meta-character is used to represent the decimal point. The\d{0,2}
expression matches0
,1
, or2
digits of the fractional part. The(...)?
mask indicates that the expression in the brackets can appear only once or not at all during editing.The mask that accepts numbers only in the range of 1-24:
(1?[1-9])|([12][0-4])
This mask consists of two parts:
(1?[1-9])
and([12][0-4])
, separated by the alternation symbol. The first part matches numbers in the ranges of1-9
and11-19
. The second part matches numbers in the ranges of10-14
and20-24
.