Regular expressions are characters that are combined according to a specific pattern (syntax) and that you can use for advanced find and/or replace searches. The regular expressions implementation uses the ECMAScript regular expression pattern syntax.
Watch the status bar for invalid regular expressions or if the search string results no matches. Use the Expression builder to create valid patterns.
Activate the icon Use regular expressions in the Find dialog |
Special pattern characters are characters (or sequences of characters) that have a special meaning when they appear in a regular expression, either to represent a character that is difficult to express in a string or to represent a category of characters. Each of these special pattern characters is matched in the target sequence against a single character (unless a quantifier specifies otherwise).
Syntax |
Description |
Example |
---|---|---|
\w |
Any alphanumeric or underscore character |
|
\W |
Any character that is not an alphanumeric or underscore character |
|
\d |
Any numeric character |
|
\D |
Any character that is not a numeric character |
|
\s |
Any space character or tab character or end-of-line character (CR)(LF) |
|
\S |
Any character that is not a space character and not a tab character and not a end-of-line character (CR)(LF) |
|
\t |
Tabulator |
|
\r\n |
End-of-line characters In the ST editor, lines end in "\r\n" (a carriage return followed by a new line). These characters are not visible, but are present in the ST editor and can be searched. See option: Show line endings -> |
|
|
Any single character including space characters, tab characters and line characters (CR)(LF) |
|
\ |
The following special characters that are used in regular expressions need to be escaped with the \ (backslash) if you want to search for them as text: ^ $ \ . * + ? ( ) [ ] { } | It is not possible to escape characters which form one of the special character sequences listed above. |
|
[class] |
The target character is part of the character class (see character classes below) |
|
[^class] |
The target character is not part of the character class (see character classes below) |
|
Quantifiers follow a character or a special pattern character. They can modify the number of times that a character is repeated in the match:
Quantifier |
Number of matches |
Example |
---|---|---|
+ |
1 or more |
|
* |
0 or more |
|
? |
0 or 1 |
|
{int} |
Exactly the number defined by int |
|
{int,} |
The number defined by int or more |
|
{min,max} |
The number between min (minimum) and max (maximum) |
|
By default, all these quantifiers are greedy (i.e. they take as many characters that meet the condition as possible). This behavior can be overridden to ungreedy (i.e. take as few characters that meet the condition as possible) by adding a question mark (?) after the quantifier.
Quantifier |
Number of matches |
Example |
---|---|---|
+? |
1 or more characters, ungreedy search |
Search pattern: Example text: sVar1 := 'abc'; sVar2 := 'ABC'; sVar3 := ''; Search result: 'abc' and 'ABC' Found are as few characters as possible between the first single quotation mark and the next single quotation mark. |
+ |
1 or more characters, greedy search |
Search pattern: Example text: sVar1 := 'abc'; sVar2 := 'ABC'; sVar3 := ''; Search result: 'abc'; sVar2 := 'ABC'; sVar3 := '' Found are as many characters as possible between the first single quotation mark and the last single quotation mark. |
*? |
0 or more characters, ungreedy search |
Search pattern: Example text: sVar1 := 'abc'; sVar2 := 'ABC'; sVar3 := ''; Search result: 'abc' and 'ABC' and '' |
* |
0 or more characters, greedy search |
Search pattern: Example text: sVar1 := 'abc'; sVar2 := 'ABC'; sVar3 := ''; Search result: 'abc'; sVar2 := 'ABC'; sVar3 := '' |
Groups allow to apply quantifiers to a sequence of characters (instead of a single character).
Group of characters |
Description |
Example |
---|---|---|
(subexpression) |
Captures the characters that represent the subexpression in the target sequence and stores them as a submatch. Each submatch is numbered after the order of appearance of their opening parenthesis (the first submatch is backreference number 1, the second is backreference number 2, and so on). This backreference number can be re-used in the search or in the replace field. |
(\d) is the subexpression of the first digit found that creates the first backreference (N=1). This backreference is used with \1 to identify the same content as found in the backreference. |
(?:subexpression) |
Defines a non-capturing group. |
In this case you refer with |
\number |
Defines a backreference that allows a previously matched subexpression to be identified subsequently in the same regular expression. |
Example 1: Search pattern: Explanation of the search pattern:
|
Backreferences can also be used in the replacement string. |
Example 2: Search pattern: Replacement pattern: Result with backreference is e.g. 3: iCount3 is replaced by iValue3 |
Assertions specify a position in the string where a match must occur. When you use an assertion in your search expression, the regular expression engine does not advance through the target sequence or consume characters; it looks for a match in the specified position only.
Group of characters |
Description |
Example |
---|---|---|
\b |
Word boundary |
|
\B |
Not a word boundary |
|
^ |
Beginning of line |
Tip When pressing the caret key <^> followed by a vowel, e.g. a, e, i, o, the caret will be interpreted as a circumflex, i.e. â, ê, î, ô, û. Press <^> followed by space to avoid this. |
$ |
End of line |
|
(?=subexpression) |
Positive lookahead The characters following the assertion must match subexpression, but no characters are consumed. |
|
(?!subexpression) |
Negative lookahead The characters following the assertion must not match subexpression, but no characters are consumed. |
|
A pattern can include different alternatives:
Character |
Description |
Example |
---|---|---|
| |
Separator that separates two alternative patterns or subexpressions. |
|
To use multiple alternative patterns in a regular expression, separate them with the separator operator (|): The regular expression will match if any of the alternatives match and as soon as one does.
Subexpressions (in groups or assertions) can also use the separator operator to separate different alternatives.
A character class defines a set of characters which is enclosed in square brackets [ ].
The regular expression object attempts to match the entire character class against a single character in the target sequence (unless a quantifier specifies otherwise).
The character class can contain any combination of:
Type |
Description |
Example |
---|---|---|
Individual characters |
Any character specified is considered part of the class (except the special characters - (hyphen), [ (left square bracket), and ] (right square bracket). All special pattern characters such as \t, \d, etc. can also be used within a character class specification. |
|
Range |
Place the - (hyphen) between two valid characters to specify a range. |
|
Escaped characters |
The characters -]\^ have special meaning in character class definitions. To treat these characters as normal characters, add a backslash before the characters to suppress their special meaning. |
|