Use regular expressions

Regular expressions are characters that are combined according to a specific pattern (syntax) and that you can use for advanced find and/or replace searches. The regular expressions implementation uses the ECMAScript regular expression pattern syntax.

Tip

Watch the status bar for invalid regular expressions or if the search string results no matches. Use the Expression builder to create valid patterns.

Activate the icon Use regular expressions in the Find dialog

Special pattern characters

Special pattern characters are characters (or sequences of characters) that have a special meaning when they appear in a regular expression, either to represent a character that is difficult to express in a string or to represent a category of characters. Each of these special pattern characters is matched in the target sequence against a single character (unless a quantifier specifies otherwise).

Syntax

Description

Example

\w

Any alphanumeric or underscore character

test\wnew finds test2new or test_new

\W

Any character that is not an alphanumeric or underscore character

test\Wnew finds test.new or test-new

\d

Any numeric character

test\d\d finds test21

\D

Any character that is not a numeric character

test\D\D finds test_a

\s

Any space character or tab character or end-of-line character (CR)(LF)

test\s1 finds test 1

\S

Any character that is not a space character and not a tab character and not a end-of-line character (CR)(LF)

test\S1 finds test_1

\t

Tabulator

a\tb finds a     b

\r\n

End-of-line characters

In the ST editor, lines end in "\r\n" (a carriage return followed by a new line). These characters are not visible, but are present in the ST editor and can be searched.

See option: Extras > Options > Program options > Editors > ST editor -> Show line endings

test\r\n finds test(CR)(LF)

.

Any single character including space characters, tab characters and line characters (CR)(LF)

te.t1 finds text1 or test1

\

The following special characters that are used in regular expressions need to be escaped with the \ (backslash) if you want to search for them as text: ^ $ \ . * + ? ( ) [ ] { } |

It is not possible to escape characters which form one of the special character sequences listed above.

test\( finds test(

[class]

The target character is part of the character class (see character classes below)

[abc] finds a, b or c

[^class]

The target character is not part of the character class (see character classes below)

[^xyz] finds any character except x, y or z

Related topics:

Quantifiers

Quantifiers follow a character or a special pattern character. They can modify the number of times that a character is repeated in the match:

Quantifier

Number of matches

Example

+

1 or more

pla+ce finds place or plaace

*

0 or more

pla*ce finds plce or place or plaace

?

0 or 1

pla?ce finds plce or place

{int}

Exactly the number defined by int

tyco{3}n finds tycooon

{int,}

The number defined by int or more

tyco{2,}n finds tycoooon

{min,max}

The number between min (minimum) and max (maximum)

tyco{1,2}n finds tycon or tycoon

By default, all these quantifiers are greedy (i.e. they take as many characters that meet the condition as possible). This behavior can be overridden to ungreedy (i.e. take as few characters that meet the condition as possible) by adding a question mark (?) after the quantifier.

Quantifier

Number of matches

Example

+?

1 or more characters, ungreedy search

Search pattern: '.+?'

Example text: sVar1 := 'abc'; sVar2 := 'ABC'; sVar3 := '';

Search result: 'abc' and 'ABC'

Found are as few characters as possible between the first single quotation mark and the next single quotation mark.

+

1 or more characters, greedy search

Search pattern: '.+'

Example text: sVar1 := 'abc'; sVar2 := 'ABC'; sVar3 := '';

Search result: 'abc'; sVar2 := 'ABC'; sVar3 := ''

Found are as many characters as possible between the first single quotation mark and the last single quotation mark.

*?

0 or more characters, ungreedy search

Search pattern: '.*?'

Example text: sVar1 := 'abc'; sVar2 := 'ABC'; sVar3 := '';

Search result: 'abc' and 'ABC' and ''

*

0 or more characters, greedy search

Search pattern: '.*'

Example text: sVar1 := 'abc'; sVar2 := 'ABC'; sVar3 := '';

Search result: 'abc'; sVar2 := 'ABC'; sVar3 := ''

Groups

Groups allow to apply quantifiers to a sequence of characters (instead of a single character).

Group of characters

Description

Example

(subexpression)

Captures the characters that represent the subexpression in the target sequence and stores them as a submatch. Each submatch is numbered after the order of appearance of their opening parenthesis (the first submatch is backreference number 1, the second is backreference number 2, and so on). This backreference number can be re-used in the search or in the replace field.

i(\d)Var\1 finds i1Var1, i2Var2 or i6Var6 but not i1Var2

(\d) is the subexpression of the first digit found that creates the first backreference (N=1). This backreference is used with \1 to identify the same content as found in the backreference.

(?:subexpression)

Defines a non-capturing group.

i(?:\d)Var(\d)Value\1 finds i1Var2Value2 or i1Var3Value3

In this case you refer with \1 to the first backreference (\d).

\number

Defines a backreference that allows a previously matched subexpression to be identified subsequently in the same regular expression.

Example 1:

Search pattern: [a-c](\d)(?:in|out)put_\1 finds a1input_1 or b2output_2

Explanation of the search pattern:

[a-c] finds a, b or c

(\d) followed by one digit and this digit is used as backreference

(?:in|out) followed by in or out, this expression is not used as backreference

put_ followed by put_

\1 followed by the first backreference

Backreferences can also be used in the replacement string.

Example 2:

Search pattern: iCount(\d) finds iCount followed by one digit and this digit is used as backreference

Replacement pattern: iValue\1 replaces the search string iCount by iValue followed by the first backreference

Result with backreference is e.g. 3: iCount3 is replaced by iValue3

Assertions

Assertions specify a position in the string where a match must occur. When you use an assertion in your search expression, the regular expression engine does not advance through the target sequence or consume characters; it looks for a match in the specified position only.

Group of characters

Description

Example

\b

Word boundary

\bis finds island and is but not This

is\b finds This and is but not island

\bis\b finds is but not island

\B

Not a word boundary

\Bis finds This but not island

is\B finds island but not This

\Bis\B finds Wish but not is or island

^

Beginning of line

^iVar finds iVar only when the search string starts with a new line. Use ^\s*iVar to find     iVar with white spaces at the beginning of the line.

Tip

When pressing the caret key <^> followed by a vowel, e.g. a, e, i, o, the caret will be interpreted as a circumflex, i.e. â, ê, î, ô, û. Press <^> followed by space to avoid this.

$

End of line

iVar$ finds iVar only when the search string ends in a line.

(?=subexpression)

Positive lookahead

The characters following the assertion must match subexpression, but no characters are consumed.

Var(?=\d) finds all Var followed by a digit

(?!subexpression)

Negative lookahead

The characters following the assertion must not match subexpression, but no characters are consumed.

Var(?!\d) finds all Var not followed by a digit

Alternatives

A pattern can include different alternatives:

Character

Description

Example

|

Separator that separates two alternative patterns or subexpressions.

(i|o)Var finds iVar or oVar

To use multiple alternative patterns in a regular expression, separate them with the separator operator (|): The regular expression will match if any of the alternatives match and as soon as one does.

Subexpressions (in groups or assertions) can also use the separator operator to separate different alternatives.

Character classes

A character class defines a set of characters which is enclosed in square brackets [ ].

The regular expression object attempts to match the entire character class against a single character in the target sequence (unless a quantifier specifies otherwise).

The character class can contain any combination of:

Type

Description

Example

Individual characters

Any character specified is considered part of the class (except the special characters - (hyphen), [ (left square bracket), and ] (right square bracket).

All special pattern characters such as \t, \d, etc. can also be used within a character class specification.

[abc] finds a, b or c

[^xyz] finds any character except x, y or z

Range

Place the - (hyphen) between two valid characters to specify a range.

[a-z] finds any lowercase character in the range from a to z.

[^D-F] finds any character except uppercase letter D, E or F.

[1-36-9] finds a digit between 1 to 3 or 6 to 9

[abc1-5] finds a, b or c or a digit between 1 to 5.

Escaped characters

The characters -]\^ have special meaning in character class definitions. To treat these characters as normal characters, add a backslash before the characters to suppress their special meaning.

[a\-z]Var finds aVar, -Var or zVar

[ab\t\]] finds a, b, tab or ]

Modified on: 2022-08-22Feedback on this pagePanasonic hotline