A regular expression defines a text (i.e. character string) pattern and assigns a variable to each component of the pattern.
The basic rules for patterns are:
- ^ Beginning of string
- $ End of string
- . Any character
- [ Start of character list
- ] End of character list
- ( Start of expression group
- ) End of expression group
- | ORs two expressions
- \ Escape character
- * Preceding expression occurs zero or more times
- ? Preceding expression occurs zero or one times
- + Preceding expression occurs one or more times
- all other characters match themselves
The [ and ] characters can enclose character lists:
- [ab] denotes a single lowercase a or b letter
- [a-z] any lowercase letter
- [0-9] any digit
- [0-9]+ any number
- [a-z,A-Z,0-9] any letter or digit
Examples:
- .* denotes any sequence of characters
- .* .* denotes any string of characters that includes a space
You use "capture groups" to determine which parts of the text will be grouped together and put into a variable. You achieve this grouping by placing brackets around the capture group:
- (.*) (.*) creates two capture groups $1 and $2; $1 will contain all characters before the space and $2 will contain all characters after the space
As mentioned before the different capture groups will be placed into capture variables. These variables are denoted by $1, $2, $3, $4, etc.. $0 is a special variable that holds the entire matched pattern.
Please note that the old style \1, \2, \3, etc. capture variables syntax is still supported in the "Re-arrange using regular expressions" action for backwards compatibility reasons, but we recommend using the now standard $1, $2, $3, etc. syntax instead.
|