Regular Expression in Python
Regular Expression in Python
Regular expression in Python is also known as RegEx. It is a sequence of characters that defines a search pattern. It is used to examine whether a string contains the specified search pattern or not. To work with regular expressions, built-in Python module called re needs to imported in the script for RegEx.
Meta Characters
In Python, meta characters are the special characters that are interpreted in a different way by the RegEx engine. They are listed below in a table:
Character | Description | Example |
[] | This meta character is used to indicate a set of characters. | “[A-Z]” |
. | This meta character is used to indicate any character but new line. | “He.” |
^ | This meta character is used to indicate starting point of the RegEx. | “^Study” |
$ | This meta character is used to indicate ending point of the RegEx. | “Experts!$” |
* | This meta character is used to indicate zero or more occurrences of the specified character. | “Hello*” |
+ | This meta character is used to indicate one or more occurrences of the specified character. | “Hello+” |
{} | This meta character is used to indicate specified number of occurrences of the specified character. | “Hel{2}o” |
? | This meta character is used to indicate zero or one occurrence of the specified character. | “Python?” |
| | This meta character is used to indicate either or condition. | “this|that” |
() | This meta character is used to indicate grouping of sub-patterns. | “(a|b)ayz” |
\ | This meta character is used to escape all the escape characters along with the other meta characters. | “\^” |
Regular Expression in Python Example
# Python program to exemplify regular expressions import re String = "Studyexperts!" x = re.search("^S...........!$", String) if(x): print("Pattern found.") else: print("Pattern not found.") String = "Studyexperts!." x = re.search("^S...........!$", String) if(x): print("Pattern found.") else: print("Pattern not found.")
Output:
Pattern found. Pattern not found.
Special Sequences
Sequence | Description | Example |
\A | This special sequence is used to indicate that the specified characters are at the beginning of the string. | “\AYou” |
\b | This special sequence is used to indicate that the specified characters are at the beginning or at the end of the string. | “\bthat”
“that\b” |
\B | This special sequence is used to indicate that the specified characters are present but not at the beginning or at the end of the string. | “\Bthat”
“that\B” |
\d | This special sequence is used to indicate that the string contains digits. | “\d” |
\D | This special sequence is used to indicate that the string does not contain digits. | “\D” |
\s | This special sequence is used to indicate that the string contains a white space character. | “\s” |
\S | This special sequence is used to indicate that the string does not contain a white space character. | “\S” |
\w | This special sequence is used to indicate that the string contains any word characters which are characters from a to Z, digits from 0-9, and the underscore _ character. | “\w” |
\W | This special sequence is used to indicate that the string does not contain any word characters. | “\W” |
\Z | This special sequence is used to indicate that the specified characters are at the end of the string. | “StudyExperts!\Z” |
Sets
A set is a collection of characters placed inside a pair of square brackets with a special meaning.
Set | Description |
[abc] | This set is used to indicate that a string should have one of the specified characters (a, b, or c). |
[a-d] | This set is used to indicate that a string should have one of the specified characters (a, b, c, or d). |
[^abc] | This set is used to indicate that a string should have any character except for a, b, and c. |
[123] | This set is used to indicate that a string should have one of the specified digits (1, 2, or 3). |
[0-9] | This set is used to indicate that a string should have any of the digits between 0 and 9. |
[1-8][0-9] | This set is used to indicate that a string should have a two digit-number between 10 and 89. |
[a-zA-Z] | This set is used to indicate that a string should have any alphabetic character between a and z, either lowercase or uppercase. |
[+] | In sets, +, *, ., |, (), $,{} has no special meaning, so [+] means: return a match for any character in the string. |