Python regular expression cheat sheet
Python regular expression cheat sheet
I'd be delighted to provide you with a comprehensive Python regular expression (regex) cheat sheet!
Pattern Syntax
Python regex patterns are defined using the re
module, which provides several functions for searching and manipulating text:
re.compile(pattern)
: Compiles a pattern into a regex object. re.search(pattern, string)
: Searches for the first occurrence of the pattern in the given string. re.match(pattern, string)
: Searches for the pattern at the beginning of the given string.
Basic Patterns
Literal: Matches any literal character (e.g.,.
, [
, (
, etc.).
Example: r'.`` matches a dot (
.`)
Example: r'*'
matches zero or more of the preceding character
Example: r'[abc]'
matches either 'a', 'b', or 'c'
Example: r'[0-9]'
matches any digit (0-9)
Example: r'('
matches the opening parenthesis ((
)
Special Patterns
Dot (.
): Matches any single character (except newline).
Example: r'abc.def'
matches 'abc.' followed by any single character
*
): Matches zero or more of the preceding pattern.
Example: r'a.*'
matches 'a' followed by zero or more characters
+
): Matches one or more of the preceding pattern.
Example: r'a+'
matches 'a' repeated one or more times
?
): Matches zero or one of the preceding pattern.
Example: r'a?'
matches either an 'a' or no 'a'
{}
) and (^
): {n,m}
: Matches exactly n to m occurrences. {n,}
: Matches at least n occurrences. ^{pattern}
: Matches the start of a string.
Example: r'a{2,3}'
matches 'a' repeated between 2 and 3 times
|
): Matches either pattern A or B.
Example: r'(abc|def)'
matches either 'abc' or 'def'
()
): Defines a capturing group.
Example: r'(abc)+('
matches 'abc' repeated one or more times
(?=pattern)
: Positive lookahead. (?!pattern)
: Negative lookahead. (?<=pattern)
: Positive lookbehind. (?<!pattern)
: Negative lookbehind.
Example: r'Hello (?=w+)'
matches 'Hello' followed by one or more word characters
Flags
re.IGNORECASE: Makes the pattern case-insensitive.Example: re.compile(r"hello", re.IGNORECASE).search("HELLO")
Example: re.compile(r"^hello$", re.MULTILINE).search("hello worldnhello again")
Example: re.compile(r".*hello.*", re.DOTALL).search("Hello, World!nhello friend!")
Common Tasks
Validation: Verifies a string matches a pattern.Example: re.match(r"[a-zA-Z]+", "Hello")
returns the first match
Example: re.sub(r"old", "new", "old is gold")
replaces 'old' with 'new'
Example: re.split(r"W+", "Hello, World!")
) splits the string by non-alphanumeric characters
Example: re.findall(r"[a-zA-Z]+", "Hello, World! 123 abc")
returns all matches
Example: re.subn(r"(d+)", lambda x: str(int(x.group(1)) + 1), "abc123def456")
) replaces numbers with incremented values
Conclusion
This comprehensive Python regex cheat sheet covers the basics, special patterns, and flags. Familiarize yourself with these concepts to master the art of text manipulation in Python!
What is a re module in Python?
I'm happy to explain what a re
module is in Python!
The re
module is a built-in module in Python that provides support for regular expressions. Regular expressions are patterns used for string matching, searching, and replacing. They're incredibly powerful tools for processing text data.
In the re
module, you can:
search()
function to find the first occurrence of a pattern in a string. Match strings: Use the match()
function to check if a string matches a specific pattern at the beginning of the string. Findall strings: Use the findall()
function to return all non-overlapping matches of a pattern in a string as a list of tuples. Substitute strings: Use the sub()
function to replace occurrences of a pattern with another string. Compile patterns: Use the compile()
function to compile regular expressions into patterns that can be reused.
Regular expressions (regex) are composed of special characters, called "metacharacters," and literal characters. Metacharacters have specific meanings when used in regex:
.
, $
, ^
, |
, (
, )
, [
, ]
, {
, }
, (
, +
, ?
, *
, ~
(tilde), (
, )
, and
are metacharacters. All other characters are considered literal.
Some common regex patterns include:
Matching sets: Use square brackets[
and ]
to match a set of characters. For example, [a-zA-Z0-9_]
matches any letter (both uppercase and lowercase), number, or underscore. Repeating: Use special characters like *
, +
, {m,n}
, or ?
to specify repetition patterns. For instance, .*
matches any sequence of characters (including none). Grouping: Use parentheses ( )
to group parts of the pattern and treat them as a single unit. Character classes: Use special syntax like d
(digit), w
(word character), or s
(whitespace) to match specific types of characters.
The re
module also supports various flavors of regex, such as:
In summary, the re
module in Python provides a robust way to work with regular expressions, which are essential for text processing and pattern matching. You can use it for various tasks like searching, validating, extracting data from strings, and even performing data transformations.
Now that you know what the re
module is, go ahead and explore its many features!