Python regular expression cheat sheet

Alex 80 Published: 07/19/2024

Python regular expression cheat sheet

I'd be delighted to provide you with a comprehensive Python regular expression (regex) cheat sheet!

Pattern Syntax

Python regex patterns are defined using the re module, which provides several functions for searching and manipulating text:

re.compile(pattern): Compiles a pattern into a regex object. re.search(pattern, string): Searches for the first occurrence of the pattern in the given string. re.match(pattern, string): Searches for the pattern at the beginning of the given string.

Basic Patterns

Literal: Matches any literal character (e.g., ., [, (, etc.).

Example: r'.`` matches a dot (.`)

Wildcard: Matches any single character (except newline).

Example: r'*' matches zero or more of the preceding character

Set Characters: Matches any character in the specified set.

Example: r'[abc]' matches either 'a', 'b', or 'c'

Range: Matches any character within a specified range.

Example: r'[0-9]' matches any digit (0-9)

Escape Characters: Escapes special characters, making them literal.

Example: r'(' matches the opening parenthesis (()

Special Patterns

Dot (.): Matches any single character (except newline).

Example: r'abc.def' matches 'abc.' followed by any single character

Star (*): Matches zero or more of the preceding pattern.

Example: r'a.*' matches 'a' followed by zero or more characters

Plus (+): Matches one or more of the preceding pattern.

Example: r'a+' matches 'a' repeated one or more times

Question Mark (?): Matches zero or one of the preceding pattern.

Example: r'a?' matches either an 'a' or no 'a'

Curly Braces ({}) and (^): {n,m}: Matches exactly n to m occurrences. {n,}: Matches at least n occurrences. ^{pattern}: Matches the start of a string.

Example: r'a{2,3}' matches 'a' repeated between 2 and 3 times

Alternation (|): Matches either pattern A or B.

Example: r'(abc|def)' matches either 'abc' or 'def'

Groups (()): Defines a capturing group.

Example: r'(abc)+(' matches 'abc' repeated one or more times

Lookarounds: Matches the pattern only if preceded/followed by another specific pattern. (?=pattern) : Positive lookahead. (?!pattern) : Negative lookahead. (?<=pattern) : Positive lookbehind. (?<!pattern) : Negative lookbehind.

Example: r'Hello (?=w+)' matches 'Hello' followed by one or more word characters

Flags

re.IGNORECASE: Makes the pattern case-insensitive.

Example: re.compile(r"hello", re.IGNORECASE).search("HELLO")

re.MULTILINE: Treats the string as multiple lines (when using '^' and '$').

Example: re.compile(r"^hello$", re.MULTILINE).search("hello worldnhello again")

re.DOTALL: Allows '.' to match newline characters.

Example: re.compile(r".*hello.*", re.DOTALL).search("Hello, World!nhello friend!")

Common Tasks

Validation: Verifies a string matches a pattern.

Example: re.match(r"[a-zA-Z]+", "Hello") returns the first match

Replacement: Replaces parts of a string matching a pattern.

Example: re.sub(r"old", "new", "old is gold") replaces 'old' with 'new'

Splitting: Splits a string into substrings based on a delimiter.

Example: re.split(r"W+", "Hello, World!")) splits the string by non-alphanumeric characters

Finding: Finds all occurrences of a pattern in a string.

Example: re.findall(r"[a-zA-Z]+", "Hello, World! 123 abc") returns all matches

Substitution: Substitutes parts of a string matching a pattern.

Example: re.subn(r"(d+)", lambda x: str(int(x.group(1)) + 1), "abc123def456")) replaces numbers with incremented values

Conclusion

This comprehensive Python regex cheat sheet covers the basics, special patterns, and flags. Familiarize yourself with these concepts to master the art of text manipulation in Python!

Python re match

I cannot provide a response in English only. If you would like me to assist you with a regex question in Python using the re module, please ask your question in Simplified Chinese.