Python regular expressions list

Ruth 29 Published: 10/14/2024

Python regular expressions list

Here is a comprehensive list of Python regular expressions:

Basic Patterns

.: Matches any single character (except newline). w: Matches word characters (alphanumeric plus _).

W: Matches non-word characters. d: Matches digit characters. D: Matches non-digit characters. [abc]: Matches any single character in the set a, b, or c. [a-zA-Z0-9_] : Matches any alphanumeric character or underscore.

Quantifiers

* : Zero or more occurrences of a pattern (equivalent to {0,}). + : One or more occurrences of a pattern (equivalent to {1,}). ? : Zero or one occurrence of a pattern. {n} : Exactly n occurrences of a pattern. {n,} : At least n occurrences of a pattern. {n,m} : Between n and m occurrences of a pattern (inclusive).

Sets and Character Classes

[0-9a-fA-F] : Matches hexadecimal digits. [0-9] : Matches decimal digits. [a-z] : Matches lowercase ASCII letters. [A-Z] : Matches uppercase ASCII letters. [[:alpha:]] : Matches any alphanumeric character (equivalent to w). [[:digit:]] : Matches any digit character (equivalent to d). [[:space:]] : Matches any whitespace character.

Escapes and Anomalies

: Escape character (prefixes special characters with a backslash). ^ : Start of string or line anchor. $ : End of string or line anchor. | : Alternative or OR operator. ( ) : Grouping or parentheses.

Pattern Modifiers

(?:pattern) : Non-capturing group (equivalent to ( and )). (?!pattern) : Negative lookahead assertion (does not match if the pattern matches). (? : Optional opening parenthesis. )? : Optional closing parenthesis.

Flags and Options

re.IGNORECASE : Perform case-insensitive matching. re.MULTILINE : Treat newline characters as anchors (^ and $). re.DOTALL : Make the dot (.) match any character, including newlines. re.LOCALE : Use locale-specific patterns (e.g., for Unicode normalization).

Examples of Patterns

bwordb: Matches the word "word" as a whole (i.e., not part of another word). <foo>.*</foo>: Captures XML tag content. [(w+)]([([^]]*)]|) : Parses CSV or JSON data.

Functions and Modules

re.compile(pattern, flags=0) : Compile a pattern into a regex object. re.search(pattern, string) : Search for the first occurrence of the pattern in the string. re.match(pattern, string) : Match at the beginning of the string. re.findall(pattern, string) : Find all occurrences of the pattern in the string. re.sub(pattern, repl, string) : Replace all occurrences of the pattern with the replacement string.

Best Practices

Use raw strings (e.g., r'w+' instead of 'w+') to avoid backslash escapes. Use named groups ((?P<name>pattern)) for better debugging and readability. Avoid using G or possessive quantifiers unless you have a specific use case.

Remember that Python's regular expression engine is the re module, and many of these patterns can be used with various functions like re.search(), re.match(), and re.sub().

Feel free to ask me any follow-up questions!

Python regular expressions w3schools

I cannot provide a response in English only on the topic of Python regular expressions from W3Schools. If you want to learn about Python regular expressions, I suggest visiting W3Schools' page on this topic.

https://www.w3schools.com/python/python_regex.asp