Regular expressions (regex) are one of the most powerful tools in a developer’s arsenal — and one of the most feared. This cheat sheet covers everything you need, from basic syntax to advanced patterns, with practical examples you can use right now.
Try it live: Use the Genbox Regex Tester to test any pattern from this guide directly in your browser.
Table of Contents
- Basic Syntax
- Character Classes
- Quantifiers
- Anchors
- Groups and Backreferences
- Lookaheads and Lookbehinds
- Flags
- Common Patterns
- Regex in JavaScript
- Regex in Python
Basic Syntax
At its core, a regex pattern is a sequence of characters that defines a search pattern. Every literal character in a regex matches itself:
- Pattern
catmatches the string “cat” anywhere in the text - Pattern
123matches the literal string “123”
Special characters (called metacharacters) have special meaning and must be escaped with a backslash \ if you want to match them literally:
. ^ $ * + ? { } [ ] \ | ( )
| Pattern | Matches |
|---|---|
\. | A literal period |
\$ | A literal dollar sign |
\* | A literal asterisk |
\\ | A literal backslash |
Character Classes
Character classes let you match any one character from a set.
Built-in Shorthand Classes
| Shorthand | Equivalent | Matches |
|---|---|---|
\d | [0-9] | Any digit |
\D | [^0-9] | Any non-digit |
\w | [a-zA-Z0-9_] | Word character (letter, digit, or underscore) |
\W | [^a-zA-Z0-9_] | Non-word character |
\s | [ \t\r\n\f\v] | Whitespace (space, tab, newline, etc.) |
\S | [^ \t\r\n\f\v] | Non-whitespace |
. | (any except \n) | Any character except newline |
Custom Character Classes
Use square brackets [ ] to define a custom set:
| Pattern | Matches |
|---|---|
[aeiou] | Any lowercase vowel |
[A-Z] | Any uppercase letter |
[0-9a-f] | Any hex digit (lowercase) |
[^abc] | Any character except a, b, or c |
[a-zA-Z] | Any letter |
Note: Most special characters lose their special meaning inside [ ]. The only special characters inside a character class are ], \, ^ (at start), and - (between chars).
Quantifiers
Quantifiers specify how many times the preceding element should match.
| Quantifier | Meaning | Example |
|---|---|---|
* | 0 or more | a* matches "", “a”, “aa”, “aaa” |
+ | 1 or more | a+ matches “a”, “aa”, “aaa” |
? | 0 or 1 | colou?r matches “color” and “colour” |
{n} | Exactly n times | \d{4} matches exactly 4 digits |
{n,} | n or more times | \d{2,} matches 2 or more digits |
{n,m} | Between n and m times | \d{2,4} matches 2–4 digits |
Greedy vs Lazy
By default, quantifiers are greedy — they match as much as possible. Add ? to make them lazy (match as little as possible):
Input: <b>bold</b> and <i>italic</i>
Greedy: <.+> → matches "<b>bold</b> and <i>italic</i>"
Lazy: <.+?> → matches "<b>", "</b>", "<i>", "</i>"
| Greedy | Lazy | Behavior |
|---|---|---|
* | *? | 0 or more, lazy |
+ | +? | 1 or more, lazy |
? | ?? | 0 or 1, lazy |
{n,m} | {n,m}? | n–m times, lazy |
Anchors
Anchors match a position in the string, not a character.
| Anchor | Matches |
|---|---|
^ | Start of string (or start of line in multiline mode) |
$ | End of string (or end of line in multiline mode) |
\b | Word boundary (between \w and \W) |
\B | Non-word boundary |
\A | Start of string (Python/Ruby; not supported in JS) |
\Z | End of string (Python/Ruby; not supported in JS) |
Examples:
^\d+$ → Entire string is digits only
\bcat\b → The word "cat" (not "catch" or "concatenate")
^Hello → String starts with "Hello"
world$ → String ends with "world"
Groups and Backreferences
Capturing Groups
Parentheses ( ) create a capturing group, which saves the matched text for later use.
(\d{4})-(\d{2})-(\d{2})
Applied to "2026-04-14":
- Group 1 →
2026 - Group 2 →
04 - Group 3 →
14
Non-Capturing Groups
Use (?:...) when you need grouping but don’t need to capture:
(?:https?|ftp):// → Matches "http://", "https://", or "ftp://"
without capturing the protocol
Named Capturing Groups
Name your groups with (?<name>...) for readable references:
const re = /(?<year>\d{4})-(?<month>\d{2})-(?<day>\d{2})/;
const m = re.exec('2026-04-14');
console.log(m.groups.year); // "2026"
console.log(m.groups.month); // "04"
Backreferences
Refer back to a captured group with \1, \2, etc. (or \k<name> for named groups):
(['"])(.*?)\1 → Matches text in matching quotes (single or double)
Lookaheads and Lookbehinds
Lookaround assertions check what’s before or after a position without including it in the match.
| Syntax | Type | Description |
|---|---|---|
(?=...) | Positive lookahead | Match if followed by ... |
(?!...) | Negative lookahead | Match if NOT followed by ... |
(?<=...) | Positive lookbehind | Match if preceded by ... |
(?<!...) | Negative lookbehind | Match if NOT preceded by ... |
Examples:
\d+(?= dollars) → Matches a number only if followed by " dollars"
"100 dollars" → matches "100"
"100 euros" → no match
(?<=\$)\d+ → Matches digits only if preceded by "$"
"$500 and €200" → matches "500"
\b\w+(?!ing)\b → Approximate: words NOT ending in "ing"
Flags
Flags change how the regex engine interprets the pattern.
| Flag | JS | Python | Effect |
|---|---|---|---|
| Global | g | (N/A, use findall) | Find all matches |
| Case-insensitive | i | re.IGNORECASE | Case-insensitive matching |
| Multiline | m | re.MULTILINE | ^/$ match line boundaries |
| Dotall | s | re.DOTALL | . matches \n |
| Unicode | u | (default in Python 3) | Full Unicode support |
| Verbose | (N/A) | re.VERBOSE | Allow comments and whitespace in pattern |
Common Patterns
Copy-paste ready patterns for common validation tasks:
Email Address
^[a-zA-Z0-9._%+\-]+@[a-zA-Z0-9.\-]+\.[a-zA-Z]{2,}$
URL
https?:\/\/(www\.)?[-a-zA-Z0-9@:%._+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}\b([-a-zA-Z0-9()@:%_+.~#?&/=]*)
IPv4 Address
\b(?:(?:25[0-5]|2[0-4]\d|[01]?\d\d?)\.){3}(?:25[0-5]|2[0-4]\d|[01]?\d\d?)\b
Date (YYYY-MM-DD)
^\d{4}-(0[1-9]|1[0-2])-(0[1-9]|[12]\d|3[01])$
US Phone Number
^\+?1?\s?(\(\d{3}\)|\d{3})[\s.\-]?\d{3}[\s.\-]?\d{4}$
Strong Password (8+ chars, uppercase, lowercase, digit, special)
^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[@$!%*?&])[A-Za-z\d@$!%*?&]{8,}$
Hex Color Code
^#([A-Fa-f0-9]{6}|[A-Fa-f0-9]{3})$
Slug (URL-friendly string)
^[a-z0-9]+(?:-[a-z0-9]+)*$
Credit Card (general)
^(?:4[0-9]{12}(?:[0-9]{3})?|5[1-5][0-9]{14}|3[47][0-9]{13}|6(?:011|5[0-9]{2})[0-9]{12})$
Whitespace-only string
^\s*$
Regex in JavaScript
JavaScript has two ways to create regex:
// Literal syntax (preferred for static patterns)
const re = /\d+/g;
// Constructor (for dynamic patterns)
const pattern = '\\d+';
const re = new RegExp(pattern, 'g');
Key Methods
| Method | Returns | Use case |
|---|---|---|
str.match(re) | Array of matches (or null) | Get all matches with g flag |
str.matchAll(re) | Iterator of match objects | Get matches with capture groups |
str.search(re) | Index of first match (or -1) | Test existence, get position |
str.replace(re, replacement) | New string | Replace matches |
str.replaceAll(re, replacement) | New string | Replace all (requires g flag) |
str.split(re) | Array of substrings | Split on pattern |
re.test(str) | true / false | Fast existence check |
re.exec(str) | Match object (or null) | Low-level, iterates with g flag |
Replace with Function
const result = 'hello world'.replace(/\b\w/g, char => char.toUpperCase());
// → "Hello World"
Named Groups in Replace
const date = '2026-04-14';
const formatted = date.replace(
/(?<y>\d{4})-(?<m>\d{2})-(?<d>\d{2})/,
'$<d>/$<m>/$<y>'
);
// → "14/04/2026"
Regex in Python
Python’s re module provides full regex support.
import re
# Compile for reuse (faster when using pattern multiple times)
pattern = re.compile(r'\d+')
# Common functions
re.match(r'^\d+', text) # Match at beginning only
re.search(r'\d+', text) # Search anywhere
re.findall(r'\d+', text) # Return list of all matches
re.finditer(r'\d+', text) # Return iterator of match objects
re.sub(r'\d+', 'N', text) # Replace matches
re.split(r'\s+', text) # Split on pattern
Always use raw strings (r'...') for regex in Python to avoid double-escaping backslashes.
Named Groups in Python
m = re.search(r'(?P<year>\d{4})-(?P<month>\d{2})', '2026-04')
print(m.group('year')) # "2026"
print(m.group('month')) # "04"
Frequently Asked Questions
What’s the difference between + and *?
+ requires at least one match; * allows zero matches. Use + when the element must appear at least once.
Why does my . not match newlines?
By default, . matches any character except newline. Enable the dotall flag (s in JavaScript, re.DOTALL in Python) to make . match newlines too.
What does ^ mean inside and outside [ ]?
Outside brackets, ^ anchors to the start of the string. Inside brackets [^abc], it negates the character class (match anything except the listed chars).
When should I use (?:...) vs (...)?
Use non-capturing (?:...) whenever you need grouping (for quantifiers or alternation) but don’t need to reference the matched text later. It’s slightly faster and cleaner.
How do I match a literal special character like . or *?
Escape it with a backslash: \. matches a literal period, \* matches a literal asterisk.
Ready to test these patterns? Open the Genbox Regex Tester and try them live.