Q: What is catastrophic backtracking in regex?

Catastrophic backtracking occurs when a poorly written regex pattern causes the engine to try an exponentially growing number of match paths, freezing or crashing your application. Common triggers include nested quantifiers like (a+)+ or patterns with overlapping alternatives. To avoid it, use specific character classes instead of dot-star, avoid nesting quantifiers, and test patterns against worst-case inputs such as long strings with near-matches before deploying to production.

Q: How do I validate an email address with regex?

A practical email regex is [a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,} which covers the vast majority of real-world email addresses. The full RFC 5322 specification is extremely complex and rarely implemented in practice. For production applications, use a simple regex for basic format checking, then verify the address by sending a confirmation email. No regex can confirm an email address actually exists or is reachable.

Question 1

What are regular expressions and why are they useful?

Accepted Answer

Regular expressions (regex) are sequences of characters that define search patterns for matching, finding, and replacing text. They are built into virtually every programming language (JavaScript, Python, Java, C#, Go, Ruby, PHP) and text editor (VS Code, Sublime Text, vim). They are essential for tasks like input validation (email, phone numbers, URLs), log file parsing, find-and-replace operations, data extraction from unstructured text, and URL routing in web frameworks.

Question 2

What do the regex flags g, i, m, s, and u mean?

Accepted Answer

g (global) finds all matches instead of stopping at the first. i (case-insensitive) ignores case differences so "hello" matches "Hello" and "HELLO". m (multiline) makes ^ and $ match start/end of each line instead of the entire string. s (dotAll) makes the dot metacharacter match newline characters, which it normally skips. u (unicode) enables full Unicode matching including surrogate pairs and Unicode property escapes like \p{Letter}.

Question 3

What is the difference between greedy and lazy quantifiers?

Accepted Answer

Greedy quantifiers (*, +, ?) match as much text as possible, then backtrack if needed for the rest of the pattern to match. Lazy quantifiers (*?, +?, ??) match as little text as possible, expanding only if needed. For example, given the text "bold", the greedy pattern "<.*>" matches the entire string from the first < to the last >, while the lazy pattern "<.*?>" matches just "" and then "" as two separate matches.

Question 4

What are lookahead and lookbehind assertions?

Accepted Answer

Lookahead (?=...) and lookbehind (?<=...) are zero-width assertions that match a position without consuming characters. Positive lookahead (?=X) matches a position followed by X. Negative lookahead (?!X) matches a position NOT followed by X. Positive lookbehind (?<=X) matches a position preceded by X. Negative lookbehind (?<!X) matches a position NOT preceded by X. These are powerful for matching patterns that depend on surrounding context — for example, (?<=\$)\d+ matches digits that follow a dollar sign without including the dollar sign in the match.

Question 5

What is catastrophic backtracking in regex?

Accepted Answer

Catastrophic backtracking occurs when a poorly written regex pattern causes the engine to try an exponentially growing number of match paths, freezing or crashing your application. Common triggers include nested quantifiers like (a+)+ or patterns with overlapping alternatives. To avoid it, use specific character classes instead of dot-star, avoid nesting quantifiers, and test patterns against worst-case inputs such as long strings with near-matches before deploying to production.

Question 6

How do I validate an email address with regex?

Accepted Answer

A practical email regex is [a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,} which covers the vast majority of real-world email addresses. The full RFC 5322 specification is extremely complex and rarely implemented in practice. For production applications, use a simple regex for basic format checking, then verify the address by sending a confirmation email. No regex can confirm an email address actually exists or is reachable.

Pattern	Name	Meaning	Example
`.`	Dot	Any character except newline	`a.c` matches "abc", "a1c"
`^`	Caret	Start of string (or line with m flag)	`^Hello` matches "Hello World"
`$`	Dollar	End of string (or line with m flag)	`end$` matches "the end"
`\d`	Digit	Any digit [0-9]	`\d{3}` matches "123"
`\w`	Word	Word character [a-zA-Z0-9_]	`\w+` matches "hello_world"
`\s`	Space	Whitespace (space, tab, newline)	`\s+` matches " " (spaces)
`\b`	Boundary	Word boundary (between \w and \W)	`\bcat\b` matches "cat" not "cats"
`[abc]`	Char class	Any one of the listed characters	`[aeiou]` matches vowels
`[^abc]`	Negated class	Any character NOT listed	`[^0-9]` matches non-digits
`(x\|y)`	Alternation	Match x or y	`(cat\|dog)` matches either

Quantifier	Meaning	Greedy	Lazy
`*`	0 or more	`a*` — as many as possible	`a*?` — as few as possible
`+`	1 or more	`a+`	`a+?`
`?`	0 or 1	`a?`	`a??`
`{n}`	Exactly n	`\d{4}` — exactly 4 digits	N/A
`{n,m}`	Between n and m	`\d{2,4}`	`\d{2,4}?`
`{n,}`	n or more	`\w{3,}`	`\w{3,}?`

Purpose	Pattern	Notes
Email (basic)	`[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}`	Covers most common email formats
URL	`https?://[^\s/$.?#].[^\s]*`	HTTP and HTTPS URLs
IPv4 address	`\b\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\b`	Basic format; does not validate range 0-255
Date (YYYY-MM-DD)	`\d{4}-\d{2}-\d{2}`	ISO 8601 date format
Phone (US)	`\(?\d{3}\)?[-.\s]?\d{3}[-.\s]?\d{4}`	Matches (555) 123-4567, 555.123.4567, etc.
HTML tag	`<[^>]+>`	Matches opening and closing tags
Hex color	`#([0-9a-fA-F]{3}){1,2}\b`	Matches #FFF and #FF00FF
Password strength	`^(?=.[a-z])(?=.[A-Z])(?=.*\d).{8,}$`	Min 8 chars, upper, lower, digit

Regex Tester

Regular Expressions Explained: Syntax, Metacharacters, Patterns, and Advanced Features

Regex Metacharacter Reference Table

Quantifiers: Controlling How Many Times a Pattern Matches

Grouping and Capturing

Lookahead and Lookbehind Assertions

Common Regex Patterns You Can Copy and Use

Regex Flags Explained

Regex Performance Tips

Regex Engine Differences Across Languages

Frequently Asked Questions

Related Developer Tools