Dev Tools
Client-sidefiles never upload

Regex builder & live tester

Paste a pattern + a sample string. See matches highlighted, with all capture groups and named groups.

//i
(g is always on)
Matches (2)
alice@example.com sent the email.
bob@dev.9niche.com replied.
ignore this line.
Capture groups
  1. #1 @ 0–17
    alice@example.com
    • $1: alice
    • $2: example.com
    • word: alice
    • host: example.com
  2. #2 @ 34–52
    bob@dev.9niche.com
    • $1: bob
    • $2: dev.9niche.com
    • word: bob
    • host: dev.9niche.com
β€ΊHow to use
  1. Enter the pattern in the top-left box. Don't include the leading/trailing /.
  2. Toggle flags below: i / m / s / u / y. g is always on so you get every match.
  3. Right side shows match count, highlighted ranges, and capture / named groups for each match.
  4. Pattern errors appear inline (the browser's native message).
Tips
  • Need to grab an email? Try (?[\w.+-]+)@(?[\w.-]+\.\w+).
  • To prevent catastrophic backtracking freezing your tab, input is capped at 100,000 chars and 500 matches.

Regex in depth: capture groups, lookaround, catastrophic backtracking, cross-language differences

regex is something most engineers use daily but never feel fully fluent in. Here are the techniques I actually reach for after years of QA / SRE work β€” plus the real cases where my browser locked up because of one careless quantifier.

Capture vs non-capturing vs named groups

All three use parentheses but mean different things: - () Capture group: captures, referenceable as $1, $2 - (?:) Non-capturing group: groups but doesn't capture β€” slightly faster, so use these freely in complex patterns - (?) Named group: captures and can be referenced by name β€” wins on readability because you don't have to count positions weeks later Example: matching an email - Bad: (\w+)@(\w+)\.(\w+) β€” using $1 $2 $3 later, no idea which is which after a month. - Good: (?\w+)@(?\w+)\.(?\w+) β€” using groups.user is self-documenting.

Lookahead / lookbehind: conditions without consuming

Lookaround tests a condition inside the regex engine without advancing the cursor β€” perfect when you want "simultaneous conditions": - (?=...) Positive lookahead: what follows must match - (?!...) Negative lookahead: what follows must not match - (?<=...) Positive lookbehind: what precedes must match - (? Negative lookbehind: what precedes must not match Classic example β€” password with at least one digit AND one uppercase: `` ^(?=.*\d)(?=.*[A-Z]).{8,}$ `` Three lookaheads in parallel, no characters "consumed", just conditions checked. Much cleaner than chaining multiple regexes.

Catastrophic backtracking: how regex freezes your browser

Nested quantifiers are the classic foot-gun. Anti-patterns: (a+)+, (a*)*, (a|a)* With (a+)+b against aaaaaaaaaaaaaaaaa, the engine tries every possible way to split the as into groups before giving up β€” O(2^n). I've seen 30 characters of test input lock up a browser for 8 seconds. How to avoid: 1. Atomic groups (?>...) (no JS support; Node β‰₯ 16 has them; Java / .NET have them) 2. Possessive quantifiers ++ *+ (same β€” no JS) 3. Audit your quantifiers for overlap ((\w+)+ collapses to \w+) 4. Cap input length (this site's tools cap regex input at 100,000 chars for exactly this reason) JS-land usually relies on (3) and (4).

JavaScript vs Python: differences that bite

- Start / end anchors: JS doesn't have \A / \Z, use ^ $ + m flag - Unicode: JS needs the u flag to get \p{Letter}; Python is Unicode by default - Lookbehind: Safari < 16.4 has no lookbehind at all β€” your site breaks for those users. Always wrap in try/catch with a fallback regex. - re vs regex module (Python): the stdlib re doesn't support variable-length lookbehind; install the third-party regex module if you need it - Sticky flag y: JS only β€” useful when writing tokenizers / lexers

Real QA scenarios where I use regex

The patterns I reach for most weeks: - nginx access log parsing: extract IP / status / response time β†’ feed into [percentile analysis](/en/tools/percentile) - API response body checks: Robot Framework's Should Match Regexp is much sharper than Should Contain - Test data validation: confirm the [credit card test data](/en/tools/tw-test-data) you generated matches the expected (\d{4}) (\d{4}) (\d{4}) (\d{4}) format - Selenium dynamic IDs: grab userdata-([a-f0-9]{8}) and use the captured suffix - Error log classification: pull file path + line number out of stack traces to rank flakiest modules

Try the patterns: paste each one into the [Regex tool](/en/tools/regex) and confirm matches live. The lookbehind-on-Safari case is the one that catches everyone.

Related tools