DevToys Web Pro iconDevToys Web ProBlog
Tradus cu LocalePack logoLocalePack
Evaluează-ne:
Încearcă extensia de browser:
← Back to Blog

Regex Find and Replace: Capture Groups, Backreferences, and Practical Examples

9 min read

A plain text search finds an exact string. Regex find-and-replace goes further: it matches patterns and lets you reuse parts of the match in the replacement. That combination handles tasks that are tedious or impossible with literal substitution — reformatting dates, redacting sensitive tokens, converting naming conventions. The String Replacer tool lets you run all the examples in this article directly in your browser, with no data leaving your machine.

Literal Replace vs Regex Replace

Literal (plain-text) replacement is faster, safer, and easier to read when you know the exact string. Use it for fixed tokens: replacing http:// with https://, or swapping one identifier for another.

Switch to regex when any of these apply:

  • The target varies in length or spelling — e.g., all timestamps regardless of value.
  • You need to reuse part of the matched text in the replacement.
  • You need to apply the operation to every line independently (multiline mode).
  • The match spans optional characters — e.g., trailing whitespace that may or may not exist.

The risk with regex is over-matching. A pattern like .* in the wrong place swallows more than intended. Start narrow, test on a sample, then broaden.

Capture Groups

Parentheses in a regex pattern create a capture group. Whatever the group matches is remembered and available in the replacement string via a backreference. Groups are numbered left-to-right starting at 1, by their opening parenthesis.

Pattern:     (\w+)\s+(\d+)
Input:       file 42
Match:       file 42
  Group 1:   file
  Group 2:   42

To swap the order in the replacement, reference the groups by number. JavaScript uses $1, $2; Python and most POSIX tools use \\1, \\2:

// JavaScript
"file 42".replace(/(\w+)\s+(\d+)/, "$2 $1");
// => "42 file"
# Python
import re
re.sub(r"(\w+)\s+(\d+)", r"\2 \1", "file 42")
# => "42 file"

Non-capturing groups use (?:...). They group without consuming a slot number, which keeps your backreference numbering clean when you only need grouping for alternation or repetition.

Special Replacement Tokens

Replacement strings are not plain text — certain sequences carry special meaning. The table below covers the tokens supported in JavaScript (String.prototype.replace) and most regex engines:

TokenMeaningExample replacement
$&The entire matched substring[$&] wraps each match in brackets
$`The substring before the matchRarely useful; repeats prefix text
$'The substring after the matchRarely useful; repeats suffix text
$1$9Numbered capture group$2-$1 reverses two captured words
$<name>Named capture group (JS / .NET / Python)$<year>-$<month>
$$A literal $ signUse when replacement text contains a dollar sign

A common pitfall: $& inserts the whole match, not group 1. If your pattern has one group and you write $& expecting the captured text, you get the full match instead. Use $1 for the first group.

Named Capture Groups

Numbered groups break when you add or remove a group — the indices shift. Named groups solve this. The syntax is (?<name>...), and the backreference is $<name> in JavaScript or \\g<name> in Python.

// Reformat ISO date: 2026-04-20 → 20/04/2026
const result = "2026-04-20".replace(
  /(?<year>\d{4})-(?<month>\d{2})-(?<day>\d{2})/,
  "$<day>/$<month>/$<year>"
);
// => "20/04/2026"
import re
result = re.sub(
    r"(?P<year>\d{4})-(?P<month>\d{2})-(?P<day>\d{2})",
    r"\g<day>/\g<month>/\g<year>",
    "2026-04-20"
)
# => "20/04/2026"

Python uses (?P<name>...) for the group and \\g<name> for the backreference. JavaScript and .NET use (?<name>...) and $<name>.

Greedy vs Non-Greedy Quantifiers

By default, quantifiers are greedy — they match as much as possible. Adding ? after a quantifier makes it non-greedy (lazy), matching as little as possible.

Consider stripping HTML tags. The greedy pattern <.*> on the string <b>bold</b> matches the entire string from the first < to the last > — not what you want. The non-greedy version matches each tag individually:

const html = "<b>bold</b> and <i>italic</i>";

// Greedy — matches from first < to last >
html.replace(/<.*>/g, "");
// => ""  (entire string consumed)

// Non-greedy — matches each tag separately
html.replace(/<.*?>/g, "");
// => "bold and italic"

Use non-greedy quantifiers (*?, +?, ??) whenever your pattern sits between two known delimiters and you want the shortest match. See the Regex Cheatsheet for a full quantifier reference.

Multiline Mode

Two flags change how ^ and $ behave:

  • m (multiline): ^ matches the start of each line; $ matches the end of each line. Without m, they only match the very start and end of the entire string.
  • s (dotall / single-line): . matches newline characters as well as everything else. Without s, . stops at \n.

A practical example: adding a prefix to every line in a block of text requires the m flag so ^ anchors to each line start:

const lines = "alpha\nbeta\ngamma";

// Without m flag: only matches start of entire string
lines.replace(/^/g, "- ");
// => "- alpha\nbeta\ngamma"

// With m flag: matches start of each line
lines.replace(/^/gm, "- ");
// => "- alpha\n- beta\n- gamma"

For multi-line block matching (e.g., everything between two markers), combine m and s. See Regex Newline Behavior for a deep dive into how different engines handle line endings.

Case-Preserving Replace

No regex engine has a built-in case-preserving replace. When you rename a symbol (say userId to accountId), you often want to keep the original casing: userId stays camelCase, UserId stays PascalCase, USER_ID stays screaming snake.

The manual approach in JavaScript is to pass a function as the replacement argument, detect the casing of the match, and apply the same transform to the replacement string:

function preserveCase(match, replacement) {
  if (match === match.toUpperCase()) return replacement.toUpperCase();
  if (match[0] === match[0].toUpperCase()) {
    return replacement[0].toUpperCase() + replacement.slice(1);
  }
  return replacement;
}

const text = "userId, UserId, USER_ID";
const result = text.replace(/userId|UserId|USER_ID/gi, (match) =>
  preserveCase(match, "accountId")
);
// => "accountId, AccountId, ACCOUNT_ID"

Common Tasks

Log Redaction — Emails and Tokens

Before sharing logs or committing fixtures, redact personally identifiable information and secrets. Two patterns cover the most common cases:

// Redact email addresses
text.replace(/[\w.+-]+@[\w-]+\.[\w.]+/g, "[EMAIL]");

// Redact bearer tokens (Authorization header values)
text.replace(/Bearer\s+[\w.-]+/gi, "Bearer [REDACTED]");

// Redact generic API keys (long alphanumeric strings after key=)
text.replace(/([?&](?:api_key|token|secret)=)[\w-]+/gi, "$1[REDACTED]");

Rename Patterns — camelCase to snake_case

Converting camelCase identifiers to snake_case is a classic backreference task. Each uppercase letter that follows a lowercase letter or digit marks a word boundary:

function camelToSnake(str) {
  return str
    .replace(/([a-z0-9])([A-Z])/g, "$1_$2")
    .toLowerCase();
}

camelToSnake("getUserById");   // => "get_user_by_id"
camelToSnake("myAPIClient");   // => "my_api_client" (partial — acronyms need extra pass)
camelToSnake("parseHTTPSUrl"); // => "parse_https_url"

Reordering — Date Format Conversion

Rearranging date fields is the textbook use case for named groups:

// US format MM/DD/YYYY → ISO 8601 YYYY-MM-DD
const usToIso = (date) =>
  date.replace(
    /(?<month>\d{2})\/(?<day>\d{2})\/(?<year>\d{4})/g,
    "$<year>-$<month>-$<day>"
  );

usToIso("04/20/2026");  // => "2026-04-20"
usToIso("12/31/2025");  // => "2025-12-31"

Code Examples Across Languages

JavaScript — replace vs replaceAll

const s = "foo bar foo";

// replace with string: only first occurrence
s.replace("foo", "baz");        // => "baz bar foo"

// replace with regex + g flag: all occurrences
s.replace(/foo/g, "baz");       // => "baz bar baz"

// replaceAll with string: all occurrences (ES2021)
s.replaceAll("foo", "baz");     // => "baz bar baz"

// replaceAll with regex: requires g flag, else TypeError
s.replaceAll(/foo/g, "baz");    // => "baz bar baz"

Python — re.sub with a Callable

Python's re.sub accepts a function as the replacement argument. The function receives the match object and returns the replacement string — useful when the replacement depends on the matched content:

import re

def mask_digits(m):
    return "*" * len(m.group())

result = re.sub(r"\d+", mask_digits, "Order 12345, invoice 678")
# => "Order *****, invoice ***"

# Count-limited substitution: replace only first 2 occurrences
re.sub(r"\d+", "N", "1 and 2 and 3", count=2)
# => "N and N and 3"

sed — Basic vs Extended Regex

sed defaults to Basic Regular Expressions (BRE), where grouping parentheses must be escaped: \\(\\). The -E flag enables Extended Regular Expressions (ERE) with unescaped (), +, and ?:

# BRE: parentheses must be escaped
echo "2026-04-20" | sed 's/\([0-9]\{4\}\)-\([0-9]\{2\}\)-\([0-9]\{2\}\)/\3\/\2\/\1/'
# => 20/04/2026

# ERE: cleaner syntax
echo "2026-04-20" | sed -E 's/([0-9]{4})-([0-9]{2})-([0-9]{2})/\3\/\2\/\1/'
# => 20/04/2026

In sed replacement strings, backreferences are always \\1, \\2, etc. — there is no $1 syntax. The & token (no dollar sign) inserts the whole match, equivalent to $& in JavaScript.

Pitfalls

  • replace with a string replacement replaces only the first match. In JavaScript, "a a a".replace("a", "b") returns "b a a". Use a regex with the g flag, or replaceAll, for global replacement.
  • A literal $ in the replacement string must be written as $$. Writing "$10" as the replacement inserts capture group 10 (or group 1 followed by 0). To include an actual dollar sign, use "$$10".
  • Catastrophic backtracking (ReDoS). Patterns like (a+)+ or (\\w+\\s*)+$ can take exponential time on certain inputs because the engine tries an enormous number of ways to match. Avoid nested quantifiers over the same character class, and test suspicious patterns against long inputs before deploying.
  • $& vs $1 confusion. $& is the entire match; $1 is the first capture group. If your pattern wraps the whole match in a single group, both refer to the same text — but they diverge the moment the pattern contains any structure outside the group.
  • Forgetting the g flag with regex in JavaScript. A regex without g passed to replace replaces only the first occurrence, silently. No error, no warning — just an incomplete substitution.

Try all the patterns from this article interactively in the String Replacer — paste your text, enter a pattern and replacement, toggle the g, m, and i flags, and see results update in real time with no data leaving your browser.