Text Diff Guide: Compare Files, Configs, and Code Like a Developer
A diff tool answers one question: what changed? Whether you're reviewing a pull request, comparing two config files, tracking down a regression, or auditing what a deployment script modified — diffing is one of the most frequently used operations in a developer's day. This guide covers how diff algorithms work, how to read their output, when to use each granularity, and practical use cases. Use the Text Comparer to follow along.
How Diff Works: The Myers Algorithm
The standard diff algorithm used by git diff, GNU diff, and most modern tools is the Myers algorithm, published by Eugene Myers in 1986. It solves the Longest Common Subsequence (LCS) problem: find the largest set of lines that appear in both texts in the same order, then mark everything else as added or removed.
Consider two versions of a config file:
# Version A # Version B
host: localhost host: db.prod.example.com
port: 5432 port: 5432
database: myapp_dev database: myapp_prod
user: admin user: app_user
password: secret password: ••••••••
ssl: true
pool_size: 10The Myers algorithm finds the longest sequence of lines common to both (port: 5432 and the structural shape), then reports the minimal set of changes needed to transform A into B. The goal is always the shortest edit script — the fewest insertions and deletions.
Reading Diff Output
Unified diff format
Unified diff is what git diff produces. It's the format you read in pull requests. Each changed section (called a hunk) shows a few lines of context around the change:
--- a/config.yml
+++ b/config.yml
@@ -1,5 +1,7 @@
-host: localhost
+host: db.prod.example.com
port: 5432
-database: myapp_dev
+database: myapp_prod
-user: admin
+user: app_user
-password: secret
+password: ••••••••
+ssl: true
+pool_size: 10How to read it:
---is the original file,+++is the new file@@ -1,5 +1,7 @@means: original starts at line 1, shows 5 lines; new version starts at line 1, shows 7 lines- Lines starting with
-were removed - Lines starting with
+were added - Lines with no prefix are context (unchanged)
Side-by-side diff
Side-by-side view shows original and modified text in two columns. It's easier to read for prose and config files where you want to see old and new values next to each other. The Text Comparer renders diffs side-by-side, which is better for scanning what a value changed from and to.
Diff Granularity: Line, Word, Character
The same two texts can be diffed at different granularities. Each serves different use cases:
| Granularity | Unit of comparison | Best For |
|---|---|---|
| Line diff | Entire lines | Code, config files, log files — anything structured by line |
| Word diff | Whitespace-separated tokens | Prose, documentation, long lines with small edits |
| Character diff | Individual characters | Spotting typos, single-character changes in IDs or flags |
Line diff can be misleading when a single word changes in a long line — the entire line shows as removed and re-added. Word or character diff reveals the actual change:
# Line diff — hard to spot what changed:
- timeout: 30000
+ timeout: 300000
# Word/character diff — immediately obvious:
timeout: [30000 → 300000]YAML Comparison: A Common Source of Confusion
YAML comparison deserves special attention because YAML's implicit typing creates surprises. Two YAML files can be semantically identical but textually different, or vice versa.
Whitespace and indentation
YAML uses indentation for structure, not braces. A diff that shows only whitespace changes may actually represent a structural change:
# Version A # Version B
servers: servers:
- name: web - name: web
port: 80 port: 80
- name: api - name: api
port: 8080 port: 8080
# ↑ Extra indent: now nested under 'api'Value type changes
YAML implicit typing means the string "true" and the boolean true look similar in a text diff but behave differently in code. A diff tool shows the text difference; it cannot warn you about semantic type changes. See the full breakdown in the YAML Implicit Typing Pitfalls article.
Anchor and alias expansion
# Version A — uses anchor
defaults: &defaults
timeout: 30
retries: 3
service_a:
<<: *defaults
port: 8080
# Version B — expanded inline
service_a:
timeout: 30
retries: 3
port: 8080These two files are semantically equivalent but a text diff reports them as completely different. When comparing YAML, consider whether you want a text diff or a semantic diff (parse both files and compare the resulting data structures).
Practical Use Cases
Config file auditing
Before deploying to production, diff your staging and production configs to find environment-specific values that may have been accidentally hardcoded:
# On the command line
diff staging.env production.env
# Or in git
git diff staging..production -- config/Common things to catch: database hostnames pointing to dev, debug flags left on, rate-limit values that differ between environments.
PR review
When reviewing a pull request, the unified diff is what GitHub, GitLab, and Bitbucket show. Tips for reading PR diffs efficiently:
- Start with the file list. Understanding which files changed tells you the scope before reading any line.
- Read context lines. The unchanged lines around a hunk tell you what function or block the change is inside.
- Rename detection. Git detects renames — a file showing as deleted and added with high similarity is a move, not a rewrite.
- Ignore whitespace.
git diff -wstrips whitespace-only changes that clutter the diff without changing logic.
Debugging regressions
When a feature worked last week but breaks today, git bisect + diff narrows the cause:
# Find the commit that broke it
git bisect start
git bisect bad HEAD
git bisect good v1.2.0
# After bisect finds the culprit commit:
git show <commit-hash>
# Shows the diff of exactly what changed in that commitComparing API responses
Paste two JSON responses into the Text Comparer to spot what a new API version changed. Format both with the JSON Formatter first — this normalises key ordering and indentation so the diff shows actual data changes, not formatting noise.
# Normalise JSON for clean diffing
cat response_v1.json | jq --sort-keys . > v1_sorted.json
cat response_v2.json | jq --sort-keys . > v2_sorted.json
diff v1_sorted.json v2_sorted.jsonDatabase migration review
Before running a migration in production, diff the generated SQL against the previous migration:
diff migrations/0042_add_users.sql migrations/0043_add_roles.sqlLook for: unexpected DROP statements, column type changes on large tables, index changes that will lock the table.
Useful git diff Options
| Command | What it does |
|---|---|
git diff | Unstaged changes in working directory |
git diff --staged | Staged changes (what will be committed) |
git diff HEAD~1 | Changes since the last commit |
git diff main..feature | All changes in feature branch not in main |
git diff -w | Ignore all whitespace changes |
git diff --word-diff | Word-level diff instead of line-level |
git diff --stat | Summary: files changed, insertions, deletions |
git diff -- path/to/file | Diff a specific file only |
git diff --no-color | pbcopy | Copy plain-text diff to clipboard (macOS) |
Programmatic Diffing in Node.js
For building diff views in applications, diff is the standard Node.js library:
npm install diffimport { diffLines, diffWords, diffChars, createPatch } from 'diff';
const oldText = 'host: localhost
port: 5432
';
const newText = 'host: db.prod.example.com
port: 5432
';
// Line diff
const lineDiff = diffLines(oldText, newText);
for (const part of lineDiff) {
const symbol = part.added ? '+' : part.removed ? '-' : ' ';
process.stdout.write(symbol + part.value);
}
// - host: localhost
// + host: db.prod.example.com
// port: 5432
// Word diff
const wordDiff = diffWords(
'The quick brown fox',
'The fast brown fox'
);
// [unchanged] The
// [removed] quick
// [added] fast
// [unchanged] brown fox
// Generate unified patch
const patch = createPatch('config.yml', oldText, newText);
console.log(patch);
// --- config.yml
// +++ config.yml
// @@ -1,2 +1,2 @@
// -host: localhost
// +host: db.prod.example.com
// port: 5432When Text Diff Is Not Enough
Text diff compares bytes. It has no understanding of structure or semantics. In some cases you need more:
- JSON/YAML semantic diff: Parse both files and diff the resulting data structures. Tools like
jsondiffpatchordyffdo this and ignore formatting noise. - SQL schema diff: Tools like
pgdiffcompare database schemas rather than migration files, catching structural changes even when SQL is re-ordered. - Binary files: Text diff is meaningless on binary formats. Use format-specific tools —
exiftoolfor image metadata, specialized PDF or Word diff tools. - Minified code: Always pretty-print before diffing. A minified JS file shows as one giant changed line. Use the JavaScript Formatter first.
Compare any two texts directly in your browser with the Text Comparer — YAML, JSON, configs, code, or plain prose. No data leaves your machine.