DevToys Web Pro iconDevToys Web Proالمدونة
مُترجم بواسطة LocalePack logoLocalePack
قيّمنا:
جرّب إضافة المتصفح:
← Back to Blog

String Escaping Guide: JSON, HTML, URL, SQL, Shell, and More

10 min read

String escaping is one of those fundamentals that every developer encounters daily yet rarely stops to think about systematically. You escape a backslash in JSON, a less-than sign in HTML, a space in a URL, a single quote in SQL — but the rules are completely different in each case, and mixing them up causes bugs that range from broken output to security vulnerabilities. Use the Text Escape tool to convert strings across all these contexts instantly.

Why Escaping Is Context-Dependent

The ampersand character & is a useful example. In plain text it is completely harmless — it is just a character. In HTML it is the start of an entity reference like & or ©, so an unescaped & in an attribute value can silently corrupt markup. In JSON it carries no special meaning whatsoever and does not need escaping. In a URL query string it is a parameter separator, so Q&A becomes two parameters (Q and A) unless you encode it as Q%26A.

Each language or protocol defines its own reserved characters — characters that have structural meaning within that format. Escaping is how you tell the parser: "treat this as literal data, not as syntax." Because the reserved sets differ, every context needs its own escaping rules.

Context Overview

ContextMechanismKey characters
JSONBackslash sequences" \ / \n \r \t \uXXXX
HTMLNamed / numeric entities< > & " '
URLPercent-encodingEverything outside RFC 3986 unreserved set
SQLDoubled quotes or backslash (dialect-dependent)' \ and NUL bytes
ShellSingle/double quoting or backslash$ ` \ " ' ! & ; | ( ) < >
RegexBackslash before metacharacter.*+?()[]{}^$|\
CSVDouble-quote wrapping + doubled quotes, "

JSON Escaping

JSON strings are delimited by double quotes, so a literal double quote inside a string must be escaped as \". The backslash is the escape character itself, so it must be doubled as \\. Beyond those two, JSON defines a short list of recognized sequences:

SequenceMeaning
\"Double quote
\\Backslash
\/Forward slash (optional but allowed)
\nNewline (U+000A)
\rCarriage return (U+000D)
\tTab (U+0009)
\bBackspace (U+0008)
\fForm feed (U+000C)
\uXXXXUnicode code point (four hex digits)

A critical but often overlooked rule: control characters U+0000 through U+001F (other than the ones with named escapes above) are forbidden as raw bytes in JSON strings. If you serialize a string that contains a raw tab character without escaping it, you produce invalid JSON that most parsers will reject.

// Valid JSON — all special characters properly escaped
{
  "message": "Line one\nLine two",
  "path": "C:\\Users\\alice",
  "name": "She said \"hello\"",
  "unicode": "\u00e9"
}

See also: Unescape JSON Strings for the reverse operation — turning escape sequences back into readable text.

HTML Entities

HTML parsers treat < as the start of a tag and & as the start of an entity reference. Those two characters must always be escaped in text content. Inside attribute values the rules tighten: you also need to escape the quote character used as the attribute delimiter.

CharacterNamed entityNumeric (decimal)Where required
<&lt;&#60;Text content, attribute values
>&gt;&#62;Text content (technically optional but recommended)
&&amp;&#38;Text content, attribute values
"&quot;&#34;Inside double-quoted attribute values
'&apos;&#39;Inside single-quoted attribute values

The &apos; entity is worth calling out. It is valid in XML and HTML5 but was not defined in HTML4. For maximum compatibility in attribute values use &#39; instead. And remember: if you use double quotes for your attribute delimiters you do not need to escape single quotes inside them, and vice versa — but escaping both is always safe.

See also: HTML Entities Reference Guide for a complete list of named entities.

URL Percent-Encoding

RFC 3986 defines a set of unreserved characters that can appear in a URL without encoding: A-Z a-z 0-9 - _ . ~. Every other byte must be percent-encoded as %XX where XX is the uppercase hexadecimal byte value.

The encoding rules differ depending on which component of the URL you are building:

ComponentExtra allowed unencodedMust encode
Path segment! $ & ' ( ) * + , ; = : @ / (structural)? # [ ] and spaces
Query value! $ ' ( ) * , ; : @ / ?& = + # [ ] and spaces
FragmentMost printable ASCII# [ ]

A space encodes to %20 in standard percent-encoding. application/x-www-form-urlencoded (HTML form submission) uses + for spaces instead — these two formats are not interchangeable. Always use encodeURIComponent() in JavaScript for query parameter values, never encodeURI() (which leaves structural characters like & and = unencoded).

See also: URL Encoding Edge Cases for a deeper dive into component-level encoding rules.

SQL Escaping

SQL string literals are delimited by single quotes. The standard SQL way to include a literal single quote inside a string is to double it: 'it''s'. MySQL and some other databases also allow the backslash escape \'it\'s\', but this is non-standard and disabled by default in strict mode.

-- Standard SQL (works everywhere)
SELECT * FROM users WHERE name = 'O''Brien';

-- MySQL non-standard (avoid)
SELECT * FROM users WHERE name = 'O\'Brien';

Manual escaping is error-prone and is the root cause of SQL injection. The real answer is parameterized queries (also called prepared statements). The database driver handles escaping for you, and the query structure is fixed before user input is ever substituted:

// Never do this
const query = `SELECT * FROM users WHERE name = '${userInput}'`;

// Always do this — parameterized query
const result = await db.query(
  'SELECT * FROM users WHERE name = $1',
  [userInput]
);

Parameterized queries eliminate the injection risk entirely because user input is never interpreted as SQL syntax. Manual escaping, however careful, can be bypassed by multi-byte character tricks in certain character set configurations.

Shell Escaping

Shell interpreters expand a long list of special characters: $ ` \ " ' ! & ; | ( ) < > { } and whitespace. Incorrectly handling these in a shell command is the cause of command injection vulnerabilities and unexpected behavior.

Shell quoting has two distinct modes:

Quote styleWhat it doesWhat still expands
Single quotes '...'Preserves every character literallyNothing — strongest quoting
Double quotes "..."Prevents word splitting and glob expansion$VAR, $(cmd), backtick `cmd`, \
Backslash \Escapes the next character onlyEverything else
# Single quotes: safest for literal strings
echo 'The price is $5.00 (no expansion)'

# Double quotes: allow variable expansion
echo "Hello, $USER"

# Dangerous: unquoted variable with spaces or special chars
file="report (final).txt"
rm $file          # expands to: rm report (final).txt  -- four args!
rm "$file"        # correct: rm "report (final).txt"

# In Python, use shlex.quote for safe shell argument quoting
import shlex
safe = shlex.quote(user_input)  # wraps in single quotes and escapes internal ones

The $(command) construct and backtick substitution are particularly dangerous in user-supplied strings passed to exec or system() calls. Single-quoting a value prevents both forms of command substitution entirely.

Regex Escaping

Regular expression engines reserve a set of metacharacters that control matching behavior. To match them literally, prefix each with a backslash:

. * + ? ( ) [ ] { } ^ $ | \

The easiest way to escape an arbitrary string for use inside a regex pattern is a helper function that escapes all metacharacters:

// JavaScript — escapeRegExp helper
function escapeRegExp(str) {
  return str.replace(/[.*+?^${}()|[\]\\]/g, '\\$&');
}

// Example: match a literal URL parameter
const param = 'status=200 (OK)';
const pattern = new RegExp(escapeRegExp(param));
import re

# Python — re.escape handles all metacharacters
pattern = re.compile(re.escape(user_input))

Note that the set of metacharacters varies slightly between regex flavors (PCRE, ECMAScript, RE2, POSIX). When in doubt, escaping any non-alphanumeric character is safe in most flavors.

CSV Escaping

RFC 4180 defines the standard CSV format. Fields that contain commas, double quotes, or newlines must be wrapped in double quotes. A double quote inside a quoted field is escaped by doubling it: "".

# Plain field — no quoting needed
Alice,30,Engineer

# Field with a comma — must be quoted
"Smith, John",45,Manager

# Field with a double quote — quote the field, double the internal quote
"She said ""hello""",22,Intern

# Field with a newline — must be quoted
"Line one
Line two",55,Director

The most common CSV bug is forgetting to handle embedded newlines. A cell value containing a newline is valid per RFC 4180 as long as the field is double-quoted, but many ad-hoc CSV parsers (line-by-line readers) choke on it. Use a proper CSV library rather than splitting on commas.

Cross-Context Bugs

The most subtle escaping bugs happen when data passes through multiple contexts in sequence. A string that is correctly escaped for context A may arrive at context B already partially processed, leading to double encoding or broken escaping.

JSON Inside HTML Inside JavaScript

A common pattern: server-side code serializes data to JSON, inlines it in a <script> tag, and the browser parses the HTML then executes the JS. Each layer has its own parser:

<!-- WRONG: closing script tag injected via JSON string -->
<script>
  var data = {"message": "</script><script>alert(1)</script>"};
</script>

<!-- CORRECT: escape forward slashes in JSON for HTML embedding -->
<script>
  var data = {"message": "<\/script><script>alert(1)<\/script>"};
</script>

The fix is to escape / as \/ in JSON that will be embedded in HTML. JSON allows this optionally, and HTML-embedding is exactly the case where you want it.

Double Encoding

Double encoding happens when you escape a string that is already escaped. A space becomes %20 after the first pass, then %2520 after the second (because % itself gets encoded). URLs with %25 in them are almost always a sign of double encoding. The rule: escape exactly once, at the point of assembly.

Escaped Once, Unescaped Twice

The mirror problem: some frameworks or template engines auto-escape output. If you pre-escape your data before passing it to such a template, the output ends up with literal &amp; instead of & on the rendered page. The rule: let your framework handle context-appropriate escaping; pass raw data, not pre-escaped strings.


The Text Escape tool converts strings across all the contexts described in this article — JSON, HTML, URL, SQL, shell, regex, and CSV — in a single pass, client-side, with no data leaving your browser. Related articles: HTML Entities Guide, Unescape JSON Strings, and URL Encoding Edge Cases.