URL Encoding Edge Cases: Space, Plus, and Percent
At some point every developer ships a feature that mysteriously breaks as soon as someone pastes a value with spaces, plus signs, or non-ASCII characters into a URL. The root cause is almost always the same: subtle differences in how URL encoding works between browsers, backend frameworks, and reverse proxies.
In this article we will walk through real debugging scenarios around percent encoding,+ vs %20, and the difference between RFC 3986 URLs and application/x-www-form-urlencoded form data. We will also look at how to use a URL encoder/decoder to sanity-check query strings before they ever hit your backend.
Two Worlds: RFC 3986 vs application/x-www-form-urlencoded
When people say "URL encoding", they usually mix two slightly different rulesets:
- RFC 3986 URL encoding – describes how characters are percent encoded in URIs and URLs.
- application/x-www-form-urlencoded – describes how browsers encode HTML form fields into the query string or request body.
The tricky part: both use percent encoding, but form encoding treats space differently.
| Context | Space character | Example |
|---|---|---|
| RFC 3986 (generic URL) | %20 | q=hello%20world |
| application/x-www-form-urlencoded | + (plus sign) | q=hello+world |
In form-encoded data, + means space, not literal plus. In RFC 3986 URLs, + is just another character and must be percent encoded as %2B if you really need a plus sign.
Debugging Story: "C++" Turns into "C "
Imagine you are debugging a search feature where users can search for programming languages. Someone reports that searching for C++ returns results for C instead.
The frontend sends a request that looks correct at first glance:
GET /search?query=C++ HTTP/1.1
Host: example.testBut if the browser or client does not encode the plus signs correctly, the actual request line might become:
GET /search?query=C%2B%2B # Correct (percent encoded plus signs)
GET /search?query=C++ # Ambiguous (may be interpreted as "C ")On the server side, if the framework treats the query as application/x-www-form-urlencoded, it will decode + to a space. The result is "C " instead of "C++".
The safe rule: encode literal plus signs as %2B. When in doubt, run the value through a URL encoder and verify the encoded string.
Reserved, Unreserved, and Special Characters
RFC 3986 splits characters into three broad groups:
- Unreserved: safe to use without encoding – letters, digits,
-,.,_,~ - Reserved: have a structural meaning in the URL –
:,/,?,#,[,],@,!,$,&,',(,),*,+,,,;,= - Everything else: must be percent encoded (including spaces, non-ASCII, control characters)
In the query string, reserved characters are frequently used as separators:
&separates key-value pairs=separates names and values?starts the query
If you need these characters inside a value, you must encode them. For example, a query parameter that contains JSON should encode ", { and }, and any & characters to avoid breaking the structure of the URL.
How Percent Encoding Works
Percent encoding replaces a byte in the URL with % followed by two uppercase hexadecimal digits representing that byte's value.
" " (space) -> %20
"+" (plus) -> %2B
"%" (percent) -> %25
"?" (question)-> %3F
"&" (ampersand)-> %26For non-ASCII characters in UTF-8, each byte of the encoded sequence is percent encoded separately:
"привет" (UTF-8 bytes):
пр -> D0 BF
ив -> D0 B8 D0 B2
ет -> D0 B5 D1 82
Encoded URL value:
%D0%BF%D1%80%D0%B8%D0%B2%D0%B5%D1%82A URL encoder/decoder is useful here: paste a Unicode string, verify the percent encoded output, and check whether your backend decodes it back to the original value without mangling bytes.
Query String Encoding in Practice
Suppose you want to send multiple filters in a single query string parameter using JSON. The raw JSON might look like this:
{
"status": ["open", "closed"],
"assignee": "alice@example.com"
}If you naively embed it into a URL without encoding, you get an invalid query string:
/issues?filter={"status":["open","closed"],"assignee":"alice@example.com"}Characters like ", {, }, and especially @ and & will conflict with URL parsing rules.
The safe version percent encodes the JSON value:
/issues?filter=%7B%22status%22%3A%5B%22open%22%2C%22closed%22%5D%2C%22assignee%22%3A%22alice%40example.com%22%7DThis is where a query string encoding workflow helps: assemble the JSON, run it through a URL encoder, and only then append it to the URL.
Common URL Encoding Bugs
Bug 1: Double Encoding
Double encoding happens when a value is percent encoded more than once. For example, space becomes %20, then the % itself is encoded to %25, producing %2520.
// First encoding:
"hello world" -> "hello%20world"
// Encoded again:
"hello%20world" -> "hello%2520world"On the server, a single decoding pass produces hello%20world, not hello world. You end up storing literally %20 in the database or comparing the wrong string during lookups.
To debug, decode the value once using a URL decoder. If you still see percent sequences, you know the value was encoded multiple times somewhere in the pipeline.
Bug 2: Missing Encoding for Query String Values
Constructing URLs by string concatenation is a classic source of bugs:
// Fragile: no encoding
const url = "/search?query=" + query;
// Safer: encode the value
const url = "/search?query=" + encodeURIComponent(query);Forgetting to encode values means characters like & and = will break the structure of the query string and drop parameters on the floor. Always feed raw user input through a URL encoder before inserting it into the query.
Bug 3: Encoding the Whole URL Instead of Just Values
Another common mistake is calling a URL encoder on the entire URL, including delimiters and path separators:
// Wrong: encodes "/", "?", "&" and "=" as well
encodeURIComponent("/search?query=hello world");
// Correct: encode only the dynamic value
"/search?query=" + encodeURIComponent("hello world");When you over-encode the whole string, the server no longer sees a normal URL with a path and query string. Many frameworks will treat it as a path with literal %3F and %3D characters, and you lose all query parameters.
Using DevToys-Style URL Encoding Tools in Your Workflow
A dedicated URL encoder/decoder is invaluable when debugging web payloads. With the DevToys Pro URL Encoder/Decoder you can:
- Paste a raw query string and immediately see decoded keys and values
- Encode arbitrary text (including spaces, plus signs, and Unicode) as a safe URL value
- Verify whether a string is double encoded by decoding it step-by-step
- Compare how different inputs change the encoded output for edge cases
- Experiment with query string encoding before wiring it into production code
Best Practices for URL Encoding
1. Encode Values, Not the Entire URL
Apply URL encoding only to the dynamic pieces of the URL (query parameters, path segments), not to the entire string. This keeps separators like ?, &, =, and / intact so that the URL parser can do its job.
2. Treat "+" Carefully
- In form-encoded data,
+represents a space. - To send a literal
+in a value, encode it as%2B. - If you see spaces where plus signs should be, check how the server decodes
application/x-www-form-urlencodedpayloads.
3. Always Use a Proper Encoder
Avoid hand-written replacements like replace(" ", "%20"). Use a proper URL encoder in your language of choice, and cross-check behavior using the online URL encoder/decoder so that browsers, servers, and tools all agree on semantics.
4. Log Both Raw and Decoded Values
When debugging production issues, log the raw query string as received and the decoded parameters as interpreted by your framework. Differences between the two often reveal double encoding, missing encoding, or incorrect handling of special characters.
Conclusion
URL encoding looks simple until spaces, plus signs, and non-ASCII characters enter the picture. Understanding the difference between RFC 3986 encoding and application/x-www-form-urlencoded, treating + with care, and consistently percent encoding special characters will save you from subtle query string bugs.
The combination of correct query string encoding in code and a reliable URL encoder/decoder in your toolbox gives you confidence that what the user sees in the address bar is exactly what your backend receives and decodes.