URL Encoding: The Complete Developer's Guide
URLs look simple, but they're surprisingly strict. The spec only allows a small subset of ASCII characters — and the moment you try to put a space, an accented letter, or a Chinese character into a query string, the browser will silently rewrite it (or break entirely). That's where URL encoding (also called percent encoding) comes in.
This guide covers how URL encoding works, when to use it, and the most common pitfalls — including the crucial difference between encodeURI and encodeURIComponent in JavaScript, plus the equivalents in Python.
The Problem: URLs Only Carry ASCII
Original URLs (defined in RFC 3986) were designed to be typed on a keyboard, copy-pasted between emails, and read on a teletype. To keep that simple, only a restricted character set is allowed unescaped:
Unreserved (never encoded): A-Z a-z 0-9 - _ . ~
Reserved (sometimes encoded): ! * ' ( ) ; : @ & = + $ , / ? # [ ]
Anything else — spaces, slashes inside a query value, emoji, CJK characters — must be encoded as %XX, where XX is the two-digit hexadecimal byte value. For multi-byte characters like 中, you first encode to UTF-8 bytes (E4 B8 AD), then percent-encode each one: %E4%B8%AD.
The Rules in Practice
Spaces
A literal space is illegal in a URL. There are two valid representations:
%20— the canonical percent-encoding, works everywhere.+— a legacy convention fromapplication/x-www-form-urlencodedform submissions. Works in query strings, not in URL paths.
Non-ASCII Characters
Every character above U+007F is first converted to its UTF-8 byte sequence, then each byte becomes %XX:
"中" → UTF-8: E4 B8 AD → %E4%B8%AD
"日" → UTF-8: E6 97 A5 → %E6%97%A5
"é" → UTF-8: C3 A9 → %C3%A9
Reserved Characters in the Wrong Place
A / separates path segments. A ? introduces a query string. A # marks a fragment. If any of these appears in a value (like a search query), it must be encoded — otherwise the server misinterprets the URL structure.
3 Real-World Use Cases
1. Query String Parameters
The bread and butter of web development. Whenever you build a search URL or pass a filter through a link, every value must be percent-encoded so special characters don't break parsing on the server side.
// Search for "cats & dogs" on example.com
https://example.com/search?q=cats%20%26%20dogs&page=2
2. Form Submissions
When a browser submits an HTML form with method="GET", every field is URL-encoded and joined with &. The <input type="email"> and URL patterns depend on this convention working correctly.
3. REST API Identifiers
Resource paths with slugs containing special characters — like blog post titles — must be encoded. A post titled "C++ vs C#: A Comparison" would have its slug c-vs-c-a-comparison or the title sent as a query parameter properly encoded.
JavaScript: encodeURI vs encodeURIComponent
This is the most common source of bugs in web code, and the distinction is worth memorizing.
encodeURI
Designed to encode a complete URL. It leaves the structural characters alone so the URL stays valid:
encodeURI("https://example.com/path with spaces/?q=hello world")
// "https://example.com/path%20with%20spaces/?q=hello%20world"
encodeURI("https://example.com/中国")
// "https://example.com/%E4%B8%AD%E5%9B%BD"
Characters not encoded by encodeURI: A-Z a-z 0-9 - _ . ! ~ * ' ( ) ; / ? : @ & = + $ , #
encodeURIComponent
Designed to encode a single value (a query parameter, a path segment, a form field). It escapes far more aggressively, including the structural characters that encodeURI leaves alone:
encodeURIComponent("hello world?")
// "hello%20world%3F"
encodeURIComponent("a&b=c")
// "a%26b%3Dc"
Characters not encoded by encodeURIComponent: A-Z a-z 0-9 - _ . ! ~ * ' ( )
The Practical Rule
Use encodeURIComponent for any value you build into a query string, and encodeURI only when you're encoding a complete URL whose structure you've already verified. The classic bug:
// ❌ WRONG — & and = in the value leak into the query string
const url = "https://api.example.com/search?q=" + encodeURI("cats & dogs");
// url = "https://api.example.com/search?q=cats%20&%20dogs" (broken!)
// ✅ RIGHT — encode the value, not the whole URL
const url = "https://api.example.com/search?q=" + encodeURIComponent("cats & dogs");
// url = "https://api.example.com/search?q=cats%20%26%20dogs" (correct)
Python: urllib.parse.quote and quote_plus
Python's standard library mirrors this distinction. Use quote for general URL components and quote_plus for form-style data (where spaces become + instead of %20):
from urllib.parse import quote, quote_plus, urlencode
quote("hello world")
# 'hello%20world'
quote_plus("hello world")
# 'hello+world' (form-style)
# Building a query string the safe way
params = urlencode({"q": "cats & dogs", "page": 2})
# 'q=cats+%26+dogs&page=2'
Always use urlencode rather than manual string concatenation — it correctly handles every edge case (Unicode, repeated keys, list values).
Encode or decode any URL instantly — with options for + vs %20, batch processing, and 100% client-side handling.
Frequently Asked Questions
Should I use %20 or + for spaces in query strings?
Both are valid in query strings. + is the legacy application/x-www-form-urlencoded convention; %20 is the strict RFC 3986 form. Server-side, most frameworks decode both, so it rarely matters — but %20 is safer when you control both sides.
Do I need to encode URLs from users before storing them?
Generally, store them in their original form and encode only at the point of insertion into a URL. Double-encoding (encoding an already-encoded URL) is a real bug that produces nonsense like %2520 for a space. Most libraries detect and reject double-encoded values.
Why is there a / character in some "encoded" URLs?
/ is reserved because it separates path segments. Most encoders leave it alone by default, but you can force it to be encoded (%2F) when the slash is part of a value rather than a separator. Be cautious — many servers normalize %2F back to /, which can cause path-traversal issues if you rely on it for security.
What's the difference between percent-encoding and HTML entity encoding?
They solve different problems. URL encoding (%XX) makes characters safe for transport in a URL. HTML entity encoding (&, <) makes characters safe to display in an HTML document. Use URL encoding when building links; use HTML entity encoding when inserting text into a page — they are not interchangeable.
Conclusion
URL encoding looks trivial until you ship a bug caused by an unencoded ampersand. Internalize the rule: use encodeURIComponent for values, encodeURI for whole URLs, and your language's urlencoder for anything else. Build the encoding into the helper function that constructs your URLs, not into every call site, and you'll never see a broken link again.
For quick testing or one-off conversions, the DevToolbox URL encoder handles edge cases (Unicode, double encoding, form-style + spaces) without leaving your browser.