URL 编码是什么？

URL 编码（Percent-encoding）把 URL 中的非 ASCII 字符或保留字符转为 `%XX` 形式（如空格 → `%20`）。这样 URL 在所有系统和浏览器中都能正确传输。

为什么要 URL 编码？

URL 只能用 ASCII 字符。中文、空格、`&` `?` `#` 等在 URL 中有特殊含义或不在 ASCII 范围内，必须编码才能正确传输。

URL 编码 vs Base64 编码？

URL 编码用于 URL 传输（把字符转 %XX），Base64 用于数据传输（把字节转 ASCII）。两者用途不同，可叠加使用。

URL Encoding: The Complete Developer's Guide

URLs look simple, but they're surprisingly strict. The spec only allows a small subset of ASCII characters — and the moment you try to put a space, an accented letter, or a Chinese character into a query string, the browser will silently rewrite it (or break entirely). That's where URL encoding (also called percent encoding) comes in.

✍️ Author：DevToolbox Team📅 Updated：2026-06-24📎 References：RFC 3986 — URI Generic Syntax
WHATWG URL Standard RFC Standards

📌 Key Takeaways

URL Encoding: The Complete Developer's Guide is widely used by developers
Based on RFC standards and real-world experience
Free online tools, runs locally, no data upload
FAQ section at the bottom answers common questions

This guide covers how URL encoding works, when to use it, and the most common pitfalls — including the crucial difference between encodeURI and encodeURIComponent in JavaScript, plus the equivalents in Python.

The Problem: URLs Only Carry ASCII

Original URLs (defined in RFC 3986) were designed to be typed on a keyboard, copy-pasted between emails, and read on a teletype. To keep that simple, only a restricted character set is allowed unescaped:

Unreserved (never encoded):  A-Z  a-z  0-9  -  _  .  ~
Reserved (sometimes encoded):  !  *  '  (  )  ;  :  @  &  =  +  $  ,  /  ?  #  [  ]

Anything else — spaces, slashes inside a query value, emoji, CJK characters — must be encoded as %XX, where XX is the two-digit hexadecimal byte value. For multi-byte characters like 中, you first encode to UTF-8 bytes (E4 B8 AD), then percent-encode each one: %E4%B8%AD.

The Rules in Practice

Spaces

A literal space is illegal in a URL. There are two valid representations:

%20 — the canonical percent-encoding, works everywhere.
+ — a legacy convention from application/x-www-form-urlencoded form submissions. Works in query strings, not in URL paths.

Non-ASCII Characters

Every character above U+007F is first converted to its UTF-8 byte sequence, then each byte becomes %XX:

"中"  →  UTF-8: E4 B8 AD  →  %E4%B8%AD
"日"  →  UTF-8: E6 97 A5  →  %E6%97%A5
"é"   →  UTF-8: C3 A9     →  %C3%A9

Reserved Characters in the Wrong Place

A / separates path segments. A ? introduces a query string. A # marks a fragment. If any of these appears in a value (like a search query), it must be encoded — otherwise the server misinterprets the URL structure.

3 Real-World Use Cases

1. Query String Parameters

The bread and butter of web development. Whenever you build a search URL or pass a filter through a link, every value must be percent-encoded so special characters don't break parsing on the server side.

// Search for "cats & dogs" on example.com
https://example.com/search?q=cats%20%26%20dogs&page=2

2. Form Submissions

When a browser submits an HTML form with method="GET", every field is URL-encoded and joined with &. The <input type="email"> and URL patterns depend on this convention working correctly.

3. REST API Identifiers

Resource paths with slugs containing special characters — like blog post titles — must be encoded. A post titled "C++ vs C#: A Comparison" would have its slug c-vs-c-a-comparison or the title sent as a query parameter properly encoded.

JavaScript: encodeURI vs encodeURIComponent

This is the most common source of bugs in web code, and the distinction is worth memorizing.

encodeURI

Designed to encode a complete URL. It leaves the structural characters alone so the URL stays valid:

encodeURI("https://example.com/path with spaces/?q=hello world")
// "https://example.com/path%20with%20spaces/?q=hello%20world"

encodeURI("https://example.com/中国")
// "https://example.com/%E4%B8%AD%E5%9B%BD"

Characters not encoded by encodeURI: A-Z a-z 0-9 - _ . ! ~ * ' ( ) ; / ? : @ & = + $ , #

encodeURIComponent

Designed to encode a single value (a query parameter, a path segment, a form field). It escapes far more aggressively, including the structural characters that encodeURI leaves alone:

encodeURIComponent("hello world?")
// "hello%20world%3F"

encodeURIComponent("a&b=c")
// "a%26b%3Dc"

Characters not encoded by encodeURIComponent: A-Z a-z 0-9 - _ . ! ~ * ' ( )

The Practical Rule

Use encodeURIComponent for any value you build into a query string, and encodeURI only when you're encoding a complete URL whose structure you've already verified. The classic bug:

// ❌ WRONG — & and = in the value leak into the query string
const url = "https://api.example.com/search?q=" + encodeURI("cats & dogs");
// url = "https://api.example.com/search?q=cats%20&%20dogs"  (broken!)

// ✅ RIGHT — encode the value, not the whole URL
const url = "https://api.example.com/search?q=" + encodeURIComponent("cats & dogs");
// url = "https://api.example.com/search?q=cats%20%26%20dogs"  (correct)

Python: urllib.parse.quote and quote_plus

Python's standard library mirrors this distinction. Use quote for general URL components and quote_plus for form-style data (where spaces become + instead of %20):

from urllib.parse import quote, quote_plus, urlencode

quote("hello world")
# 'hello%20world'

quote_plus("hello world")
# 'hello+world'  (form-style)

# Building a query string the safe way
params = urlencode({"q": "cats & dogs", "page": 2})
# 'q=cats+%26+dogs&page=2'

Always use urlencode rather than manual string concatenation — it correctly handles every edge case (Unicode, repeated keys, list values).

Encode or decode any URL instantly — with options for + vs %20, batch processing, and 100% client-side handling.

Open URL Encoder →

Frequently Asked Questions

Should I use %20 or + for spaces in query strings?

Both are valid in query strings. + is the legacy application/x-www-form-urlencoded convention; %20 is the strict RFC 3986 form. Server-side, most frameworks decode both, so it rarely matters — but %20 is safer when you control both sides.

Do I need to encode URLs from users before storing them?

Generally, store them in their original form and encode only at the point of insertion into a URL. Double-encoding (encoding an already-encoded URL) is a real bug that produces nonsense like %2520 for a space. Most libraries detect and reject double-encoded values.

Why is there a / character in some "encoded" URLs?

/ is reserved because it separates path segments. Most encoders leave it alone by default, but you can force it to be encoded (%2F) when the slash is part of a value rather than a separator. Be cautious — many servers normalize %2F back to /, which can cause path-traversal issues if you rely on it for security.

What's the difference between percent-encoding and HTML entity encoding?

They solve different problems. URL encoding (%XX) makes characters safe for transport in a URL. HTML entity encoding (&, <) makes characters safe to display in an HTML document. Use URL encoding when building links; use HTML entity encoding when inserting text into a page — they are not interchangeable.

Conclusion

URL encoding looks trivial until you ship a bug caused by an unencoded ampersand. Internalize the rule: use encodeURIComponent for values, encodeURI for whole URLs, and your language's urlencoder for anything else. Build the encoding into the helper function that constructs your URLs, not into every call site, and you'll never see a broken link again.

For quick testing or one-off conversions, the DevToolbox URL encoder handles edge cases (Unicode, double encoding, form-style + spaces) without leaving your browser.

FAQ: Common Questions

Q: What is URL encoding?

URL encoding (Percent-encoding) converts non-ASCII characters or reserved characters in the URL into `%XX` form (such as spaces → `%20`). This way the URL will be transmitted correctly across all systems and browsers.

Q: Why URL encoding?

URLs can only use ASCII characters. Chinese characters, spaces, `&` `?` `#`, etc. have special meanings in the URL or are not within the ASCII range and must be encoded for correct transmission.

Q: URL encoding vs Base64 encoding?

URL encoding is used for URL transmission (convert characters to %XX), and Base64 is used for data transmission (convert bytes to ASCII). The two have different uses and can be used in combination.