URL Encoding Explained: When and Why to Encode URLs
If you've ever seen %20 in a URL and wondered what it meant, or if an API broke because a query parameter contained an ampersand, this guide is for you. URL encoding is a fundamental web skill that every developer needs to understand โ and getting it wrong produces some of the most frustrating bugs in web development.
Table of Contents
- What URL Encoding Is
- What Percent-Encoding Means
- Why Spaces Become %20 or +
- Characters That Need Encoding
- Query Strings and UTM Parameters
- Try It with URL Architect
1. What URL Encoding Is
URLs can only contain a limited set of characters defined by RFC 3986. The allowed set includes uppercase and lowercase ASCII letters, digits, and a handful of special characters: - _ . ~ (unreserved characters) and : / ? # [ ] @ ! $ & ' ( ) * + , ; = (reserved characters with special meaning).
Any character outside this set โ spaces, accented letters, emoji, angle brackets, curly braces, pipe symbols, and most non-ASCII characters โ must be encoded before it can appear in a URL. This encoding process replaces the unsafe character with a percent sign followed by two hexadecimal digits representing the character's byte value.
The purpose is simple: ensure that every URL is unambiguous. Without encoding, a space in a filename could be mistaken for the boundary between the path and the query string. An ampersand in a search term could be interpreted as a parameter separator.
2. What Percent-Encoding Means
Percent-encoding (sometimes called URL encoding) converts a character to its UTF-8 byte representation, then writes each byte as %XX where XX is the hexadecimal value. Examples:
| Character | UTF-8 Bytes | Percent-Encoded |
|---|---|---|
| space | 0x20 | %20 |
| & | 0x26 | %26 |
| = | 0x3D | %3D |
| รฉ | 0xC3 0xA9 | %C3%A9 |
| ๐ | 0xF0 0x9F 0x9A 0x80 | %F0%9F%9A%80 |
Notice how a single emoji becomes four percent-encoded sequences because its UTF-8 encoding is 4 bytes. This is why URLs with emoji or non-Latin text can become extremely long.
3. Why Spaces Become %20 or +
This is one of the most confusing aspects of URL encoding, and it has a historical explanation. There are actually two different encoding standards at play:
- RFC 3986 (URI standard): Spaces are encoded as
%20. This is whatencodeURIComponent()produces in JavaScript. - application/x-www-form-urlencoded: Spaces are encoded as
+. This format is used by HTML form submissions and some legacy systems.
Both are valid, but mixing them causes bugs. If your backend expects %20 and receives +, the plus sign might be interpreted literally. When in doubt, use encodeURIComponent() for building URLs programmatically โ it uses %20 and is correct for all URL contexts.
// JavaScript examples
encodeURIComponent("hello world") // "hello%20world"
encodeURIComponent("a&b=c") // "a%26b%3Dc"
// Form encoding (different!)
new URLSearchParams({q: "hello world"}).toString() // "q=hello+world"
4. Characters That Need Encoding
Not every special character needs encoding โ it depends on where in the URL it appears. Here is a practical guide:
- Always encode in query values:
& = + # % ? /and spaces. These have structural meaning in URLs. - Always encode in paths:
? # [ ]and spaces. Slashes are allowed but must not be inside a single path segment. - Safe everywhere:
A-Z a-z 0-9 - _ . ~never need encoding.
The most common bug is double encoding: encoding a value, then encoding the entire URL again. This turns %20 into %2520, which the server receives as the literal string โ%20โ instead of a space. Always encode values before assembling the URL, then leave the assembled URL alone.
5. Query Strings and UTM Parameters
Query strings are the part of a URL after the ?, consisting of key-value pairs separated by &. Each key and value must be individually encoded so the parser can distinguish separators from literal characters.
// Building a URL with query parameters
const base = "https://example.com/search";
const params = new URLSearchParams({});
params.set("q", "price > 100 & color = blue");
params.set("page", "1");
console.log(`${base}?${params}`)
// https://example.com/search?q=price+%3E+100+%26+color+%3D+blue&page=1
UTM parameters follow the same rules. Marketing teams add utm_source, utm_medium, utm_campaign, and other tags to links for Google Analytics tracking. If a campaign name contains spaces or special characters, they must be encoded. Otherwise, the analytics platform misreads the data.
URL Architect includes a dedicated UTM builder that handles encoding automatically, so you can focus on the campaign values without worrying about escaping.
Try It with URL Architect
Need to encode a URL, decode a messy percent-encoded string, or build UTM-tagged links? Open URL Architect and do it instantly. Encode, decode, and build โ all in one interface, all in your browser.
- Encode and decode URLs with one click
- Build UTM-tagged campaign links
- 100% client-side โ no data leaves your browser