URL Architect
Encode/decode URLs and build tracking links with UTM parameters.
Deep-Dive Technical Documentation
RFC 3986: The Definitive URI Syntax Standard
Every URL you construct is governed by RFC 3986 (published January 2005, updating RFC 2396). The RFC defines the generic URI syntax as scheme://authority/path?query#fragment. The path component uses '/' as a delimiter and permits unreserved characters (A–Z, a–z, 0–9, '-', '.', '_', '~') to appear literally. Every other character — spaces, ampersands, equals signs, non-ASCII code points — must be percent-encoded: the byte value is written as %HH where HH is the two-digit uppercase hexadecimal representation. For example, a space (byte 0x20) becomes %20. The query component follows slightly different rules: it permits '/' and '?' to appear unencoded, but '&' and '=' carry syntactic meaning as key-value delimiters. This distinction is why encodeURI and encodeURIComponent behave differently in JavaScript — encodeURI preserves structural characters like ':', '/', '?', and '#', while encodeURIComponent escapes everything except unreserved characters. URL Architect uses encodeURIComponent for parameter values because that is the only correct choice when building query strings: a parameter value containing '&' or '=' must be escaped, or it will corrupt the query structure.
Percent-Encoding and UTF-8: How Non-ASCII Characters Travel Through URLs
When a URL contains non-ASCII characters — accented letters, CJK ideographs, emoji, Arabic script — the encoding pipeline has two stages. First, the string is encoded as UTF-8 bytes (RFC 3629). For example, the German 'ü' (U+00FC) becomes the two-byte sequence 0xC3 0xBC in UTF-8. Second, each byte is percent-encoded: %C3%BC. A single emoji like the rocket (U+1F680) expands to four UTF-8 bytes and therefore four percent-encoded triplets: %F0%9F%9A%80. This is specified in RFC 3986 Section 2.5, which mandates UTF-8 as the character encoding for new URI schemes. Older systems sometimes used Latin-1 or Shift-JIS encoding, which is a common source of mojibake (garbled characters) when URLs are shared across systems. URL Architect handles this correctly by always encoding through the JavaScript runtime's built-in UTF-8 pipeline, so Internationalized Resource Identifiers (IRIs) defined in RFC 3987 are handled transparently without requiring users to understand the underlying byte-level mechanics.
UTM Parameters: The Google Analytics Attribution Framework
UTM (Urchin Tracking Module) parameters are query string key-value pairs that Google Analytics reads to attribute traffic sources. The five standard parameters are: utm_source (the referrer, e.g., 'newsletter', 'twitter'), utm_medium (the marketing medium, e.g., 'email', 'cpc', 'social'), utm_campaign (the specific campaign name, e.g., 'spring_sale_2024'), utm_term (paid search keywords, used for tracking which keyword triggered an ad), and utm_content (used for A/B testing or differentiating links that point to the same URL). When Google Analytics processes an incoming request, it extracts these parameters from the URL and populates the Source, Medium, and Campaign dimensions in the Acquisition reports. The values are case-sensitive — 'Email' and 'email' create separate entries in your reports — which is why consistent naming conventions matter. URL Architect enforces lowercase formatting and strips trailing whitespace from parameter values to prevent this fragmentation. The tool also supports custom parameters beyond the standard five, which is useful for internal analytics systems or platforms like HubSpot, Mixpanel, or Amplitude that recognize their own parameter namespaces.
Common URL Encoding Pitfalls and How to Avoid Them
The most frequent URL encoding bug is double-encoding: taking an already-encoded string like 'hello%20world' and encoding it again to produce 'hello%2520world' (because '%' itself gets encoded to '%25'). This happens when developers pass a URL through encodeURIComponent more than once, or when a framework's HTTP client applies encoding on top of manually encoded parameters. The reverse problem — under-encoding — occurs when developers use encodeURI on a full URL but forget that query parameter values need encodeURIComponent. A value like 'q=foo&bar' passed through encodeURI will leave the '&' intact, splitting what should be a single parameter into two. Another common issue involves the '+' character: in the application/x-www-form-urlencoded format (used by HTML form submissions), spaces are encoded as '+' instead of '%20'. Some server-side frameworks decode '+' as a space, while others treat it as a literal plus sign. URL Architect always uses percent-encoding (%20 for spaces) because that is the universally correct encoding specified by RFC 3986, avoiding the ambiguity introduced by the form-encoding legacy format.
What is URL Architect?
URLs break in subtle, infuriating ways. A space in a query parameter, an unescaped ampersand, a hash character that silently truncates everything after it — these are the kinds of bugs that eat an hour of your afternoon before you realize the problem was encoding. URL Architect gives you two tools in one: a URL encoder/decoder and a UTM parameter builder. The encoder/decoder uses JavaScript's encodeURIComponent under the hood, which percent-encodes every character that isn't a letter, digit, or one of -_.!~*'() — including forward slashes, question marks, and ampersands that would otherwise be interpreted as URL structure. This is the correct function for encoding parameter values (as opposed to encodeURI, which deliberately leaves structural characters intact and will silently pass through bugs). On the UTM side, the builder lets you construct Google Analytics tracking URLs by filling in utm_source, utm_medium, utm_campaign, and the optional utm_term and utm_content fields. It assembles the full URL with properly encoded parameters, ready to paste into your campaign. You can also add arbitrary custom parameters for internal tracking systems. Developers use the encoder for API query strings, redirect URLs with nested parameters, and debugging percent-encoded values from server logs. Marketers use the UTM builder to tag every link in a newsletter or ad campaign without manually concatenating query strings. Both modes run entirely in your browser — no URL data is transmitted anywhere.
How to Use
- For encoding/decoding: Enter your URL or string and select encode or decode
- For UTM building: Enter your base URL and fill in the UTM parameters
- Add custom parameters if needed using the 'Add Parameter' button
- Copy the result to use in your application or campaign
Common Use Cases
- Encoding query parameters for API requests
- Decoding URL-encoded strings for debugging
- Building UTM-tagged URLs for Google Analytics
- Creating trackable marketing campaign links
- Encoding special characters in file paths
Frequently Asked Questions
Client-Side Sandbox Security Verification
Zero server transmission. All processing runs entirely within your browser's JavaScript sandbox using native browser-compiled APIs. 0% of your data payloads ever cross an external server boundary, origin log, or third-party endpoint.
Browser-native compilation. Operations like JSON.parse(), btoa()/atob(), encodeURIComponent(), and the Intl API are executed by the browser engine itself (V8, SpiderMonkey, or JavaScriptCore) — no WebAssembly payloads, no remote execution, no server-side eval.
Independently verifiable. Open your browser's DevTools > Network tab while using any tool. You will see zero outbound requests containing your data. This is a verifiable, auditable privacy architecture.