URL Encoder & Decoder
Result
—
URL Encoding Explained: Percent Encoding, Reserved Characters, and the encodeURI vs encodeURIComponent Distinction
URL encoding (officially called percent-encoding, defined in RFC 3986) is the mechanism that allows arbitrary text — including special characters, spaces, non-English characters, and binary data — to be safely included in a Uniform Resource Locator. Since URLs can only contain a limited subset of ASCII characters, any character outside this "safe" set must be converted to its byte representation using a percent sign (%) followed by two hexadecimal digits. For example, a space becomes %20, a hash becomes %23, and the Japanese character 日 becomes %E6%97%A5 (three bytes in UTF-8).
Understanding URL encoding is essential for web developers, API designers, and anyone who works with HTTP. Incorrect encoding leads to broken links, security vulnerabilities (injection attacks), and garbled text. This tool encodes and decodes text using JavaScript's encodeURIComponent() and decodeURIComponent(), which implement the RFC 3986 standard and handle full Unicode input correctly.
URL Character Categories: Reserved, Unreserved, and Unsafe
RFC 3986 divides characters into three categories that determine when encoding is required.
Unreserved Characters (never need encoding)
| Category | Characters |
|---|---|
| Uppercase letters | A B C D E F G H I J K L M N O P Q R S T U V W X Y Z |
| Lowercase letters | a b c d e f g h i j k l m n o p q r s t u v w x y z |
| Digits | 0 1 2 3 4 5 6 7 8 9 |
| Special safe chars | - _ . ~ |
Reserved Characters (have special meaning in URLs)
| Character | Encoded | URL Purpose |
|---|---|---|
| : | %3A | Scheme separator (http:), port (:8080) |
| / | %2F | Path separator |
| ? | %3F | Query string start |
| # | %23 | Fragment identifier |
| & | %26 | Query parameter separator |
| = | %3D | Key-value separator in query |
| @ | %40 | User info separator |
| + | %2B | Space in form data (legacy) |
| (space) | %20 | Not allowed in URLs |
encodeURI() vs encodeURIComponent(): The Critical Difference
JavaScript provides two built-in URL encoding functions, and using the wrong one is one of the most common web development bugs.
encodeURI() is designed for encoding a complete URI. It leaves reserved characters that have structural meaning in URLs untouched: : / ? # [ ] @ ! $ & ' ( ) * + , ; =. Use this when you have a full URL and just want to fix unsafe characters like spaces. Example: encodeURI("https://example.com/my page.html") produces https://example.com/my%20page.html — the scheme (https:) and path separator (/) are preserved.
encodeURIComponent() is designed for encoding a single component of a URI — typically a query parameter value. It encodes all reserved characters because within a parameter value, characters like &, =, and # should be treated as literal text, not as URL structure. Example: encodeURIComponent("Tom & Jerry") produces Tom%20%26%20Jerry — the ampersand is encoded so it will not be mistaken for a parameter separator.
The rule of thumb: use encodeURIComponent() for query parameter values and path segments, and encodeURI() only when you have a full URL and want to make it safe. This tool uses encodeURIComponent(), which is correct for the most common use case — encoding text for use in a URL parameter.
How Percent Encoding Works with Unicode (UTF-8)
When encoding non-ASCII characters, the text is first converted to its UTF-8 byte representation, and then each byte is percent-encoded individually. A single Unicode character can produce one to four percent-encoded bytes. For example, the Euro sign (€) has the UTF-8 byte sequence E2 82 AC, which URL-encodes to %E2%82%AC. Chinese characters typically require three bytes: 中 (meaning "middle/China") encodes to %E4%B8%AD. Emoji characters require four bytes: the thumbs-up emoji encodes to %F0%9F%91%8D.
This UTF-8 based encoding is standardized in RFC 3986 (the current URI standard) and IRI (Internationalized Resource Identifiers, RFC 3987). Older systems sometimes used other encodings like Latin-1 or Shift-JIS for URL encoding, which led to mojibake (garbled text) when systems disagreed on encoding. Today, UTF-8 is the universal standard for URL encoding, and all modern browsers and servers use it.
Spaces in URLs: %20 vs + (Plus Sign)
Both %20 and + can represent a space character in URLs, but they have different scopes and histories. The + convention comes from the application/x-www-form-urlencoded content type used by HTML forms (defined in the HTML specification, not the URI specification). In this format, spaces are encoded as + instead of %20. This only applies to query string parameters submitted by HTML forms.
The percent-encoding %20 is defined by RFC 3986 and works everywhere in a URL — scheme, authority, path, query, and fragment. For maximum compatibility, use %20 in path segments and + or %20 in query strings. JavaScript's encodeURIComponent() always produces %20 for spaces, while URLSearchParams produces + for spaces in query parameters.
When to Encode: Practical Scenarios
Building API URLs: When constructing URLs with user-supplied query parameters, always encode parameter values. If a user searches for "Tom & Jerry", the query string must be ?q=Tom%20%26%20Jerry, not ?q=Tom & Jerry (which would be parsed as two separate parameters).
Redirects and links: When generating redirect URLs that include other URLs as parameters (e.g., ?redirect=https://example.com/page?id=5), the embedded URL must be encoded to prevent its reserved characters from being interpreted as part of the outer URL structure.
File paths in URLs: Filenames with spaces, special characters, or non-ASCII characters need encoding. A file named "Q&A Report.pdf" must be accessed as Q%26A%20Report.pdf in the URL.
OAuth and authentication flows: OAuth 2.0 requires strict percent-encoding of callback URLs and scope values. Incorrect encoding is one of the most common OAuth integration failures.
URL Encoding in Different Languages
| Language | Encode Function | Decode Function | Notes |
|---|---|---|---|
| JavaScript | encodeURIComponent() | decodeURIComponent() | Also: encodeURI() for full URLs |
| Python | urllib.parse.quote() | urllib.parse.unquote() | quote_plus() for form data |
| PHP | rawurlencode() | rawurldecode() | urlencode() uses + for spaces |
| Java | URLEncoder.encode() | URLDecoder.decode() | Uses + for spaces (legacy) |
| Go | url.QueryEscape() | url.QueryUnescape() | PathEscape for path segments |
| C# | Uri.EscapeDataString() | Uri.UnescapeDataString() | Avoid HttpUtility.UrlEncode for RFC 3986 |
Common URL Encoding Mistakes
Double-encoding: Encoding an already-encoded string produces garbled results. %20 becomes %2520 (the percent sign itself gets encoded). If your URLs contain %25, you are likely double-encoding.
Using encodeURI() for parameter values: encodeURI() does not encode &, =, or + characters, so using it for query parameter values will break URLs when the value contains these characters. Always use encodeURIComponent() for parameter values.
Not encoding at all: Building URLs through string concatenation without encoding user input creates both broken URLs and security vulnerabilities (URL injection, open redirects). Always encode dynamic values.
Encoding characters that do not need encoding: Letters, digits, hyphens, underscores, periods, and tildes are unreserved and never need encoding. Over-encoding wastes bytes and reduces URL readability.
How This Tool Works
This encoder uses JavaScript's encodeURIComponent() for encoding and decodeURIComponent() for decoding, which implement RFC 3986 percent-encoding with full UTF-8 Unicode support. Type text in the encode field to see the percent-encoded result instantly, or paste an encoded string to decode it back to readable text. All processing happens in your browser — no data is sent to any server, making this tool safe for encoding sensitive parameters like API keys or authentication tokens.
Frequently Asked Questions
What is URL encoding (percent encoding)?
URL encoding (also called percent encoding, defined in RFC 3986) converts characters that are unsafe or have special meaning in URLs into a percent sign followed by two hexadecimal digits representing the character's byte value. For example, a space becomes %20, an ampersand becomes %26, and non-ASCII characters are first converted to UTF-8 bytes, each percent-encoded individually. This ensures URLs are transmitted correctly across all systems, since URLs can only contain a limited set of ASCII characters.
What is the difference between encodeURI() and encodeURIComponent()?
encodeURI() encodes a complete URI, leaving reserved characters like : / ? # and & untouched since they have structural meaning in URLs. encodeURIComponent() encodes everything except unreserved characters (letters, digits, -, _, ., ~), making it suitable for encoding individual query parameter values where characters like & and = should be treated as literal text. Use encodeURI() for full URLs and encodeURIComponent() for parameter values. Using the wrong one is a common source of bugs.
Which characters need to be URL-encoded?
Characters that must be encoded include spaces (as %20 or + in form data), all non-ASCII characters (UTF-8 bytes encoded individually), and reserved characters when used outside their designated structural purpose. Reserved characters include : / ? # [ ] @ ! $ & ' ( ) * + , ; =. Unreserved characters that never need encoding are uppercase and lowercase letters (A-Z, a-z), digits (0-9), and four symbols: hyphen (-), underscore (_), period (.), and tilde (~).
Why does my URL have %20 or + for spaces?
Both %20 and + represent spaces in URLs, but in different contexts. %20 is the standard RFC 3986 percent-encoding for a space character and works everywhere in a URL (scheme, path, query, fragment). The + sign represents a space only in the application/x-www-form-urlencoded format used by HTML form submissions, and only within the query string. JavaScript's encodeURIComponent() produces %20, while URLSearchParams and HTML forms produce +. For maximum compatibility, %20 is safer.
What happens if I double-encode a URL?
Double encoding occurs when an already-encoded string is encoded again, converting percent signs into %25. For example, a space encoded once becomes %20, but encoded twice becomes %2520 (the % is encoded to %25, followed by 20). This is a common bug in web applications that causes broken links, failed API requests, and garbled text. To avoid it, only encode raw user input once, and always decode before re-encoding. If you receive a URL with %25 sequences, it has likely been double-encoded and needs only one round of decoding.
How does URL encoding handle non-English characters like Chinese or Arabic?
Non-ASCII characters are first converted to their UTF-8 byte representation, then each byte is percent-encoded individually. For example, the Chinese character for sun (U+65E5) has the UTF-8 bytes E6, 97, A5, so it becomes %E6%97%A5. A single emoji like the smiley face (U+1F600) uses four UTF-8 bytes and becomes %F0%9F%98%80. Modern browsers display the original characters in the address bar using Internationalized Resource Identifiers (IRIs), while the actual HTTP request uses the percent-encoded form. This system allows URLs to contain text in any language.