Not all characters are valid in a URL. We can define a series of valid characters that can be used "as is", reserved characters that can be used "as is"but have a special meaning and any other character that will need to be "URL-Encoded" into a numerical format to be valid and usable and not conflict with characters that do have a special meaning.
Valid Characters:
The following characters can be used directly within a URL and do not require any special encoding:
Characters |
Explanation |
A to Z |
Uppercase alphabetical |
a to z |
Lowercase alphabetical |
0 to 9 |
Numerical |
- |
Hyphen |
_ |
Underscore |
. |
Dot / full stop |
~ |
Tilde |
Reserved Characters:
These characters can also be used "as is", but they have a special meaning in a URI, and if they are to be used outside of their reserved meaning, they must be URL-encoded.
Failure to do so may generate unforeseen consequences in the processing of the URL. Notably, problems can occur if you send a URI as a parameter for a redirection inside another URL. If these characters are not URL-encoded, the server may be unable to determine where a URL ends and where a parameter starts or how many parameters there are in a received URL.
Characters |
Explanation |
: |
Protocol separator or username/password separator when specified in the URL |
@ |
Credential and host separator |
/ |
Directory separator for resource or folder paths. |
? |
Query string separator |
& |
Separator for key-value pairs if more than one key-value pair is present in the URI |
= |
Assigns a value to a key in a URI |
# |
End of URL Anchor, indicating to a browser to jump to that anchor in an HTML page, if present in the source code. |
% |
Character indicating a "percent-encoded" (URL-encoded) character. It will be followed by a numerical code to represent a reserved character that otherwise could corrupt the meaning of a URI. |
+ |
Represents a space in some query strings (often used in place of a space character). |
[ |
They are used to enclose an IPv6 address in the URL (e.g., http://[2001:db8::1]/). |
] |
|
! |
Reserved for future use, it can appear in various contexts within the URL. |
$ |
Reserved for use within the query component to delimit special reserved parameters. |
( |
Reserved for future use, they can appear in various contexts within the URL. |
) |
|
* |
Reserved for future use, it can appear in various contexts within the URL. |
, |
Reserved for future use, it can appear in various contexts within the path or query components. |
; |
Sometimes used to separate parameters in the path component (e.g., http://example.com/path;param=value). |
Any character not defined in the valid character list or not used according to the reserved character list usage should be URL encoded.
Encoding Reserved and Invalid Characters in URLs:
When working with URLs, it's crucial to properly encode reserved and invalid characters to ensure they are correctly interpreted by web servers and browsers. To avoid issues with these characters, you can use online tools like urlencoder.org or the URL encoding features of your programming language.
Reserved characters have specific meanings defined by RFC documents, and if misused, they can break a URL. If you need to use one of these characters for a different purpose, encode it to its URL-encoded value.
Encoding an Email Address in a URL
Consider a variable email in your query string set to mail@example.com. The @ symbol is reserved, so it needs to be encoded to prevent it from being misinterpreted as a username combination.
For example, the @ symbol is encoded as %40. Here’s a valid query string example:
In this example:
- The @ symbol correctly separates the username from the server address.
- The email variable properly encodes the @ symbol to %40, ensuring the server accurately interprets the email address.
Encoding the % character
The % character is used to denote encoded characters in a URL. If you need to use % as a literal character, it must be encoded as %25.
For example:
Here, %25 represents the literal % character, preventing it from being misinterpreted as the start of an encoded character sequence.
Handling Invalid URLs with Variables:
Proper URL encoding is crucial when embedding one URL within another, such as in redirect links, to avoid conflicts and misinterpretation.
Consider this invalid URL with unencoded query strings:
The problem here is the presence of multiple unencoded ? and & characters, leading to incorrect parsing of query parameters.
Valid URL: Encoding the Redirect Parameter
To make the above URL valid, URL-encode the redirect parameter:
In this example:
- Characters such as :, /, ?, =, and & within the redirect value are encoded to their respective percent-encoded values.
- This encoding ensures the redirect parameter is treated as a single value and not parsed incorrectly.
Servers processing this URL must decode the redirect value to interpret and handle it correctly.
By encoding reserved and invalid characters in URLs, you ensure accurate communication and functionality between clients and servers, avoiding common pitfalls of URL misinterpretation.
URL Encoding Special Character Chart
Character |
Name |
URL Encoded |
(space) |
Space |
%20 or + |
! |
Exclamation mark |
%21 |
" |
Double quote |
%22 |
# |
Number sign (hash) |
%23 |
$ |
Dollar sign |
%24 |
% |
Percent |
%25 |
& |
Ampersand |
%26 |
' |
Apostrophe (single quote) |
%27 |
( |
Left parenthesis |
%28 |
) |
Right parenthesis |
%29 |
* |
Asterisk |
%2A |
+ |
Plus sign |
%2B |
, |
Comma |
%2C |
/ |
Forward slash |
%2F |
: |
Colon |
%3A |
; |
Semicolon |
%3B |
< |
Less than |
%3C |
= |
Equals sign |
%3D |
> |
Greater than |
%3E |
? |
Question mark |
%3F |
@ |
At sign |
%40 |
[ |
Left square bracket |
%5B |
\ |
Backslash |
%5C |
] |
Right square bracket |
%5D |
^ |
Caret |
%5E |
_ |
Underscore |
%5F |
` |
Grave accent |
%60 |
{ |
Left curly brace |
%7B |
| |
Vertical bar (pipe) |
%7C |
} |
Right curly brace |
%7D |
~ |
Tilde |
%7E |
For more information, view our Connect Self-Serve Guide.