何时需要做urlEncode,以及为什么要做

在RFC1738中,对于URL可以使用的字符集做了如下规定:

只有0-9a-zA-Z的字母以及$-_.+!*'(),"这几个特殊字符

而在html4中扩展了所有的unicode character set能够在url中使用。

那么到底有哪些字符需要encoded呢?

1. ascii control characters

 原因是:他们不可打印,

 字符范围iso-8859-1的00-1F 以及7F

2. non-ascii characters:

原因:这些字符因为不在ascii集合中不被认为在url中是合法的

字符范围: iso-latin的80-FF范围

3. reserved characters:

原因:URL使用部分预留的字符来定义url的语法。当这些字符在url中不被当作其特殊角色时,他们必须被encoded

字符范围: $, &,+, , /,:,;,=,?,@

CharacterCode
Points
(Hex)
Code
Points
(Dec)
 Dollar ("$")
 Ampersand ("&")
 Plus ("+")
 Comma (",")
 Forward slash/Virgule ("/")
 Colon (":")
 Semi-colon (";")
 Equals ("=")
 Question mark ("?")
 'At' symbol ("@")
24
26
2B
2C
2F
3A
3B
3D
3F
40
36
38
43
44
47
58
59
61
63
64

4.unsafe characters

原因: 部分字符如果在url中可能导致歧义。这些字符也必须被encoded:

CharacterCode
Points
(Hex)
Code
Points
(Dec)
Why encode?
Space 20 32 Significant sequences of spaces may be lost in some uses (especially multiple spaces)
Quotation marks
'Less Than' symbol ("<")
'Greater Than' symbol (">")
22
3C
3E
34
60
62
These characters are often used to delimit URLs in plain text.
'Pound' character ("#") 23 35 This is used in URLs to indicate where a fragment identifier (bookmarks/anchors in HTML) begins.
Percent character ("%") 25 37 This is used to URL encode/escape other characters, so it should itself also be encoded.
Misc. characters:
   Left Curly Brace ("{")
   Right Curly Brace ("}")
   Vertical Bar/Pipe ("|")
   Backslash ("")
   Caret ("^")
   Tilde ("~")
   Left Square Bracket ("[")
   Right Square Bracket ("]")
   Grave Accent ("`")

7B
7D
7C
5C
5E
7E
5B
5D
60

123
125
124
92
94
126
91
93
96
Some systems can possibly modify these chara

 如何做url encoded呢?

url encoding of a character包含一个%号,并且以iso-latin的16进制两位数来跟进

例如:

space = %20

使用javascript的 

encodeURIComponent 函数来实现
原文地址:https://www.cnblogs.com/kidsitcn/p/6694249.html