Python中url的编码以及解码 - 博客

[{"createTime":1735734952000,"id":1,"img":"bandupan_350_218.jpg","link":"https://pan.baidu.com/s/1T03izdWtRSeMqOXoT9HCug?pwd=draw","name":"百度网盘下载","status":9,"txt":"百度网盘下载","type":1,"updateTime":1735747411000,"userId":3},{"createTime":1736173885000,"id":2,"img":"txy_480_300.png","link":"https://cloud.tencent.com/act/cps/redirect?redirect=1077&cps_key=edb15096bfff75effaaa8c8bb66138bd&from=console","name":"腾讯云秒杀","status":9,"txt":"腾讯云限量秒杀","type":1,"updateTime":1736173885000,"userId":3},{"createTime":1736177492000,"id":3,"img":"aly_251_140.png","link":"https://www.aliyun.com/minisite/goods?userCode=pwp8kmv3","memo":"","name":"阿里云","status":9,"txt":"阿里云2折起","type":1,"updateTime":1736177492000,"userId":3},{"createTime":1735660800000,"id":4,"img":"vultr_560_300.png","link":"https://www.vultr.com/?ref=9603742-8H","name":"Vultr","status":9,"txt":"Vultr送$100","type":1,"updateTime":1735660800000,"userId":3},{"createTime":1735660800000,"id":5,"img":"jdy_663_320.jpg","link":"https://3.cn/2ay1-e5t","name":"京东云","status":9,"txt":"京东云特惠专区","type":1,"updateTime":1735660800000,"userId":3},{"createTime":1735660800000,"id":6,"img":"qk_443_300.png","link":"https://pan.quark.cn/s/6229b93c70d0","name":"夸克网盘","status":9,"txt":"夸克网盘","type":1,"updateTime":1735660800000,"userId":3},{"createTime":1735660800000,"id":7,"img":"yun_910_50.png","link":"https://activity.huaweicloud.com/discount_area_v5/index.html?fromacct=261f35b6-af54-4511-a2ca-910fa15905d1&utm_source=aXhpYW95YW5nOA===&utm_medium=cps&utm_campaign=201905","name":"底部","status":9,"txt":"高性能云服务器2折起","type":2,"updateTime":1735660800000,"userId":3}]

当有些请求，或者地址中的汉字以及特殊符号不编码使用不了时候，则需要去把中文进行编码，有些地址拿到之后，需要进行解码，不然中文会变成百分号加几个字母和数字的形式

1.url编码
from urllib.parse import quote # 将字符串‘程序设计’进行编码 text = quote("程序设计", 'utf-8')
print(text) # 打印结果 # %E7%A8%8B%E5%BA%8F%E8%AE%BE%E8%AE%A1
2.url解码
from urllib.parse import unquote #
对字符串‘%E7%A8%8B%E5%BA%8F%E8%AE%BE%E8%AE%A1’进行解密 text =
unquote("%E7%A8%8B%E5%BA%8F%E8%AE%BE%E8%AE%A1", 'utf-8') print(text) # 打印结果 #
程序设计
3.源码
def quote(string, safe='/', encoding=None, errors=None): """quote('abc def')
-> 'abc%20def' Each part of a URL, e.g. the path info, the query, etc., has a
different set of reserved characters that must be quoted. The quote function
offers a cautious (not minimal) way to quote a string for most of these parts.
RFC 3986 Uniform Resource Identifier (URI): Generic Syntax lists the following
(un)reserved characters. unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~"
reserved = gen-delims / sub-delims gen-delims = ":" / "/" / "?" / "#" / "[" /
"]" / "@" sub-delims = "!" / "$" / "&" / "'" / "(" / ")" / "*" / "+" / "," /
";" / "=" Each of the reserved characters is reserved in some component of a
URL, but not necessarily in all of them. The quote function %-escapes all
characters that are neither in the unreserved chars ("always safe") nor the
additional chars set via the safe arg. The default for the safe arg is '/'. The
character is reserved, but in typical usage the quote function is being called
on a path where the existing slash characters are to be preserved. Python 3.7
updates from using RFC 2396 to RFC 3986 to quote URL strings. Now, "~" is
included in the set of unreserved characters. string and safe may be either str
or bytes objects. encoding and errors must not be specified if string is a
bytes object. The optional encoding and errors parameters specify how to deal
with non-ASCII characters, as accepted by the str.encode method. By default,
encoding='utf-8' (characters are encoded with UTF-8), and errors='strict'
(unsupported characters raise a UnicodeEncodeError). """ if isinstance(string,
str): if not string: return string if encoding is None: encoding = 'utf-8' if
errors is None: errors = 'strict' string = string.encode(encoding, errors)
else: if encoding is not None: raise TypeError("quote() doesn't support
'encoding' for bytes") if errors is not None: raise TypeError("quote() doesn't
support 'errors' for bytes") return quote_from_bytes(string, safe) def
unquote(string, encoding='utf-8', errors='replace'): """Replace %xx escapes by
their single-character equivalent. The optional encoding and errors parameters
specify how to decode percent-encoded sequences into Unicode characters, as
accepted by the bytes.decode() method. By default, percent-encoded sequences
are decoded with UTF-8, and invalid sequences are replaced by a placeholder
character. unquote('abc%20def') -> 'abc def'. """ if '%' not in string:
string.split return string if encoding is None: encoding = 'utf-8' if errors is
None: errors = 'replace' bits = _asciire.split(string) res = [bits[0]] append =
res.append for i in range(1, len(bits), 2):
append(unquote_to_bytes(bits[i]).decode(encoding, errors)) append(bits[i + 1])
return ''.join(res)

技术

Java1212 篇
Python927 篇
开发语言608 篇
c语言463 篇
算法461 篇
MySQL438 篇
数据库394 篇
前端387 篇
更多...