906 lines
81 KiB
HTML
906 lines
81 KiB
HTML
|
|
|||
|
<!DOCTYPE html>
|
|||
|
|
|||
|
<html xmlns="http://www.w3.org/1999/xhtml">
|
|||
|
<head>
|
|||
|
<meta charset="utf-8" />
|
|||
|
<title>urllib.parse — Parse URLs into components — Python 3.7.4 documentation</title>
|
|||
|
<link rel="stylesheet" href="../_static/pydoctheme.css" type="text/css" />
|
|||
|
<link rel="stylesheet" href="../_static/pygments.css" type="text/css" />
|
|||
|
|
|||
|
<script type="text/javascript" id="documentation_options" data-url_root="../" src="../_static/documentation_options.js"></script>
|
|||
|
<script type="text/javascript" src="../_static/jquery.js"></script>
|
|||
|
<script type="text/javascript" src="../_static/underscore.js"></script>
|
|||
|
<script type="text/javascript" src="../_static/doctools.js"></script>
|
|||
|
<script type="text/javascript" src="../_static/language_data.js"></script>
|
|||
|
|
|||
|
<script type="text/javascript" src="../_static/sidebar.js"></script>
|
|||
|
|
|||
|
<link rel="search" type="application/opensearchdescription+xml"
|
|||
|
title="Search within Python 3.7.4 documentation"
|
|||
|
href="../_static/opensearch.xml"/>
|
|||
|
<link rel="author" title="About these documents" href="../about.html" />
|
|||
|
<link rel="index" title="Index" href="../genindex.html" />
|
|||
|
<link rel="search" title="Search" href="../search.html" />
|
|||
|
<link rel="copyright" title="Copyright" href="../copyright.html" />
|
|||
|
<link rel="next" title="urllib.error — Exception classes raised by urllib.request" href="urllib.error.html" />
|
|||
|
<link rel="prev" title="urllib.request — Extensible library for opening URLs" href="urllib.request.html" />
|
|||
|
<link rel="shortcut icon" type="image/png" href="../_static/py.png" />
|
|||
|
<link rel="canonical" href="https://docs.python.org/3/library/urllib.parse.html" />
|
|||
|
|
|||
|
<script type="text/javascript" src="../_static/copybutton.js"></script>
|
|||
|
<script type="text/javascript" src="../_static/switchers.js"></script>
|
|||
|
|
|||
|
|
|||
|
|
|||
|
<style>
|
|||
|
@media only screen {
|
|||
|
table.full-width-table {
|
|||
|
width: 100%;
|
|||
|
}
|
|||
|
}
|
|||
|
</style>
|
|||
|
|
|||
|
|
|||
|
</head><body>
|
|||
|
|
|||
|
<div class="related" role="navigation" aria-label="related navigation">
|
|||
|
<h3>Navigation</h3>
|
|||
|
<ul>
|
|||
|
<li class="right" style="margin-right: 10px">
|
|||
|
<a href="../genindex.html" title="General Index"
|
|||
|
accesskey="I">index</a></li>
|
|||
|
<li class="right" >
|
|||
|
<a href="../py-modindex.html" title="Python Module Index"
|
|||
|
>modules</a> |</li>
|
|||
|
<li class="right" >
|
|||
|
<a href="urllib.error.html" title="urllib.error — Exception classes raised by urllib.request"
|
|||
|
accesskey="N">next</a> |</li>
|
|||
|
<li class="right" >
|
|||
|
<a href="urllib.request.html" title="urllib.request — Extensible library for opening URLs"
|
|||
|
accesskey="P">previous</a> |</li>
|
|||
|
<li><img src="../_static/py.png" alt=""
|
|||
|
style="vertical-align: middle; margin-top: -1px"/></li>
|
|||
|
<li><a href="https://www.python.org/">Python</a> »</li>
|
|||
|
<li>
|
|||
|
<span class="language_switcher_placeholder">en</span>
|
|||
|
<span class="version_switcher_placeholder">3.7.4</span>
|
|||
|
<a href="../index.html">Documentation </a> »
|
|||
|
</li>
|
|||
|
|
|||
|
<li class="nav-item nav-item-1"><a href="index.html" >The Python Standard Library</a> »</li>
|
|||
|
<li class="nav-item nav-item-2"><a href="internet.html" accesskey="U">Internet Protocols and Support</a> »</li>
|
|||
|
<li class="right">
|
|||
|
|
|||
|
|
|||
|
<div class="inline-search" style="display: none" role="search">
|
|||
|
<form class="inline-search" action="../search.html" method="get">
|
|||
|
<input placeholder="Quick search" type="text" name="q" />
|
|||
|
<input type="submit" value="Go" />
|
|||
|
<input type="hidden" name="check_keywords" value="yes" />
|
|||
|
<input type="hidden" name="area" value="default" />
|
|||
|
</form>
|
|||
|
</div>
|
|||
|
<script type="text/javascript">$('.inline-search').show(0);</script>
|
|||
|
|
|
|||
|
</li>
|
|||
|
|
|||
|
</ul>
|
|||
|
</div>
|
|||
|
|
|||
|
<div class="document">
|
|||
|
<div class="documentwrapper">
|
|||
|
<div class="bodywrapper">
|
|||
|
<div class="body" role="main">
|
|||
|
|
|||
|
<div class="section" id="module-urllib.parse">
|
|||
|
<span id="urllib-parse-parse-urls-into-components"></span><h1><a class="reference internal" href="#module-urllib.parse" title="urllib.parse: Parse URLs into or assemble them from components."><code class="xref py py-mod docutils literal notranslate"><span class="pre">urllib.parse</span></code></a> — Parse URLs into components<a class="headerlink" href="#module-urllib.parse" title="Permalink to this headline">¶</a></h1>
|
|||
|
<p><strong>Source code:</strong> <a class="reference external" href="https://github.com/python/cpython/tree/3.7/Lib/urllib/parse.py">Lib/urllib/parse.py</a></p>
|
|||
|
<hr class="docutils" id="index-0" />
|
|||
|
<p>This module defines a standard interface to break Uniform Resource Locator (URL)
|
|||
|
strings up in components (addressing scheme, network location, path etc.), to
|
|||
|
combine the components back into a URL string, and to convert a “relative URL”
|
|||
|
to an absolute URL given a “base URL.”</p>
|
|||
|
<p>The module has been designed to match the Internet RFC on Relative Uniform
|
|||
|
Resource Locators. It supports the following URL schemes: <code class="docutils literal notranslate"><span class="pre">file</span></code>, <code class="docutils literal notranslate"><span class="pre">ftp</span></code>,
|
|||
|
<code class="docutils literal notranslate"><span class="pre">gopher</span></code>, <code class="docutils literal notranslate"><span class="pre">hdl</span></code>, <code class="docutils literal notranslate"><span class="pre">http</span></code>, <code class="docutils literal notranslate"><span class="pre">https</span></code>, <code class="docutils literal notranslate"><span class="pre">imap</span></code>, <code class="docutils literal notranslate"><span class="pre">mailto</span></code>, <code class="docutils literal notranslate"><span class="pre">mms</span></code>,
|
|||
|
<code class="docutils literal notranslate"><span class="pre">news</span></code>, <code class="docutils literal notranslate"><span class="pre">nntp</span></code>, <code class="docutils literal notranslate"><span class="pre">prospero</span></code>, <code class="docutils literal notranslate"><span class="pre">rsync</span></code>, <code class="docutils literal notranslate"><span class="pre">rtsp</span></code>, <code class="docutils literal notranslate"><span class="pre">rtspu</span></code>, <code class="docutils literal notranslate"><span class="pre">sftp</span></code>,
|
|||
|
<code class="docutils literal notranslate"><span class="pre">shttp</span></code>, <code class="docutils literal notranslate"><span class="pre">sip</span></code>, <code class="docutils literal notranslate"><span class="pre">sips</span></code>, <code class="docutils literal notranslate"><span class="pre">snews</span></code>, <code class="docutils literal notranslate"><span class="pre">svn</span></code>, <code class="docutils literal notranslate"><span class="pre">svn+ssh</span></code>, <code class="docutils literal notranslate"><span class="pre">telnet</span></code>,
|
|||
|
<code class="docutils literal notranslate"><span class="pre">wais</span></code>, <code class="docutils literal notranslate"><span class="pre">ws</span></code>, <code class="docutils literal notranslate"><span class="pre">wss</span></code>.</p>
|
|||
|
<p>The <a class="reference internal" href="#module-urllib.parse" title="urllib.parse: Parse URLs into or assemble them from components."><code class="xref py py-mod docutils literal notranslate"><span class="pre">urllib.parse</span></code></a> module defines functions that fall into two broad
|
|||
|
categories: URL parsing and URL quoting. These are covered in detail in
|
|||
|
the following sections.</p>
|
|||
|
<div class="section" id="url-parsing">
|
|||
|
<h2>URL Parsing<a class="headerlink" href="#url-parsing" title="Permalink to this headline">¶</a></h2>
|
|||
|
<p>The URL parsing functions focus on splitting a URL string into its components,
|
|||
|
or on combining URL components into a URL string.</p>
|
|||
|
<dl class="function">
|
|||
|
<dt id="urllib.parse.urlparse">
|
|||
|
<code class="descclassname">urllib.parse.</code><code class="descname">urlparse</code><span class="sig-paren">(</span><em>urlstring</em>, <em>scheme=''</em>, <em>allow_fragments=True</em><span class="sig-paren">)</span><a class="headerlink" href="#urllib.parse.urlparse" title="Permalink to this definition">¶</a></dt>
|
|||
|
<dd><p>Parse a URL into six components, returning a 6-item <a class="reference internal" href="../glossary.html#term-named-tuple"><span class="xref std std-term">named tuple</span></a>. This
|
|||
|
corresponds to the general structure of a URL:
|
|||
|
<code class="docutils literal notranslate"><span class="pre">scheme://netloc/path;parameters?query#fragment</span></code>.
|
|||
|
Each tuple item is a string, possibly empty. The components are not broken up in
|
|||
|
smaller parts (for example, the network location is a single string), and %
|
|||
|
escapes are not expanded. The delimiters as shown above are not part of the
|
|||
|
result, except for a leading slash in the <em>path</em> component, which is retained if
|
|||
|
present. For example:</p>
|
|||
|
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="gp">>>> </span><span class="kn">from</span> <span class="nn">urllib.parse</span> <span class="k">import</span> <span class="n">urlparse</span>
|
|||
|
<span class="gp">>>> </span><span class="n">o</span> <span class="o">=</span> <span class="n">urlparse</span><span class="p">(</span><span class="s1">'http://www.cwi.nl:80/</span><span class="si">%7E</span><span class="s1">guido/Python.html'</span><span class="p">)</span>
|
|||
|
<span class="gp">>>> </span><span class="n">o</span> <span class="c1"># doctest: +NORMALIZE_WHITESPACE</span>
|
|||
|
<span class="go">ParseResult(scheme='http', netloc='www.cwi.nl:80', path='/%7Eguido/Python.html',</span>
|
|||
|
<span class="go"> params='', query='', fragment='')</span>
|
|||
|
<span class="gp">>>> </span><span class="n">o</span><span class="o">.</span><span class="n">scheme</span>
|
|||
|
<span class="go">'http'</span>
|
|||
|
<span class="gp">>>> </span><span class="n">o</span><span class="o">.</span><span class="n">port</span>
|
|||
|
<span class="go">80</span>
|
|||
|
<span class="gp">>>> </span><span class="n">o</span><span class="o">.</span><span class="n">geturl</span><span class="p">()</span>
|
|||
|
<span class="go">'http://www.cwi.nl:80/%7Eguido/Python.html'</span>
|
|||
|
</pre></div>
|
|||
|
</div>
|
|||
|
<p>Following the syntax specifications in <span class="target" id="index-1"></span><a class="rfc reference external" href="https://tools.ietf.org/html/rfc1808.html"><strong>RFC 1808</strong></a>, urlparse recognizes
|
|||
|
a netloc only if it is properly introduced by ‘//’. Otherwise the
|
|||
|
input is presumed to be a relative URL and thus to start with
|
|||
|
a path component.</p>
|
|||
|
<div class="highlight-pycon3 notranslate"><div class="highlight"><pre><span></span><span class="go"> >>> from urllib.parse import urlparse</span>
|
|||
|
<span class="go"> >>> urlparse('//www.cwi.nl:80/%7Eguido/Python.html')</span>
|
|||
|
<span class="go"> ParseResult(scheme='', netloc='www.cwi.nl:80', path='/%7Eguido/Python.html',</span>
|
|||
|
<span class="go"> params='', query='', fragment='')</span>
|
|||
|
<span class="go"> >>> urlparse('www.cwi.nl/%7Eguido/Python.html')</span>
|
|||
|
<span class="go"> ParseResult(scheme='', netloc='', path='www.cwi.nl/%7Eguido/Python.html',</span>
|
|||
|
<span class="go"> params='', query='', fragment='')</span>
|
|||
|
<span class="go"> >>> urlparse('help/Python.html')</span>
|
|||
|
<span class="go"> ParseResult(scheme='', netloc='', path='help/Python.html', params='',</span>
|
|||
|
<span class="go"> query='', fragment='')</span>
|
|||
|
</pre></div>
|
|||
|
</div>
|
|||
|
<p>The <em>scheme</em> argument gives the default addressing scheme, to be
|
|||
|
used only if the URL does not specify one. It should be the same type
|
|||
|
(text or bytes) as <em>urlstring</em>, except that the default value <code class="docutils literal notranslate"><span class="pre">''</span></code> is
|
|||
|
always allowed, and is automatically converted to <code class="docutils literal notranslate"><span class="pre">b''</span></code> if appropriate.</p>
|
|||
|
<p>If the <em>allow_fragments</em> argument is false, fragment identifiers are not
|
|||
|
recognized. Instead, they are parsed as part of the path, parameters
|
|||
|
or query component, and <code class="xref py py-attr docutils literal notranslate"><span class="pre">fragment</span></code> is set to the empty string in
|
|||
|
the return value.</p>
|
|||
|
<p>The return value is a <a class="reference internal" href="../glossary.html#term-named-tuple"><span class="xref std std-term">named tuple</span></a>, which means that its items can
|
|||
|
be accessed by index or as named attributes, which are:</p>
|
|||
|
<table class="docutils align-center">
|
|||
|
<colgroup>
|
|||
|
<col style="width: 25%" />
|
|||
|
<col style="width: 10%" />
|
|||
|
<col style="width: 36%" />
|
|||
|
<col style="width: 30%" />
|
|||
|
</colgroup>
|
|||
|
<thead>
|
|||
|
<tr class="row-odd"><th class="head"><p>Attribute</p></th>
|
|||
|
<th class="head"><p>Index</p></th>
|
|||
|
<th class="head"><p>Value</p></th>
|
|||
|
<th class="head"><p>Value if not present</p></th>
|
|||
|
</tr>
|
|||
|
</thead>
|
|||
|
<tbody>
|
|||
|
<tr class="row-even"><td><p><code class="xref py py-attr docutils literal notranslate"><span class="pre">scheme</span></code></p></td>
|
|||
|
<td><p>0</p></td>
|
|||
|
<td><p>URL scheme specifier</p></td>
|
|||
|
<td><p><em>scheme</em> parameter</p></td>
|
|||
|
</tr>
|
|||
|
<tr class="row-odd"><td><p><code class="xref py py-attr docutils literal notranslate"><span class="pre">netloc</span></code></p></td>
|
|||
|
<td><p>1</p></td>
|
|||
|
<td><p>Network location part</p></td>
|
|||
|
<td><p>empty string</p></td>
|
|||
|
</tr>
|
|||
|
<tr class="row-even"><td><p><code class="xref py py-attr docutils literal notranslate"><span class="pre">path</span></code></p></td>
|
|||
|
<td><p>2</p></td>
|
|||
|
<td><p>Hierarchical path</p></td>
|
|||
|
<td><p>empty string</p></td>
|
|||
|
</tr>
|
|||
|
<tr class="row-odd"><td><p><code class="xref py py-attr docutils literal notranslate"><span class="pre">params</span></code></p></td>
|
|||
|
<td><p>3</p></td>
|
|||
|
<td><p>Parameters for last path
|
|||
|
element</p></td>
|
|||
|
<td><p>empty string</p></td>
|
|||
|
</tr>
|
|||
|
<tr class="row-even"><td><p><code class="xref py py-attr docutils literal notranslate"><span class="pre">query</span></code></p></td>
|
|||
|
<td><p>4</p></td>
|
|||
|
<td><p>Query component</p></td>
|
|||
|
<td><p>empty string</p></td>
|
|||
|
</tr>
|
|||
|
<tr class="row-odd"><td><p><code class="xref py py-attr docutils literal notranslate"><span class="pre">fragment</span></code></p></td>
|
|||
|
<td><p>5</p></td>
|
|||
|
<td><p>Fragment identifier</p></td>
|
|||
|
<td><p>empty string</p></td>
|
|||
|
</tr>
|
|||
|
<tr class="row-even"><td><p><code class="xref py py-attr docutils literal notranslate"><span class="pre">username</span></code></p></td>
|
|||
|
<td></td>
|
|||
|
<td><p>User name</p></td>
|
|||
|
<td><p><a class="reference internal" href="constants.html#None" title="None"><code class="xref py py-const docutils literal notranslate"><span class="pre">None</span></code></a></p></td>
|
|||
|
</tr>
|
|||
|
<tr class="row-odd"><td><p><code class="xref py py-attr docutils literal notranslate"><span class="pre">password</span></code></p></td>
|
|||
|
<td></td>
|
|||
|
<td><p>Password</p></td>
|
|||
|
<td><p><a class="reference internal" href="constants.html#None" title="None"><code class="xref py py-const docutils literal notranslate"><span class="pre">None</span></code></a></p></td>
|
|||
|
</tr>
|
|||
|
<tr class="row-even"><td><p><code class="xref py py-attr docutils literal notranslate"><span class="pre">hostname</span></code></p></td>
|
|||
|
<td></td>
|
|||
|
<td><p>Host name (lower case)</p></td>
|
|||
|
<td><p><a class="reference internal" href="constants.html#None" title="None"><code class="xref py py-const docutils literal notranslate"><span class="pre">None</span></code></a></p></td>
|
|||
|
</tr>
|
|||
|
<tr class="row-odd"><td><p><code class="xref py py-attr docutils literal notranslate"><span class="pre">port</span></code></p></td>
|
|||
|
<td></td>
|
|||
|
<td><p>Port number as integer,
|
|||
|
if present</p></td>
|
|||
|
<td><p><a class="reference internal" href="constants.html#None" title="None"><code class="xref py py-const docutils literal notranslate"><span class="pre">None</span></code></a></p></td>
|
|||
|
</tr>
|
|||
|
</tbody>
|
|||
|
</table>
|
|||
|
<p>Reading the <code class="xref py py-attr docutils literal notranslate"><span class="pre">port</span></code> attribute will raise a <a class="reference internal" href="exceptions.html#ValueError" title="ValueError"><code class="xref py py-exc docutils literal notranslate"><span class="pre">ValueError</span></code></a> if
|
|||
|
an invalid port is specified in the URL. See section
|
|||
|
<a class="reference internal" href="#urlparse-result-object"><span class="std std-ref">Structured Parse Results</span></a> for more information on the result object.</p>
|
|||
|
<p>Unmatched square brackets in the <code class="xref py py-attr docutils literal notranslate"><span class="pre">netloc</span></code> attribute will raise a
|
|||
|
<a class="reference internal" href="exceptions.html#ValueError" title="ValueError"><code class="xref py py-exc docutils literal notranslate"><span class="pre">ValueError</span></code></a>.</p>
|
|||
|
<p>Characters in the <code class="xref py py-attr docutils literal notranslate"><span class="pre">netloc</span></code> attribute that decompose under NFKC
|
|||
|
normalization (as used by the IDNA encoding) into any of <code class="docutils literal notranslate"><span class="pre">/</span></code>, <code class="docutils literal notranslate"><span class="pre">?</span></code>,
|
|||
|
<code class="docutils literal notranslate"><span class="pre">#</span></code>, <code class="docutils literal notranslate"><span class="pre">@</span></code>, or <code class="docutils literal notranslate"><span class="pre">:</span></code> will raise a <a class="reference internal" href="exceptions.html#ValueError" title="ValueError"><code class="xref py py-exc docutils literal notranslate"><span class="pre">ValueError</span></code></a>. If the URL is
|
|||
|
decomposed before parsing, no error will be raised.</p>
|
|||
|
<p>As is the case with all named tuples, the subclass has a few additional methods
|
|||
|
and attributes that are particularly useful. One such method is <code class="xref py py-meth docutils literal notranslate"><span class="pre">_replace()</span></code>.
|
|||
|
The <code class="xref py py-meth docutils literal notranslate"><span class="pre">_replace()</span></code> method will return a new ParseResult object replacing specified
|
|||
|
fields with new values.</p>
|
|||
|
<div class="highlight-pycon3 notranslate"><div class="highlight"><pre><span></span><span class="go"> >>> from urllib.parse import urlparse</span>
|
|||
|
<span class="go"> >>> u = urlparse('//www.cwi.nl:80/%7Eguido/Python.html')</span>
|
|||
|
<span class="go"> >>> u</span>
|
|||
|
<span class="go"> ParseResult(scheme='', netloc='www.cwi.nl:80', path='/%7Eguido/Python.html',</span>
|
|||
|
<span class="go"> params='', query='', fragment='')</span>
|
|||
|
<span class="go"> >>> u._replace(scheme='http')</span>
|
|||
|
<span class="go"> ParseResult(scheme='http', netloc='www.cwi.nl:80', path='/%7Eguido/Python.html',</span>
|
|||
|
<span class="go"> params='', query='', fragment='')</span>
|
|||
|
</pre></div>
|
|||
|
</div>
|
|||
|
<div class="versionchanged">
|
|||
|
<p><span class="versionmodified changed">Changed in version 3.2: </span>Added IPv6 URL parsing capabilities.</p>
|
|||
|
</div>
|
|||
|
<div class="versionchanged">
|
|||
|
<p><span class="versionmodified changed">Changed in version 3.3: </span>The fragment is now parsed for all URL schemes (unless <em>allow_fragment</em> is
|
|||
|
false), in accordance with <span class="target" id="index-2"></span><a class="rfc reference external" href="https://tools.ietf.org/html/rfc3986.html"><strong>RFC 3986</strong></a>. Previously, a whitelist of
|
|||
|
schemes that support fragments existed.</p>
|
|||
|
</div>
|
|||
|
<div class="versionchanged">
|
|||
|
<p><span class="versionmodified changed">Changed in version 3.6: </span>Out-of-range port numbers now raise <a class="reference internal" href="exceptions.html#ValueError" title="ValueError"><code class="xref py py-exc docutils literal notranslate"><span class="pre">ValueError</span></code></a>, instead of
|
|||
|
returning <a class="reference internal" href="constants.html#None" title="None"><code class="xref py py-const docutils literal notranslate"><span class="pre">None</span></code></a>.</p>
|
|||
|
</div>
|
|||
|
<div class="versionchanged">
|
|||
|
<p><span class="versionmodified changed">Changed in version 3.7.3: </span>Characters that affect netloc parsing under NFKC normalization will
|
|||
|
now raise <a class="reference internal" href="exceptions.html#ValueError" title="ValueError"><code class="xref py py-exc docutils literal notranslate"><span class="pre">ValueError</span></code></a>.</p>
|
|||
|
</div>
|
|||
|
</dd></dl>
|
|||
|
|
|||
|
<dl class="function">
|
|||
|
<dt id="urllib.parse.parse_qs">
|
|||
|
<code class="descclassname">urllib.parse.</code><code class="descname">parse_qs</code><span class="sig-paren">(</span><em>qs</em>, <em>keep_blank_values=False</em>, <em>strict_parsing=False</em>, <em>encoding='utf-8'</em>, <em>errors='replace'</em>, <em>max_num_fields=None</em><span class="sig-paren">)</span><a class="headerlink" href="#urllib.parse.parse_qs" title="Permalink to this definition">¶</a></dt>
|
|||
|
<dd><p>Parse a query string given as a string argument (data of type
|
|||
|
<em class="mimetype">application/x-www-form-urlencoded</em>). Data are returned as a
|
|||
|
dictionary. The dictionary keys are the unique query variable names and the
|
|||
|
values are lists of values for each name.</p>
|
|||
|
<p>The optional argument <em>keep_blank_values</em> is a flag indicating whether blank
|
|||
|
values in percent-encoded queries should be treated as blank strings. A true value
|
|||
|
indicates that blanks should be retained as blank strings. The default false
|
|||
|
value indicates that blank values are to be ignored and treated as if they were
|
|||
|
not included.</p>
|
|||
|
<p>The optional argument <em>strict_parsing</em> is a flag indicating what to do with
|
|||
|
parsing errors. If false (the default), errors are silently ignored. If true,
|
|||
|
errors raise a <a class="reference internal" href="exceptions.html#ValueError" title="ValueError"><code class="xref py py-exc docutils literal notranslate"><span class="pre">ValueError</span></code></a> exception.</p>
|
|||
|
<p>The optional <em>encoding</em> and <em>errors</em> parameters specify how to decode
|
|||
|
percent-encoded sequences into Unicode characters, as accepted by the
|
|||
|
<a class="reference internal" href="stdtypes.html#bytes.decode" title="bytes.decode"><code class="xref py py-meth docutils literal notranslate"><span class="pre">bytes.decode()</span></code></a> method.</p>
|
|||
|
<p>The optional argument <em>max_num_fields</em> is the maximum number of fields to
|
|||
|
read. If set, then throws a <a class="reference internal" href="exceptions.html#ValueError" title="ValueError"><code class="xref py py-exc docutils literal notranslate"><span class="pre">ValueError</span></code></a> if there are more than
|
|||
|
<em>max_num_fields</em> fields read.</p>
|
|||
|
<p>Use the <a class="reference internal" href="#urllib.parse.urlencode" title="urllib.parse.urlencode"><code class="xref py py-func docutils literal notranslate"><span class="pre">urllib.parse.urlencode()</span></code></a> function (with the <code class="docutils literal notranslate"><span class="pre">doseq</span></code>
|
|||
|
parameter set to <code class="docutils literal notranslate"><span class="pre">True</span></code>) to convert such dictionaries into query
|
|||
|
strings.</p>
|
|||
|
<div class="versionchanged">
|
|||
|
<p><span class="versionmodified changed">Changed in version 3.2: </span>Add <em>encoding</em> and <em>errors</em> parameters.</p>
|
|||
|
</div>
|
|||
|
<div class="versionchanged">
|
|||
|
<p><span class="versionmodified changed">Changed in version 3.7.2: </span>Added <em>max_num_fields</em> parameter.</p>
|
|||
|
</div>
|
|||
|
</dd></dl>
|
|||
|
|
|||
|
<dl class="function">
|
|||
|
<dt id="urllib.parse.parse_qsl">
|
|||
|
<code class="descclassname">urllib.parse.</code><code class="descname">parse_qsl</code><span class="sig-paren">(</span><em>qs</em>, <em>keep_blank_values=False</em>, <em>strict_parsing=False</em>, <em>encoding='utf-8'</em>, <em>errors='replace'</em>, <em>max_num_fields=None</em><span class="sig-paren">)</span><a class="headerlink" href="#urllib.parse.parse_qsl" title="Permalink to this definition">¶</a></dt>
|
|||
|
<dd><p>Parse a query string given as a string argument (data of type
|
|||
|
<em class="mimetype">application/x-www-form-urlencoded</em>). Data are returned as a list of
|
|||
|
name, value pairs.</p>
|
|||
|
<p>The optional argument <em>keep_blank_values</em> is a flag indicating whether blank
|
|||
|
values in percent-encoded queries should be treated as blank strings. A true value
|
|||
|
indicates that blanks should be retained as blank strings. The default false
|
|||
|
value indicates that blank values are to be ignored and treated as if they were
|
|||
|
not included.</p>
|
|||
|
<p>The optional argument <em>strict_parsing</em> is a flag indicating what to do with
|
|||
|
parsing errors. If false (the default), errors are silently ignored. If true,
|
|||
|
errors raise a <a class="reference internal" href="exceptions.html#ValueError" title="ValueError"><code class="xref py py-exc docutils literal notranslate"><span class="pre">ValueError</span></code></a> exception.</p>
|
|||
|
<p>The optional <em>encoding</em> and <em>errors</em> parameters specify how to decode
|
|||
|
percent-encoded sequences into Unicode characters, as accepted by the
|
|||
|
<a class="reference internal" href="stdtypes.html#bytes.decode" title="bytes.decode"><code class="xref py py-meth docutils literal notranslate"><span class="pre">bytes.decode()</span></code></a> method.</p>
|
|||
|
<p>The optional argument <em>max_num_fields</em> is the maximum number of fields to
|
|||
|
read. If set, then throws a <a class="reference internal" href="exceptions.html#ValueError" title="ValueError"><code class="xref py py-exc docutils literal notranslate"><span class="pre">ValueError</span></code></a> if there are more than
|
|||
|
<em>max_num_fields</em> fields read.</p>
|
|||
|
<p>Use the <a class="reference internal" href="#urllib.parse.urlencode" title="urllib.parse.urlencode"><code class="xref py py-func docutils literal notranslate"><span class="pre">urllib.parse.urlencode()</span></code></a> function to convert such lists of pairs into
|
|||
|
query strings.</p>
|
|||
|
<div class="versionchanged">
|
|||
|
<p><span class="versionmodified changed">Changed in version 3.2: </span>Add <em>encoding</em> and <em>errors</em> parameters.</p>
|
|||
|
</div>
|
|||
|
<div class="versionchanged">
|
|||
|
<p><span class="versionmodified changed">Changed in version 3.7.2: </span>Added <em>max_num_fields</em> parameter.</p>
|
|||
|
</div>
|
|||
|
</dd></dl>
|
|||
|
|
|||
|
<dl class="function">
|
|||
|
<dt id="urllib.parse.urlunparse">
|
|||
|
<code class="descclassname">urllib.parse.</code><code class="descname">urlunparse</code><span class="sig-paren">(</span><em>parts</em><span class="sig-paren">)</span><a class="headerlink" href="#urllib.parse.urlunparse" title="Permalink to this definition">¶</a></dt>
|
|||
|
<dd><p>Construct a URL from a tuple as returned by <code class="docutils literal notranslate"><span class="pre">urlparse()</span></code>. The <em>parts</em>
|
|||
|
argument can be any six-item iterable. This may result in a slightly
|
|||
|
different, but equivalent URL, if the URL that was parsed originally had
|
|||
|
unnecessary delimiters (for example, a <code class="docutils literal notranslate"><span class="pre">?</span></code> with an empty query; the RFC
|
|||
|
states that these are equivalent).</p>
|
|||
|
</dd></dl>
|
|||
|
|
|||
|
<dl class="function">
|
|||
|
<dt id="urllib.parse.urlsplit">
|
|||
|
<code class="descclassname">urllib.parse.</code><code class="descname">urlsplit</code><span class="sig-paren">(</span><em>urlstring</em>, <em>scheme=''</em>, <em>allow_fragments=True</em><span class="sig-paren">)</span><a class="headerlink" href="#urllib.parse.urlsplit" title="Permalink to this definition">¶</a></dt>
|
|||
|
<dd><p>This is similar to <a class="reference internal" href="#urllib.parse.urlparse" title="urllib.parse.urlparse"><code class="xref py py-func docutils literal notranslate"><span class="pre">urlparse()</span></code></a>, but does not split the params from the URL.
|
|||
|
This should generally be used instead of <a class="reference internal" href="#urllib.parse.urlparse" title="urllib.parse.urlparse"><code class="xref py py-func docutils literal notranslate"><span class="pre">urlparse()</span></code></a> if the more recent URL
|
|||
|
syntax allowing parameters to be applied to each segment of the <em>path</em> portion
|
|||
|
of the URL (see <span class="target" id="index-3"></span><a class="rfc reference external" href="https://tools.ietf.org/html/rfc2396.html"><strong>RFC 2396</strong></a>) is wanted. A separate function is needed to
|
|||
|
separate the path segments and parameters. This function returns a 5-item
|
|||
|
<a class="reference internal" href="../glossary.html#term-named-tuple"><span class="xref std std-term">named tuple</span></a>:</p>
|
|||
|
<div class="highlight-python3 notranslate"><div class="highlight"><pre><span></span><span class="p">(</span><span class="n">addressing</span> <span class="n">scheme</span><span class="p">,</span> <span class="n">network</span> <span class="n">location</span><span class="p">,</span> <span class="n">path</span><span class="p">,</span> <span class="n">query</span><span class="p">,</span> <span class="n">fragment</span> <span class="n">identifier</span><span class="p">)</span><span class="o">.</span>
|
|||
|
</pre></div>
|
|||
|
</div>
|
|||
|
<p>The return value is a <a class="reference internal" href="../glossary.html#term-named-tuple"><span class="xref std std-term">named tuple</span></a>, its items can be accessed by index
|
|||
|
or as named attributes:</p>
|
|||
|
<table class="docutils align-center">
|
|||
|
<colgroup>
|
|||
|
<col style="width: 25%" />
|
|||
|
<col style="width: 10%" />
|
|||
|
<col style="width: 35%" />
|
|||
|
<col style="width: 31%" />
|
|||
|
</colgroup>
|
|||
|
<thead>
|
|||
|
<tr class="row-odd"><th class="head"><p>Attribute</p></th>
|
|||
|
<th class="head"><p>Index</p></th>
|
|||
|
<th class="head"><p>Value</p></th>
|
|||
|
<th class="head"><p>Value if not present</p></th>
|
|||
|
</tr>
|
|||
|
</thead>
|
|||
|
<tbody>
|
|||
|
<tr class="row-even"><td><p><code class="xref py py-attr docutils literal notranslate"><span class="pre">scheme</span></code></p></td>
|
|||
|
<td><p>0</p></td>
|
|||
|
<td><p>URL scheme specifier</p></td>
|
|||
|
<td><p><em>scheme</em> parameter</p></td>
|
|||
|
</tr>
|
|||
|
<tr class="row-odd"><td><p><code class="xref py py-attr docutils literal notranslate"><span class="pre">netloc</span></code></p></td>
|
|||
|
<td><p>1</p></td>
|
|||
|
<td><p>Network location part</p></td>
|
|||
|
<td><p>empty string</p></td>
|
|||
|
</tr>
|
|||
|
<tr class="row-even"><td><p><code class="xref py py-attr docutils literal notranslate"><span class="pre">path</span></code></p></td>
|
|||
|
<td><p>2</p></td>
|
|||
|
<td><p>Hierarchical path</p></td>
|
|||
|
<td><p>empty string</p></td>
|
|||
|
</tr>
|
|||
|
<tr class="row-odd"><td><p><code class="xref py py-attr docutils literal notranslate"><span class="pre">query</span></code></p></td>
|
|||
|
<td><p>3</p></td>
|
|||
|
<td><p>Query component</p></td>
|
|||
|
<td><p>empty string</p></td>
|
|||
|
</tr>
|
|||
|
<tr class="row-even"><td><p><code class="xref py py-attr docutils literal notranslate"><span class="pre">fragment</span></code></p></td>
|
|||
|
<td><p>4</p></td>
|
|||
|
<td><p>Fragment identifier</p></td>
|
|||
|
<td><p>empty string</p></td>
|
|||
|
</tr>
|
|||
|
<tr class="row-odd"><td><p><code class="xref py py-attr docutils literal notranslate"><span class="pre">username</span></code></p></td>
|
|||
|
<td></td>
|
|||
|
<td><p>User name</p></td>
|
|||
|
<td><p><a class="reference internal" href="constants.html#None" title="None"><code class="xref py py-const docutils literal notranslate"><span class="pre">None</span></code></a></p></td>
|
|||
|
</tr>
|
|||
|
<tr class="row-even"><td><p><code class="xref py py-attr docutils literal notranslate"><span class="pre">password</span></code></p></td>
|
|||
|
<td></td>
|
|||
|
<td><p>Password</p></td>
|
|||
|
<td><p><a class="reference internal" href="constants.html#None" title="None"><code class="xref py py-const docutils literal notranslate"><span class="pre">None</span></code></a></p></td>
|
|||
|
</tr>
|
|||
|
<tr class="row-odd"><td><p><code class="xref py py-attr docutils literal notranslate"><span class="pre">hostname</span></code></p></td>
|
|||
|
<td></td>
|
|||
|
<td><p>Host name (lower case)</p></td>
|
|||
|
<td><p><a class="reference internal" href="constants.html#None" title="None"><code class="xref py py-const docutils literal notranslate"><span class="pre">None</span></code></a></p></td>
|
|||
|
</tr>
|
|||
|
<tr class="row-even"><td><p><code class="xref py py-attr docutils literal notranslate"><span class="pre">port</span></code></p></td>
|
|||
|
<td></td>
|
|||
|
<td><p>Port number as integer,
|
|||
|
if present</p></td>
|
|||
|
<td><p><a class="reference internal" href="constants.html#None" title="None"><code class="xref py py-const docutils literal notranslate"><span class="pre">None</span></code></a></p></td>
|
|||
|
</tr>
|
|||
|
</tbody>
|
|||
|
</table>
|
|||
|
<p>Reading the <code class="xref py py-attr docutils literal notranslate"><span class="pre">port</span></code> attribute will raise a <a class="reference internal" href="exceptions.html#ValueError" title="ValueError"><code class="xref py py-exc docutils literal notranslate"><span class="pre">ValueError</span></code></a> if
|
|||
|
an invalid port is specified in the URL. See section
|
|||
|
<a class="reference internal" href="#urlparse-result-object"><span class="std std-ref">Structured Parse Results</span></a> for more information on the result object.</p>
|
|||
|
<p>Unmatched square brackets in the <code class="xref py py-attr docutils literal notranslate"><span class="pre">netloc</span></code> attribute will raise a
|
|||
|
<a class="reference internal" href="exceptions.html#ValueError" title="ValueError"><code class="xref py py-exc docutils literal notranslate"><span class="pre">ValueError</span></code></a>.</p>
|
|||
|
<p>Characters in the <code class="xref py py-attr docutils literal notranslate"><span class="pre">netloc</span></code> attribute that decompose under NFKC
|
|||
|
normalization (as used by the IDNA encoding) into any of <code class="docutils literal notranslate"><span class="pre">/</span></code>, <code class="docutils literal notranslate"><span class="pre">?</span></code>,
|
|||
|
<code class="docutils literal notranslate"><span class="pre">#</span></code>, <code class="docutils literal notranslate"><span class="pre">@</span></code>, or <code class="docutils literal notranslate"><span class="pre">:</span></code> will raise a <a class="reference internal" href="exceptions.html#ValueError" title="ValueError"><code class="xref py py-exc docutils literal notranslate"><span class="pre">ValueError</span></code></a>. If the URL is
|
|||
|
decomposed before parsing, no error will be raised.</p>
|
|||
|
<div class="versionchanged">
|
|||
|
<p><span class="versionmodified changed">Changed in version 3.6: </span>Out-of-range port numbers now raise <a class="reference internal" href="exceptions.html#ValueError" title="ValueError"><code class="xref py py-exc docutils literal notranslate"><span class="pre">ValueError</span></code></a>, instead of
|
|||
|
returning <a class="reference internal" href="constants.html#None" title="None"><code class="xref py py-const docutils literal notranslate"><span class="pre">None</span></code></a>.</p>
|
|||
|
</div>
|
|||
|
<div class="versionchanged">
|
|||
|
<p><span class="versionmodified changed">Changed in version 3.7.3: </span>Characters that affect netloc parsing under NFKC normalization will
|
|||
|
now raise <a class="reference internal" href="exceptions.html#ValueError" title="ValueError"><code class="xref py py-exc docutils literal notranslate"><span class="pre">ValueError</span></code></a>.</p>
|
|||
|
</div>
|
|||
|
</dd></dl>
|
|||
|
|
|||
|
<dl class="function">
|
|||
|
<dt id="urllib.parse.urlunsplit">
|
|||
|
<code class="descclassname">urllib.parse.</code><code class="descname">urlunsplit</code><span class="sig-paren">(</span><em>parts</em><span class="sig-paren">)</span><a class="headerlink" href="#urllib.parse.urlunsplit" title="Permalink to this definition">¶</a></dt>
|
|||
|
<dd><p>Combine the elements of a tuple as returned by <a class="reference internal" href="#urllib.parse.urlsplit" title="urllib.parse.urlsplit"><code class="xref py py-func docutils literal notranslate"><span class="pre">urlsplit()</span></code></a> into a
|
|||
|
complete URL as a string. The <em>parts</em> argument can be any five-item
|
|||
|
iterable. This may result in a slightly different, but equivalent URL, if the
|
|||
|
URL that was parsed originally had unnecessary delimiters (for example, a ?
|
|||
|
with an empty query; the RFC states that these are equivalent).</p>
|
|||
|
</dd></dl>
|
|||
|
|
|||
|
<dl class="function">
|
|||
|
<dt id="urllib.parse.urljoin">
|
|||
|
<code class="descclassname">urllib.parse.</code><code class="descname">urljoin</code><span class="sig-paren">(</span><em>base</em>, <em>url</em>, <em>allow_fragments=True</em><span class="sig-paren">)</span><a class="headerlink" href="#urllib.parse.urljoin" title="Permalink to this definition">¶</a></dt>
|
|||
|
<dd><p>Construct a full (“absolute”) URL by combining a “base URL” (<em>base</em>) with
|
|||
|
another URL (<em>url</em>). Informally, this uses components of the base URL, in
|
|||
|
particular the addressing scheme, the network location and (part of) the
|
|||
|
path, to provide missing components in the relative URL. For example:</p>
|
|||
|
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="gp">>>> </span><span class="kn">from</span> <span class="nn">urllib.parse</span> <span class="k">import</span> <span class="n">urljoin</span>
|
|||
|
<span class="gp">>>> </span><span class="n">urljoin</span><span class="p">(</span><span class="s1">'http://www.cwi.nl/</span><span class="si">%7E</span><span class="s1">guido/Python.html'</span><span class="p">,</span> <span class="s1">'FAQ.html'</span><span class="p">)</span>
|
|||
|
<span class="go">'http://www.cwi.nl/%7Eguido/FAQ.html'</span>
|
|||
|
</pre></div>
|
|||
|
</div>
|
|||
|
<p>The <em>allow_fragments</em> argument has the same meaning and default as for
|
|||
|
<a class="reference internal" href="#urllib.parse.urlparse" title="urllib.parse.urlparse"><code class="xref py py-func docutils literal notranslate"><span class="pre">urlparse()</span></code></a>.</p>
|
|||
|
<div class="admonition note">
|
|||
|
<p class="admonition-title">Note</p>
|
|||
|
<p>If <em>url</em> is an absolute URL (that is, starting with <code class="docutils literal notranslate"><span class="pre">//</span></code> or <code class="docutils literal notranslate"><span class="pre">scheme://</span></code>),
|
|||
|
the <em>url</em>’s host name and/or scheme will be present in the result. For example:</p>
|
|||
|
</div>
|
|||
|
<div class="highlight-pycon3 notranslate"><div class="highlight"><pre><span></span><span class="gp">>>> </span><span class="n">urljoin</span><span class="p">(</span><span class="s1">'http://www.cwi.nl/</span><span class="si">%7E</span><span class="s1">guido/Python.html'</span><span class="p">,</span>
|
|||
|
<span class="gp">... </span> <span class="s1">'//www.python.org/</span><span class="si">%7E</span><span class="s1">guido'</span><span class="p">)</span>
|
|||
|
<span class="go">'http://www.python.org/%7Eguido'</span>
|
|||
|
</pre></div>
|
|||
|
</div>
|
|||
|
<p>If you do not want that behavior, preprocess the <em>url</em> with <a class="reference internal" href="#urllib.parse.urlsplit" title="urllib.parse.urlsplit"><code class="xref py py-func docutils literal notranslate"><span class="pre">urlsplit()</span></code></a> and
|
|||
|
<a class="reference internal" href="#urllib.parse.urlunsplit" title="urllib.parse.urlunsplit"><code class="xref py py-func docutils literal notranslate"><span class="pre">urlunsplit()</span></code></a>, removing possible <em>scheme</em> and <em>netloc</em> parts.</p>
|
|||
|
<div class="versionchanged">
|
|||
|
<p><span class="versionmodified changed">Changed in version 3.5: </span>Behaviour updated to match the semantics defined in <span class="target" id="index-4"></span><a class="rfc reference external" href="https://tools.ietf.org/html/rfc3986.html"><strong>RFC 3986</strong></a>.</p>
|
|||
|
</div>
|
|||
|
</dd></dl>
|
|||
|
|
|||
|
<dl class="function">
|
|||
|
<dt id="urllib.parse.urldefrag">
|
|||
|
<code class="descclassname">urllib.parse.</code><code class="descname">urldefrag</code><span class="sig-paren">(</span><em>url</em><span class="sig-paren">)</span><a class="headerlink" href="#urllib.parse.urldefrag" title="Permalink to this definition">¶</a></dt>
|
|||
|
<dd><p>If <em>url</em> contains a fragment identifier, return a modified version of <em>url</em>
|
|||
|
with no fragment identifier, and the fragment identifier as a separate
|
|||
|
string. If there is no fragment identifier in <em>url</em>, return <em>url</em> unmodified
|
|||
|
and an empty string.</p>
|
|||
|
<p>The return value is a <a class="reference internal" href="../glossary.html#term-named-tuple"><span class="xref std std-term">named tuple</span></a>, its items can be accessed by index
|
|||
|
or as named attributes:</p>
|
|||
|
<table class="docutils align-center">
|
|||
|
<colgroup>
|
|||
|
<col style="width: 25%" />
|
|||
|
<col style="width: 10%" />
|
|||
|
<col style="width: 35%" />
|
|||
|
<col style="width: 31%" />
|
|||
|
</colgroup>
|
|||
|
<thead>
|
|||
|
<tr class="row-odd"><th class="head"><p>Attribute</p></th>
|
|||
|
<th class="head"><p>Index</p></th>
|
|||
|
<th class="head"><p>Value</p></th>
|
|||
|
<th class="head"><p>Value if not present</p></th>
|
|||
|
</tr>
|
|||
|
</thead>
|
|||
|
<tbody>
|
|||
|
<tr class="row-even"><td><p><code class="xref py py-attr docutils literal notranslate"><span class="pre">url</span></code></p></td>
|
|||
|
<td><p>0</p></td>
|
|||
|
<td><p>URL with no fragment</p></td>
|
|||
|
<td><p>empty string</p></td>
|
|||
|
</tr>
|
|||
|
<tr class="row-odd"><td><p><code class="xref py py-attr docutils literal notranslate"><span class="pre">fragment</span></code></p></td>
|
|||
|
<td><p>1</p></td>
|
|||
|
<td><p>Fragment identifier</p></td>
|
|||
|
<td><p>empty string</p></td>
|
|||
|
</tr>
|
|||
|
</tbody>
|
|||
|
</table>
|
|||
|
<p>See section <a class="reference internal" href="#urlparse-result-object"><span class="std std-ref">Structured Parse Results</span></a> for more information on the result
|
|||
|
object.</p>
|
|||
|
<div class="versionchanged">
|
|||
|
<p><span class="versionmodified changed">Changed in version 3.2: </span>Result is a structured object rather than a simple 2-tuple.</p>
|
|||
|
</div>
|
|||
|
</dd></dl>
|
|||
|
|
|||
|
</div>
|
|||
|
<div class="section" id="parsing-ascii-encoded-bytes">
|
|||
|
<span id="id1"></span><h2>Parsing ASCII Encoded Bytes<a class="headerlink" href="#parsing-ascii-encoded-bytes" title="Permalink to this headline">¶</a></h2>
|
|||
|
<p>The URL parsing functions were originally designed to operate on character
|
|||
|
strings only. In practice, it is useful to be able to manipulate properly
|
|||
|
quoted and encoded URLs as sequences of ASCII bytes. Accordingly, the
|
|||
|
URL parsing functions in this module all operate on <a class="reference internal" href="stdtypes.html#bytes" title="bytes"><code class="xref py py-class docutils literal notranslate"><span class="pre">bytes</span></code></a> and
|
|||
|
<a class="reference internal" href="stdtypes.html#bytearray" title="bytearray"><code class="xref py py-class docutils literal notranslate"><span class="pre">bytearray</span></code></a> objects in addition to <a class="reference internal" href="stdtypes.html#str" title="str"><code class="xref py py-class docutils literal notranslate"><span class="pre">str</span></code></a> objects.</p>
|
|||
|
<p>If <a class="reference internal" href="stdtypes.html#str" title="str"><code class="xref py py-class docutils literal notranslate"><span class="pre">str</span></code></a> data is passed in, the result will also contain only
|
|||
|
<a class="reference internal" href="stdtypes.html#str" title="str"><code class="xref py py-class docutils literal notranslate"><span class="pre">str</span></code></a> data. If <a class="reference internal" href="stdtypes.html#bytes" title="bytes"><code class="xref py py-class docutils literal notranslate"><span class="pre">bytes</span></code></a> or <a class="reference internal" href="stdtypes.html#bytearray" title="bytearray"><code class="xref py py-class docutils literal notranslate"><span class="pre">bytearray</span></code></a> data is
|
|||
|
passed in, the result will contain only <a class="reference internal" href="stdtypes.html#bytes" title="bytes"><code class="xref py py-class docutils literal notranslate"><span class="pre">bytes</span></code></a> data.</p>
|
|||
|
<p>Attempting to mix <a class="reference internal" href="stdtypes.html#str" title="str"><code class="xref py py-class docutils literal notranslate"><span class="pre">str</span></code></a> data with <a class="reference internal" href="stdtypes.html#bytes" title="bytes"><code class="xref py py-class docutils literal notranslate"><span class="pre">bytes</span></code></a> or
|
|||
|
<a class="reference internal" href="stdtypes.html#bytearray" title="bytearray"><code class="xref py py-class docutils literal notranslate"><span class="pre">bytearray</span></code></a> in a single function call will result in a
|
|||
|
<a class="reference internal" href="exceptions.html#TypeError" title="TypeError"><code class="xref py py-exc docutils literal notranslate"><span class="pre">TypeError</span></code></a> being raised, while attempting to pass in non-ASCII
|
|||
|
byte values will trigger <a class="reference internal" href="exceptions.html#UnicodeDecodeError" title="UnicodeDecodeError"><code class="xref py py-exc docutils literal notranslate"><span class="pre">UnicodeDecodeError</span></code></a>.</p>
|
|||
|
<p>To support easier conversion of result objects between <a class="reference internal" href="stdtypes.html#str" title="str"><code class="xref py py-class docutils literal notranslate"><span class="pre">str</span></code></a> and
|
|||
|
<a class="reference internal" href="stdtypes.html#bytes" title="bytes"><code class="xref py py-class docutils literal notranslate"><span class="pre">bytes</span></code></a>, all return values from URL parsing functions provide
|
|||
|
either an <code class="xref py py-meth docutils literal notranslate"><span class="pre">encode()</span></code> method (when the result contains <a class="reference internal" href="stdtypes.html#str" title="str"><code class="xref py py-class docutils literal notranslate"><span class="pre">str</span></code></a>
|
|||
|
data) or a <code class="xref py py-meth docutils literal notranslate"><span class="pre">decode()</span></code> method (when the result contains <a class="reference internal" href="stdtypes.html#bytes" title="bytes"><code class="xref py py-class docutils literal notranslate"><span class="pre">bytes</span></code></a>
|
|||
|
data). The signatures of these methods match those of the corresponding
|
|||
|
<a class="reference internal" href="stdtypes.html#str" title="str"><code class="xref py py-class docutils literal notranslate"><span class="pre">str</span></code></a> and <a class="reference internal" href="stdtypes.html#bytes" title="bytes"><code class="xref py py-class docutils literal notranslate"><span class="pre">bytes</span></code></a> methods (except that the default encoding
|
|||
|
is <code class="docutils literal notranslate"><span class="pre">'ascii'</span></code> rather than <code class="docutils literal notranslate"><span class="pre">'utf-8'</span></code>). Each produces a value of a
|
|||
|
corresponding type that contains either <a class="reference internal" href="stdtypes.html#bytes" title="bytes"><code class="xref py py-class docutils literal notranslate"><span class="pre">bytes</span></code></a> data (for
|
|||
|
<code class="xref py py-meth docutils literal notranslate"><span class="pre">encode()</span></code> methods) or <a class="reference internal" href="stdtypes.html#str" title="str"><code class="xref py py-class docutils literal notranslate"><span class="pre">str</span></code></a> data (for
|
|||
|
<code class="xref py py-meth docutils literal notranslate"><span class="pre">decode()</span></code> methods).</p>
|
|||
|
<p>Applications that need to operate on potentially improperly quoted URLs
|
|||
|
that may contain non-ASCII data will need to do their own decoding from
|
|||
|
bytes to characters before invoking the URL parsing methods.</p>
|
|||
|
<p>The behaviour described in this section applies only to the URL parsing
|
|||
|
functions. The URL quoting functions use their own rules when producing
|
|||
|
or consuming byte sequences as detailed in the documentation of the
|
|||
|
individual URL quoting functions.</p>
|
|||
|
<div class="versionchanged">
|
|||
|
<p><span class="versionmodified changed">Changed in version 3.2: </span>URL parsing functions now accept ASCII encoded byte sequences</p>
|
|||
|
</div>
|
|||
|
</div>
|
|||
|
<div class="section" id="structured-parse-results">
|
|||
|
<span id="urlparse-result-object"></span><h2>Structured Parse Results<a class="headerlink" href="#structured-parse-results" title="Permalink to this headline">¶</a></h2>
|
|||
|
<p>The result objects from the <a class="reference internal" href="#urllib.parse.urlparse" title="urllib.parse.urlparse"><code class="xref py py-func docutils literal notranslate"><span class="pre">urlparse()</span></code></a>, <a class="reference internal" href="#urllib.parse.urlsplit" title="urllib.parse.urlsplit"><code class="xref py py-func docutils literal notranslate"><span class="pre">urlsplit()</span></code></a> and
|
|||
|
<a class="reference internal" href="#urllib.parse.urldefrag" title="urllib.parse.urldefrag"><code class="xref py py-func docutils literal notranslate"><span class="pre">urldefrag()</span></code></a> functions are subclasses of the <a class="reference internal" href="stdtypes.html#tuple" title="tuple"><code class="xref py py-class docutils literal notranslate"><span class="pre">tuple</span></code></a> type.
|
|||
|
These subclasses add the attributes listed in the documentation for
|
|||
|
those functions, the encoding and decoding support described in the
|
|||
|
previous section, as well as an additional method:</p>
|
|||
|
<dl class="method">
|
|||
|
<dt id="urllib.parse.urllib.parse.SplitResult.geturl">
|
|||
|
<code class="descclassname">urllib.parse.SplitResult.</code><code class="descname">geturl</code><span class="sig-paren">(</span><span class="sig-paren">)</span><a class="headerlink" href="#urllib.parse.urllib.parse.SplitResult.geturl" title="Permalink to this definition">¶</a></dt>
|
|||
|
<dd><p>Return the re-combined version of the original URL as a string. This may
|
|||
|
differ from the original URL in that the scheme may be normalized to lower
|
|||
|
case and empty components may be dropped. Specifically, empty parameters,
|
|||
|
queries, and fragment identifiers will be removed.</p>
|
|||
|
<p>For <a class="reference internal" href="#urllib.parse.urldefrag" title="urllib.parse.urldefrag"><code class="xref py py-func docutils literal notranslate"><span class="pre">urldefrag()</span></code></a> results, only empty fragment identifiers will be removed.
|
|||
|
For <a class="reference internal" href="#urllib.parse.urlsplit" title="urllib.parse.urlsplit"><code class="xref py py-func docutils literal notranslate"><span class="pre">urlsplit()</span></code></a> and <a class="reference internal" href="#urllib.parse.urlparse" title="urllib.parse.urlparse"><code class="xref py py-func docutils literal notranslate"><span class="pre">urlparse()</span></code></a> results, all noted changes will be
|
|||
|
made to the URL returned by this method.</p>
|
|||
|
<p>The result of this method remains unchanged if passed back through the original
|
|||
|
parsing function:</p>
|
|||
|
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="gp">>>> </span><span class="kn">from</span> <span class="nn">urllib.parse</span> <span class="k">import</span> <span class="n">urlsplit</span>
|
|||
|
<span class="gp">>>> </span><span class="n">url</span> <span class="o">=</span> <span class="s1">'HTTP://www.Python.org/doc/#'</span>
|
|||
|
<span class="gp">>>> </span><span class="n">r1</span> <span class="o">=</span> <span class="n">urlsplit</span><span class="p">(</span><span class="n">url</span><span class="p">)</span>
|
|||
|
<span class="gp">>>> </span><span class="n">r1</span><span class="o">.</span><span class="n">geturl</span><span class="p">()</span>
|
|||
|
<span class="go">'http://www.Python.org/doc/'</span>
|
|||
|
<span class="gp">>>> </span><span class="n">r2</span> <span class="o">=</span> <span class="n">urlsplit</span><span class="p">(</span><span class="n">r1</span><span class="o">.</span><span class="n">geturl</span><span class="p">())</span>
|
|||
|
<span class="gp">>>> </span><span class="n">r2</span><span class="o">.</span><span class="n">geturl</span><span class="p">()</span>
|
|||
|
<span class="go">'http://www.Python.org/doc/'</span>
|
|||
|
</pre></div>
|
|||
|
</div>
|
|||
|
</dd></dl>
|
|||
|
|
|||
|
<p>The following classes provide the implementations of the structured parse
|
|||
|
results when operating on <a class="reference internal" href="stdtypes.html#str" title="str"><code class="xref py py-class docutils literal notranslate"><span class="pre">str</span></code></a> objects:</p>
|
|||
|
<dl class="class">
|
|||
|
<dt id="urllib.parse.DefragResult">
|
|||
|
<em class="property">class </em><code class="descclassname">urllib.parse.</code><code class="descname">DefragResult</code><span class="sig-paren">(</span><em>url</em>, <em>fragment</em><span class="sig-paren">)</span><a class="headerlink" href="#urllib.parse.DefragResult" title="Permalink to this definition">¶</a></dt>
|
|||
|
<dd><p>Concrete class for <a class="reference internal" href="#urllib.parse.urldefrag" title="urllib.parse.urldefrag"><code class="xref py py-func docutils literal notranslate"><span class="pre">urldefrag()</span></code></a> results containing <a class="reference internal" href="stdtypes.html#str" title="str"><code class="xref py py-class docutils literal notranslate"><span class="pre">str</span></code></a>
|
|||
|
data. The <code class="xref py py-meth docutils literal notranslate"><span class="pre">encode()</span></code> method returns a <a class="reference internal" href="#urllib.parse.DefragResultBytes" title="urllib.parse.DefragResultBytes"><code class="xref py py-class docutils literal notranslate"><span class="pre">DefragResultBytes</span></code></a>
|
|||
|
instance.</p>
|
|||
|
<div class="versionadded">
|
|||
|
<p><span class="versionmodified added">New in version 3.2.</span></p>
|
|||
|
</div>
|
|||
|
</dd></dl>
|
|||
|
|
|||
|
<dl class="class">
|
|||
|
<dt id="urllib.parse.ParseResult">
|
|||
|
<em class="property">class </em><code class="descclassname">urllib.parse.</code><code class="descname">ParseResult</code><span class="sig-paren">(</span><em>scheme</em>, <em>netloc</em>, <em>path</em>, <em>params</em>, <em>query</em>, <em>fragment</em><span class="sig-paren">)</span><a class="headerlink" href="#urllib.parse.ParseResult" title="Permalink to this definition">¶</a></dt>
|
|||
|
<dd><p>Concrete class for <a class="reference internal" href="#urllib.parse.urlparse" title="urllib.parse.urlparse"><code class="xref py py-func docutils literal notranslate"><span class="pre">urlparse()</span></code></a> results containing <a class="reference internal" href="stdtypes.html#str" title="str"><code class="xref py py-class docutils literal notranslate"><span class="pre">str</span></code></a>
|
|||
|
data. The <code class="xref py py-meth docutils literal notranslate"><span class="pre">encode()</span></code> method returns a <a class="reference internal" href="#urllib.parse.ParseResultBytes" title="urllib.parse.ParseResultBytes"><code class="xref py py-class docutils literal notranslate"><span class="pre">ParseResultBytes</span></code></a>
|
|||
|
instance.</p>
|
|||
|
</dd></dl>
|
|||
|
|
|||
|
<dl class="class">
|
|||
|
<dt id="urllib.parse.SplitResult">
|
|||
|
<em class="property">class </em><code class="descclassname">urllib.parse.</code><code class="descname">SplitResult</code><span class="sig-paren">(</span><em>scheme</em>, <em>netloc</em>, <em>path</em>, <em>query</em>, <em>fragment</em><span class="sig-paren">)</span><a class="headerlink" href="#urllib.parse.SplitResult" title="Permalink to this definition">¶</a></dt>
|
|||
|
<dd><p>Concrete class for <a class="reference internal" href="#urllib.parse.urlsplit" title="urllib.parse.urlsplit"><code class="xref py py-func docutils literal notranslate"><span class="pre">urlsplit()</span></code></a> results containing <a class="reference internal" href="stdtypes.html#str" title="str"><code class="xref py py-class docutils literal notranslate"><span class="pre">str</span></code></a>
|
|||
|
data. The <code class="xref py py-meth docutils literal notranslate"><span class="pre">encode()</span></code> method returns a <a class="reference internal" href="#urllib.parse.SplitResultBytes" title="urllib.parse.SplitResultBytes"><code class="xref py py-class docutils literal notranslate"><span class="pre">SplitResultBytes</span></code></a>
|
|||
|
instance.</p>
|
|||
|
</dd></dl>
|
|||
|
|
|||
|
<p>The following classes provide the implementations of the parse results when
|
|||
|
operating on <a class="reference internal" href="stdtypes.html#bytes" title="bytes"><code class="xref py py-class docutils literal notranslate"><span class="pre">bytes</span></code></a> or <a class="reference internal" href="stdtypes.html#bytearray" title="bytearray"><code class="xref py py-class docutils literal notranslate"><span class="pre">bytearray</span></code></a> objects:</p>
|
|||
|
<dl class="class">
|
|||
|
<dt id="urllib.parse.DefragResultBytes">
|
|||
|
<em class="property">class </em><code class="descclassname">urllib.parse.</code><code class="descname">DefragResultBytes</code><span class="sig-paren">(</span><em>url</em>, <em>fragment</em><span class="sig-paren">)</span><a class="headerlink" href="#urllib.parse.DefragResultBytes" title="Permalink to this definition">¶</a></dt>
|
|||
|
<dd><p>Concrete class for <a class="reference internal" href="#urllib.parse.urldefrag" title="urllib.parse.urldefrag"><code class="xref py py-func docutils literal notranslate"><span class="pre">urldefrag()</span></code></a> results containing <a class="reference internal" href="stdtypes.html#bytes" title="bytes"><code class="xref py py-class docutils literal notranslate"><span class="pre">bytes</span></code></a>
|
|||
|
data. The <code class="xref py py-meth docutils literal notranslate"><span class="pre">decode()</span></code> method returns a <a class="reference internal" href="#urllib.parse.DefragResult" title="urllib.parse.DefragResult"><code class="xref py py-class docutils literal notranslate"><span class="pre">DefragResult</span></code></a>
|
|||
|
instance.</p>
|
|||
|
<div class="versionadded">
|
|||
|
<p><span class="versionmodified added">New in version 3.2.</span></p>
|
|||
|
</div>
|
|||
|
</dd></dl>
|
|||
|
|
|||
|
<dl class="class">
|
|||
|
<dt id="urllib.parse.ParseResultBytes">
|
|||
|
<em class="property">class </em><code class="descclassname">urllib.parse.</code><code class="descname">ParseResultBytes</code><span class="sig-paren">(</span><em>scheme</em>, <em>netloc</em>, <em>path</em>, <em>params</em>, <em>query</em>, <em>fragment</em><span class="sig-paren">)</span><a class="headerlink" href="#urllib.parse.ParseResultBytes" title="Permalink to this definition">¶</a></dt>
|
|||
|
<dd><p>Concrete class for <a class="reference internal" href="#urllib.parse.urlparse" title="urllib.parse.urlparse"><code class="xref py py-func docutils literal notranslate"><span class="pre">urlparse()</span></code></a> results containing <a class="reference internal" href="stdtypes.html#bytes" title="bytes"><code class="xref py py-class docutils literal notranslate"><span class="pre">bytes</span></code></a>
|
|||
|
data. The <code class="xref py py-meth docutils literal notranslate"><span class="pre">decode()</span></code> method returns a <a class="reference internal" href="#urllib.parse.ParseResult" title="urllib.parse.ParseResult"><code class="xref py py-class docutils literal notranslate"><span class="pre">ParseResult</span></code></a>
|
|||
|
instance.</p>
|
|||
|
<div class="versionadded">
|
|||
|
<p><span class="versionmodified added">New in version 3.2.</span></p>
|
|||
|
</div>
|
|||
|
</dd></dl>
|
|||
|
|
|||
|
<dl class="class">
|
|||
|
<dt id="urllib.parse.SplitResultBytes">
|
|||
|
<em class="property">class </em><code class="descclassname">urllib.parse.</code><code class="descname">SplitResultBytes</code><span class="sig-paren">(</span><em>scheme</em>, <em>netloc</em>, <em>path</em>, <em>query</em>, <em>fragment</em><span class="sig-paren">)</span><a class="headerlink" href="#urllib.parse.SplitResultBytes" title="Permalink to this definition">¶</a></dt>
|
|||
|
<dd><p>Concrete class for <a class="reference internal" href="#urllib.parse.urlsplit" title="urllib.parse.urlsplit"><code class="xref py py-func docutils literal notranslate"><span class="pre">urlsplit()</span></code></a> results containing <a class="reference internal" href="stdtypes.html#bytes" title="bytes"><code class="xref py py-class docutils literal notranslate"><span class="pre">bytes</span></code></a>
|
|||
|
data. The <code class="xref py py-meth docutils literal notranslate"><span class="pre">decode()</span></code> method returns a <a class="reference internal" href="#urllib.parse.SplitResult" title="urllib.parse.SplitResult"><code class="xref py py-class docutils literal notranslate"><span class="pre">SplitResult</span></code></a>
|
|||
|
instance.</p>
|
|||
|
<div class="versionadded">
|
|||
|
<p><span class="versionmodified added">New in version 3.2.</span></p>
|
|||
|
</div>
|
|||
|
</dd></dl>
|
|||
|
|
|||
|
</div>
|
|||
|
<div class="section" id="url-quoting">
|
|||
|
<h2>URL Quoting<a class="headerlink" href="#url-quoting" title="Permalink to this headline">¶</a></h2>
|
|||
|
<p>The URL quoting functions focus on taking program data and making it safe
|
|||
|
for use as URL components by quoting special characters and appropriately
|
|||
|
encoding non-ASCII text. They also support reversing these operations to
|
|||
|
recreate the original data from the contents of a URL component if that
|
|||
|
task isn’t already covered by the URL parsing functions above.</p>
|
|||
|
<dl class="function">
|
|||
|
<dt id="urllib.parse.quote">
|
|||
|
<code class="descclassname">urllib.parse.</code><code class="descname">quote</code><span class="sig-paren">(</span><em>string</em>, <em>safe='/'</em>, <em>encoding=None</em>, <em>errors=None</em><span class="sig-paren">)</span><a class="headerlink" href="#urllib.parse.quote" title="Permalink to this definition">¶</a></dt>
|
|||
|
<dd><p>Replace special characters in <em>string</em> using the <code class="docutils literal notranslate"><span class="pre">%xx</span></code> escape. Letters,
|
|||
|
digits, and the characters <code class="docutils literal notranslate"><span class="pre">'_.-~'</span></code> are never quoted. By default, this
|
|||
|
function is intended for quoting the path section of URL. The optional <em>safe</em>
|
|||
|
parameter specifies additional ASCII characters that should not be quoted
|
|||
|
— its default value is <code class="docutils literal notranslate"><span class="pre">'/'</span></code>.</p>
|
|||
|
<p><em>string</em> may be either a <a class="reference internal" href="stdtypes.html#str" title="str"><code class="xref py py-class docutils literal notranslate"><span class="pre">str</span></code></a> or a <a class="reference internal" href="stdtypes.html#bytes" title="bytes"><code class="xref py py-class docutils literal notranslate"><span class="pre">bytes</span></code></a>.</p>
|
|||
|
<div class="versionchanged">
|
|||
|
<p><span class="versionmodified changed">Changed in version 3.7: </span>Moved from <span class="target" id="index-5"></span><a class="rfc reference external" href="https://tools.ietf.org/html/rfc2396.html"><strong>RFC 2396</strong></a> to <span class="target" id="index-6"></span><a class="rfc reference external" href="https://tools.ietf.org/html/rfc3986.html"><strong>RFC 3986</strong></a> for quoting URL strings. “~” is now
|
|||
|
included in the set of reserved characters.</p>
|
|||
|
</div>
|
|||
|
<p>The optional <em>encoding</em> and <em>errors</em> parameters specify how to deal with
|
|||
|
non-ASCII characters, as accepted by the <a class="reference internal" href="stdtypes.html#str.encode" title="str.encode"><code class="xref py py-meth docutils literal notranslate"><span class="pre">str.encode()</span></code></a> method.
|
|||
|
<em>encoding</em> defaults to <code class="docutils literal notranslate"><span class="pre">'utf-8'</span></code>.
|
|||
|
<em>errors</em> defaults to <code class="docutils literal notranslate"><span class="pre">'strict'</span></code>, meaning unsupported characters raise a
|
|||
|
<a class="reference internal" href="exceptions.html#UnicodeEncodeError" title="UnicodeEncodeError"><code class="xref py py-class docutils literal notranslate"><span class="pre">UnicodeEncodeError</span></code></a>.
|
|||
|
<em>encoding</em> and <em>errors</em> must not be supplied if <em>string</em> is a
|
|||
|
<a class="reference internal" href="stdtypes.html#bytes" title="bytes"><code class="xref py py-class docutils literal notranslate"><span class="pre">bytes</span></code></a>, or a <a class="reference internal" href="exceptions.html#TypeError" title="TypeError"><code class="xref py py-class docutils literal notranslate"><span class="pre">TypeError</span></code></a> is raised.</p>
|
|||
|
<p>Note that <code class="docutils literal notranslate"><span class="pre">quote(string,</span> <span class="pre">safe,</span> <span class="pre">encoding,</span> <span class="pre">errors)</span></code> is equivalent to
|
|||
|
<code class="docutils literal notranslate"><span class="pre">quote_from_bytes(string.encode(encoding,</span> <span class="pre">errors),</span> <span class="pre">safe)</span></code>.</p>
|
|||
|
<p>Example: <code class="docutils literal notranslate"><span class="pre">quote('/El</span> <span class="pre">Niño/')</span></code> yields <code class="docutils literal notranslate"><span class="pre">'/El%20Ni%C3%B1o/'</span></code>.</p>
|
|||
|
</dd></dl>
|
|||
|
|
|||
|
<dl class="function">
|
|||
|
<dt id="urllib.parse.quote_plus">
|
|||
|
<code class="descclassname">urllib.parse.</code><code class="descname">quote_plus</code><span class="sig-paren">(</span><em>string</em>, <em>safe=''</em>, <em>encoding=None</em>, <em>errors=None</em><span class="sig-paren">)</span><a class="headerlink" href="#urllib.parse.quote_plus" title="Permalink to this definition">¶</a></dt>
|
|||
|
<dd><p>Like <a class="reference internal" href="#urllib.parse.quote" title="urllib.parse.quote"><code class="xref py py-func docutils literal notranslate"><span class="pre">quote()</span></code></a>, but also replace spaces by plus signs, as required for
|
|||
|
quoting HTML form values when building up a query string to go into a URL.
|
|||
|
Plus signs in the original string are escaped unless they are included in
|
|||
|
<em>safe</em>. It also does not have <em>safe</em> default to <code class="docutils literal notranslate"><span class="pre">'/'</span></code>.</p>
|
|||
|
<p>Example: <code class="docutils literal notranslate"><span class="pre">quote_plus('/El</span> <span class="pre">Niño/')</span></code> yields <code class="docutils literal notranslate"><span class="pre">'%2FEl+Ni%C3%B1o%2F'</span></code>.</p>
|
|||
|
</dd></dl>
|
|||
|
|
|||
|
<dl class="function">
|
|||
|
<dt id="urllib.parse.quote_from_bytes">
|
|||
|
<code class="descclassname">urllib.parse.</code><code class="descname">quote_from_bytes</code><span class="sig-paren">(</span><em>bytes</em>, <em>safe='/'</em><span class="sig-paren">)</span><a class="headerlink" href="#urllib.parse.quote_from_bytes" title="Permalink to this definition">¶</a></dt>
|
|||
|
<dd><p>Like <a class="reference internal" href="#urllib.parse.quote" title="urllib.parse.quote"><code class="xref py py-func docutils literal notranslate"><span class="pre">quote()</span></code></a>, but accepts a <a class="reference internal" href="stdtypes.html#bytes" title="bytes"><code class="xref py py-class docutils literal notranslate"><span class="pre">bytes</span></code></a> object rather than a
|
|||
|
<a class="reference internal" href="stdtypes.html#str" title="str"><code class="xref py py-class docutils literal notranslate"><span class="pre">str</span></code></a>, and does not perform string-to-bytes encoding.</p>
|
|||
|
<p>Example: <code class="docutils literal notranslate"><span class="pre">quote_from_bytes(b'a&\xef')</span></code> yields
|
|||
|
<code class="docutils literal notranslate"><span class="pre">'a%26%EF'</span></code>.</p>
|
|||
|
</dd></dl>
|
|||
|
|
|||
|
<dl class="function">
|
|||
|
<dt id="urllib.parse.unquote">
|
|||
|
<code class="descclassname">urllib.parse.</code><code class="descname">unquote</code><span class="sig-paren">(</span><em>string</em>, <em>encoding='utf-8'</em>, <em>errors='replace'</em><span class="sig-paren">)</span><a class="headerlink" href="#urllib.parse.unquote" title="Permalink to this definition">¶</a></dt>
|
|||
|
<dd><p>Replace <code class="docutils literal notranslate"><span class="pre">%xx</span></code> escapes by their single-character equivalent.
|
|||
|
The optional <em>encoding</em> and <em>errors</em> parameters specify how to decode
|
|||
|
percent-encoded sequences into Unicode characters, as accepted by the
|
|||
|
<a class="reference internal" href="stdtypes.html#bytes.decode" title="bytes.decode"><code class="xref py py-meth docutils literal notranslate"><span class="pre">bytes.decode()</span></code></a> method.</p>
|
|||
|
<p><em>string</em> must be a <a class="reference internal" href="stdtypes.html#str" title="str"><code class="xref py py-class docutils literal notranslate"><span class="pre">str</span></code></a>.</p>
|
|||
|
<p><em>encoding</em> defaults to <code class="docutils literal notranslate"><span class="pre">'utf-8'</span></code>.
|
|||
|
<em>errors</em> defaults to <code class="docutils literal notranslate"><span class="pre">'replace'</span></code>, meaning invalid sequences are replaced
|
|||
|
by a placeholder character.</p>
|
|||
|
<p>Example: <code class="docutils literal notranslate"><span class="pre">unquote('/El%20Ni%C3%B1o/')</span></code> yields <code class="docutils literal notranslate"><span class="pre">'/El</span> <span class="pre">Niño/'</span></code>.</p>
|
|||
|
</dd></dl>
|
|||
|
|
|||
|
<dl class="function">
|
|||
|
<dt id="urllib.parse.unquote_plus">
|
|||
|
<code class="descclassname">urllib.parse.</code><code class="descname">unquote_plus</code><span class="sig-paren">(</span><em>string</em>, <em>encoding='utf-8'</em>, <em>errors='replace'</em><span class="sig-paren">)</span><a class="headerlink" href="#urllib.parse.unquote_plus" title="Permalink to this definition">¶</a></dt>
|
|||
|
<dd><p>Like <a class="reference internal" href="#urllib.parse.unquote" title="urllib.parse.unquote"><code class="xref py py-func docutils literal notranslate"><span class="pre">unquote()</span></code></a>, but also replace plus signs by spaces, as required for
|
|||
|
unquoting HTML form values.</p>
|
|||
|
<p><em>string</em> must be a <a class="reference internal" href="stdtypes.html#str" title="str"><code class="xref py py-class docutils literal notranslate"><span class="pre">str</span></code></a>.</p>
|
|||
|
<p>Example: <code class="docutils literal notranslate"><span class="pre">unquote_plus('/El+Ni%C3%B1o/')</span></code> yields <code class="docutils literal notranslate"><span class="pre">'/El</span> <span class="pre">Niño/'</span></code>.</p>
|
|||
|
</dd></dl>
|
|||
|
|
|||
|
<dl class="function">
|
|||
|
<dt id="urllib.parse.unquote_to_bytes">
|
|||
|
<code class="descclassname">urllib.parse.</code><code class="descname">unquote_to_bytes</code><span class="sig-paren">(</span><em>string</em><span class="sig-paren">)</span><a class="headerlink" href="#urllib.parse.unquote_to_bytes" title="Permalink to this definition">¶</a></dt>
|
|||
|
<dd><p>Replace <code class="docutils literal notranslate"><span class="pre">%xx</span></code> escapes by their single-octet equivalent, and return a
|
|||
|
<a class="reference internal" href="stdtypes.html#bytes" title="bytes"><code class="xref py py-class docutils literal notranslate"><span class="pre">bytes</span></code></a> object.</p>
|
|||
|
<p><em>string</em> may be either a <a class="reference internal" href="stdtypes.html#str" title="str"><code class="xref py py-class docutils literal notranslate"><span class="pre">str</span></code></a> or a <a class="reference internal" href="stdtypes.html#bytes" title="bytes"><code class="xref py py-class docutils literal notranslate"><span class="pre">bytes</span></code></a>.</p>
|
|||
|
<p>If it is a <a class="reference internal" href="stdtypes.html#str" title="str"><code class="xref py py-class docutils literal notranslate"><span class="pre">str</span></code></a>, unescaped non-ASCII characters in <em>string</em>
|
|||
|
are encoded into UTF-8 bytes.</p>
|
|||
|
<p>Example: <code class="docutils literal notranslate"><span class="pre">unquote_to_bytes('a%26%EF')</span></code> yields <code class="docutils literal notranslate"><span class="pre">b'a&\xef'</span></code>.</p>
|
|||
|
</dd></dl>
|
|||
|
|
|||
|
<dl class="function">
|
|||
|
<dt id="urllib.parse.urlencode">
|
|||
|
<code class="descclassname">urllib.parse.</code><code class="descname">urlencode</code><span class="sig-paren">(</span><em>query</em>, <em>doseq=False</em>, <em>safe=''</em>, <em>encoding=None</em>, <em>errors=None</em>, <em>quote_via=quote_plus</em><span class="sig-paren">)</span><a class="headerlink" href="#urllib.parse.urlencode" title="Permalink to this definition">¶</a></dt>
|
|||
|
<dd><p>Convert a mapping object or a sequence of two-element tuples, which may
|
|||
|
contain <a class="reference internal" href="stdtypes.html#str" title="str"><code class="xref py py-class docutils literal notranslate"><span class="pre">str</span></code></a> or <a class="reference internal" href="stdtypes.html#bytes" title="bytes"><code class="xref py py-class docutils literal notranslate"><span class="pre">bytes</span></code></a> objects, to a percent-encoded ASCII
|
|||
|
text string. If the resultant string is to be used as a <em>data</em> for POST
|
|||
|
operation with the <a class="reference internal" href="urllib.request.html#urllib.request.urlopen" title="urllib.request.urlopen"><code class="xref py py-func docutils literal notranslate"><span class="pre">urlopen()</span></code></a> function, then
|
|||
|
it should be encoded to bytes, otherwise it would result in a
|
|||
|
<a class="reference internal" href="exceptions.html#TypeError" title="TypeError"><code class="xref py py-exc docutils literal notranslate"><span class="pre">TypeError</span></code></a>.</p>
|
|||
|
<p>The resulting string is a series of <code class="docutils literal notranslate"><span class="pre">key=value</span></code> pairs separated by <code class="docutils literal notranslate"><span class="pre">'&'</span></code>
|
|||
|
characters, where both <em>key</em> and <em>value</em> are quoted using the <em>quote_via</em>
|
|||
|
function. By default, <a class="reference internal" href="#urllib.parse.quote_plus" title="urllib.parse.quote_plus"><code class="xref py py-func docutils literal notranslate"><span class="pre">quote_plus()</span></code></a> is used to quote the values, which
|
|||
|
means spaces are quoted as a <code class="docutils literal notranslate"><span class="pre">'+'</span></code> character and ‘/’ characters are
|
|||
|
encoded as <code class="docutils literal notranslate"><span class="pre">%2F</span></code>, which follows the standard for GET requests
|
|||
|
(<code class="docutils literal notranslate"><span class="pre">application/x-www-form-urlencoded</span></code>). An alternate function that can be
|
|||
|
passed as <em>quote_via</em> is <a class="reference internal" href="#urllib.parse.quote" title="urllib.parse.quote"><code class="xref py py-func docutils literal notranslate"><span class="pre">quote()</span></code></a>, which will encode spaces as <code class="docutils literal notranslate"><span class="pre">%20</span></code>
|
|||
|
and not encode ‘/’ characters. For maximum control of what is quoted, use
|
|||
|
<code class="docutils literal notranslate"><span class="pre">quote</span></code> and specify a value for <em>safe</em>.</p>
|
|||
|
<p>When a sequence of two-element tuples is used as the <em>query</em>
|
|||
|
argument, the first element of each tuple is a key and the second is a
|
|||
|
value. The value element in itself can be a sequence and in that case, if
|
|||
|
the optional parameter <em>doseq</em> is evaluates to <code class="docutils literal notranslate"><span class="pre">True</span></code>, individual
|
|||
|
<code class="docutils literal notranslate"><span class="pre">key=value</span></code> pairs separated by <code class="docutils literal notranslate"><span class="pre">'&'</span></code> are generated for each element of
|
|||
|
the value sequence for the key. The order of parameters in the encoded
|
|||
|
string will match the order of parameter tuples in the sequence.</p>
|
|||
|
<p>The <em>safe</em>, <em>encoding</em>, and <em>errors</em> parameters are passed down to
|
|||
|
<em>quote_via</em> (the <em>encoding</em> and <em>errors</em> parameters are only passed
|
|||
|
when a query element is a <a class="reference internal" href="stdtypes.html#str" title="str"><code class="xref py py-class docutils literal notranslate"><span class="pre">str</span></code></a>).</p>
|
|||
|
<p>To reverse this encoding process, <a class="reference internal" href="#urllib.parse.parse_qs" title="urllib.parse.parse_qs"><code class="xref py py-func docutils literal notranslate"><span class="pre">parse_qs()</span></code></a> and <a class="reference internal" href="#urllib.parse.parse_qsl" title="urllib.parse.parse_qsl"><code class="xref py py-func docutils literal notranslate"><span class="pre">parse_qsl()</span></code></a> are
|
|||
|
provided in this module to parse query strings into Python data structures.</p>
|
|||
|
<p>Refer to <a class="reference internal" href="urllib.request.html#urllib-examples"><span class="std std-ref">urllib examples</span></a> to find out how urlencode
|
|||
|
method can be used for generating query string for a URL or data for POST.</p>
|
|||
|
<div class="versionchanged">
|
|||
|
<p><span class="versionmodified changed">Changed in version 3.2: </span>Query parameter supports bytes and string objects.</p>
|
|||
|
</div>
|
|||
|
<div class="versionadded">
|
|||
|
<p><span class="versionmodified added">New in version 3.5: </span><em>quote_via</em> parameter.</p>
|
|||
|
</div>
|
|||
|
</dd></dl>
|
|||
|
|
|||
|
<div class="admonition seealso">
|
|||
|
<p class="admonition-title">See also</p>
|
|||
|
<dl class="simple">
|
|||
|
<dt><span class="target" id="index-7"></span><a class="rfc reference external" href="https://tools.ietf.org/html/rfc3986.html"><strong>RFC 3986</strong></a> - Uniform Resource Identifiers</dt><dd><p>This is the current standard (STD66). Any changes to urllib.parse module
|
|||
|
should conform to this. Certain deviations could be observed, which are
|
|||
|
mostly for backward compatibility purposes and for certain de-facto
|
|||
|
parsing requirements as commonly observed in major browsers.</p>
|
|||
|
</dd>
|
|||
|
<dt><span class="target" id="index-8"></span><a class="rfc reference external" href="https://tools.ietf.org/html/rfc2732.html"><strong>RFC 2732</strong></a> - Format for Literal IPv6 Addresses in URL’s.</dt><dd><p>This specifies the parsing requirements of IPv6 URLs.</p>
|
|||
|
</dd>
|
|||
|
<dt><span class="target" id="index-9"></span><a class="rfc reference external" href="https://tools.ietf.org/html/rfc2396.html"><strong>RFC 2396</strong></a> - Uniform Resource Identifiers (URI): Generic Syntax</dt><dd><p>Document describing the generic syntactic requirements for both Uniform Resource
|
|||
|
Names (URNs) and Uniform Resource Locators (URLs).</p>
|
|||
|
</dd>
|
|||
|
<dt><span class="target" id="index-10"></span><a class="rfc reference external" href="https://tools.ietf.org/html/rfc2368.html"><strong>RFC 2368</strong></a> - The mailto URL scheme.</dt><dd><p>Parsing requirements for mailto URL schemes.</p>
|
|||
|
</dd>
|
|||
|
<dt><span class="target" id="index-11"></span><a class="rfc reference external" href="https://tools.ietf.org/html/rfc1808.html"><strong>RFC 1808</strong></a> - Relative Uniform Resource Locators</dt><dd><p>This Request For Comments includes the rules for joining an absolute and a
|
|||
|
relative URL, including a fair number of “Abnormal Examples” which govern the
|
|||
|
treatment of border cases.</p>
|
|||
|
</dd>
|
|||
|
<dt><span class="target" id="index-12"></span><a class="rfc reference external" href="https://tools.ietf.org/html/rfc1738.html"><strong>RFC 1738</strong></a> - Uniform Resource Locators (URL)</dt><dd><p>This specifies the formal syntax and semantics of absolute URLs.</p>
|
|||
|
</dd>
|
|||
|
</dl>
|
|||
|
</div>
|
|||
|
</div>
|
|||
|
</div>
|
|||
|
|
|||
|
|
|||
|
</div>
|
|||
|
</div>
|
|||
|
</div>
|
|||
|
<div class="sphinxsidebar" role="navigation" aria-label="main navigation">
|
|||
|
<div class="sphinxsidebarwrapper">
|
|||
|
<h3><a href="../contents.html">Table of Contents</a></h3>
|
|||
|
<ul>
|
|||
|
<li><a class="reference internal" href="#"><code class="xref py py-mod docutils literal notranslate"><span class="pre">urllib.parse</span></code> — Parse URLs into components</a><ul>
|
|||
|
<li><a class="reference internal" href="#url-parsing">URL Parsing</a></li>
|
|||
|
<li><a class="reference internal" href="#parsing-ascii-encoded-bytes">Parsing ASCII Encoded Bytes</a></li>
|
|||
|
<li><a class="reference internal" href="#structured-parse-results">Structured Parse Results</a></li>
|
|||
|
<li><a class="reference internal" href="#url-quoting">URL Quoting</a></li>
|
|||
|
</ul>
|
|||
|
</li>
|
|||
|
</ul>
|
|||
|
|
|||
|
<h4>Previous topic</h4>
|
|||
|
<p class="topless"><a href="urllib.request.html"
|
|||
|
title="previous chapter"><code class="xref py py-mod docutils literal notranslate"><span class="pre">urllib.request</span></code> — Extensible library for opening URLs</a></p>
|
|||
|
<h4>Next topic</h4>
|
|||
|
<p class="topless"><a href="urllib.error.html"
|
|||
|
title="next chapter"><code class="xref py py-mod docutils literal notranslate"><span class="pre">urllib.error</span></code> — Exception classes raised by urllib.request</a></p>
|
|||
|
<div role="note" aria-label="source link">
|
|||
|
<h3>This Page</h3>
|
|||
|
<ul class="this-page-menu">
|
|||
|
<li><a href="../bugs.html">Report a Bug</a></li>
|
|||
|
<li>
|
|||
|
<a href="https://github.com/python/cpython/blob/3.7/Doc/library/urllib.parse.rst"
|
|||
|
rel="nofollow">Show Source
|
|||
|
</a>
|
|||
|
</li>
|
|||
|
</ul>
|
|||
|
</div>
|
|||
|
</div>
|
|||
|
</div>
|
|||
|
<div class="clearer"></div>
|
|||
|
</div>
|
|||
|
<div class="related" role="navigation" aria-label="related navigation">
|
|||
|
<h3>Navigation</h3>
|
|||
|
<ul>
|
|||
|
<li class="right" style="margin-right: 10px">
|
|||
|
<a href="../genindex.html" title="General Index"
|
|||
|
>index</a></li>
|
|||
|
<li class="right" >
|
|||
|
<a href="../py-modindex.html" title="Python Module Index"
|
|||
|
>modules</a> |</li>
|
|||
|
<li class="right" >
|
|||
|
<a href="urllib.error.html" title="urllib.error — Exception classes raised by urllib.request"
|
|||
|
>next</a> |</li>
|
|||
|
<li class="right" >
|
|||
|
<a href="urllib.request.html" title="urllib.request — Extensible library for opening URLs"
|
|||
|
>previous</a> |</li>
|
|||
|
<li><img src="../_static/py.png" alt=""
|
|||
|
style="vertical-align: middle; margin-top: -1px"/></li>
|
|||
|
<li><a href="https://www.python.org/">Python</a> »</li>
|
|||
|
<li>
|
|||
|
<span class="language_switcher_placeholder">en</span>
|
|||
|
<span class="version_switcher_placeholder">3.7.4</span>
|
|||
|
<a href="../index.html">Documentation </a> »
|
|||
|
</li>
|
|||
|
|
|||
|
<li class="nav-item nav-item-1"><a href="index.html" >The Python Standard Library</a> »</li>
|
|||
|
<li class="nav-item nav-item-2"><a href="internet.html" >Internet Protocols and Support</a> »</li>
|
|||
|
<li class="right">
|
|||
|
|
|||
|
|
|||
|
<div class="inline-search" style="display: none" role="search">
|
|||
|
<form class="inline-search" action="../search.html" method="get">
|
|||
|
<input placeholder="Quick search" type="text" name="q" />
|
|||
|
<input type="submit" value="Go" />
|
|||
|
<input type="hidden" name="check_keywords" value="yes" />
|
|||
|
<input type="hidden" name="area" value="default" />
|
|||
|
</form>
|
|||
|
</div>
|
|||
|
<script type="text/javascript">$('.inline-search').show(0);</script>
|
|||
|
|
|
|||
|
</li>
|
|||
|
|
|||
|
</ul>
|
|||
|
</div>
|
|||
|
<div class="footer">
|
|||
|
© <a href="../copyright.html">Copyright</a> 2001-2019, Python Software Foundation.
|
|||
|
<br />
|
|||
|
The Python Software Foundation is a non-profit corporation.
|
|||
|
<a href="https://www.python.org/psf/donations/">Please donate.</a>
|
|||
|
<br />
|
|||
|
Last updated on Jul 13, 2019.
|
|||
|
<a href="../bugs.html">Found a bug</a>?
|
|||
|
<br />
|
|||
|
Created using <a href="http://sphinx.pocoo.org/">Sphinx</a> 2.0.1.
|
|||
|
</div>
|
|||
|
|
|||
|
</body>
|
|||
|
</html>
|