Caleb Fontenot 335515d331 add files
2019-07-15 09:16:41 -07:00

386 lines
26 KiB
HTML
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta charset="utf-8" />
<title>email.charset: Representing character sets &#8212; Python 3.7.4 documentation</title>
<link rel="stylesheet" href="../_static/pydoctheme.css" type="text/css" />
<link rel="stylesheet" href="../_static/pygments.css" type="text/css" />
<script type="text/javascript" id="documentation_options" data-url_root="../" src="../_static/documentation_options.js"></script>
<script type="text/javascript" src="../_static/jquery.js"></script>
<script type="text/javascript" src="../_static/underscore.js"></script>
<script type="text/javascript" src="../_static/doctools.js"></script>
<script type="text/javascript" src="../_static/language_data.js"></script>
<script type="text/javascript" src="../_static/sidebar.js"></script>
<link rel="search" type="application/opensearchdescription+xml"
title="Search within Python 3.7.4 documentation"
href="../_static/opensearch.xml"/>
<link rel="author" title="About these documents" href="../about.html" />
<link rel="index" title="Index" href="../genindex.html" />
<link rel="search" title="Search" href="../search.html" />
<link rel="copyright" title="Copyright" href="../copyright.html" />
<link rel="next" title="email.encoders: Encoders" href="email.encoders.html" />
<link rel="prev" title="email.header: Internationalized headers" href="email.header.html" />
<link rel="shortcut icon" type="image/png" href="../_static/py.png" />
<link rel="canonical" href="https://docs.python.org/3/library/email.charset.html" />
<script type="text/javascript" src="../_static/copybutton.js"></script>
<script type="text/javascript" src="../_static/switchers.js"></script>
<style>
@media only screen {
table.full-width-table {
width: 100%;
}
}
</style>
</head><body>
<div class="related" role="navigation" aria-label="related navigation">
<h3>Navigation</h3>
<ul>
<li class="right" style="margin-right: 10px">
<a href="../genindex.html" title="General Index"
accesskey="I">index</a></li>
<li class="right" >
<a href="../py-modindex.html" title="Python Module Index"
>modules</a> |</li>
<li class="right" >
<a href="email.encoders.html" title="email.encoders: Encoders"
accesskey="N">next</a> |</li>
<li class="right" >
<a href="email.header.html" title="email.header: Internationalized headers"
accesskey="P">previous</a> |</li>
<li><img src="../_static/py.png" alt=""
style="vertical-align: middle; margin-top: -1px"/></li>
<li><a href="https://www.python.org/">Python</a> &#187;</li>
<li>
<span class="language_switcher_placeholder">en</span>
<span class="version_switcher_placeholder">3.7.4</span>
<a href="../index.html">Documentation </a> &#187;
</li>
<li class="nav-item nav-item-1"><a href="index.html" >The Python Standard Library</a> &#187;</li>
<li class="nav-item nav-item-2"><a href="netdata.html" >Internet Data Handling</a> &#187;</li>
<li class="nav-item nav-item-3"><a href="email.html" accesskey="U"><code class="xref py py-mod docutils literal notranslate"><span class="pre">email</span></code> — An email and MIME handling package</a> &#187;</li>
<li class="right">
<div class="inline-search" style="display: none" role="search">
<form class="inline-search" action="../search.html" method="get">
<input placeholder="Quick search" type="text" name="q" />
<input type="submit" value="Go" />
<input type="hidden" name="check_keywords" value="yes" />
<input type="hidden" name="area" value="default" />
</form>
</div>
<script type="text/javascript">$('.inline-search').show(0);</script>
|
</li>
</ul>
</div>
<div class="document">
<div class="documentwrapper">
<div class="bodywrapper">
<div class="body" role="main">
<div class="section" id="module-email.charset">
<span id="email-charset-representing-character-sets"></span><h1><a class="reference internal" href="#module-email.charset" title="email.charset: Character Sets"><code class="xref py py-mod docutils literal notranslate"><span class="pre">email.charset</span></code></a>: Representing character sets<a class="headerlink" href="#module-email.charset" title="Permalink to this headline"></a></h1>
<p><strong>Source code:</strong> <a class="reference external" href="https://github.com/python/cpython/tree/3.7/Lib/email/charset.py">Lib/email/charset.py</a></p>
<hr class="docutils" />
<p>This module is part of the legacy (<code class="docutils literal notranslate"><span class="pre">Compat32</span></code>) email API. In the new
API only the aliases table is used.</p>
<p>The remaining text in this section is the original documentation of the module.</p>
<p>This module provides a class <a class="reference internal" href="#email.charset.Charset" title="email.charset.Charset"><code class="xref py py-class docutils literal notranslate"><span class="pre">Charset</span></code></a> for representing character sets
and character set conversions in email messages, as well as a character set
registry and several convenience methods for manipulating this registry.
Instances of <a class="reference internal" href="#email.charset.Charset" title="email.charset.Charset"><code class="xref py py-class docutils literal notranslate"><span class="pre">Charset</span></code></a> are used in several other modules within the
<a class="reference internal" href="email.html#module-email" title="email: Package supporting the parsing, manipulating, and generating email messages."><code class="xref py py-mod docutils literal notranslate"><span class="pre">email</span></code></a> package.</p>
<p>Import this class from the <a class="reference internal" href="#module-email.charset" title="email.charset: Character Sets"><code class="xref py py-mod docutils literal notranslate"><span class="pre">email.charset</span></code></a> module.</p>
<dl class="class">
<dt id="email.charset.Charset">
<em class="property">class </em><code class="descclassname">email.charset.</code><code class="descname">Charset</code><span class="sig-paren">(</span><em>input_charset=DEFAULT_CHARSET</em><span class="sig-paren">)</span><a class="headerlink" href="#email.charset.Charset" title="Permalink to this definition"></a></dt>
<dd><p>Map character sets to their email properties.</p>
<p>This class provides information about the requirements imposed on email for a
specific character set. It also provides convenience routines for converting
between character sets, given the availability of the applicable codecs. Given
a character set, it will do its best to provide information on how to use that
character set in an email message in an RFC-compliant way.</p>
<p>Certain character sets must be encoded with quoted-printable or base64 when used
in email headers or bodies. Certain character sets must be converted outright,
and are not allowed in email.</p>
<p>Optional <em>input_charset</em> is as described below; it is always coerced to lower
case. After being alias normalized it is also used as a lookup into the
registry of character sets to find out the header encoding, body encoding, and
output conversion codec to be used for the character set. For example, if
<em>input_charset</em> is <code class="docutils literal notranslate"><span class="pre">iso-8859-1</span></code>, then headers and bodies will be encoded using
quoted-printable and no output conversion codec is necessary. If
<em>input_charset</em> is <code class="docutils literal notranslate"><span class="pre">euc-jp</span></code>, then headers will be encoded with base64, bodies
will not be encoded, but output text will be converted from the <code class="docutils literal notranslate"><span class="pre">euc-jp</span></code>
character set to the <code class="docutils literal notranslate"><span class="pre">iso-2022-jp</span></code> character set.</p>
<p><a class="reference internal" href="#email.charset.Charset" title="email.charset.Charset"><code class="xref py py-class docutils literal notranslate"><span class="pre">Charset</span></code></a> instances have the following data attributes:</p>
<dl class="attribute">
<dt id="email.charset.Charset.input_charset">
<code class="descname">input_charset</code><a class="headerlink" href="#email.charset.Charset.input_charset" title="Permalink to this definition"></a></dt>
<dd><p>The initial character set specified. Common aliases are converted to
their <em>official</em> email names (e.g. <code class="docutils literal notranslate"><span class="pre">latin_1</span></code> is converted to
<code class="docutils literal notranslate"><span class="pre">iso-8859-1</span></code>). Defaults to 7-bit <code class="docutils literal notranslate"><span class="pre">us-ascii</span></code>.</p>
</dd></dl>
<dl class="attribute">
<dt id="email.charset.Charset.header_encoding">
<code class="descname">header_encoding</code><a class="headerlink" href="#email.charset.Charset.header_encoding" title="Permalink to this definition"></a></dt>
<dd><p>If the character set must be encoded before it can be used in an email
header, this attribute will be set to <code class="docutils literal notranslate"><span class="pre">Charset.QP</span></code> (for
quoted-printable), <code class="docutils literal notranslate"><span class="pre">Charset.BASE64</span></code> (for base64 encoding), or
<code class="docutils literal notranslate"><span class="pre">Charset.SHORTEST</span></code> for the shortest of QP or BASE64 encoding. Otherwise,
it will be <code class="docutils literal notranslate"><span class="pre">None</span></code>.</p>
</dd></dl>
<dl class="attribute">
<dt id="email.charset.Charset.body_encoding">
<code class="descname">body_encoding</code><a class="headerlink" href="#email.charset.Charset.body_encoding" title="Permalink to this definition"></a></dt>
<dd><p>Same as <em>header_encoding</em>, but describes the encoding for the mail
messages body, which indeed may be different than the header encoding.
<code class="docutils literal notranslate"><span class="pre">Charset.SHORTEST</span></code> is not allowed for <em>body_encoding</em>.</p>
</dd></dl>
<dl class="attribute">
<dt id="email.charset.Charset.output_charset">
<code class="descname">output_charset</code><a class="headerlink" href="#email.charset.Charset.output_charset" title="Permalink to this definition"></a></dt>
<dd><p>Some character sets must be converted before they can be used in email
headers or bodies. If the <em>input_charset</em> is one of them, this attribute
will contain the name of the character set output will be converted to.
Otherwise, it will be <code class="docutils literal notranslate"><span class="pre">None</span></code>.</p>
</dd></dl>
<dl class="attribute">
<dt id="email.charset.Charset.input_codec">
<code class="descname">input_codec</code><a class="headerlink" href="#email.charset.Charset.input_codec" title="Permalink to this definition"></a></dt>
<dd><p>The name of the Python codec used to convert the <em>input_charset</em> to
Unicode. If no conversion codec is necessary, this attribute will be
<code class="docutils literal notranslate"><span class="pre">None</span></code>.</p>
</dd></dl>
<dl class="attribute">
<dt id="email.charset.Charset.output_codec">
<code class="descname">output_codec</code><a class="headerlink" href="#email.charset.Charset.output_codec" title="Permalink to this definition"></a></dt>
<dd><p>The name of the Python codec used to convert Unicode to the
<em>output_charset</em>. If no conversion codec is necessary, this attribute
will have the same value as the <em>input_codec</em>.</p>
</dd></dl>
<p><a class="reference internal" href="#email.charset.Charset" title="email.charset.Charset"><code class="xref py py-class docutils literal notranslate"><span class="pre">Charset</span></code></a> instances also have the following methods:</p>
<dl class="method">
<dt id="email.charset.Charset.get_body_encoding">
<code class="descname">get_body_encoding</code><span class="sig-paren">(</span><span class="sig-paren">)</span><a class="headerlink" href="#email.charset.Charset.get_body_encoding" title="Permalink to this definition"></a></dt>
<dd><p>Return the content transfer encoding used for body encoding.</p>
<p>This is either the string <code class="docutils literal notranslate"><span class="pre">quoted-printable</span></code> or <code class="docutils literal notranslate"><span class="pre">base64</span></code> depending on
the encoding used, or it is a function, in which case you should call the
function with a single argument, the Message object being encoded. The
function should then set the <em class="mailheader">Content-Transfer-Encoding</em>
header itself to whatever is appropriate.</p>
<p>Returns the string <code class="docutils literal notranslate"><span class="pre">quoted-printable</span></code> if <em>body_encoding</em> is <code class="docutils literal notranslate"><span class="pre">QP</span></code>,
returns the string <code class="docutils literal notranslate"><span class="pre">base64</span></code> if <em>body_encoding</em> is <code class="docutils literal notranslate"><span class="pre">BASE64</span></code>, and
returns the string <code class="docutils literal notranslate"><span class="pre">7bit</span></code> otherwise.</p>
</dd></dl>
<dl class="method">
<dt id="email.charset.Charset.get_output_charset">
<code class="descname">get_output_charset</code><span class="sig-paren">(</span><span class="sig-paren">)</span><a class="headerlink" href="#email.charset.Charset.get_output_charset" title="Permalink to this definition"></a></dt>
<dd><p>Return the output character set.</p>
<p>This is the <em>output_charset</em> attribute if that is not <code class="docutils literal notranslate"><span class="pre">None</span></code>, otherwise
it is <em>input_charset</em>.</p>
</dd></dl>
<dl class="method">
<dt id="email.charset.Charset.header_encode">
<code class="descname">header_encode</code><span class="sig-paren">(</span><em>string</em><span class="sig-paren">)</span><a class="headerlink" href="#email.charset.Charset.header_encode" title="Permalink to this definition"></a></dt>
<dd><p>Header-encode the string <em>string</em>.</p>
<p>The type of encoding (base64 or quoted-printable) will be based on the
<em>header_encoding</em> attribute.</p>
</dd></dl>
<dl class="method">
<dt id="email.charset.Charset.header_encode_lines">
<code class="descname">header_encode_lines</code><span class="sig-paren">(</span><em>string</em>, <em>maxlengths</em><span class="sig-paren">)</span><a class="headerlink" href="#email.charset.Charset.header_encode_lines" title="Permalink to this definition"></a></dt>
<dd><p>Header-encode a <em>string</em> by converting it first to bytes.</p>
<p>This is similar to <a class="reference internal" href="#email.charset.Charset.header_encode" title="email.charset.Charset.header_encode"><code class="xref py py-meth docutils literal notranslate"><span class="pre">header_encode()</span></code></a> except that the string is fit
into maximum line lengths as given by the argument <em>maxlengths</em>, which
must be an iterator: each element returned from this iterator will provide
the next maximum line length.</p>
</dd></dl>
<dl class="method">
<dt id="email.charset.Charset.body_encode">
<code class="descname">body_encode</code><span class="sig-paren">(</span><em>string</em><span class="sig-paren">)</span><a class="headerlink" href="#email.charset.Charset.body_encode" title="Permalink to this definition"></a></dt>
<dd><p>Body-encode the string <em>string</em>.</p>
<p>The type of encoding (base64 or quoted-printable) will be based on the
<em>body_encoding</em> attribute.</p>
</dd></dl>
<p>The <a class="reference internal" href="#email.charset.Charset" title="email.charset.Charset"><code class="xref py py-class docutils literal notranslate"><span class="pre">Charset</span></code></a> class also provides a number of methods to support
standard operations and built-in functions.</p>
<dl class="method">
<dt id="email.charset.Charset.__str__">
<code class="descname">__str__</code><span class="sig-paren">(</span><span class="sig-paren">)</span><a class="headerlink" href="#email.charset.Charset.__str__" title="Permalink to this definition"></a></dt>
<dd><p>Returns <em>input_charset</em> as a string coerced to lower
case. <a class="reference internal" href="../reference/datamodel.html#object.__repr__" title="object.__repr__"><code class="xref py py-meth docutils literal notranslate"><span class="pre">__repr__()</span></code></a> is an alias for <a class="reference internal" href="#email.charset.Charset.__str__" title="email.charset.Charset.__str__"><code class="xref py py-meth docutils literal notranslate"><span class="pre">__str__()</span></code></a>.</p>
</dd></dl>
<dl class="method">
<dt id="email.charset.Charset.__eq__">
<code class="descname">__eq__</code><span class="sig-paren">(</span><em>other</em><span class="sig-paren">)</span><a class="headerlink" href="#email.charset.Charset.__eq__" title="Permalink to this definition"></a></dt>
<dd><p>This method allows you to compare two <a class="reference internal" href="#email.charset.Charset" title="email.charset.Charset"><code class="xref py py-class docutils literal notranslate"><span class="pre">Charset</span></code></a> instances for
equality.</p>
</dd></dl>
<dl class="method">
<dt id="email.charset.Charset.__ne__">
<code class="descname">__ne__</code><span class="sig-paren">(</span><em>other</em><span class="sig-paren">)</span><a class="headerlink" href="#email.charset.Charset.__ne__" title="Permalink to this definition"></a></dt>
<dd><p>This method allows you to compare two <a class="reference internal" href="#email.charset.Charset" title="email.charset.Charset"><code class="xref py py-class docutils literal notranslate"><span class="pre">Charset</span></code></a> instances for
inequality.</p>
</dd></dl>
</dd></dl>
<p>The <a class="reference internal" href="#module-email.charset" title="email.charset: Character Sets"><code class="xref py py-mod docutils literal notranslate"><span class="pre">email.charset</span></code></a> module also provides the following functions for adding
new entries to the global character set, alias, and codec registries:</p>
<dl class="function">
<dt id="email.charset.add_charset">
<code class="descclassname">email.charset.</code><code class="descname">add_charset</code><span class="sig-paren">(</span><em>charset</em>, <em>header_enc=None</em>, <em>body_enc=None</em>, <em>output_charset=None</em><span class="sig-paren">)</span><a class="headerlink" href="#email.charset.add_charset" title="Permalink to this definition"></a></dt>
<dd><p>Add character properties to the global registry.</p>
<p><em>charset</em> is the input character set, and must be the canonical name of a
character set.</p>
<p>Optional <em>header_enc</em> and <em>body_enc</em> is either <code class="docutils literal notranslate"><span class="pre">Charset.QP</span></code> for
quoted-printable, <code class="docutils literal notranslate"><span class="pre">Charset.BASE64</span></code> for base64 encoding,
<code class="docutils literal notranslate"><span class="pre">Charset.SHORTEST</span></code> for the shortest of quoted-printable or base64 encoding,
or <code class="docutils literal notranslate"><span class="pre">None</span></code> for no encoding. <code class="docutils literal notranslate"><span class="pre">SHORTEST</span></code> is only valid for
<em>header_enc</em>. The default is <code class="docutils literal notranslate"><span class="pre">None</span></code> for no encoding.</p>
<p>Optional <em>output_charset</em> is the character set that the output should be in.
Conversions will proceed from input charset, to Unicode, to the output charset
when the method <code class="xref py py-meth docutils literal notranslate"><span class="pre">Charset.convert()</span></code> is called. The default is to output in
the same character set as the input.</p>
<p>Both <em>input_charset</em> and <em>output_charset</em> must have Unicode codec entries in the
modules character set-to-codec mapping; use <a class="reference internal" href="#email.charset.add_codec" title="email.charset.add_codec"><code class="xref py py-func docutils literal notranslate"><span class="pre">add_codec()</span></code></a> to add codecs the
module does not know about. See the <a class="reference internal" href="codecs.html#module-codecs" title="codecs: Encode and decode data and streams."><code class="xref py py-mod docutils literal notranslate"><span class="pre">codecs</span></code></a> modules documentation for
more information.</p>
<p>The global character set registry is kept in the module global dictionary
<code class="docutils literal notranslate"><span class="pre">CHARSETS</span></code>.</p>
</dd></dl>
<dl class="function">
<dt id="email.charset.add_alias">
<code class="descclassname">email.charset.</code><code class="descname">add_alias</code><span class="sig-paren">(</span><em>alias</em>, <em>canonical</em><span class="sig-paren">)</span><a class="headerlink" href="#email.charset.add_alias" title="Permalink to this definition"></a></dt>
<dd><p>Add a character set alias. <em>alias</em> is the alias name, e.g. <code class="docutils literal notranslate"><span class="pre">latin-1</span></code>.
<em>canonical</em> is the character sets canonical name, e.g. <code class="docutils literal notranslate"><span class="pre">iso-8859-1</span></code>.</p>
<p>The global charset alias registry is kept in the module global dictionary
<code class="docutils literal notranslate"><span class="pre">ALIASES</span></code>.</p>
</dd></dl>
<dl class="function">
<dt id="email.charset.add_codec">
<code class="descclassname">email.charset.</code><code class="descname">add_codec</code><span class="sig-paren">(</span><em>charset</em>, <em>codecname</em><span class="sig-paren">)</span><a class="headerlink" href="#email.charset.add_codec" title="Permalink to this definition"></a></dt>
<dd><p>Add a codec that map characters in the given character set to and from Unicode.</p>
<p><em>charset</em> is the canonical name of a character set. <em>codecname</em> is the name of a
Python codec, as appropriate for the second argument to the <a class="reference internal" href="stdtypes.html#str" title="str"><code class="xref py py-class docutils literal notranslate"><span class="pre">str</span></code></a>s
<a class="reference internal" href="stdtypes.html#str.encode" title="str.encode"><code class="xref py py-meth docutils literal notranslate"><span class="pre">encode()</span></code></a> method.</p>
</dd></dl>
</div>
</div>
</div>
</div>
<div class="sphinxsidebar" role="navigation" aria-label="main navigation">
<div class="sphinxsidebarwrapper">
<h4>Previous topic</h4>
<p class="topless"><a href="email.header.html"
title="previous chapter"><code class="xref py py-mod docutils literal notranslate"><span class="pre">email.header</span></code>: Internationalized headers</a></p>
<h4>Next topic</h4>
<p class="topless"><a href="email.encoders.html"
title="next chapter"><code class="xref py py-mod docutils literal notranslate"><span class="pre">email.encoders</span></code>: Encoders</a></p>
<div role="note" aria-label="source link">
<h3>This Page</h3>
<ul class="this-page-menu">
<li><a href="../bugs.html">Report a Bug</a></li>
<li>
<a href="https://github.com/python/cpython/blob/3.7/Doc/library/email.charset.rst"
rel="nofollow">Show Source
</a>
</li>
</ul>
</div>
</div>
</div>
<div class="clearer"></div>
</div>
<div class="related" role="navigation" aria-label="related navigation">
<h3>Navigation</h3>
<ul>
<li class="right" style="margin-right: 10px">
<a href="../genindex.html" title="General Index"
>index</a></li>
<li class="right" >
<a href="../py-modindex.html" title="Python Module Index"
>modules</a> |</li>
<li class="right" >
<a href="email.encoders.html" title="email.encoders: Encoders"
>next</a> |</li>
<li class="right" >
<a href="email.header.html" title="email.header: Internationalized headers"
>previous</a> |</li>
<li><img src="../_static/py.png" alt=""
style="vertical-align: middle; margin-top: -1px"/></li>
<li><a href="https://www.python.org/">Python</a> &#187;</li>
<li>
<span class="language_switcher_placeholder">en</span>
<span class="version_switcher_placeholder">3.7.4</span>
<a href="../index.html">Documentation </a> &#187;
</li>
<li class="nav-item nav-item-1"><a href="index.html" >The Python Standard Library</a> &#187;</li>
<li class="nav-item nav-item-2"><a href="netdata.html" >Internet Data Handling</a> &#187;</li>
<li class="nav-item nav-item-3"><a href="email.html" ><code class="xref py py-mod docutils literal notranslate"><span class="pre">email</span></code> — An email and MIME handling package</a> &#187;</li>
<li class="right">
<div class="inline-search" style="display: none" role="search">
<form class="inline-search" action="../search.html" method="get">
<input placeholder="Quick search" type="text" name="q" />
<input type="submit" value="Go" />
<input type="hidden" name="check_keywords" value="yes" />
<input type="hidden" name="area" value="default" />
</form>
</div>
<script type="text/javascript">$('.inline-search').show(0);</script>
|
</li>
</ul>
</div>
<div class="footer">
&copy; <a href="../copyright.html">Copyright</a> 2001-2019, Python Software Foundation.
<br />
The Python Software Foundation is a non-profit corporation.
<a href="https://www.python.org/psf/donations/">Please donate.</a>
<br />
Last updated on Jul 13, 2019.
<a href="../bugs.html">Found a bug</a>?
<br />
Created using <a href="http://sphinx.pocoo.org/">Sphinx</a> 2.0.1.
</div>
</body>
</html>