473 lines
46 KiB
HTML
473 lines
46 KiB
HTML
|
||
<!DOCTYPE html>
|
||
|
||
<html xmlns="http://www.w3.org/1999/xhtml">
|
||
<head>
|
||
<meta charset="utf-8" />
|
||
<title>Sorting HOW TO — Python 3.7.4 documentation</title>
|
||
<link rel="stylesheet" href="../_static/pydoctheme.css" type="text/css" />
|
||
<link rel="stylesheet" href="../_static/pygments.css" type="text/css" />
|
||
|
||
<script type="text/javascript" id="documentation_options" data-url_root="../" src="../_static/documentation_options.js"></script>
|
||
<script type="text/javascript" src="../_static/jquery.js"></script>
|
||
<script type="text/javascript" src="../_static/underscore.js"></script>
|
||
<script type="text/javascript" src="../_static/doctools.js"></script>
|
||
<script type="text/javascript" src="../_static/language_data.js"></script>
|
||
|
||
<script type="text/javascript" src="../_static/sidebar.js"></script>
|
||
|
||
<link rel="search" type="application/opensearchdescription+xml"
|
||
title="Search within Python 3.7.4 documentation"
|
||
href="../_static/opensearch.xml"/>
|
||
<link rel="author" title="About these documents" href="../about.html" />
|
||
<link rel="index" title="Index" href="../genindex.html" />
|
||
<link rel="search" title="Search" href="../search.html" />
|
||
<link rel="copyright" title="Copyright" href="../copyright.html" />
|
||
<link rel="next" title="Unicode HOWTO" href="unicode.html" />
|
||
<link rel="prev" title="Socket Programming HOWTO" href="sockets.html" />
|
||
<link rel="shortcut icon" type="image/png" href="../_static/py.png" />
|
||
<link rel="canonical" href="https://docs.python.org/3/howto/sorting.html" />
|
||
|
||
<script type="text/javascript" src="../_static/copybutton.js"></script>
|
||
<script type="text/javascript" src="../_static/switchers.js"></script>
|
||
|
||
|
||
|
||
<style>
|
||
@media only screen {
|
||
table.full-width-table {
|
||
width: 100%;
|
||
}
|
||
}
|
||
</style>
|
||
|
||
|
||
</head><body>
|
||
|
||
<div class="related" role="navigation" aria-label="related navigation">
|
||
<h3>Navigation</h3>
|
||
<ul>
|
||
<li class="right" style="margin-right: 10px">
|
||
<a href="../genindex.html" title="General Index"
|
||
accesskey="I">index</a></li>
|
||
<li class="right" >
|
||
<a href="../py-modindex.html" title="Python Module Index"
|
||
>modules</a> |</li>
|
||
<li class="right" >
|
||
<a href="unicode.html" title="Unicode HOWTO"
|
||
accesskey="N">next</a> |</li>
|
||
<li class="right" >
|
||
<a href="sockets.html" title="Socket Programming HOWTO"
|
||
accesskey="P">previous</a> |</li>
|
||
<li><img src="../_static/py.png" alt=""
|
||
style="vertical-align: middle; margin-top: -1px"/></li>
|
||
<li><a href="https://www.python.org/">Python</a> »</li>
|
||
<li>
|
||
<span class="language_switcher_placeholder">en</span>
|
||
<span class="version_switcher_placeholder">3.7.4</span>
|
||
<a href="../index.html">Documentation </a> »
|
||
</li>
|
||
|
||
<li class="nav-item nav-item-1"><a href="index.html" accesskey="U">Python HOWTOs</a> »</li>
|
||
<li class="right">
|
||
|
||
|
||
<div class="inline-search" style="display: none" role="search">
|
||
<form class="inline-search" action="../search.html" method="get">
|
||
<input placeholder="Quick search" type="text" name="q" />
|
||
<input type="submit" value="Go" />
|
||
<input type="hidden" name="check_keywords" value="yes" />
|
||
<input type="hidden" name="area" value="default" />
|
||
</form>
|
||
</div>
|
||
<script type="text/javascript">$('.inline-search').show(0);</script>
|
||
|
|
||
</li>
|
||
|
||
</ul>
|
||
</div>
|
||
|
||
<div class="document">
|
||
<div class="documentwrapper">
|
||
<div class="bodywrapper">
|
||
<div class="body" role="main">
|
||
|
||
<div class="section" id="sorting-how-to">
|
||
<span id="sortinghowto"></span><h1>Sorting HOW TO<a class="headerlink" href="#sorting-how-to" title="Permalink to this headline">¶</a></h1>
|
||
<dl class="field-list simple">
|
||
<dt class="field-odd">Author</dt>
|
||
<dd class="field-odd"><p>Andrew Dalke and Raymond Hettinger</p>
|
||
</dd>
|
||
<dt class="field-even">Release</dt>
|
||
<dd class="field-even"><p>0.1</p>
|
||
</dd>
|
||
</dl>
|
||
<p>Python lists have a built-in <a class="reference internal" href="../library/stdtypes.html#list.sort" title="list.sort"><code class="xref py py-meth docutils literal notranslate"><span class="pre">list.sort()</span></code></a> method that modifies the list
|
||
in-place. There is also a <a class="reference internal" href="../library/functions.html#sorted" title="sorted"><code class="xref py py-func docutils literal notranslate"><span class="pre">sorted()</span></code></a> built-in function that builds a new
|
||
sorted list from an iterable.</p>
|
||
<p>In this document, we explore the various techniques for sorting data using Python.</p>
|
||
<div class="section" id="sorting-basics">
|
||
<h2>Sorting Basics<a class="headerlink" href="#sorting-basics" title="Permalink to this headline">¶</a></h2>
|
||
<p>A simple ascending sort is very easy: just call the <a class="reference internal" href="../library/functions.html#sorted" title="sorted"><code class="xref py py-func docutils literal notranslate"><span class="pre">sorted()</span></code></a> function. It
|
||
returns a new sorted list:</p>
|
||
<div class="highlight-python3 notranslate"><div class="highlight"><pre><span></span><span class="gp">>>> </span><span class="nb">sorted</span><span class="p">([</span><span class="mi">5</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">4</span><span class="p">])</span>
|
||
<span class="go">[1, 2, 3, 4, 5]</span>
|
||
</pre></div>
|
||
</div>
|
||
<p>You can also use the <a class="reference internal" href="../library/stdtypes.html#list.sort" title="list.sort"><code class="xref py py-meth docutils literal notranslate"><span class="pre">list.sort()</span></code></a> method. It modifies the list
|
||
in-place (and returns <code class="docutils literal notranslate"><span class="pre">None</span></code> to avoid confusion). Usually it’s less convenient
|
||
than <a class="reference internal" href="../library/functions.html#sorted" title="sorted"><code class="xref py py-func docutils literal notranslate"><span class="pre">sorted()</span></code></a> - but if you don’t need the original list, it’s slightly
|
||
more efficient.</p>
|
||
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="gp">>>> </span><span class="n">a</span> <span class="o">=</span> <span class="p">[</span><span class="mi">5</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">4</span><span class="p">]</span>
|
||
<span class="gp">>>> </span><span class="n">a</span><span class="o">.</span><span class="n">sort</span><span class="p">()</span>
|
||
<span class="gp">>>> </span><span class="n">a</span>
|
||
<span class="go">[1, 2, 3, 4, 5]</span>
|
||
</pre></div>
|
||
</div>
|
||
<p>Another difference is that the <a class="reference internal" href="../library/stdtypes.html#list.sort" title="list.sort"><code class="xref py py-meth docutils literal notranslate"><span class="pre">list.sort()</span></code></a> method is only defined for
|
||
lists. In contrast, the <a class="reference internal" href="../library/functions.html#sorted" title="sorted"><code class="xref py py-func docutils literal notranslate"><span class="pre">sorted()</span></code></a> function accepts any iterable.</p>
|
||
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="gp">>>> </span><span class="nb">sorted</span><span class="p">({</span><span class="mi">1</span><span class="p">:</span> <span class="s1">'D'</span><span class="p">,</span> <span class="mi">2</span><span class="p">:</span> <span class="s1">'B'</span><span class="p">,</span> <span class="mi">3</span><span class="p">:</span> <span class="s1">'B'</span><span class="p">,</span> <span class="mi">4</span><span class="p">:</span> <span class="s1">'E'</span><span class="p">,</span> <span class="mi">5</span><span class="p">:</span> <span class="s1">'A'</span><span class="p">})</span>
|
||
<span class="go">[1, 2, 3, 4, 5]</span>
|
||
</pre></div>
|
||
</div>
|
||
</div>
|
||
<div class="section" id="key-functions">
|
||
<h2>Key Functions<a class="headerlink" href="#key-functions" title="Permalink to this headline">¶</a></h2>
|
||
<p>Both <a class="reference internal" href="../library/stdtypes.html#list.sort" title="list.sort"><code class="xref py py-meth docutils literal notranslate"><span class="pre">list.sort()</span></code></a> and <a class="reference internal" href="../library/functions.html#sorted" title="sorted"><code class="xref py py-func docutils literal notranslate"><span class="pre">sorted()</span></code></a> have a <em>key</em> parameter to specify a
|
||
function to be called on each list element prior to making comparisons.</p>
|
||
<p>For example, here’s a case-insensitive string comparison:</p>
|
||
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="gp">>>> </span><span class="nb">sorted</span><span class="p">(</span><span class="s2">"This is a test string from Andrew"</span><span class="o">.</span><span class="n">split</span><span class="p">(),</span> <span class="n">key</span><span class="o">=</span><span class="nb">str</span><span class="o">.</span><span class="n">lower</span><span class="p">)</span>
|
||
<span class="go">['a', 'Andrew', 'from', 'is', 'string', 'test', 'This']</span>
|
||
</pre></div>
|
||
</div>
|
||
<p>The value of the <em>key</em> parameter should be a function that takes a single argument
|
||
and returns a key to use for sorting purposes. This technique is fast because
|
||
the key function is called exactly once for each input record.</p>
|
||
<p>A common pattern is to sort complex objects using some of the object’s indices
|
||
as keys. For example:</p>
|
||
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="gp">>>> </span><span class="n">student_tuples</span> <span class="o">=</span> <span class="p">[</span>
|
||
<span class="gp">... </span> <span class="p">(</span><span class="s1">'john'</span><span class="p">,</span> <span class="s1">'A'</span><span class="p">,</span> <span class="mi">15</span><span class="p">),</span>
|
||
<span class="gp">... </span> <span class="p">(</span><span class="s1">'jane'</span><span class="p">,</span> <span class="s1">'B'</span><span class="p">,</span> <span class="mi">12</span><span class="p">),</span>
|
||
<span class="gp">... </span> <span class="p">(</span><span class="s1">'dave'</span><span class="p">,</span> <span class="s1">'B'</span><span class="p">,</span> <span class="mi">10</span><span class="p">),</span>
|
||
<span class="gp">... </span><span class="p">]</span>
|
||
<span class="gp">>>> </span><span class="nb">sorted</span><span class="p">(</span><span class="n">student_tuples</span><span class="p">,</span> <span class="n">key</span><span class="o">=</span><span class="k">lambda</span> <span class="n">student</span><span class="p">:</span> <span class="n">student</span><span class="p">[</span><span class="mi">2</span><span class="p">])</span> <span class="c1"># sort by age</span>
|
||
<span class="go">[('dave', 'B', 10), ('jane', 'B', 12), ('john', 'A', 15)]</span>
|
||
</pre></div>
|
||
</div>
|
||
<p>The same technique works for objects with named attributes. For example:</p>
|
||
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="gp">>>> </span><span class="k">class</span> <span class="nc">Student</span><span class="p">:</span>
|
||
<span class="gp">... </span> <span class="k">def</span> <span class="nf">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">name</span><span class="p">,</span> <span class="n">grade</span><span class="p">,</span> <span class="n">age</span><span class="p">):</span>
|
||
<span class="gp">... </span> <span class="bp">self</span><span class="o">.</span><span class="n">name</span> <span class="o">=</span> <span class="n">name</span>
|
||
<span class="gp">... </span> <span class="bp">self</span><span class="o">.</span><span class="n">grade</span> <span class="o">=</span> <span class="n">grade</span>
|
||
<span class="gp">... </span> <span class="bp">self</span><span class="o">.</span><span class="n">age</span> <span class="o">=</span> <span class="n">age</span>
|
||
<span class="gp">... </span> <span class="k">def</span> <span class="nf">__repr__</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
|
||
<span class="gp">... </span> <span class="k">return</span> <span class="nb">repr</span><span class="p">((</span><span class="bp">self</span><span class="o">.</span><span class="n">name</span><span class="p">,</span> <span class="bp">self</span><span class="o">.</span><span class="n">grade</span><span class="p">,</span> <span class="bp">self</span><span class="o">.</span><span class="n">age</span><span class="p">))</span>
|
||
</pre></div>
|
||
</div>
|
||
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="gp">>>> </span><span class="n">student_objects</span> <span class="o">=</span> <span class="p">[</span>
|
||
<span class="gp">... </span> <span class="n">Student</span><span class="p">(</span><span class="s1">'john'</span><span class="p">,</span> <span class="s1">'A'</span><span class="p">,</span> <span class="mi">15</span><span class="p">),</span>
|
||
<span class="gp">... </span> <span class="n">Student</span><span class="p">(</span><span class="s1">'jane'</span><span class="p">,</span> <span class="s1">'B'</span><span class="p">,</span> <span class="mi">12</span><span class="p">),</span>
|
||
<span class="gp">... </span> <span class="n">Student</span><span class="p">(</span><span class="s1">'dave'</span><span class="p">,</span> <span class="s1">'B'</span><span class="p">,</span> <span class="mi">10</span><span class="p">),</span>
|
||
<span class="gp">... </span><span class="p">]</span>
|
||
<span class="gp">>>> </span><span class="nb">sorted</span><span class="p">(</span><span class="n">student_objects</span><span class="p">,</span> <span class="n">key</span><span class="o">=</span><span class="k">lambda</span> <span class="n">student</span><span class="p">:</span> <span class="n">student</span><span class="o">.</span><span class="n">age</span><span class="p">)</span> <span class="c1"># sort by age</span>
|
||
<span class="go">[('dave', 'B', 10), ('jane', 'B', 12), ('john', 'A', 15)]</span>
|
||
</pre></div>
|
||
</div>
|
||
</div>
|
||
<div class="section" id="operator-module-functions">
|
||
<h2>Operator Module Functions<a class="headerlink" href="#operator-module-functions" title="Permalink to this headline">¶</a></h2>
|
||
<p>The key-function patterns shown above are very common, so Python provides
|
||
convenience functions to make accessor functions easier and faster. The
|
||
<a class="reference internal" href="../library/operator.html#module-operator" title="operator: Functions corresponding to the standard operators."><code class="xref py py-mod docutils literal notranslate"><span class="pre">operator</span></code></a> module has <a class="reference internal" href="../library/operator.html#operator.itemgetter" title="operator.itemgetter"><code class="xref py py-func docutils literal notranslate"><span class="pre">itemgetter()</span></code></a>,
|
||
<a class="reference internal" href="../library/operator.html#operator.attrgetter" title="operator.attrgetter"><code class="xref py py-func docutils literal notranslate"><span class="pre">attrgetter()</span></code></a>, and a <a class="reference internal" href="../library/operator.html#operator.methodcaller" title="operator.methodcaller"><code class="xref py py-func docutils literal notranslate"><span class="pre">methodcaller()</span></code></a> function.</p>
|
||
<p>Using those functions, the above examples become simpler and faster:</p>
|
||
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="gp">>>> </span><span class="kn">from</span> <span class="nn">operator</span> <span class="k">import</span> <span class="n">itemgetter</span><span class="p">,</span> <span class="n">attrgetter</span>
|
||
</pre></div>
|
||
</div>
|
||
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="gp">>>> </span><span class="nb">sorted</span><span class="p">(</span><span class="n">student_tuples</span><span class="p">,</span> <span class="n">key</span><span class="o">=</span><span class="n">itemgetter</span><span class="p">(</span><span class="mi">2</span><span class="p">))</span>
|
||
<span class="go">[('dave', 'B', 10), ('jane', 'B', 12), ('john', 'A', 15)]</span>
|
||
</pre></div>
|
||
</div>
|
||
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="gp">>>> </span><span class="nb">sorted</span><span class="p">(</span><span class="n">student_objects</span><span class="p">,</span> <span class="n">key</span><span class="o">=</span><span class="n">attrgetter</span><span class="p">(</span><span class="s1">'age'</span><span class="p">))</span>
|
||
<span class="go">[('dave', 'B', 10), ('jane', 'B', 12), ('john', 'A', 15)]</span>
|
||
</pre></div>
|
||
</div>
|
||
<p>The operator module functions allow multiple levels of sorting. For example, to
|
||
sort by <em>grade</em> then by <em>age</em>:</p>
|
||
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="gp">>>> </span><span class="nb">sorted</span><span class="p">(</span><span class="n">student_tuples</span><span class="p">,</span> <span class="n">key</span><span class="o">=</span><span class="n">itemgetter</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span><span class="mi">2</span><span class="p">))</span>
|
||
<span class="go">[('john', 'A', 15), ('dave', 'B', 10), ('jane', 'B', 12)]</span>
|
||
</pre></div>
|
||
</div>
|
||
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="gp">>>> </span><span class="nb">sorted</span><span class="p">(</span><span class="n">student_objects</span><span class="p">,</span> <span class="n">key</span><span class="o">=</span><span class="n">attrgetter</span><span class="p">(</span><span class="s1">'grade'</span><span class="p">,</span> <span class="s1">'age'</span><span class="p">))</span>
|
||
<span class="go">[('john', 'A', 15), ('dave', 'B', 10), ('jane', 'B', 12)]</span>
|
||
</pre></div>
|
||
</div>
|
||
</div>
|
||
<div class="section" id="ascending-and-descending">
|
||
<h2>Ascending and Descending<a class="headerlink" href="#ascending-and-descending" title="Permalink to this headline">¶</a></h2>
|
||
<p>Both <a class="reference internal" href="../library/stdtypes.html#list.sort" title="list.sort"><code class="xref py py-meth docutils literal notranslate"><span class="pre">list.sort()</span></code></a> and <a class="reference internal" href="../library/functions.html#sorted" title="sorted"><code class="xref py py-func docutils literal notranslate"><span class="pre">sorted()</span></code></a> accept a <em>reverse</em> parameter with a
|
||
boolean value. This is used to flag descending sorts. For example, to get the
|
||
student data in reverse <em>age</em> order:</p>
|
||
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="gp">>>> </span><span class="nb">sorted</span><span class="p">(</span><span class="n">student_tuples</span><span class="p">,</span> <span class="n">key</span><span class="o">=</span><span class="n">itemgetter</span><span class="p">(</span><span class="mi">2</span><span class="p">),</span> <span class="n">reverse</span><span class="o">=</span><span class="kc">True</span><span class="p">)</span>
|
||
<span class="go">[('john', 'A', 15), ('jane', 'B', 12), ('dave', 'B', 10)]</span>
|
||
</pre></div>
|
||
</div>
|
||
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="gp">>>> </span><span class="nb">sorted</span><span class="p">(</span><span class="n">student_objects</span><span class="p">,</span> <span class="n">key</span><span class="o">=</span><span class="n">attrgetter</span><span class="p">(</span><span class="s1">'age'</span><span class="p">),</span> <span class="n">reverse</span><span class="o">=</span><span class="kc">True</span><span class="p">)</span>
|
||
<span class="go">[('john', 'A', 15), ('jane', 'B', 12), ('dave', 'B', 10)]</span>
|
||
</pre></div>
|
||
</div>
|
||
</div>
|
||
<div class="section" id="sort-stability-and-complex-sorts">
|
||
<h2>Sort Stability and Complex Sorts<a class="headerlink" href="#sort-stability-and-complex-sorts" title="Permalink to this headline">¶</a></h2>
|
||
<p>Sorts are guaranteed to be <a class="reference external" href="https://en.wikipedia.org/wiki/Sorting_algorithm#Stability">stable</a>. That means that
|
||
when multiple records have the same key, their original order is preserved.</p>
|
||
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="gp">>>> </span><span class="n">data</span> <span class="o">=</span> <span class="p">[(</span><span class="s1">'red'</span><span class="p">,</span> <span class="mi">1</span><span class="p">),</span> <span class="p">(</span><span class="s1">'blue'</span><span class="p">,</span> <span class="mi">1</span><span class="p">),</span> <span class="p">(</span><span class="s1">'red'</span><span class="p">,</span> <span class="mi">2</span><span class="p">),</span> <span class="p">(</span><span class="s1">'blue'</span><span class="p">,</span> <span class="mi">2</span><span class="p">)]</span>
|
||
<span class="gp">>>> </span><span class="nb">sorted</span><span class="p">(</span><span class="n">data</span><span class="p">,</span> <span class="n">key</span><span class="o">=</span><span class="n">itemgetter</span><span class="p">(</span><span class="mi">0</span><span class="p">))</span>
|
||
<span class="go">[('blue', 1), ('blue', 2), ('red', 1), ('red', 2)]</span>
|
||
</pre></div>
|
||
</div>
|
||
<p>Notice how the two records for <em>blue</em> retain their original order so that
|
||
<code class="docutils literal notranslate"><span class="pre">('blue',</span> <span class="pre">1)</span></code> is guaranteed to precede <code class="docutils literal notranslate"><span class="pre">('blue',</span> <span class="pre">2)</span></code>.</p>
|
||
<p>This wonderful property lets you build complex sorts in a series of sorting
|
||
steps. For example, to sort the student data by descending <em>grade</em> and then
|
||
ascending <em>age</em>, do the <em>age</em> sort first and then sort again using <em>grade</em>:</p>
|
||
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="gp">>>> </span><span class="n">s</span> <span class="o">=</span> <span class="nb">sorted</span><span class="p">(</span><span class="n">student_objects</span><span class="p">,</span> <span class="n">key</span><span class="o">=</span><span class="n">attrgetter</span><span class="p">(</span><span class="s1">'age'</span><span class="p">))</span> <span class="c1"># sort on secondary key</span>
|
||
<span class="gp">>>> </span><span class="nb">sorted</span><span class="p">(</span><span class="n">s</span><span class="p">,</span> <span class="n">key</span><span class="o">=</span><span class="n">attrgetter</span><span class="p">(</span><span class="s1">'grade'</span><span class="p">),</span> <span class="n">reverse</span><span class="o">=</span><span class="kc">True</span><span class="p">)</span> <span class="c1"># now sort on primary key, descending</span>
|
||
<span class="go">[('dave', 'B', 10), ('jane', 'B', 12), ('john', 'A', 15)]</span>
|
||
</pre></div>
|
||
</div>
|
||
<p>The <a class="reference external" href="https://en.wikipedia.org/wiki/Timsort">Timsort</a> algorithm used in Python
|
||
does multiple sorts efficiently because it can take advantage of any ordering
|
||
already present in a dataset.</p>
|
||
</div>
|
||
<div class="section" id="the-old-way-using-decorate-sort-undecorate">
|
||
<h2>The Old Way Using Decorate-Sort-Undecorate<a class="headerlink" href="#the-old-way-using-decorate-sort-undecorate" title="Permalink to this headline">¶</a></h2>
|
||
<p>This idiom is called Decorate-Sort-Undecorate after its three steps:</p>
|
||
<ul class="simple">
|
||
<li><p>First, the initial list is decorated with new values that control the sort order.</p></li>
|
||
<li><p>Second, the decorated list is sorted.</p></li>
|
||
<li><p>Finally, the decorations are removed, creating a list that contains only the
|
||
initial values in the new order.</p></li>
|
||
</ul>
|
||
<p>For example, to sort the student data by <em>grade</em> using the DSU approach:</p>
|
||
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="gp">>>> </span><span class="n">decorated</span> <span class="o">=</span> <span class="p">[(</span><span class="n">student</span><span class="o">.</span><span class="n">grade</span><span class="p">,</span> <span class="n">i</span><span class="p">,</span> <span class="n">student</span><span class="p">)</span> <span class="k">for</span> <span class="n">i</span><span class="p">,</span> <span class="n">student</span> <span class="ow">in</span> <span class="nb">enumerate</span><span class="p">(</span><span class="n">student_objects</span><span class="p">)]</span>
|
||
<span class="gp">>>> </span><span class="n">decorated</span><span class="o">.</span><span class="n">sort</span><span class="p">()</span>
|
||
<span class="gp">>>> </span><span class="p">[</span><span class="n">student</span> <span class="k">for</span> <span class="n">grade</span><span class="p">,</span> <span class="n">i</span><span class="p">,</span> <span class="n">student</span> <span class="ow">in</span> <span class="n">decorated</span><span class="p">]</span> <span class="c1"># undecorate</span>
|
||
<span class="go">[('john', 'A', 15), ('jane', 'B', 12), ('dave', 'B', 10)]</span>
|
||
</pre></div>
|
||
</div>
|
||
<p>This idiom works because tuples are compared lexicographically; the first items
|
||
are compared; if they are the same then the second items are compared, and so
|
||
on.</p>
|
||
<p>It is not strictly necessary in all cases to include the index <em>i</em> in the
|
||
decorated list, but including it gives two benefits:</p>
|
||
<ul class="simple">
|
||
<li><p>The sort is stable – if two items have the same key, their order will be
|
||
preserved in the sorted list.</p></li>
|
||
<li><p>The original items do not have to be comparable because the ordering of the
|
||
decorated tuples will be determined by at most the first two items. So for
|
||
example the original list could contain complex numbers which cannot be sorted
|
||
directly.</p></li>
|
||
</ul>
|
||
<p>Another name for this idiom is
|
||
<a class="reference external" href="https://en.wikipedia.org/wiki/Schwartzian_transform">Schwartzian transform</a>,
|
||
after Randal L. Schwartz, who popularized it among Perl programmers.</p>
|
||
<p>Now that Python sorting provides key-functions, this technique is not often needed.</p>
|
||
</div>
|
||
<div class="section" id="the-old-way-using-the-cmp-parameter">
|
||
<h2>The Old Way Using the <em>cmp</em> Parameter<a class="headerlink" href="#the-old-way-using-the-cmp-parameter" title="Permalink to this headline">¶</a></h2>
|
||
<p>Many constructs given in this HOWTO assume Python 2.4 or later. Before that,
|
||
there was no <a class="reference internal" href="../library/functions.html#sorted" title="sorted"><code class="xref py py-func docutils literal notranslate"><span class="pre">sorted()</span></code></a> builtin and <a class="reference internal" href="../library/stdtypes.html#list.sort" title="list.sort"><code class="xref py py-meth docutils literal notranslate"><span class="pre">list.sort()</span></code></a> took no keyword
|
||
arguments. Instead, all of the Py2.x versions supported a <em>cmp</em> parameter to
|
||
handle user specified comparison functions.</p>
|
||
<p>In Py3.0, the <em>cmp</em> parameter was removed entirely (as part of a larger effort to
|
||
simplify and unify the language, eliminating the conflict between rich
|
||
comparisons and the <code class="xref py py-meth docutils literal notranslate"><span class="pre">__cmp__()</span></code> magic method).</p>
|
||
<p>In Py2.x, sort allowed an optional function which can be called for doing the
|
||
comparisons. That function should take two arguments to be compared and then
|
||
return a negative value for less-than, return zero if they are equal, or return
|
||
a positive value for greater-than. For example, we can do:</p>
|
||
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="gp">>>> </span><span class="k">def</span> <span class="nf">numeric_compare</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">):</span>
|
||
<span class="gp">... </span> <span class="k">return</span> <span class="n">x</span> <span class="o">-</span> <span class="n">y</span>
|
||
<span class="gp">>>> </span><span class="nb">sorted</span><span class="p">([</span><span class="mi">5</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">4</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">3</span><span class="p">],</span> <span class="nb">cmp</span><span class="o">=</span><span class="n">numeric_compare</span><span class="p">)</span> <span class="c1"># doctest: +SKIP</span>
|
||
<span class="go">[1, 2, 3, 4, 5]</span>
|
||
</pre></div>
|
||
</div>
|
||
<p>Or you can reverse the order of comparison with:</p>
|
||
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="gp">>>> </span><span class="k">def</span> <span class="nf">reverse_numeric</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">):</span>
|
||
<span class="gp">... </span> <span class="k">return</span> <span class="n">y</span> <span class="o">-</span> <span class="n">x</span>
|
||
<span class="gp">>>> </span><span class="nb">sorted</span><span class="p">([</span><span class="mi">5</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">4</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">3</span><span class="p">],</span> <span class="nb">cmp</span><span class="o">=</span><span class="n">reverse_numeric</span><span class="p">)</span> <span class="c1"># doctest: +SKIP</span>
|
||
<span class="go">[5, 4, 3, 2, 1]</span>
|
||
</pre></div>
|
||
</div>
|
||
<p>When porting code from Python 2.x to 3.x, the situation can arise when you have
|
||
the user supplying a comparison function and you need to convert that to a key
|
||
function. The following wrapper makes that easy to do:</p>
|
||
<div class="highlight-python3 notranslate"><div class="highlight"><pre><span></span><span class="k">def</span> <span class="nf">cmp_to_key</span><span class="p">(</span><span class="n">mycmp</span><span class="p">):</span>
|
||
<span class="s1">'Convert a cmp= function into a key= function'</span>
|
||
<span class="k">class</span> <span class="nc">K</span><span class="p">:</span>
|
||
<span class="k">def</span> <span class="nf">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">obj</span><span class="p">,</span> <span class="o">*</span><span class="n">args</span><span class="p">):</span>
|
||
<span class="bp">self</span><span class="o">.</span><span class="n">obj</span> <span class="o">=</span> <span class="n">obj</span>
|
||
<span class="k">def</span> <span class="nf">__lt__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">other</span><span class="p">):</span>
|
||
<span class="k">return</span> <span class="n">mycmp</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">obj</span><span class="p">,</span> <span class="n">other</span><span class="o">.</span><span class="n">obj</span><span class="p">)</span> <span class="o"><</span> <span class="mi">0</span>
|
||
<span class="k">def</span> <span class="nf">__gt__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">other</span><span class="p">):</span>
|
||
<span class="k">return</span> <span class="n">mycmp</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">obj</span><span class="p">,</span> <span class="n">other</span><span class="o">.</span><span class="n">obj</span><span class="p">)</span> <span class="o">></span> <span class="mi">0</span>
|
||
<span class="k">def</span> <span class="nf">__eq__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">other</span><span class="p">):</span>
|
||
<span class="k">return</span> <span class="n">mycmp</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">obj</span><span class="p">,</span> <span class="n">other</span><span class="o">.</span><span class="n">obj</span><span class="p">)</span> <span class="o">==</span> <span class="mi">0</span>
|
||
<span class="k">def</span> <span class="nf">__le__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">other</span><span class="p">):</span>
|
||
<span class="k">return</span> <span class="n">mycmp</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">obj</span><span class="p">,</span> <span class="n">other</span><span class="o">.</span><span class="n">obj</span><span class="p">)</span> <span class="o"><=</span> <span class="mi">0</span>
|
||
<span class="k">def</span> <span class="nf">__ge__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">other</span><span class="p">):</span>
|
||
<span class="k">return</span> <span class="n">mycmp</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">obj</span><span class="p">,</span> <span class="n">other</span><span class="o">.</span><span class="n">obj</span><span class="p">)</span> <span class="o">>=</span> <span class="mi">0</span>
|
||
<span class="k">def</span> <span class="nf">__ne__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">other</span><span class="p">):</span>
|
||
<span class="k">return</span> <span class="n">mycmp</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">obj</span><span class="p">,</span> <span class="n">other</span><span class="o">.</span><span class="n">obj</span><span class="p">)</span> <span class="o">!=</span> <span class="mi">0</span>
|
||
<span class="k">return</span> <span class="n">K</span>
|
||
</pre></div>
|
||
</div>
|
||
<p>To convert to a key function, just wrap the old comparison function:</p>
|
||
<div class="highlight-pycon3 notranslate"><div class="highlight"><pre><span></span><span class="gp">>>> </span><span class="nb">sorted</span><span class="p">([</span><span class="mi">5</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">4</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">3</span><span class="p">],</span> <span class="n">key</span><span class="o">=</span><span class="n">cmp_to_key</span><span class="p">(</span><span class="n">reverse_numeric</span><span class="p">))</span>
|
||
<span class="go">[5, 4, 3, 2, 1]</span>
|
||
</pre></div>
|
||
</div>
|
||
<p>In Python 3.2, the <a class="reference internal" href="../library/functools.html#functools.cmp_to_key" title="functools.cmp_to_key"><code class="xref py py-func docutils literal notranslate"><span class="pre">functools.cmp_to_key()</span></code></a> function was added to the
|
||
<a class="reference internal" href="../library/functools.html#module-functools" title="functools: Higher-order functions and operations on callable objects."><code class="xref py py-mod docutils literal notranslate"><span class="pre">functools</span></code></a> module in the standard library.</p>
|
||
</div>
|
||
<div class="section" id="odd-and-ends">
|
||
<h2>Odd and Ends<a class="headerlink" href="#odd-and-ends" title="Permalink to this headline">¶</a></h2>
|
||
<ul>
|
||
<li><p>For locale aware sorting, use <a class="reference internal" href="../library/locale.html#locale.strxfrm" title="locale.strxfrm"><code class="xref py py-func docutils literal notranslate"><span class="pre">locale.strxfrm()</span></code></a> for a key function or
|
||
<a class="reference internal" href="../library/locale.html#locale.strcoll" title="locale.strcoll"><code class="xref py py-func docutils literal notranslate"><span class="pre">locale.strcoll()</span></code></a> for a comparison function.</p></li>
|
||
<li><p>The <em>reverse</em> parameter still maintains sort stability (so that records with
|
||
equal keys retain the original order). Interestingly, that effect can be
|
||
simulated without the parameter by using the builtin <a class="reference internal" href="../library/functions.html#reversed" title="reversed"><code class="xref py py-func docutils literal notranslate"><span class="pre">reversed()</span></code></a> function
|
||
twice:</p>
|
||
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="gp">>>> </span><span class="n">data</span> <span class="o">=</span> <span class="p">[(</span><span class="s1">'red'</span><span class="p">,</span> <span class="mi">1</span><span class="p">),</span> <span class="p">(</span><span class="s1">'blue'</span><span class="p">,</span> <span class="mi">1</span><span class="p">),</span> <span class="p">(</span><span class="s1">'red'</span><span class="p">,</span> <span class="mi">2</span><span class="p">),</span> <span class="p">(</span><span class="s1">'blue'</span><span class="p">,</span> <span class="mi">2</span><span class="p">)]</span>
|
||
<span class="gp">>>> </span><span class="n">standard_way</span> <span class="o">=</span> <span class="nb">sorted</span><span class="p">(</span><span class="n">data</span><span class="p">,</span> <span class="n">key</span><span class="o">=</span><span class="n">itemgetter</span><span class="p">(</span><span class="mi">0</span><span class="p">),</span> <span class="n">reverse</span><span class="o">=</span><span class="kc">True</span><span class="p">)</span>
|
||
<span class="gp">>>> </span><span class="n">double_reversed</span> <span class="o">=</span> <span class="nb">list</span><span class="p">(</span><span class="nb">reversed</span><span class="p">(</span><span class="nb">sorted</span><span class="p">(</span><span class="nb">reversed</span><span class="p">(</span><span class="n">data</span><span class="p">),</span> <span class="n">key</span><span class="o">=</span><span class="n">itemgetter</span><span class="p">(</span><span class="mi">0</span><span class="p">))))</span>
|
||
<span class="gp">>>> </span><span class="k">assert</span> <span class="n">standard_way</span> <span class="o">==</span> <span class="n">double_reversed</span>
|
||
<span class="gp">>>> </span><span class="n">standard_way</span>
|
||
<span class="go">[('red', 1), ('red', 2), ('blue', 1), ('blue', 2)]</span>
|
||
</pre></div>
|
||
</div>
|
||
</li>
|
||
<li><p>The sort routines are guaranteed to use <a class="reference internal" href="../reference/datamodel.html#object.__lt__" title="object.__lt__"><code class="xref py py-meth docutils literal notranslate"><span class="pre">__lt__()</span></code></a> when making comparisons
|
||
between two objects. So, it is easy to add a standard sort order to a class by
|
||
defining an <a class="reference internal" href="../reference/datamodel.html#object.__lt__" title="object.__lt__"><code class="xref py py-meth docutils literal notranslate"><span class="pre">__lt__()</span></code></a> method:</p>
|
||
<div class="highlight-python3 notranslate"><div class="highlight"><pre><span></span><span class="gp">>>> </span><span class="n">Student</span><span class="o">.</span><span class="fm">__lt__</span> <span class="o">=</span> <span class="k">lambda</span> <span class="bp">self</span><span class="p">,</span> <span class="n">other</span><span class="p">:</span> <span class="bp">self</span><span class="o">.</span><span class="n">age</span> <span class="o"><</span> <span class="n">other</span><span class="o">.</span><span class="n">age</span>
|
||
<span class="gp">>>> </span><span class="nb">sorted</span><span class="p">(</span><span class="n">student_objects</span><span class="p">)</span>
|
||
<span class="go">[('dave', 'B', 10), ('jane', 'B', 12), ('john', 'A', 15)]</span>
|
||
</pre></div>
|
||
</div>
|
||
</li>
|
||
<li><p>Key functions need not depend directly on the objects being sorted. A key
|
||
function can also access external resources. For instance, if the student grades
|
||
are stored in a dictionary, they can be used to sort a separate list of student
|
||
names:</p>
|
||
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="gp">>>> </span><span class="n">students</span> <span class="o">=</span> <span class="p">[</span><span class="s1">'dave'</span><span class="p">,</span> <span class="s1">'john'</span><span class="p">,</span> <span class="s1">'jane'</span><span class="p">]</span>
|
||
<span class="gp">>>> </span><span class="n">newgrades</span> <span class="o">=</span> <span class="p">{</span><span class="s1">'john'</span><span class="p">:</span> <span class="s1">'F'</span><span class="p">,</span> <span class="s1">'jane'</span><span class="p">:</span><span class="s1">'A'</span><span class="p">,</span> <span class="s1">'dave'</span><span class="p">:</span> <span class="s1">'C'</span><span class="p">}</span>
|
||
<span class="gp">>>> </span><span class="nb">sorted</span><span class="p">(</span><span class="n">students</span><span class="p">,</span> <span class="n">key</span><span class="o">=</span><span class="n">newgrades</span><span class="o">.</span><span class="fm">__getitem__</span><span class="p">)</span>
|
||
<span class="go">['jane', 'dave', 'john']</span>
|
||
</pre></div>
|
||
</div>
|
||
</li>
|
||
</ul>
|
||
</div>
|
||
</div>
|
||
|
||
|
||
</div>
|
||
</div>
|
||
</div>
|
||
<div class="sphinxsidebar" role="navigation" aria-label="main navigation">
|
||
<div class="sphinxsidebarwrapper">
|
||
<h3><a href="../contents.html">Table of Contents</a></h3>
|
||
<ul>
|
||
<li><a class="reference internal" href="#">Sorting HOW TO</a><ul>
|
||
<li><a class="reference internal" href="#sorting-basics">Sorting Basics</a></li>
|
||
<li><a class="reference internal" href="#key-functions">Key Functions</a></li>
|
||
<li><a class="reference internal" href="#operator-module-functions">Operator Module Functions</a></li>
|
||
<li><a class="reference internal" href="#ascending-and-descending">Ascending and Descending</a></li>
|
||
<li><a class="reference internal" href="#sort-stability-and-complex-sorts">Sort Stability and Complex Sorts</a></li>
|
||
<li><a class="reference internal" href="#the-old-way-using-decorate-sort-undecorate">The Old Way Using Decorate-Sort-Undecorate</a></li>
|
||
<li><a class="reference internal" href="#the-old-way-using-the-cmp-parameter">The Old Way Using the <em>cmp</em> Parameter</a></li>
|
||
<li><a class="reference internal" href="#odd-and-ends">Odd and Ends</a></li>
|
||
</ul>
|
||
</li>
|
||
</ul>
|
||
|
||
<h4>Previous topic</h4>
|
||
<p class="topless"><a href="sockets.html"
|
||
title="previous chapter">Socket Programming HOWTO</a></p>
|
||
<h4>Next topic</h4>
|
||
<p class="topless"><a href="unicode.html"
|
||
title="next chapter">Unicode HOWTO</a></p>
|
||
<div role="note" aria-label="source link">
|
||
<h3>This Page</h3>
|
||
<ul class="this-page-menu">
|
||
<li><a href="../bugs.html">Report a Bug</a></li>
|
||
<li>
|
||
<a href="https://github.com/python/cpython/blob/3.7/Doc/howto/sorting.rst"
|
||
rel="nofollow">Show Source
|
||
</a>
|
||
</li>
|
||
</ul>
|
||
</div>
|
||
</div>
|
||
</div>
|
||
<div class="clearer"></div>
|
||
</div>
|
||
<div class="related" role="navigation" aria-label="related navigation">
|
||
<h3>Navigation</h3>
|
||
<ul>
|
||
<li class="right" style="margin-right: 10px">
|
||
<a href="../genindex.html" title="General Index"
|
||
>index</a></li>
|
||
<li class="right" >
|
||
<a href="../py-modindex.html" title="Python Module Index"
|
||
>modules</a> |</li>
|
||
<li class="right" >
|
||
<a href="unicode.html" title="Unicode HOWTO"
|
||
>next</a> |</li>
|
||
<li class="right" >
|
||
<a href="sockets.html" title="Socket Programming HOWTO"
|
||
>previous</a> |</li>
|
||
<li><img src="../_static/py.png" alt=""
|
||
style="vertical-align: middle; margin-top: -1px"/></li>
|
||
<li><a href="https://www.python.org/">Python</a> »</li>
|
||
<li>
|
||
<span class="language_switcher_placeholder">en</span>
|
||
<span class="version_switcher_placeholder">3.7.4</span>
|
||
<a href="../index.html">Documentation </a> »
|
||
</li>
|
||
|
||
<li class="nav-item nav-item-1"><a href="index.html" >Python HOWTOs</a> »</li>
|
||
<li class="right">
|
||
|
||
|
||
<div class="inline-search" style="display: none" role="search">
|
||
<form class="inline-search" action="../search.html" method="get">
|
||
<input placeholder="Quick search" type="text" name="q" />
|
||
<input type="submit" value="Go" />
|
||
<input type="hidden" name="check_keywords" value="yes" />
|
||
<input type="hidden" name="area" value="default" />
|
||
</form>
|
||
</div>
|
||
<script type="text/javascript">$('.inline-search').show(0);</script>
|
||
|
|
||
</li>
|
||
|
||
</ul>
|
||
</div>
|
||
<div class="footer">
|
||
© <a href="../copyright.html">Copyright</a> 2001-2019, Python Software Foundation.
|
||
<br />
|
||
The Python Software Foundation is a non-profit corporation.
|
||
<a href="https://www.python.org/psf/donations/">Please donate.</a>
|
||
<br />
|
||
Last updated on Jul 13, 2019.
|
||
<a href="../bugs.html">Found a bug</a>?
|
||
<br />
|
||
Created using <a href="http://sphinx.pocoo.org/">Sphinx</a> 2.0.1.
|
||
</div>
|
||
|
||
</body>
|
||
</html> |