613 lines
49 KiB
HTML
Raw Normal View History

2019-07-15 09:16:41 -07:00
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta charset="utf-8" />
<title>statistics — Mathematical statistics functions &#8212; Python 3.7.4 documentation</title>
<link rel="stylesheet" href="../_static/pydoctheme.css" type="text/css" />
<link rel="stylesheet" href="../_static/pygments.css" type="text/css" />
<script type="text/javascript" id="documentation_options" data-url_root="../" src="../_static/documentation_options.js"></script>
<script type="text/javascript" src="../_static/jquery.js"></script>
<script type="text/javascript" src="../_static/underscore.js"></script>
<script type="text/javascript" src="../_static/doctools.js"></script>
<script type="text/javascript" src="../_static/language_data.js"></script>
<script type="text/javascript" src="../_static/sidebar.js"></script>
<link rel="search" type="application/opensearchdescription+xml"
title="Search within Python 3.7.4 documentation"
href="../_static/opensearch.xml"/>
<link rel="author" title="About these documents" href="../about.html" />
<link rel="index" title="Index" href="../genindex.html" />
<link rel="search" title="Search" href="../search.html" />
<link rel="copyright" title="Copyright" href="../copyright.html" />
<link rel="next" title="Functional Programming Modules" href="functional.html" />
<link rel="prev" title="random — Generate pseudo-random numbers" href="random.html" />
<link rel="shortcut icon" type="image/png" href="../_static/py.png" />
<link rel="canonical" href="https://docs.python.org/3/library/statistics.html" />
<script type="text/javascript" src="../_static/copybutton.js"></script>
<script type="text/javascript" src="../_static/switchers.js"></script>
<style>
@media only screen {
table.full-width-table {
width: 100%;
}
}
</style>
</head><body>
<div class="related" role="navigation" aria-label="related navigation">
<h3>Navigation</h3>
<ul>
<li class="right" style="margin-right: 10px">
<a href="../genindex.html" title="General Index"
accesskey="I">index</a></li>
<li class="right" >
<a href="../py-modindex.html" title="Python Module Index"
>modules</a> |</li>
<li class="right" >
<a href="functional.html" title="Functional Programming Modules"
accesskey="N">next</a> |</li>
<li class="right" >
<a href="random.html" title="random — Generate pseudo-random numbers"
accesskey="P">previous</a> |</li>
<li><img src="../_static/py.png" alt=""
style="vertical-align: middle; margin-top: -1px"/></li>
<li><a href="https://www.python.org/">Python</a> &#187;</li>
<li>
<span class="language_switcher_placeholder">en</span>
<span class="version_switcher_placeholder">3.7.4</span>
<a href="../index.html">Documentation </a> &#187;
</li>
<li class="nav-item nav-item-1"><a href="index.html" >The Python Standard Library</a> &#187;</li>
<li class="nav-item nav-item-2"><a href="numeric.html" accesskey="U">Numeric and Mathematical Modules</a> &#187;</li>
<li class="right">
<div class="inline-search" style="display: none" role="search">
<form class="inline-search" action="../search.html" method="get">
<input placeholder="Quick search" type="text" name="q" />
<input type="submit" value="Go" />
<input type="hidden" name="check_keywords" value="yes" />
<input type="hidden" name="area" value="default" />
</form>
</div>
<script type="text/javascript">$('.inline-search').show(0);</script>
|
</li>
</ul>
</div>
<div class="document">
<div class="documentwrapper">
<div class="bodywrapper">
<div class="body" role="main">
<div class="section" id="module-statistics">
<span id="statistics-mathematical-statistics-functions"></span><h1><a class="reference internal" href="#module-statistics" title="statistics: mathematical statistics functions"><code class="xref py py-mod docutils literal notranslate"><span class="pre">statistics</span></code></a> — Mathematical statistics functions<a class="headerlink" href="#module-statistics" title="Permalink to this headline"></a></h1>
<div class="versionadded">
<p><span class="versionmodified added">New in version 3.4.</span></p>
</div>
<p><strong>Source code:</strong> <a class="reference external" href="https://github.com/python/cpython/tree/3.7/Lib/statistics.py">Lib/statistics.py</a></p>
<hr class="docutils" />
<p>This module provides functions for calculating mathematical statistics of
numeric (<code class="xref py py-class docutils literal notranslate"><span class="pre">Real</span></code>-valued) data.</p>
<div class="admonition note">
<p class="admonition-title">Note</p>
<p>Unless explicitly noted otherwise, these functions support <a class="reference internal" href="functions.html#int" title="int"><code class="xref py py-class docutils literal notranslate"><span class="pre">int</span></code></a>,
<a class="reference internal" href="functions.html#float" title="float"><code class="xref py py-class docutils literal notranslate"><span class="pre">float</span></code></a>, <a class="reference internal" href="decimal.html#decimal.Decimal" title="decimal.Decimal"><code class="xref py py-class docutils literal notranslate"><span class="pre">decimal.Decimal</span></code></a> and <a class="reference internal" href="fractions.html#fractions.Fraction" title="fractions.Fraction"><code class="xref py py-class docutils literal notranslate"><span class="pre">fractions.Fraction</span></code></a>.
Behaviour with other types (whether in the numeric tower or not) is
currently unsupported. Mixed types are also undefined and
implementation-dependent. If your input data consists of mixed types,
you may be able to use <a class="reference internal" href="functions.html#map" title="map"><code class="xref py py-func docutils literal notranslate"><span class="pre">map()</span></code></a> to ensure a consistent result, e.g.
<code class="docutils literal notranslate"><span class="pre">map(float,</span> <span class="pre">input_data)</span></code>.</p>
</div>
<div class="section" id="averages-and-measures-of-central-location">
<h2>Averages and measures of central location<a class="headerlink" href="#averages-and-measures-of-central-location" title="Permalink to this headline"></a></h2>
<p>These functions calculate an average or typical value from a population
or sample.</p>
<table class="docutils align-center">
<colgroup>
<col style="width: 34%" />
<col style="width: 66%" />
</colgroup>
<tbody>
<tr class="row-odd"><td><p><a class="reference internal" href="#statistics.mean" title="statistics.mean"><code class="xref py py-func docutils literal notranslate"><span class="pre">mean()</span></code></a></p></td>
<td><p>Arithmetic mean (“average”) of data.</p></td>
</tr>
<tr class="row-even"><td><p><a class="reference internal" href="#statistics.harmonic_mean" title="statistics.harmonic_mean"><code class="xref py py-func docutils literal notranslate"><span class="pre">harmonic_mean()</span></code></a></p></td>
<td><p>Harmonic mean of data.</p></td>
</tr>
<tr class="row-odd"><td><p><a class="reference internal" href="#statistics.median" title="statistics.median"><code class="xref py py-func docutils literal notranslate"><span class="pre">median()</span></code></a></p></td>
<td><p>Median (middle value) of data.</p></td>
</tr>
<tr class="row-even"><td><p><a class="reference internal" href="#statistics.median_low" title="statistics.median_low"><code class="xref py py-func docutils literal notranslate"><span class="pre">median_low()</span></code></a></p></td>
<td><p>Low median of data.</p></td>
</tr>
<tr class="row-odd"><td><p><a class="reference internal" href="#statistics.median_high" title="statistics.median_high"><code class="xref py py-func docutils literal notranslate"><span class="pre">median_high()</span></code></a></p></td>
<td><p>High median of data.</p></td>
</tr>
<tr class="row-even"><td><p><a class="reference internal" href="#statistics.median_grouped" title="statistics.median_grouped"><code class="xref py py-func docutils literal notranslate"><span class="pre">median_grouped()</span></code></a></p></td>
<td><p>Median, or 50th percentile, of grouped data.</p></td>
</tr>
<tr class="row-odd"><td><p><a class="reference internal" href="#statistics.mode" title="statistics.mode"><code class="xref py py-func docutils literal notranslate"><span class="pre">mode()</span></code></a></p></td>
<td><p>Mode (most common value) of discrete data.</p></td>
</tr>
</tbody>
</table>
</div>
<div class="section" id="measures-of-spread">
<h2>Measures of spread<a class="headerlink" href="#measures-of-spread" title="Permalink to this headline"></a></h2>
<p>These functions calculate a measure of how much the population or sample
tends to deviate from the typical or average values.</p>
<table class="docutils align-center">
<colgroup>
<col style="width: 34%" />
<col style="width: 66%" />
</colgroup>
<tbody>
<tr class="row-odd"><td><p><a class="reference internal" href="#statistics.pstdev" title="statistics.pstdev"><code class="xref py py-func docutils literal notranslate"><span class="pre">pstdev()</span></code></a></p></td>
<td><p>Population standard deviation of data.</p></td>
</tr>
<tr class="row-even"><td><p><a class="reference internal" href="#statistics.pvariance" title="statistics.pvariance"><code class="xref py py-func docutils literal notranslate"><span class="pre">pvariance()</span></code></a></p></td>
<td><p>Population variance of data.</p></td>
</tr>
<tr class="row-odd"><td><p><a class="reference internal" href="#statistics.stdev" title="statistics.stdev"><code class="xref py py-func docutils literal notranslate"><span class="pre">stdev()</span></code></a></p></td>
<td><p>Sample standard deviation of data.</p></td>
</tr>
<tr class="row-even"><td><p><a class="reference internal" href="#statistics.variance" title="statistics.variance"><code class="xref py py-func docutils literal notranslate"><span class="pre">variance()</span></code></a></p></td>
<td><p>Sample variance of data.</p></td>
</tr>
</tbody>
</table>
</div>
<div class="section" id="function-details">
<h2>Function details<a class="headerlink" href="#function-details" title="Permalink to this headline"></a></h2>
<p>Note: The functions do not require the data given to them to be sorted.
However, for reading convenience, most of the examples show sorted sequences.</p>
<dl class="function">
<dt id="statistics.mean">
<code class="descclassname">statistics.</code><code class="descname">mean</code><span class="sig-paren">(</span><em>data</em><span class="sig-paren">)</span><a class="headerlink" href="#statistics.mean" title="Permalink to this definition"></a></dt>
<dd><p>Return the sample arithmetic mean of <em>data</em> which can be a sequence or iterator.</p>
<p>The arithmetic mean is the sum of the data divided by the number of data
points. It is commonly called “the average”, although it is only one of many
different mathematical averages. It is a measure of the central location of
the data.</p>
<p>If <em>data</em> is empty, <a class="reference internal" href="#statistics.StatisticsError" title="statistics.StatisticsError"><code class="xref py py-exc docutils literal notranslate"><span class="pre">StatisticsError</span></code></a> will be raised.</p>
<p>Some examples of use:</p>
<div class="highlight-pycon3 notranslate"><div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt; </span><span class="n">mean</span><span class="p">([</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="mi">4</span><span class="p">,</span> <span class="mi">4</span><span class="p">])</span>
<span class="go">2.8</span>
<span class="gp">&gt;&gt;&gt; </span><span class="n">mean</span><span class="p">([</span><span class="o">-</span><span class="mf">1.0</span><span class="p">,</span> <span class="mf">2.5</span><span class="p">,</span> <span class="mf">3.25</span><span class="p">,</span> <span class="mf">5.75</span><span class="p">])</span>
<span class="go">2.625</span>
<span class="gp">&gt;&gt;&gt; </span><span class="kn">from</span> <span class="nn">fractions</span> <span class="k">import</span> <span class="n">Fraction</span> <span class="k">as</span> <span class="n">F</span>
<span class="gp">&gt;&gt;&gt; </span><span class="n">mean</span><span class="p">([</span><span class="n">F</span><span class="p">(</span><span class="mi">3</span><span class="p">,</span> <span class="mi">7</span><span class="p">),</span> <span class="n">F</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="mi">21</span><span class="p">),</span> <span class="n">F</span><span class="p">(</span><span class="mi">5</span><span class="p">,</span> <span class="mi">3</span><span class="p">),</span> <span class="n">F</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="mi">3</span><span class="p">)])</span>
<span class="go">Fraction(13, 21)</span>
<span class="gp">&gt;&gt;&gt; </span><span class="kn">from</span> <span class="nn">decimal</span> <span class="k">import</span> <span class="n">Decimal</span> <span class="k">as</span> <span class="n">D</span>
<span class="gp">&gt;&gt;&gt; </span><span class="n">mean</span><span class="p">([</span><span class="n">D</span><span class="p">(</span><span class="s2">&quot;0.5&quot;</span><span class="p">),</span> <span class="n">D</span><span class="p">(</span><span class="s2">&quot;0.75&quot;</span><span class="p">),</span> <span class="n">D</span><span class="p">(</span><span class="s2">&quot;0.625&quot;</span><span class="p">),</span> <span class="n">D</span><span class="p">(</span><span class="s2">&quot;0.375&quot;</span><span class="p">)])</span>
<span class="go">Decimal(&#39;0.5625&#39;)</span>
</pre></div>
</div>
<div class="admonition note">
<p class="admonition-title">Note</p>
<p>The mean is strongly affected by outliers and is not a robust estimator
for central location: the mean is not necessarily a typical example of the
data points. For more robust, although less efficient, measures of
central location, see <a class="reference internal" href="#statistics.median" title="statistics.median"><code class="xref py py-func docutils literal notranslate"><span class="pre">median()</span></code></a> and <a class="reference internal" href="#statistics.mode" title="statistics.mode"><code class="xref py py-func docutils literal notranslate"><span class="pre">mode()</span></code></a>. (In this case,
“efficient” refers to statistical efficiency rather than computational
efficiency.)</p>
<p>The sample mean gives an unbiased estimate of the true population mean,
which means that, taken on average over all the possible samples,
<code class="docutils literal notranslate"><span class="pre">mean(sample)</span></code> converges on the true mean of the entire population. If
<em>data</em> represents the entire population rather than a sample, then
<code class="docutils literal notranslate"><span class="pre">mean(data)</span></code> is equivalent to calculating the true population mean μ.</p>
</div>
</dd></dl>
<dl class="function">
<dt id="statistics.harmonic_mean">
<code class="descclassname">statistics.</code><code class="descname">harmonic_mean</code><span class="sig-paren">(</span><em>data</em><span class="sig-paren">)</span><a class="headerlink" href="#statistics.harmonic_mean" title="Permalink to this definition"></a></dt>
<dd><p>Return the harmonic mean of <em>data</em>, a sequence or iterator of
real-valued numbers.</p>
<p>The harmonic mean, sometimes called the subcontrary mean, is the
reciprocal of the arithmetic <a class="reference internal" href="#statistics.mean" title="statistics.mean"><code class="xref py py-func docutils literal notranslate"><span class="pre">mean()</span></code></a> of the reciprocals of the
data. For example, the harmonic mean of three values <em>a</em>, <em>b</em> and <em>c</em>
will be equivalent to <code class="docutils literal notranslate"><span class="pre">3/(1/a</span> <span class="pre">+</span> <span class="pre">1/b</span> <span class="pre">+</span> <span class="pre">1/c)</span></code>.</p>
<p>The harmonic mean is a type of average, a measure of the central
location of the data. It is often appropriate when averaging quantities
which are rates or ratios, for example speeds. For example:</p>
<p>Suppose an investor purchases an equal value of shares in each of
three companies, with P/E (price/earning) ratios of 2.5, 3 and 10.
What is the average P/E ratio for the investors portfolio?</p>
<div class="highlight-pycon3 notranslate"><div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt; </span><span class="n">harmonic_mean</span><span class="p">([</span><span class="mf">2.5</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="mi">10</span><span class="p">])</span> <span class="c1"># For an equal investment portfolio.</span>
<span class="go">3.6</span>
</pre></div>
</div>
<p>Using the arithmetic mean would give an average of about 5.167, which
is too high.</p>
<p><a class="reference internal" href="#statistics.StatisticsError" title="statistics.StatisticsError"><code class="xref py py-exc docutils literal notranslate"><span class="pre">StatisticsError</span></code></a> is raised if <em>data</em> is empty, or any element
is less than zero.</p>
<div class="versionadded">
<p><span class="versionmodified added">New in version 3.6.</span></p>
</div>
</dd></dl>
<dl class="function">
<dt id="statistics.median">
<code class="descclassname">statistics.</code><code class="descname">median</code><span class="sig-paren">(</span><em>data</em><span class="sig-paren">)</span><a class="headerlink" href="#statistics.median" title="Permalink to this definition"></a></dt>
<dd><p>Return the median (middle value) of numeric data, using the common “mean of
middle two” method. If <em>data</em> is empty, <a class="reference internal" href="#statistics.StatisticsError" title="statistics.StatisticsError"><code class="xref py py-exc docutils literal notranslate"><span class="pre">StatisticsError</span></code></a> is raised.
<em>data</em> can be a sequence or iterator.</p>
<p>The median is a robust measure of central location, and is less affected by
the presence of outliers in your data. When the number of data points is
odd, the middle data point is returned:</p>
<div class="highlight-pycon3 notranslate"><div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt; </span><span class="n">median</span><span class="p">([</span><span class="mi">1</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="mi">5</span><span class="p">])</span>
<span class="go">3</span>
</pre></div>
</div>
<p>When the number of data points is even, the median is interpolated by taking
the average of the two middle values:</p>
<div class="highlight-pycon3 notranslate"><div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt; </span><span class="n">median</span><span class="p">([</span><span class="mi">1</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="mi">5</span><span class="p">,</span> <span class="mi">7</span><span class="p">])</span>
<span class="go">4.0</span>
</pre></div>
</div>
<p>This is suited for when your data is discrete, and you dont mind that the
median may not be an actual data point.</p>
<p>If your data is ordinal (supports order operations) but not numeric (doesnt
support addition), you should use <a class="reference internal" href="#statistics.median_low" title="statistics.median_low"><code class="xref py py-func docutils literal notranslate"><span class="pre">median_low()</span></code></a> or <a class="reference internal" href="#statistics.median_high" title="statistics.median_high"><code class="xref py py-func docutils literal notranslate"><span class="pre">median_high()</span></code></a>
instead.</p>
<div class="admonition seealso">
<p class="admonition-title">See also</p>
<p><a class="reference internal" href="#statistics.median_low" title="statistics.median_low"><code class="xref py py-func docutils literal notranslate"><span class="pre">median_low()</span></code></a>, <a class="reference internal" href="#statistics.median_high" title="statistics.median_high"><code class="xref py py-func docutils literal notranslate"><span class="pre">median_high()</span></code></a>, <a class="reference internal" href="#statistics.median_grouped" title="statistics.median_grouped"><code class="xref py py-func docutils literal notranslate"><span class="pre">median_grouped()</span></code></a></p>
</div>
</dd></dl>
<dl class="function">
<dt id="statistics.median_low">
<code class="descclassname">statistics.</code><code class="descname">median_low</code><span class="sig-paren">(</span><em>data</em><span class="sig-paren">)</span><a class="headerlink" href="#statistics.median_low" title="Permalink to this definition"></a></dt>
<dd><p>Return the low median of numeric data. If <em>data</em> is empty,
<a class="reference internal" href="#statistics.StatisticsError" title="statistics.StatisticsError"><code class="xref py py-exc docutils literal notranslate"><span class="pre">StatisticsError</span></code></a> is raised. <em>data</em> can be a sequence or iterator.</p>
<p>The low median is always a member of the data set. When the number of data
points is odd, the middle value is returned. When it is even, the smaller of
the two middle values is returned.</p>
<div class="highlight-pycon3 notranslate"><div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt; </span><span class="n">median_low</span><span class="p">([</span><span class="mi">1</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="mi">5</span><span class="p">])</span>
<span class="go">3</span>
<span class="gp">&gt;&gt;&gt; </span><span class="n">median_low</span><span class="p">([</span><span class="mi">1</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="mi">5</span><span class="p">,</span> <span class="mi">7</span><span class="p">])</span>
<span class="go">3</span>
</pre></div>
</div>
<p>Use the low median when your data are discrete and you prefer the median to
be an actual data point rather than interpolated.</p>
</dd></dl>
<dl class="function">
<dt id="statistics.median_high">
<code class="descclassname">statistics.</code><code class="descname">median_high</code><span class="sig-paren">(</span><em>data</em><span class="sig-paren">)</span><a class="headerlink" href="#statistics.median_high" title="Permalink to this definition"></a></dt>
<dd><p>Return the high median of data. If <em>data</em> is empty, <a class="reference internal" href="#statistics.StatisticsError" title="statistics.StatisticsError"><code class="xref py py-exc docutils literal notranslate"><span class="pre">StatisticsError</span></code></a>
is raised. <em>data</em> can be a sequence or iterator.</p>
<p>The high median is always a member of the data set. When the number of data
points is odd, the middle value is returned. When it is even, the larger of
the two middle values is returned.</p>
<div class="highlight-pycon3 notranslate"><div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt; </span><span class="n">median_high</span><span class="p">([</span><span class="mi">1</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="mi">5</span><span class="p">])</span>
<span class="go">3</span>
<span class="gp">&gt;&gt;&gt; </span><span class="n">median_high</span><span class="p">([</span><span class="mi">1</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="mi">5</span><span class="p">,</span> <span class="mi">7</span><span class="p">])</span>
<span class="go">5</span>
</pre></div>
</div>
<p>Use the high median when your data are discrete and you prefer the median to
be an actual data point rather than interpolated.</p>
</dd></dl>
<dl class="function">
<dt id="statistics.median_grouped">
<code class="descclassname">statistics.</code><code class="descname">median_grouped</code><span class="sig-paren">(</span><em>data</em>, <em>interval=1</em><span class="sig-paren">)</span><a class="headerlink" href="#statistics.median_grouped" title="Permalink to this definition"></a></dt>
<dd><p>Return the median of grouped continuous data, calculated as the 50th
percentile, using interpolation. If <em>data</em> is empty, <a class="reference internal" href="#statistics.StatisticsError" title="statistics.StatisticsError"><code class="xref py py-exc docutils literal notranslate"><span class="pre">StatisticsError</span></code></a>
is raised. <em>data</em> can be a sequence or iterator.</p>
<div class="highlight-pycon3 notranslate"><div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt; </span><span class="n">median_grouped</span><span class="p">([</span><span class="mi">52</span><span class="p">,</span> <span class="mi">52</span><span class="p">,</span> <span class="mi">53</span><span class="p">,</span> <span class="mi">54</span><span class="p">])</span>
<span class="go">52.5</span>
</pre></div>
</div>
<p>In the following example, the data are rounded, so that each value represents
the midpoint of data classes, e.g. 1 is the midpoint of the class 0.51.5, 2
is the midpoint of 1.52.5, 3 is the midpoint of 2.53.5, etc. With the data
given, the middle value falls somewhere in the class 3.54.5, and
interpolation is used to estimate it:</p>
<div class="highlight-pycon3 notranslate"><div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt; </span><span class="n">median_grouped</span><span class="p">([</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="mi">4</span><span class="p">,</span> <span class="mi">4</span><span class="p">,</span> <span class="mi">4</span><span class="p">,</span> <span class="mi">4</span><span class="p">,</span> <span class="mi">4</span><span class="p">,</span> <span class="mi">5</span><span class="p">])</span>
<span class="go">3.7</span>
</pre></div>
</div>
<p>Optional argument <em>interval</em> represents the class interval, and defaults
to 1. Changing the class interval naturally will change the interpolation:</p>
<div class="highlight-pycon3 notranslate"><div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt; </span><span class="n">median_grouped</span><span class="p">([</span><span class="mi">1</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="mi">5</span><span class="p">,</span> <span class="mi">7</span><span class="p">],</span> <span class="n">interval</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span>
<span class="go">3.25</span>
<span class="gp">&gt;&gt;&gt; </span><span class="n">median_grouped</span><span class="p">([</span><span class="mi">1</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="mi">5</span><span class="p">,</span> <span class="mi">7</span><span class="p">],</span> <span class="n">interval</span><span class="o">=</span><span class="mi">2</span><span class="p">)</span>
<span class="go">3.5</span>
</pre></div>
</div>
<p>This function does not check whether the data points are at least
<em>interval</em> apart.</p>
<div class="impl-detail compound">
<p><strong>CPython implementation detail:</strong> Under some circumstances, <a class="reference internal" href="#statistics.median_grouped" title="statistics.median_grouped"><code class="xref py py-func docutils literal notranslate"><span class="pre">median_grouped()</span></code></a> may coerce data points to
floats. This behaviour is likely to change in the future.</p>
</div>
<div class="admonition seealso">
<p class="admonition-title">See also</p>
<ul class="simple">
<li><p>“Statistics for the Behavioral Sciences”, Frederick J Gravetter and
Larry B Wallnau (8th Edition).</p></li>
<li><p>The <a class="reference external" href="https://help.gnome.org/users/gnumeric/stable/gnumeric.html#gnumeric-function-SSMEDIAN">SSMEDIAN</a>
function in the Gnome Gnumeric spreadsheet, including <a class="reference external" href="https://mail.gnome.org/archives/gnumeric-list/2011-April/msg00018.html">this discussion</a>.</p></li>
</ul>
</div>
</dd></dl>
<dl class="function">
<dt id="statistics.mode">
<code class="descclassname">statistics.</code><code class="descname">mode</code><span class="sig-paren">(</span><em>data</em><span class="sig-paren">)</span><a class="headerlink" href="#statistics.mode" title="Permalink to this definition"></a></dt>
<dd><p>Return the most common data point from discrete or nominal <em>data</em>. The mode
(when it exists) is the most typical value, and is a robust measure of
central location.</p>
<p>If <em>data</em> is empty, or if there is not exactly one most common value,
<a class="reference internal" href="#statistics.StatisticsError" title="statistics.StatisticsError"><code class="xref py py-exc docutils literal notranslate"><span class="pre">StatisticsError</span></code></a> is raised.</p>
<p><code class="docutils literal notranslate"><span class="pre">mode</span></code> assumes discrete data, and returns a single value. This is the
standard treatment of the mode as commonly taught in schools:</p>
<div class="highlight-pycon3 notranslate"><div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt; </span><span class="n">mode</span><span class="p">([</span><span class="mi">1</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="mi">4</span><span class="p">])</span>
<span class="go">3</span>
</pre></div>
</div>
<p>The mode is unique in that it is the only statistic which also applies
to nominal (non-numeric) data:</p>
<div class="highlight-pycon3 notranslate"><div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt; </span><span class="n">mode</span><span class="p">([</span><span class="s2">&quot;red&quot;</span><span class="p">,</span> <span class="s2">&quot;blue&quot;</span><span class="p">,</span> <span class="s2">&quot;blue&quot;</span><span class="p">,</span> <span class="s2">&quot;red&quot;</span><span class="p">,</span> <span class="s2">&quot;green&quot;</span><span class="p">,</span> <span class="s2">&quot;red&quot;</span><span class="p">,</span> <span class="s2">&quot;red&quot;</span><span class="p">])</span>
<span class="go">&#39;red&#39;</span>
</pre></div>
</div>
</dd></dl>
<dl class="function">
<dt id="statistics.pstdev">
<code class="descclassname">statistics.</code><code class="descname">pstdev</code><span class="sig-paren">(</span><em>data</em>, <em>mu=None</em><span class="sig-paren">)</span><a class="headerlink" href="#statistics.pstdev" title="Permalink to this definition"></a></dt>
<dd><p>Return the population standard deviation (the square root of the population
variance). See <a class="reference internal" href="#statistics.pvariance" title="statistics.pvariance"><code class="xref py py-func docutils literal notranslate"><span class="pre">pvariance()</span></code></a> for arguments and other details.</p>
<div class="highlight-pycon3 notranslate"><div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt; </span><span class="n">pstdev</span><span class="p">([</span><span class="mf">1.5</span><span class="p">,</span> <span class="mf">2.5</span><span class="p">,</span> <span class="mf">2.5</span><span class="p">,</span> <span class="mf">2.75</span><span class="p">,</span> <span class="mf">3.25</span><span class="p">,</span> <span class="mf">4.75</span><span class="p">])</span>
<span class="go">0.986893273527251</span>
</pre></div>
</div>
</dd></dl>
<dl class="function">
<dt id="statistics.pvariance">
<code class="descclassname">statistics.</code><code class="descname">pvariance</code><span class="sig-paren">(</span><em>data</em>, <em>mu=None</em><span class="sig-paren">)</span><a class="headerlink" href="#statistics.pvariance" title="Permalink to this definition"></a></dt>
<dd><p>Return the population variance of <em>data</em>, a non-empty iterable of real-valued
numbers. Variance, or second moment about the mean, is a measure of the
variability (spread or dispersion) of data. A large variance indicates that
the data is spread out; a small variance indicates it is clustered closely
around the mean.</p>
<p>If the optional second argument <em>mu</em> is given, it should be the mean of
<em>data</em>. If it is missing or <code class="docutils literal notranslate"><span class="pre">None</span></code> (the default), the mean is
automatically calculated.</p>
<p>Use this function to calculate the variance from the entire population. To
estimate the variance from a sample, the <a class="reference internal" href="#statistics.variance" title="statistics.variance"><code class="xref py py-func docutils literal notranslate"><span class="pre">variance()</span></code></a> function is usually
a better choice.</p>
<p>Raises <a class="reference internal" href="#statistics.StatisticsError" title="statistics.StatisticsError"><code class="xref py py-exc docutils literal notranslate"><span class="pre">StatisticsError</span></code></a> if <em>data</em> is empty.</p>
<p>Examples:</p>
<div class="highlight-pycon3 notranslate"><div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt; </span><span class="n">data</span> <span class="o">=</span> <span class="p">[</span><span class="mf">0.0</span><span class="p">,</span> <span class="mf">0.25</span><span class="p">,</span> <span class="mf">0.25</span><span class="p">,</span> <span class="mf">1.25</span><span class="p">,</span> <span class="mf">1.5</span><span class="p">,</span> <span class="mf">1.75</span><span class="p">,</span> <span class="mf">2.75</span><span class="p">,</span> <span class="mf">3.25</span><span class="p">]</span>
<span class="gp">&gt;&gt;&gt; </span><span class="n">pvariance</span><span class="p">(</span><span class="n">data</span><span class="p">)</span>
<span class="go">1.25</span>
</pre></div>
</div>
<p>If you have already calculated the mean of your data, you can pass it as the
optional second argument <em>mu</em> to avoid recalculation:</p>
<div class="highlight-pycon3 notranslate"><div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt; </span><span class="n">mu</span> <span class="o">=</span> <span class="n">mean</span><span class="p">(</span><span class="n">data</span><span class="p">)</span>
<span class="gp">&gt;&gt;&gt; </span><span class="n">pvariance</span><span class="p">(</span><span class="n">data</span><span class="p">,</span> <span class="n">mu</span><span class="p">)</span>
<span class="go">1.25</span>
</pre></div>
</div>
<p>This function does not attempt to verify that you have passed the actual mean
as <em>mu</em>. Using arbitrary values for <em>mu</em> may lead to invalid or impossible
results.</p>
<p>Decimals and Fractions are supported:</p>
<div class="highlight-pycon3 notranslate"><div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt; </span><span class="kn">from</span> <span class="nn">decimal</span> <span class="k">import</span> <span class="n">Decimal</span> <span class="k">as</span> <span class="n">D</span>
<span class="gp">&gt;&gt;&gt; </span><span class="n">pvariance</span><span class="p">([</span><span class="n">D</span><span class="p">(</span><span class="s2">&quot;27.5&quot;</span><span class="p">),</span> <span class="n">D</span><span class="p">(</span><span class="s2">&quot;30.25&quot;</span><span class="p">),</span> <span class="n">D</span><span class="p">(</span><span class="s2">&quot;30.25&quot;</span><span class="p">),</span> <span class="n">D</span><span class="p">(</span><span class="s2">&quot;34.5&quot;</span><span class="p">),</span> <span class="n">D</span><span class="p">(</span><span class="s2">&quot;41.75&quot;</span><span class="p">)])</span>
<span class="go">Decimal(&#39;24.815&#39;)</span>
<span class="gp">&gt;&gt;&gt; </span><span class="kn">from</span> <span class="nn">fractions</span> <span class="k">import</span> <span class="n">Fraction</span> <span class="k">as</span> <span class="n">F</span>
<span class="gp">&gt;&gt;&gt; </span><span class="n">pvariance</span><span class="p">([</span><span class="n">F</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="mi">4</span><span class="p">),</span> <span class="n">F</span><span class="p">(</span><span class="mi">5</span><span class="p">,</span> <span class="mi">4</span><span class="p">),</span> <span class="n">F</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">)])</span>
<span class="go">Fraction(13, 72)</span>
</pre></div>
</div>
<div class="admonition note">
<p class="admonition-title">Note</p>
<p>When called with the entire population, this gives the population variance
σ². When called on a sample instead, this is the biased sample variance
s², also known as variance with N degrees of freedom.</p>
<p>If you somehow know the true population mean μ, you may use this function
to calculate the variance of a sample, giving the known population mean as
the second argument. Provided the data points are representative
(e.g. independent and identically distributed), the result will be an
unbiased estimate of the population variance.</p>
</div>
</dd></dl>
<dl class="function">
<dt id="statistics.stdev">
<code class="descclassname">statistics.</code><code class="descname">stdev</code><span class="sig-paren">(</span><em>data</em>, <em>xbar=None</em><span class="sig-paren">)</span><a class="headerlink" href="#statistics.stdev" title="Permalink to this definition"></a></dt>
<dd><p>Return the sample standard deviation (the square root of the sample
variance). See <a class="reference internal" href="#statistics.variance" title="statistics.variance"><code class="xref py py-func docutils literal notranslate"><span class="pre">variance()</span></code></a> for arguments and other details.</p>
<div class="highlight-pycon3 notranslate"><div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt; </span><span class="n">stdev</span><span class="p">([</span><span class="mf">1.5</span><span class="p">,</span> <span class="mf">2.5</span><span class="p">,</span> <span class="mf">2.5</span><span class="p">,</span> <span class="mf">2.75</span><span class="p">,</span> <span class="mf">3.25</span><span class="p">,</span> <span class="mf">4.75</span><span class="p">])</span>
<span class="go">1.0810874155219827</span>
</pre></div>
</div>
</dd></dl>
<dl class="function">
<dt id="statistics.variance">
<code class="descclassname">statistics.</code><code class="descname">variance</code><span class="sig-paren">(</span><em>data</em>, <em>xbar=None</em><span class="sig-paren">)</span><a class="headerlink" href="#statistics.variance" title="Permalink to this definition"></a></dt>
<dd><p>Return the sample variance of <em>data</em>, an iterable of at least two real-valued
numbers. Variance, or second moment about the mean, is a measure of the
variability (spread or dispersion) of data. A large variance indicates that
the data is spread out; a small variance indicates it is clustered closely
around the mean.</p>
<p>If the optional second argument <em>xbar</em> is given, it should be the mean of
<em>data</em>. If it is missing or <code class="docutils literal notranslate"><span class="pre">None</span></code> (the default), the mean is
automatically calculated.</p>
<p>Use this function when your data is a sample from a population. To calculate
the variance from the entire population, see <a class="reference internal" href="#statistics.pvariance" title="statistics.pvariance"><code class="xref py py-func docutils literal notranslate"><span class="pre">pvariance()</span></code></a>.</p>
<p>Raises <a class="reference internal" href="#statistics.StatisticsError" title="statistics.StatisticsError"><code class="xref py py-exc docutils literal notranslate"><span class="pre">StatisticsError</span></code></a> if <em>data</em> has fewer than two values.</p>
<p>Examples:</p>
<div class="highlight-pycon3 notranslate"><div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt; </span><span class="n">data</span> <span class="o">=</span> <span class="p">[</span><span class="mf">2.75</span><span class="p">,</span> <span class="mf">1.75</span><span class="p">,</span> <span class="mf">1.25</span><span class="p">,</span> <span class="mf">0.25</span><span class="p">,</span> <span class="mf">0.5</span><span class="p">,</span> <span class="mf">1.25</span><span class="p">,</span> <span class="mf">3.5</span><span class="p">]</span>
<span class="gp">&gt;&gt;&gt; </span><span class="n">variance</span><span class="p">(</span><span class="n">data</span><span class="p">)</span>
<span class="go">1.3720238095238095</span>
</pre></div>
</div>
<p>If you have already calculated the mean of your data, you can pass it as the
optional second argument <em>xbar</em> to avoid recalculation:</p>
<div class="highlight-pycon3 notranslate"><div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt; </span><span class="n">m</span> <span class="o">=</span> <span class="n">mean</span><span class="p">(</span><span class="n">data</span><span class="p">)</span>
<span class="gp">&gt;&gt;&gt; </span><span class="n">variance</span><span class="p">(</span><span class="n">data</span><span class="p">,</span> <span class="n">m</span><span class="p">)</span>
<span class="go">1.3720238095238095</span>
</pre></div>
</div>
<p>This function does not attempt to verify that you have passed the actual mean
as <em>xbar</em>. Using arbitrary values for <em>xbar</em> can lead to invalid or
impossible results.</p>
<p>Decimal and Fraction values are supported:</p>
<div class="highlight-pycon3 notranslate"><div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt; </span><span class="kn">from</span> <span class="nn">decimal</span> <span class="k">import</span> <span class="n">Decimal</span> <span class="k">as</span> <span class="n">D</span>
<span class="gp">&gt;&gt;&gt; </span><span class="n">variance</span><span class="p">([</span><span class="n">D</span><span class="p">(</span><span class="s2">&quot;27.5&quot;</span><span class="p">),</span> <span class="n">D</span><span class="p">(</span><span class="s2">&quot;30.25&quot;</span><span class="p">),</span> <span class="n">D</span><span class="p">(</span><span class="s2">&quot;30.25&quot;</span><span class="p">),</span> <span class="n">D</span><span class="p">(</span><span class="s2">&quot;34.5&quot;</span><span class="p">),</span> <span class="n">D</span><span class="p">(</span><span class="s2">&quot;41.75&quot;</span><span class="p">)])</span>
<span class="go">Decimal(&#39;31.01875&#39;)</span>
<span class="gp">&gt;&gt;&gt; </span><span class="kn">from</span> <span class="nn">fractions</span> <span class="k">import</span> <span class="n">Fraction</span> <span class="k">as</span> <span class="n">F</span>
<span class="gp">&gt;&gt;&gt; </span><span class="n">variance</span><span class="p">([</span><span class="n">F</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="mi">6</span><span class="p">),</span> <span class="n">F</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">),</span> <span class="n">F</span><span class="p">(</span><span class="mi">5</span><span class="p">,</span> <span class="mi">3</span><span class="p">)])</span>
<span class="go">Fraction(67, 108)</span>
</pre></div>
</div>
<div class="admonition note">
<p class="admonition-title">Note</p>
<p>This is the sample variance s² with Bessels correction, also known as
variance with N-1 degrees of freedom. Provided that the data points are
representative (e.g. independent and identically distributed), the result
should be an unbiased estimate of the true population variance.</p>
<p>If you somehow know the actual population mean μ you should pass it to the
<a class="reference internal" href="#statistics.pvariance" title="statistics.pvariance"><code class="xref py py-func docutils literal notranslate"><span class="pre">pvariance()</span></code></a> function as the <em>mu</em> parameter to get the variance of a
sample.</p>
</div>
</dd></dl>
</div>
<div class="section" id="exceptions">
<h2>Exceptions<a class="headerlink" href="#exceptions" title="Permalink to this headline"></a></h2>
<p>A single exception is defined:</p>
<dl class="exception">
<dt id="statistics.StatisticsError">
<em class="property">exception </em><code class="descclassname">statistics.</code><code class="descname">StatisticsError</code><a class="headerlink" href="#statistics.StatisticsError" title="Permalink to this definition"></a></dt>
<dd><p>Subclass of <a class="reference internal" href="exceptions.html#ValueError" title="ValueError"><code class="xref py py-exc docutils literal notranslate"><span class="pre">ValueError</span></code></a> for statistics-related exceptions.</p>
</dd></dl>
</div>
</div>
</div>
</div>
</div>
<div class="sphinxsidebar" role="navigation" aria-label="main navigation">
<div class="sphinxsidebarwrapper">
<h3><a href="../contents.html">Table of Contents</a></h3>
<ul>
<li><a class="reference internal" href="#"><code class="xref py py-mod docutils literal notranslate"><span class="pre">statistics</span></code> — Mathematical statistics functions</a><ul>
<li><a class="reference internal" href="#averages-and-measures-of-central-location">Averages and measures of central location</a></li>
<li><a class="reference internal" href="#measures-of-spread">Measures of spread</a></li>
<li><a class="reference internal" href="#function-details">Function details</a></li>
<li><a class="reference internal" href="#exceptions">Exceptions</a></li>
</ul>
</li>
</ul>
<h4>Previous topic</h4>
<p class="topless"><a href="random.html"
title="previous chapter"><code class="xref py py-mod docutils literal notranslate"><span class="pre">random</span></code> — Generate pseudo-random numbers</a></p>
<h4>Next topic</h4>
<p class="topless"><a href="functional.html"
title="next chapter">Functional Programming Modules</a></p>
<div role="note" aria-label="source link">
<h3>This Page</h3>
<ul class="this-page-menu">
<li><a href="../bugs.html">Report a Bug</a></li>
<li>
<a href="https://github.com/python/cpython/blob/3.7/Doc/library/statistics.rst"
rel="nofollow">Show Source
</a>
</li>
</ul>
</div>
</div>
</div>
<div class="clearer"></div>
</div>
<div class="related" role="navigation" aria-label="related navigation">
<h3>Navigation</h3>
<ul>
<li class="right" style="margin-right: 10px">
<a href="../genindex.html" title="General Index"
>index</a></li>
<li class="right" >
<a href="../py-modindex.html" title="Python Module Index"
>modules</a> |</li>
<li class="right" >
<a href="functional.html" title="Functional Programming Modules"
>next</a> |</li>
<li class="right" >
<a href="random.html" title="random — Generate pseudo-random numbers"
>previous</a> |</li>
<li><img src="../_static/py.png" alt=""
style="vertical-align: middle; margin-top: -1px"/></li>
<li><a href="https://www.python.org/">Python</a> &#187;</li>
<li>
<span class="language_switcher_placeholder">en</span>
<span class="version_switcher_placeholder">3.7.4</span>
<a href="../index.html">Documentation </a> &#187;
</li>
<li class="nav-item nav-item-1"><a href="index.html" >The Python Standard Library</a> &#187;</li>
<li class="nav-item nav-item-2"><a href="numeric.html" >Numeric and Mathematical Modules</a> &#187;</li>
<li class="right">
<div class="inline-search" style="display: none" role="search">
<form class="inline-search" action="../search.html" method="get">
<input placeholder="Quick search" type="text" name="q" />
<input type="submit" value="Go" />
<input type="hidden" name="check_keywords" value="yes" />
<input type="hidden" name="area" value="default" />
</form>
</div>
<script type="text/javascript">$('.inline-search').show(0);</script>
|
</li>
</ul>
</div>
<div class="footer">
&copy; <a href="../copyright.html">Copyright</a> 2001-2019, Python Software Foundation.
<br />
The Python Software Foundation is a non-profit corporation.
<a href="https://www.python.org/psf/donations/">Please donate.</a>
<br />
Last updated on Jul 13, 2019.
<a href="../bugs.html">Found a bug</a>?
<br />
Created using <a href="http://sphinx.pocoo.org/">Sphinx</a> 2.0.1.
</div>
</body>
</html>