Math.NET Numerics
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
 

228 lines
19 KiB

<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>Distance Metrics
</title>
<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
<meta name="author" content="Christoph Ruegg, Marcus Cuda, Jurgen Van Gael">
<link rel="stylesheet" id="theme_link" href="https://cdnjs.cloudflare.com/ajax/libs/bootswatch/4.6.0/materia/bootstrap.min.css">
<script src="https://code.jquery.com/jquery-3.4.1.min.js"></script>
<script src="https://cdn.jsdelivr.net/npm/bootstrap@4.6.0/dist/js/bootstrap.bundle.min.js" integrity="sha384-Piv4xVNRyMGpqkS2by6br4gNJ7DXjqk09RmUpJ8jgGtD7zP9yug3goQfGII0yAns" crossorigin="anonymous"></script>
<script type="text/javascript" async src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.7/MathJax.js?config=TeX-MML-AM_CHTML"></script>
<link rel="shortcut icon" type="image/x-icon" href="/favicon.ico">
<link type="text/css" rel="stylesheet" href="https://numerics.mathdotnet.com/content/navbar-fixed-left.css" />
<link type="text/css" rel="stylesheet" href="https://numerics.mathdotnet.com/content/fsdocs-default.css" />
<link type="text/css" rel="stylesheet" href="https://numerics.mathdotnet.com/content/fsdocs-custom.css" />
<script type="text/javascript" src="https://numerics.mathdotnet.com/content/fsdocs-tips.js"></script>
<!-- HTML5 shim, for IE6-8 support of HTML5 elements -->
<!--[if lt IE 9]>
<script src="http://html5shim.googlecode.com/svn/trunk/html5.js"></script>
<![endif]-->
<!-- BEGIN SEARCH BOX: this adds support for the search box -->
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/JavaScript-autoComplete/1.0.4/auto-complete.css" />
<!-- END SEARCH BOX: this adds support for the search box -->
</head>
<body>
<nav class="navbar navbar-expand-md navbar-light bg-secondary fixed-left" id="fsdocs-nav">
<button class="navbar-toggler" type="button" data-toggle="collapse" data-target="#navbarsExampleDefault" aria-controls="navbarsExampleDefault" aria-expanded="false" aria-label="Toggle navigation">
<span class="navbar-toggler-icon"></span>
</button>
<div class="collapse navbar-collapse navbar-nav-scroll" id="navbarsExampleDefault">
<a href="https://numerics.mathdotnet.com/"><img id="fsdocs-logo" src="/logo.png" /></a>
<!-- BEGIN SEARCH BOX: this adds support for the search box -->
<div id="header">
<div class="searchbox" id="fsdocs-searchbox">
<label for="search-by">
<i class="fas fa-search"></i>
</label>
<input data-search-input="" id="search-by" type="search" placeholder="Search..." />
<span data-search-clear="">
<i class="fas fa-times"></i>
</span>
</div>
</div>
<!-- END SEARCH BOX: this adds support for the search box -->
<ul class="navbar-nav">
<li class="nav-header">Math.NET Numerics</li>
<li class="nav-item"><a class="nav-link" href="Packages.html">NuGet & Binaries</a></li>
<li class="nav-item"><a class="nav-link" href="ReleaseNotes.html">Release Notes</a></li>
<li class="nav-item"><a class="nav-link" href="https://github.com/mathnet/mathnet-numerics/blob/master/LICENSE.md">MIT License</a></li>
<li class="nav-item"><a class="nav-link" href="Compatibility.html">Platform Support</a></li>
<li class="nav-item"><a class="nav-link" href="https://numerics.mathdotnet.com/api/">Class Reference</a></li>
<li class="nav-item"><a class="nav-link" href="https://github.com/mathnet/mathnet-numerics/issues">Issues & Bugs</a></li>
<li class="nav-item"><a class="nav-link" href="Users.html">Who is using Math.NET?</a></li>
<li class="nav-header">Contributing</li>
<li class="nav-item"><a class="nav-link" href="Contributors.html">Contributors</a></li>
<li class="nav-item"><a class="nav-link" href="Contributing.html">Contributing</a></li>
<li class="nav-item"><a class="nav-link" href="Build.html">Build & Tools</a></li>
<li class="nav-item"><a class="nav-link" href="https://github.com/mathnet/mathnet-numerics/discussions/categories/ideas">Your Ideas</a></li>
<li class="nav-header">Getting Help</li>
<li class="nav-item"><a class="nav-link" href="https://discuss.mathdotnet.com/c/numerics">Discuss</a></li>
<li class="nav-item"><a class="nav-link" href="https://stackoverflow.com/questions/tagged/mathdotnet">Stack Overflow</a></li>
<li class="nav-header">Getting Started</li>
<l class="nav-item"i><a class="nav-link" href="/">Getting started</a></li>
<li class="nav-item"><a class="nav-link" href="Constants.html">Constants</a></li>
<li class="nav-item"><a class="nav-link" href="Matrix.html">Matrices and Vectors</a></li>
<li class="nav-item"><a class="nav-link" href="Euclid.html">Euclid & Number Theory</a></li>
<li class="nav-item">Combinatorics</li>
<li class="nav-header">Evaluation</li>
<li class="nav-item"><a class="nav-link" href="Functions.html">Special Functions</a></li>
<li class="nav-item"><a class="nav-link" href="Integration.html">Integration</a></li>
<li class="nav-header">Statistics/Probability</li>
<li class="nav-item"><a class="nav-link" href="DescriptiveStatistics.html">Descriptive Statistics</a></li>
<li class="nav-item"><a class="nav-link" href="Probability.html">Probability Distributions</a></li>
<li class="nav-header">Generation</li>
<li class="nav-item"><a class="nav-link" href="Generate.html">Generating Data</a></li>
<li class="nav-item"><a class="nav-link" href="Random.html">Random Numbers</a></li>
<li class="nav-header">Solving Equations</li>
<li class="nav-item"><a class="nav-link" href="LinearEquations.html">Linear Equation Systems</a></li>
<li class="nav-header">Optimization</li>
<li class="nav-item"><a class="nav-link" href="Distance.html">Distance Metrics</a></li>
<li class="nav-header">Curve Fitting</li>
<li class="nav-item"><a class="nav-link" href="Regression.html">Regression</a></li>
<li class="nav-header">Native Providers</li>
<li class="nav-item"><a class="nav-link" href="MKL.html">Intel MKL</a></li>
<li class="nav-header">Working Together</li>
<li class="nav-item"><a class="nav-link" href="CSV.html">Delimited Text Files (CSV)</a></li>
<li class="nav-item"><a class="nav-link" href="MatrixMarket.html">NIST MatrixMarket</a></li>
<li class="nav-item"><a class="nav-link" href="MatlabFiles.html">MATLAB</a></li>
<li class="nav-item"><a class="nav-link" href="IFSharpNotebook.html">IF# Notebook</a></li>
</ul>
</div>
</nav>
<div class="container">
<div class="masthead">
<h3 class="muted">
<a href="https://numerics.mathdotnet.com">Math.NET Numerics</a> |
<a href="https://www.mathdotnet.com">Math.NET Project</a> |
<a href="https://github.com/mathnet/mathnet-numerics">GitHub</a>
</h3>
</div>
<hr />
<div class="container" id="fsdocs-content">
<h1><a name="Distance-Metrics" class="anchor" href="#Distance-Metrics">Distance Metrics</a></h1>
<p>A metric or distance function is a function <span class="math">\(d(x,y)\)</span> that defines the distance
between elements of a set as a non-negative real number. If the distance is zero, both elements are equivalent
under that specific metric. Distance functions thus provide a way to measure how close two elements are, where elements
do not have to be numbers but can also be vectors, matrices or arbitrary objects. Distance functions are often used
as error or cost functions to be minimized in an optimization problem.</p>
<p>There are multiple ways to define a metric on a set. A typical distance for real numbers is the absolute difference,
<span class="math">\(d : (x, y) \mapsto |x-y|\)</span>. But a scaled version of the absolute difference, or even <span class="math">\(d(x, y) = \begin{cases} 0 &amp;\mbox{if } x = y \\ 1 &amp; \mbox{if } x \ne y. \end{cases}\)</span>
are valid metrics as well. Every normed vector space induces a distance given by <span class="math">\(d(\vec x, \vec y) = \|\vec x - \vec y\|\)</span>.</p>
<p>Math.NET Numerics provides the following distance functions on vectors and arrays:</p>
<h2><a name="Sum-of-Absolute-Difference-SAD" class="anchor" href="#Sum-of-Absolute-Difference-SAD">Sum of Absolute Difference (SAD)</a></h2>
<img src="DistanceSAD.png" style="width:87px; height:87px; float:left; margin:10px 10px 10px 0;" />
<p>The sum of absolute difference is equivalent to the <span class="math">\(L_1\)</span>-norm of the difference, also known as Manhattan- or Taxicab-norm.
The <code>abs</code> function makes this metric a bit complicated to deal with analytically, but it is more robust than SSD.</p>
<p><span class="math">\[d_{\mathbf{SAD}} : (x, y) \mapsto \|x-y\|_1 = \sum_{i=1}^{n} |x_i-y_i|\]</span></p>
<table class="pre"><tr><td class="snippet"><pre class="fssnip highlighted"><code lang="csharp"><span class="k">double</span> d <span class="o">=</span> Distance.SAD(x, y);
</code></pre></td></tr></table>
<h2><a name="Sum-of-Squared-Difference-SSD" class="anchor" href="#Sum-of-Squared-Difference-SSD">Sum of Squared Difference (SSD)</a></h2>
<img src="DistanceSSD.png" style="width:87px; height:87px; float:left; margin:10px 10px 10px 0;" />
<p>The sum of squared difference is equivalent to the squared <span class="math">\(L_2\)</span>-norm, also known as Euclidean norm.
It is therefore also known as Squared Euclidean distance.
This is the fundamental metric in least squares problems and linear algebra. The absence of the <code>abs</code>
function makes this metric convenient to deal with analytically, but the squares cause it to be very
sensitive to large outliers.</p>
<p><span class="math">\[d_{\mathbf{SSD}} : (x, y) \mapsto \|x-y\|_2^2 = \langle x-y, x-y\rangle = \sum_{i=1}^{n} (x_i-y_i)^2\]</span></p>
<table class="pre"><tr><td class="snippet"><pre class="fssnip highlighted"><code lang="csharp"><span class="k">double</span> d <span class="o">=</span> Distance.SSD(x, y);
</code></pre></td></tr></table>
<h2><a name="Mean-Absolute-Error-MAE" class="anchor" href="#Mean-Absolute-Error-MAE">Mean-Absolute Error (MAE)</a></h2>
<img src="DistanceMAE.png" style="width:87px; height:87px; float:left; margin:10px 10px 10px 0;" />
<p>The mean absolute error is a normalized version of the sum of absolute difference.</p>
<p><span class="math">\[d_{\mathbf{MAE}} : (x, y) \mapsto \frac{d_{\mathbf{SAD}}}{n} = \frac{\|x-y\|_1}{n} = \frac{1}{n}\sum_{i=1}^{n} |x_i-y_i|\]</span></p>
<table class="pre"><tr><td class="snippet"><pre class="fssnip highlighted"><code lang="csharp"><span class="k">double</span> d <span class="o">=</span> Distance.MAE(x, y);
</code></pre></td></tr></table>
<h2><a name="Mean-Squared-Error-MSE" class="anchor" href="#Mean-Squared-Error-MSE">Mean-Squared Error (MSE)</a></h2>
<img src="DistanceMSE.png" style="width:87px; height:87px; float:left; margin:10px 10px 10px 0;" />
<p>The mean squared error is a normalized version of the sum of squared difference.</p>
<p><span class="math">\[d_{\mathbf{MSE}} : (x, y) \mapsto \frac{d_{\mathbf{SSD}}}{n} = \frac{\|x-y\|_2^2}{n} = \frac{1}{n}\sum_{i=1}^{n} (x_i-y_i)^2\]</span></p>
<table class="pre"><tr><td class="snippet"><pre class="fssnip highlighted"><code lang="csharp"><span class="k">double</span> d <span class="o">=</span> Distance.MSE(x, y);
</code></pre></td></tr></table>
<h2><a name="Euclidean-Distance" class="anchor" href="#Euclidean-Distance">Euclidean Distance</a></h2>
<img src="DistanceEuclidean.png" style="width:87px; height:87px; float:left; margin:10px 10px 10px 0;" />
<p>The euclidean distance is the <span class="math">\(L_2\)</span>-norm of the difference, a special case of the Minkowski distance with p=2.
It is the natural distance in a geometric interpretation.</p>
<p><span class="math">\[d_{\mathbf{2}} : (x, y) \mapsto \|x-y\|_2 = \sqrt{d_{\mathbf{SSD}}} = \sqrt{\sum_{i=1}^{n} (x_i-y_i)^2}\]</span></p>
<table class="pre"><tr><td class="snippet"><pre class="fssnip highlighted"><code lang="csharp"><span class="k">double</span> d <span class="o">=</span> Distance.Euclidean(x, y);
</code></pre></td></tr></table>
<h2><a name="Manhattan-Distance" class="anchor" href="#Manhattan-Distance">Manhattan Distance</a></h2>
<img src="DistanceManhattan.png" style="width:87px; height:87px; float:left; margin:10px 10px 10px 0;" />
<p>The Manhattan distance is the <span class="math">\(L_1\)</span>-norm of the difference, a special case of the Minkowski distance with p=1
and equivalent to the sum of absolute difference.</p>
<p><span class="math">\[d_{\mathbf{1}} \equiv d_{\mathbf{SAD}} : (x, y) \mapsto \|x-y\|_1 = \sum_{i=1}^{n} |x_i-y_i|\]</span></p>
<table class="pre"><tr><td class="snippet"><pre class="fssnip highlighted"><code lang="csharp"><span class="k">double</span> d <span class="o">=</span> Distance.Manhattan(x, y);
</code></pre></td></tr></table>
<h2><a name="Chebyshev-Distance" class="anchor" href="#Chebyshev-Distance">Chebyshev Distance</a></h2>
<img src="DistanceChebyshev.png" style="width:87px; height:87px; float:left; margin:10px 10px 10px 0;" />
<p>The Chebyshev distance is the <span class="math">\(L_\infty\)</span>-norm of the difference, a special case of the Minkowski distance
where p goes to infinity. It is also known as Chessboard distance.</p>
<p><span class="math">\[d_{\mathbf{\infty}} : (x, y) \mapsto \|x-y\|_\infty = \lim_{p \rightarrow \infty}\bigg(\sum_{i=1}^{n} |x_i-y_i|^p\bigg)^\frac{1}{p} = \max_{i} |x_i-y_i|\]</span></p>
<table class="pre"><tr><td class="snippet"><pre class="fssnip highlighted"><code lang="csharp"><span class="k">double</span> d <span class="o">=</span> Distance.Chebyshev(x, y);
</code></pre></td></tr></table>
<h2><a name="Minkowski-Distance" class="anchor" href="#Minkowski-Distance">Minkowski Distance</a></h2>
<img src="DistanceMinkowski3.png" style="width:87px; height:87px; float:left; margin:10px 10px 10px 0;" />
<p>The Minkowski distance is the generalized <span class="math">\(L_p\)</span>-norm of the difference.
The contour plot on the left demonstrates the case of p=3.</p>
<p><span class="math">\[d_{\mathbf{p}} : (x, y) \mapsto \|x-y\|_p = \bigg(\sum_{i=1}^{n} |x_i-y_i|^p\bigg)^\frac{1}{p}\]</span></p>
<table class="pre"><tr><td class="snippet"><pre class="fssnip highlighted"><code lang="csharp"><span class="k">double</span> d <span class="o">=</span> Distance.Minkowski(p, x, y);
</code></pre></td></tr></table>
<h2><a name="Canberra-Distance" class="anchor" href="#Canberra-Distance">Canberra Distance</a></h2>
<img src="DistanceCanberra.png" style="width:87px; height:87px; float:left; margin:10px 10px 10px 0;" />
<p>The Canberra distance is a weighted version of the Manhattan distance, introduced and refined 1967 by Lance, Williams and Adkins.
It is often used for data scattered around an origin, as it is biased for measures around the origin and very sensitive for values close to zero.</p>
<p><span class="math">\[d_{\mathbf{CAD}} : (x, y) \mapsto \sum_{i=1}^{n} \frac{|x_i-y_i|}{|x_i|+|y_i|}\]</span></p>
<table class="pre"><tr><td class="snippet"><pre class="fssnip highlighted"><code lang="csharp"><span class="k">double</span> d <span class="o">=</span> Distance.Canberra(x, y);
</code></pre></td></tr></table>
<h2><a name="Cosine-Distance" class="anchor" href="#Cosine-Distance">Cosine Distance</a></h2>
<img src="DistanceCosine.png" style="width:87px; height:87px; float:left; margin:10px 10px 10px 0;" />
<p>The cosine distance contains the dot product scaled by the product of the Euclidean distances from the origin.
It represents the angular distance of two vectors while ignoring their scale.</p>
<p><span class="math">\[d_{\mathbf{cos}} : (x, y) \mapsto 1-\frac{\langle x, y\rangle}{\|x\|_2\|y\|_2} = 1-\frac{\sum_{i=1}^{n} x_i y_i}{\sqrt{\sum_{i=1}^{n} x_i^2}\sqrt{\sum_{i=1}^{n} y_i^2}}\]</span></p>
<table class="pre"><tr><td class="snippet"><pre class="fssnip highlighted"><code lang="csharp"><span class="k">double</span> d <span class="o">=</span> Distance.Cosine(x, y);
</code></pre></td></tr></table>
<h2><a name="Pearson-s-Distance" class="anchor" href="#Pearson-s-Distance">Pearson's Distance</a></h2>
<img src="DistancePearson.png" style="width:87px; height:87px; float:left; margin:10px 10px 10px 0;" />
<p>The Pearson distance is a correlation distance based on Pearson's product-momentum correlation coefficient
of the two sample vectors. Since the correlation coefficient falls between [-1, 1], the Pearson distance
lies in [0, 2] and measures the linear relationship between the two vectors.</p>
<p><span class="math">\[d_{\mathbf{Pearson}} : (x, y) \mapsto 1 - \mathbf{Corr}(x, y)\]</span></p>
<table class="pre"><tr><td class="snippet"><pre class="fssnip highlighted"><code lang="csharp"><span class="k">double</span> d <span class="o">=</span> Distance.Pearson(x, y);
</code></pre></td></tr></table>
<h2><a name="Hamming-Distance" class="anchor" href="#Hamming-Distance">Hamming Distance</a></h2>
<p>The hamming distance represents the number of entries in the two sample vectors which are different.
It is a fundamental distance measure in information theory but less relevant in non-integer numerical problems.</p>
<table class="pre"><tr><td class="snippet"><pre class="fssnip highlighted"><code lang="csharp"><span class="k">double</span> d <span class="o">=</span> Distance.Hamming(x, y);
</code></pre></td></tr></table>
</div>
<!-- BEGIN SEARCH BOX: this adds support for the search box -->
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/JavaScript-autoComplete/1.0.4/auto-complete.css" />
<script type="text/javascript">var fsdocs_search_baseurl = 'https://numerics.mathdotnet.com/';</script>
<script type="text/javascript" src="https://cdnjs.cloudflare.com/ajax/libs/lunr.js/2.3.8/lunr.min.js"></script>
<script type="text/javascript" src="https://cdnjs.cloudflare.com/ajax/libs/JavaScript-autoComplete/1.0.4/auto-complete.min.js"></script>
<script type="text/javascript" src="https://numerics.mathdotnet.com/content/fsdocs-search.js"></script>
<!-- END SEARCH BOX: this adds support for the search box -->
</div>
</body>
</html>