csharpfftfsharpintegrationinterpolationlinear-algebramathdifferentiationmatrixnumericsrandomregressionstatisticsmathnet
You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
263 lines
19 KiB
263 lines
19 KiB
<!DOCTYPE html>
|
|
<html lang="en">
|
|
<head>
|
|
<meta charset="utf-8"/>
|
|
<title>Distance Metrics
|
|
</title>
|
|
<meta name="viewport" content="width=device-width, initial-scale=1.0"/>
|
|
<meta name="description" content="Math.NET Numerics, providing methods and algorithms for numerical computations in science, engineering and every day use. .Net 4, .Net 3.5, SL5, Win8, WP8, PCL 47 and 136, Mono, Xamarin Android/iOS."/>
|
|
<meta name="author" content="Christoph Ruegg, Marcus Cuda, Jurgen Van Gael"/>
|
|
|
|
<script src="https://code.jquery.com/jquery-1.8.0.js"></script>
|
|
<script src="https://code.jquery.com/ui/1.8.23/jquery-ui.js"></script>
|
|
<script src="https://netdna.bootstrapcdn.com/twitter-bootstrap/2.2.1/js/bootstrap.min.js"></script>
|
|
<link href="https://netdna.bootstrapcdn.com/twitter-bootstrap/2.2.1/css/bootstrap-combined.min.css" rel="stylesheet"/>
|
|
|
|
<link type="text/css" rel="stylesheet" href="https://numerics.mathdotnet.com/content/style.css" />
|
|
<style>
|
|
#main table:not(.pre) {
|
|
border: 1px solid #dddddd;
|
|
max-width: 100%;
|
|
border-style: solid;
|
|
border-width: 1px;
|
|
border-color: gray;
|
|
border-collapse: collapse;
|
|
border-right-width: 1px;
|
|
border-bottom-width: 1px;
|
|
margin-top: 15px;
|
|
margin-bottom: 25px;
|
|
}
|
|
#main table:not(.pre) th, #main table:not(.pre) td {
|
|
border: 1px solid #dddddd;
|
|
padding: 6px;
|
|
}
|
|
#main table:not(.pre) th p, #main table:not(.pre) td p {
|
|
margin-bottom: 5px;
|
|
}
|
|
</style>
|
|
<script type="text/javascript" src="https://numerics.mathdotnet.com/content/tips.js"></script>
|
|
<!-- HTML5 shim, for IE6-8 support of HTML5 elements -->
|
|
<!--[if lt IE 9]>
|
|
<script src="https://oss.maxcdn.com/html5shiv/3.7.2/html5shiv.min.js"></script>
|
|
<![endif]-->
|
|
|
|
<script type="text/javascript" src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.1/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script>
|
|
</head>
|
|
<body>
|
|
<div class="container">
|
|
<div class="masthead">
|
|
<ul class="nav nav-pills pull-right">
|
|
<li><a href="https://www.mathdotnet.com">Math.NET Project</a></li>
|
|
<li><a href="https://numerics.mathdotnet.com">Math.NET Numerics</a></li>
|
|
<li><a href="https://github.com/mathnet/mathnet-numerics">GitHub</a></li>
|
|
</ul>
|
|
<h3 class="muted">Math.NET Numerics</h3>
|
|
</div>
|
|
<hr />
|
|
<div class="row">
|
|
<div class="span9" id="main">
|
|
|
|
<h1><a name="Distance-Metrics" class="anchor" href="#Distance-Metrics">Distance Metrics</a></h1>
|
|
<p>A metric or distance function is a function <span class="math">\(d(x,y)\)</span> that defines the distance
|
|
between elements of a set as a non-negative real number. If the distance is zero, both elements are equivalent
|
|
under that specific metric. Distance functions thus provide a way to measure how close two elements are, where elements
|
|
do not have to be numbers but can also be vectors, matrices or arbitrary objects. Distance functions are often used
|
|
as error or cost functions to be minimized in an optimization problem.</p>
|
|
<p>There are multiple ways to define a metric on a set. A typical distance for real numbers is the absolute difference,
|
|
<span class="math">\(d : (x, y) \mapsto |x-y|\)</span>. But a scaled version of the absolute difference, or even <span class="math">\(d(x, y) = \begin{cases} 0 &\mbox{if } x = y \\ 1 & \mbox{if } x \ne y. \end{cases}\)</span>
|
|
are valid metrics as well. Every normed vector space induces a distance given by <span class="math">\(d(\vec x, \vec y) = \|\vec x - \vec y\|\)</span>.</p>
|
|
<p>Math.NET Numerics provides the following distance functions on vectors and arrays:</p>
|
|
<h2><a name="Sum-of-Absolute-Difference-SAD" class="anchor" href="#Sum-of-Absolute-Difference-SAD">Sum of Absolute Difference (SAD)</a></h2>
|
|
<img src="img/DistanceSAD.png" style="width:87px; height:87px; float:left; margin:10px 10px 10px 0;" />
|
|
<p>The sum of absolute difference is equivalent to the <span class="math">\(L_1\)</span>-norm of the difference, also known as Manhattan- or Taxicab-norm.
|
|
The <code>abs</code> function makes this metric a bit complicated to deal with analytically, but it is more robust than SSD.</p>
|
|
<p><span class="math">\[d_{\mathbf{SAD}} : (x, y) \mapsto \|x-y\|_1 = \sum_{i=1}^{n} |x_i-y_i|\]</span></p>
|
|
<table class="pre"><tr><td class="lines"><pre class="fssnip"><span class="l">1: </span>
|
|
</pre></td>
|
|
<td class="snippet"><pre class="fssnip highlighted"><code lang="csharp"><span class="k">double</span> d <span class="o">=</span> Distance.SAD(x, y);
|
|
</code></pre></td></tr></table>
|
|
<h2><a name="Sum-of-Squared-Difference-SSD" class="anchor" href="#Sum-of-Squared-Difference-SSD">Sum of Squared Difference (SSD)</a></h2>
|
|
<img src="img/DistanceSSD.png" style="width:87px; height:87px; float:left; margin:10px 10px 10px 0;" />
|
|
<p>The sum of squared difference is equivalent to the squared <span class="math">\(L_2\)</span>-norm, also known as Euclidean norm.
|
|
It is therefore also known as Squared Euclidean distance.
|
|
This is the fundamental metric in least squares problems and linear algebra. The absence of the <code>abs</code>
|
|
function makes this metric convenient to deal with analytically, but the squares cause it to be very
|
|
sensitive to large outliers.</p>
|
|
<p><span class="math">\[d_{\mathbf{SSD}} : (x, y) \mapsto \|x-y\|_2^2 = \langle x-y, x-y\rangle = \sum_{i=1}^{n} (x_i-y_i)^2\]</span></p>
|
|
<table class="pre"><tr><td class="lines"><pre class="fssnip"><span class="l">1: </span>
|
|
</pre></td>
|
|
<td class="snippet"><pre class="fssnip highlighted"><code lang="csharp"><span class="k">double</span> d <span class="o">=</span> Distance.SSD(x, y);
|
|
</code></pre></td></tr></table>
|
|
<h2><a name="Mean-Absolute-Error-MAE" class="anchor" href="#Mean-Absolute-Error-MAE">Mean-Absolute Error (MAE)</a></h2>
|
|
<img src="img/DistanceMAE.png" style="width:87px; height:87px; float:left; margin:10px 10px 10px 0;" />
|
|
<p>The mean absolute error is a normalized version of the sum of absolute difference.</p>
|
|
<p><span class="math">\[d_{\mathbf{MAE}} : (x, y) \mapsto \frac{d_{\mathbf{SAD}}}{n} = \frac{\|x-y\|_1}{n} = \frac{1}{n}\sum_{i=1}^{n} |x_i-y_i|\]</span></p>
|
|
<table class="pre"><tr><td class="lines"><pre class="fssnip"><span class="l">1: </span>
|
|
</pre></td>
|
|
<td class="snippet"><pre class="fssnip highlighted"><code lang="csharp"><span class="k">double</span> d <span class="o">=</span> Distance.MAE(x, y);
|
|
</code></pre></td></tr></table>
|
|
<h2><a name="Mean-Squared-Error-MSE" class="anchor" href="#Mean-Squared-Error-MSE">Mean-Squared Error (MSE)</a></h2>
|
|
<img src="img/DistanceMSE.png" style="width:87px; height:87px; float:left; margin:10px 10px 10px 0;" />
|
|
<p>The mean squared error is a normalized version of the sum of squared difference.</p>
|
|
<p><span class="math">\[d_{\mathbf{MSE}} : (x, y) \mapsto \frac{d_{\mathbf{SSD}}}{n} = \frac{\|x-y\|_2^2}{n} = \frac{1}{n}\sum_{i=1}^{n} (x_i-y_i)^2\]</span></p>
|
|
<table class="pre"><tr><td class="lines"><pre class="fssnip"><span class="l">1: </span>
|
|
</pre></td>
|
|
<td class="snippet"><pre class="fssnip highlighted"><code lang="csharp"><span class="k">double</span> d <span class="o">=</span> Distance.MSE(x, y);
|
|
</code></pre></td></tr></table>
|
|
<h2><a name="Euclidean-Distance" class="anchor" href="#Euclidean-Distance">Euclidean Distance</a></h2>
|
|
<img src="img/DistanceEuclidean.png" style="width:87px; height:87px; float:left; margin:10px 10px 10px 0;" />
|
|
<p>The euclidean distance is the <span class="math">\(L_2\)</span>-norm of the difference, a special case of the Minkowski distance with p=2.
|
|
It is the natural distance in a geometric interpretation.</p>
|
|
<p><span class="math">\[d_{\mathbf{2}} : (x, y) \mapsto \|x-y\|_2 = \sqrt{d_{\mathbf{SSD}}} = \sqrt{\sum_{i=1}^{n} (x_i-y_i)^2}\]</span></p>
|
|
<table class="pre"><tr><td class="lines"><pre class="fssnip"><span class="l">1: </span>
|
|
</pre></td>
|
|
<td class="snippet"><pre class="fssnip highlighted"><code lang="csharp"><span class="k">double</span> d <span class="o">=</span> Distance.Euclidean(x, y);
|
|
</code></pre></td></tr></table>
|
|
<h2><a name="Manhattan-Distance" class="anchor" href="#Manhattan-Distance">Manhattan Distance</a></h2>
|
|
<img src="img/DistanceManhattan.png" style="width:87px; height:87px; float:left; margin:10px 10px 10px 0;" />
|
|
<p>The Manhattan distance is the <span class="math">\(L_1\)</span>-norm of the difference, a special case of the Minkowski distance with p=1
|
|
and equivalent to the sum of absolute difference.</p>
|
|
<p><span class="math">\[d_{\mathbf{1}} \equiv d_{\mathbf{SAD}} : (x, y) \mapsto \|x-y\|_1 = \sum_{i=1}^{n} |x_i-y_i|\]</span></p>
|
|
<table class="pre"><tr><td class="lines"><pre class="fssnip"><span class="l">1: </span>
|
|
</pre></td>
|
|
<td class="snippet"><pre class="fssnip highlighted"><code lang="csharp"><span class="k">double</span> d <span class="o">=</span> Distance.Manhattan(x, y);
|
|
</code></pre></td></tr></table>
|
|
<h2><a name="Chebyshev-Distance" class="anchor" href="#Chebyshev-Distance">Chebyshev Distance</a></h2>
|
|
<img src="img/DistanceChebyshev.png" style="width:87px; height:87px; float:left; margin:10px 10px 10px 0;" />
|
|
<p>The Chebyshev distance is the <span class="math">\(L_\infty\)</span>-norm of the difference, a special case of the Minkowski distance
|
|
where p goes to infinity. It is also known as Chessboard distance.</p>
|
|
<p><span class="math">\[d_{\mathbf{\infty}} : (x, y) \mapsto \|x-y\|_\infty = \lim_{p \rightarrow \infty}\bigg(\sum_{i=1}^{n} |x_i-y_i|^p\bigg)^\frac{1}{p} = \max_{i} |x_i-y_i|\]</span></p>
|
|
<table class="pre"><tr><td class="lines"><pre class="fssnip"><span class="l">1: </span>
|
|
</pre></td>
|
|
<td class="snippet"><pre class="fssnip highlighted"><code lang="csharp"><span class="k">double</span> d <span class="o">=</span> Distance.Chebyshev(x, y);
|
|
</code></pre></td></tr></table>
|
|
<h2><a name="Minkowski-Distance" class="anchor" href="#Minkowski-Distance">Minkowski Distance</a></h2>
|
|
<img src="img/DistanceMinkowski3.png" style="width:87px; height:87px; float:left; margin:10px 10px 10px 0;" />
|
|
<p>The Minkowski distance is the generalized <span class="math">\(L_p\)</span>-norm of the difference.
|
|
The contour plot on the left demonstrates the case of p=3.</p>
|
|
<p><span class="math">\[d_{\mathbf{p}} : (x, y) \mapsto \|x-y\|_p = \bigg(\sum_{i=1}^{n} |x_i-y_i|^p\bigg)^\frac{1}{p}\]</span></p>
|
|
<table class="pre"><tr><td class="lines"><pre class="fssnip"><span class="l">1: </span>
|
|
</pre></td>
|
|
<td class="snippet"><pre class="fssnip highlighted"><code lang="csharp"><span class="k">double</span> d <span class="o">=</span> Distance.Minkowski(p, x, y);
|
|
</code></pre></td></tr></table>
|
|
<h2><a name="Canberra-Distance" class="anchor" href="#Canberra-Distance">Canberra Distance</a></h2>
|
|
<img src="img/DistanceCanberra.png" style="width:87px; height:87px; float:left; margin:10px 10px 10px 0;" />
|
|
<p>The Canberra distance is a weighted version of the Manhattan distance, introduced and refined 1967 by Lance, Williams and Adkins.
|
|
It is often used for data scattered around an origin, as it is biased for measures around the origin and very sensitive for values close to zero.</p>
|
|
<p><span class="math">\[d_{\mathbf{CAD}} : (x, y) \mapsto \sum_{i=1}^{n} \frac{|x_i-y_i|}{|x_i|+|y_i|}\]</span></p>
|
|
<table class="pre"><tr><td class="lines"><pre class="fssnip"><span class="l">1: </span>
|
|
</pre></td>
|
|
<td class="snippet"><pre class="fssnip highlighted"><code lang="csharp"><span class="k">double</span> d <span class="o">=</span> Distance.Canberra(x, y);
|
|
</code></pre></td></tr></table>
|
|
<h2><a name="Cosine-Distance" class="anchor" href="#Cosine-Distance">Cosine Distance</a></h2>
|
|
<img src="img/DistanceCosine.png" style="width:87px; height:87px; float:left; margin:10px 10px 10px 0;" />
|
|
<p>The cosine distance contains the dot product scaled by the product of the Euclidean distances from the origin.
|
|
It represents the angular distance of two vectors while ignoring their scale.</p>
|
|
<p><span class="math">\[d_{\mathbf{cos}} : (x, y) \mapsto 1-\frac{\langle x, y\rangle}{\|x\|_2\|y\|_2} = 1-\frac{\sum_{i=1}^{n} x_i y_i}{\sqrt{\sum_{i=1}^{n} x_i^2}\sqrt{\sum_{i=1}^{n} y_i^2}}\]</span></p>
|
|
<table class="pre"><tr><td class="lines"><pre class="fssnip"><span class="l">1: </span>
|
|
</pre></td>
|
|
<td class="snippet"><pre class="fssnip highlighted"><code lang="csharp"><span class="k">double</span> d <span class="o">=</span> Distance.Cosine(x, y);
|
|
</code></pre></td></tr></table>
|
|
<h2><a name="Pearson-s-Distance" class="anchor" href="#Pearson-s-Distance">Pearson's Distance</a></h2>
|
|
<img src="img/DistancePearson.png" style="width:87px; height:87px; float:left; margin:10px 10px 10px 0;" />
|
|
<p>The Pearson distance is a correlation distance based on Pearson's product-momentum correlation coefficient
|
|
of the two sample vectors. Since the correlation coefficient falls between [-1, 1], the Pearson distance
|
|
lies in [0, 2] and measures the linear relationship between the two vectors.</p>
|
|
<p><span class="math">\[d_{\mathbf{Pearson}} : (x, y) \mapsto 1 - \mathbf{Corr}(x, y)\]</span></p>
|
|
<table class="pre"><tr><td class="lines"><pre class="fssnip"><span class="l">1: </span>
|
|
</pre></td>
|
|
<td class="snippet"><pre class="fssnip highlighted"><code lang="csharp"><span class="k">double</span> d <span class="o">=</span> Distance.Pearson(x, y);
|
|
</code></pre></td></tr></table>
|
|
<h2><a name="Hamming-Distance" class="anchor" href="#Hamming-Distance">Hamming Distance</a></h2>
|
|
<p>The hamming distance represents the number of entries in the two sample vectors which are different.
|
|
It is a fundamental distance measure in information theory but less relevant in non-integer numerical problems.</p>
|
|
<table class="pre"><tr><td class="lines"><pre class="fssnip"><span class="l">1: </span>
|
|
</pre></td>
|
|
<td class="snippet"><pre class="fssnip highlighted"><code lang="csharp"><span class="k">double</span> d <span class="o">=</span> Distance.Hamming(x, y);
|
|
</code></pre></td></tr></table>
|
|
|
|
|
|
</div>
|
|
<div class="span3">
|
|
<ul class="nav nav-list" id="menu">
|
|
|
|
<li class="nav-header">Math.NET Numerics</li>
|
|
<li><a href="https://numerics.mathdotnet.com/Packages.html">NuGet & Binaries</a></li>
|
|
<li><a href="https://numerics.mathdotnet.com/ReleaseNotes.html">Release Notes</a></li>
|
|
<li><a href="https://numerics.mathdotnet.com/License.html">MIT/X11 License</a></li>
|
|
<li><a href="https://numerics.mathdotnet.com/Compatibility.html">Platform Support</a></li>
|
|
<li><a href="https://numerics.mathdotnet.com/api/">Class Reference</a></li>
|
|
<li><a href="https://github.com/mathnet/mathnet-numerics/issues">Issues & Bugs</a></li>
|
|
<li><a href="https://numerics.mathdotnet.com/Users.html">Who is using Math.NET?</a></li>
|
|
|
|
<li class="nav-header">Contributing</li>
|
|
<li><a href="https://numerics.mathdotnet.com/Contributors.html">Contributors</a></li>
|
|
<li><a href="https://numerics.mathdotnet.com/Contributing.html">Contributing</a></li>
|
|
<li><a href="https://numerics.mathdotnet.com/Build.html">Build & Tools</a></li>
|
|
<li><a href="http://feedback.mathdotnet.com/forums/2060-math-net-numerics">Your Ideas</a></li>
|
|
|
|
<li class="nav-header">Getting Help</li>
|
|
<li><a href="https://discuss.mathdotnet.com/c/numerics">Discuss</a></li>
|
|
<li><a href="https://stackoverflow.com/questions/tagged/mathdotnet">Stack Overflow</a></li>
|
|
|
|
<li class="nav-header">Getting Started</li>
|
|
<li><a href="https://numerics.mathdotnet.com/">Getting started</a></li>
|
|
<li><a href="https://numerics.mathdotnet.com/Constants.html">Constants</a></li>
|
|
<li>Floating-Point Numbers</li>
|
|
<li>Arbitrary Precision Numbers</li>
|
|
<li>Complex Numbers</li>
|
|
<li><a href="https://numerics.mathdotnet.com/Matrix.html">Matrices and Vectors</a></li>
|
|
<li><a href="https://numerics.mathdotnet.com/Euclid.html">Euclid & Number Theory</a></li>
|
|
<li>Combinatorics</li>
|
|
|
|
<li class="nav-header">Evaluation</li>
|
|
<li><a href="https://numerics.mathdotnet.com/Functions.html">Special Functions</a></li>
|
|
<li>Differentiation</li>
|
|
<li><a href="https://numerics.mathdotnet.com/Integration.html">Integration</a></li>
|
|
|
|
<li class="nav-header">Statistics/Probability</li>
|
|
<li><a href="https://numerics.mathdotnet.com/DescriptiveStatistics.html">Descriptive Statistics</a></li>
|
|
<li><a href="https://numerics.mathdotnet.com/Probability.html">Probability Distributions</a></li>
|
|
|
|
<li class="nav-header">Generation</li>
|
|
<li><a href="https://numerics.mathdotnet.com/Generate.html">Generating Data</a></li>
|
|
<li><a href="https://numerics.mathdotnet.com/Random.html">Random Numbers</a></li>
|
|
|
|
<li class="nav-header">Transformation</li>
|
|
<li>Fourier Transform (FFT)</li>
|
|
<li>Filtering & DSP</li>
|
|
<li>Window Functions</li>
|
|
|
|
<li class="nav-header">Solving Equations</li>
|
|
<li><a href="https://numerics.mathdotnet.com/LinearEquations.html">Linear Equation Systems</a></li>
|
|
<li>Nonlinear Root Finding</li>
|
|
|
|
<li class="nav-header">Optimization</li>
|
|
<li>Linear Least Squares</li>
|
|
<li>Nonlinear Optimization</li>
|
|
<li><a href="https://numerics.mathdotnet.com/Distance.html">Distance Metrics</a></li>
|
|
|
|
<li class="nav-header">Curve Fitting</li>
|
|
<li><a href="https://numerics.mathdotnet.com/Regression.html">Regression</a></li>
|
|
<li>Interpolation</li>
|
|
<li>Fourier Approximation</li>
|
|
|
|
<li class="nav-header">Native Providers</li>
|
|
<li><a href="https://numerics.mathdotnet.com/MKL.html">Intel MKL</a></li>
|
|
|
|
<li class="nav-header">Working Together</li>
|
|
<li><a href="https://numerics.mathdotnet.com/CSV.html">Delimited Text Files (CSV)</a></li>
|
|
<li><a href="https://numerics.mathdotnet.com/MatrixMarket.html">NIST MatrixMarket</a></li>
|
|
<li><a href="https://numerics.mathdotnet.com/MatlabFiles.html">MATLAB</a></li>
|
|
<li><a href="https://numerics.mathdotnet.com/IFSharpNotebook.html">IF# Notebook</a></li>
|
|
<li>FsLab & Deedle</li>
|
|
<li>Microsoft Excel</li>
|
|
<li>numl.net machine learning</li>
|
|
<li>R-project</li>
|
|
|
|
</ul>
|
|
</div>
|
|
</div>
|
|
</div>
|
|
</body>
|
|
</html>
|
|
|