csharpfftfsharpintegrationinterpolationlinear-algebramathdifferentiationmatrixnumericsrandomregressionstatisticsmathnet
You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
372 lines
28 KiB
372 lines
28 KiB
<!DOCTYPE html>
|
|
<html lang="en">
|
|
<head>
|
|
<meta charset="utf-8"/>
|
|
<title>Curve Fitting: Linear Regression
|
|
</title>
|
|
<meta name="viewport" content="width=device-width, initial-scale=1.0"/>
|
|
<meta name="description" content="Math.NET Numerics, providing methods and algorithms for numerical computations in science, engineering and every day use. .Net 4, .Net 3.5, SL5, Win8, WP8, PCL 47 and 136, Mono, Xamarin Android/iOS."/>
|
|
<meta name="author" content="Christoph Ruegg, Marcus Cuda, Jurgen Van Gael"/>
|
|
|
|
<script src="https://code.jquery.com/jquery-1.8.0.js"></script>
|
|
<script src="https://code.jquery.com/ui/1.8.23/jquery-ui.js"></script>
|
|
<script src="https://netdna.bootstrapcdn.com/twitter-bootstrap/2.2.1/js/bootstrap.min.js"></script>
|
|
<link href="https://netdna.bootstrapcdn.com/twitter-bootstrap/2.2.1/css/bootstrap-combined.min.css" rel="stylesheet"/>
|
|
|
|
<link type="text/css" rel="stylesheet" href="https://numerics.mathdotnet.com/content/style.css" />
|
|
<style>
|
|
#main table:not(.pre) {
|
|
border: 1px solid #dddddd;
|
|
max-width: 100%;
|
|
border-style: solid;
|
|
border-width: 1px;
|
|
border-color: gray;
|
|
border-collapse: collapse;
|
|
border-right-width: 1px;
|
|
border-bottom-width: 1px;
|
|
margin-top: 15px;
|
|
margin-bottom: 25px;
|
|
}
|
|
#main table:not(.pre) th, #main table:not(.pre) td {
|
|
border: 1px solid #dddddd;
|
|
padding: 6px;
|
|
}
|
|
#main table:not(.pre) th p, #main table:not(.pre) td p {
|
|
margin-bottom: 5px;
|
|
}
|
|
</style>
|
|
<script type="text/javascript" src="https://numerics.mathdotnet.com/content/tips.js"></script>
|
|
<!-- HTML5 shim, for IE6-8 support of HTML5 elements -->
|
|
<!--[if lt IE 9]>
|
|
<script src="https://oss.maxcdn.com/html5shiv/3.7.2/html5shiv.min.js"></script>
|
|
<![endif]-->
|
|
|
|
<script type="text/javascript" src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.1/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script>
|
|
</head>
|
|
<body>
|
|
<div class="container">
|
|
<div class="masthead">
|
|
<ul class="nav nav-pills pull-right">
|
|
<li><a href="https://www.mathdotnet.com">Math.NET Project</a></li>
|
|
<li><a href="https://numerics.mathdotnet.com">Math.NET Numerics</a></li>
|
|
<li><a href="https://github.com/mathnet/mathnet-numerics">GitHub</a></li>
|
|
</ul>
|
|
<h3 class="muted">Math.NET Numerics</h3>
|
|
</div>
|
|
<hr />
|
|
<div class="row">
|
|
<div class="span9" id="main">
|
|
|
|
<h1><a name="Curve-Fitting-Linear-Regression" class="anchor" href="#Curve-Fitting-Linear-Regression">Curve Fitting: Linear Regression</a></h1>
|
|
<p>Regression is all about fitting a low order parametric model or curve to data, so we can
|
|
reason about it or make predictions on points not covered by the data. Both data and
|
|
model are known, but we'd like to find the model parameters that make the model fit best
|
|
or good enough to the data according to some metric.</p>
|
|
<p>We may also be interested in how well the model supports the data or whether we better
|
|
look for another more appropriate model.</p>
|
|
<p>In a regression, a lot of data is reduced and generalized into a few parameters.
|
|
The resulting model can obviously no longer reproduce all the original data exactly -
|
|
if you need the data to be reproduced exactly, have a look at interpolation instead.</p>
|
|
<h2><a name="Simple-Regression-Fit-to-a-Line" class="anchor" href="#Simple-Regression-Fit-to-a-Line">Simple Regression: Fit to a Line</a></h2>
|
|
<p>In the simplest yet still common form of regression we would like to fit a line
|
|
<span class="math">\(y : x \mapsto a + b x\)</span> to a set of points <span class="math">\((x_j,y_j)\)</span>, where <span class="math">\(x_j\)</span> and <span class="math">\(y_j\)</span> are scalars.
|
|
Assuming we have two double arrays for x and y, we can use <code>Fit.Line</code> to evaluate the <span class="math">\(a\)</span> and <span class="math">\(b\)</span>
|
|
parameters of the least squares fit:</p>
|
|
<table class="pre"><tr><td class="lines"><pre class="fssnip"><span class="l">1: </span>
|
|
<span class="l">2: </span>
|
|
<span class="l">3: </span>
|
|
<span class="l">4: </span>
|
|
<span class="l">5: </span>
|
|
<span class="l">6: </span>
|
|
</pre></td>
|
|
<td class="snippet"><pre class="fssnip highlighted"><code lang="csharp"><span class="k">double</span>[] xdata <span class="o">=</span> <span class="k">new</span> <span class="k">double</span>[] { <span class="n">10</span>, <span class="n">20</span>, <span class="n">30</span> };
|
|
<span class="k">double</span>[] ydata <span class="o">=</span> <span class="k">new</span> <span class="k">double</span>[] { <span class="n">15</span>, <span class="n">20</span>, <span class="n">25</span> };
|
|
|
|
Tuple<<span class="k">double</span>, <span class="k">double</span>> p <span class="o">=</span> Fit.Line(xdata, ydata);
|
|
<span class="k">double</span> a <span class="o">=</span> p.Item<span class="n">1</span>; <span class="c">// == 10; intercept</span>
|
|
<span class="k">double</span> b <span class="o">=</span> p.Item<span class="n">2</span>; <span class="c">// == 0.5; slope</span>
|
|
</code></pre></td></tr></table>
|
|
<p>Or in F#:</p>
|
|
<table class="pre"><tr><td class="lines"><pre class="fssnip"><span class="l">1: </span>
|
|
</pre></td>
|
|
<td class="snippet"><pre class="fssnip highlighted"><code lang="fsharp"><span class="k">let</span> <span class="i">a</span>, <span class="i">b</span> <span class="o">=</span> <span class="i">Fit</span><span class="o">.</span><span class="i">Line</span> ([|<span class="n">10.0</span>;<span class="n">20.0</span>;<span class="n">30.0</span>|], [|<span class="n">15.0</span>;<span class="n">20.0</span>;<span class="n">25.0</span>|])
|
|
</code></pre></td>
|
|
</tr>
|
|
</table>
|
|
<p>How well do these parameters fit the data? The data points happen to be positioned
|
|
exactly on a line. Indeed, the <a href="https://en.wikipedia.org/wiki/Coefficient_of_determination">coefficient of determination</a>
|
|
confirms the perfect fit:</p>
|
|
<table class="pre"><tr><td class="lines"><pre class="fssnip"><span class="l">1: </span>
|
|
</pre></td>
|
|
<td class="snippet"><pre class="fssnip highlighted"><code lang="csharp">GoodnessOfFit.RSquared(xdata.Select(x <span class="o">=</span><span class="o">></span> a+b*x), ydata); <span class="c">// == 1.0</span>
|
|
</code></pre></td></tr></table>
|
|
<h2><a name="Linear-Model" class="anchor" href="#Linear-Model">Linear Model</a></h2>
|
|
<p>In practice, a line is often not an adequate model. But if we can choose a model that is linear,
|
|
we can leverage the power of linear algebra; otherwise we have to resort to iterative methods
|
|
(see Nonlinear Optimization).</p>
|
|
<p>A linear model can be described as linear combination of <span class="math">\(N\)</span> arbitrary but known
|
|
functions <span class="math">\(f_i(x)\)</span>, scaled by the model parameters <span class="math">\(p_i\)</span>. Note that none of the functions
|
|
<span class="math">\(f_i\)</span> depends on any of the <span class="math">\(p_i\)</span> parameters.</p>
|
|
<p><span class="math">\[y : x \mapsto p_1 f_1(x) + p_2 f_2(x) + \cdots + p_N f_N(x)\]</span></p>
|
|
<p>If we have <span class="math">\(M\)</span> data points <span class="math">\((x_j,y_j)\)</span>, then we can write the regression problem as an
|
|
overdefined system of <span class="math">\(M\)</span> equations:</p>
|
|
<p><span class="math">\[\begin{eqnarray}
|
|
y_1 &=& p_1 f_1(x_1) + p_2 f_2(x_1) + \cdots + p_N f_N(x_1) \\
|
|
y_2 &=& p_1 f_1(x_2) + p_2 f_2(x_2) + \cdots + p_N f_N(x_2) \\
|
|
&\vdots& \\
|
|
y_M &=& p_1 f_1(x_M) + p_2 f_2(x_M) + \cdots + p_N f_N(x_M)
|
|
\end{eqnarray}\]</span></p>
|
|
<p>Or in matrix notation with the predictor matrix <span class="math">\(X\)</span> and the response <span class="math">\(y\)</span>:</p>
|
|
<p><span class="math">\[\begin{eqnarray}
|
|
\mathbf y &=& \mathbf X \mathbf p \\
|
|
\begin{bmatrix}y_1\\y_2\\ \vdots \\y_M\end{bmatrix} &=&
|
|
\begin{bmatrix}f_1(x_1) & f_2(x_1) & \cdots & f_N(x_1)\\f_1(x_2) & f_2(x_2) & \cdots & f_N(x_2)\\ \vdots & \vdots & \ddots & \vdots\\f_1(x_M) & f_2(x_M) & \cdots & f_N(x_M)\end{bmatrix}
|
|
\begin{bmatrix}p_1\\p_2\\ \vdots \\p_N\end{bmatrix}
|
|
\end{eqnarray}\]</span></p>
|
|
<p>Provided the dataset is small enough, if transformed to the normal equation
|
|
<span class="math">\(\mathbf{X}^T\mathbf y = \mathbf{X}^T\mathbf X \mathbf p\)</span> this can be solved efficiently by the
|
|
Cholesky decomposition (do not use matrix inversion!).</p>
|
|
<table class="pre"><tr><td class="lines"><pre class="fssnip"><span class="l">1: </span>
|
|
</pre></td>
|
|
<td class="snippet"><pre class="fssnip highlighted"><code lang="csharp">Vector<<span class="k">double</span>> p <span class="o">=</span> MultipleRegression.NormalEquations(X, y);
|
|
</code></pre></td></tr></table>
|
|
<p>Using normal equations is comparably fast as it can dramatically reduce the linear algebra problem
|
|
to be solved, but that comes at the cost of less precision. If you need more precision, try using
|
|
<code>MultipleRegression.QR</code> or <code>MultipleRegression.Svd</code> instead, with the same arguments.</p>
|
|
<h2><a name="Polynomial-Regression" class="anchor" href="#Polynomial-Regression">Polynomial Regression</a></h2>
|
|
<p>To fit to a polynomial we can choose the following linear model with <span class="math">\(f_i(x) := x^i\)</span>:</p>
|
|
<p><span class="math">\[y : x \mapsto p_0 + p_1 x + p_2 x^2 + \cdots + p_N x^N\]</span></p>
|
|
<p>The predictor matrix of this model is the <a href="https://en.wikipedia.org/wiki/Vandermonde_matrix">Vandermonde matrix</a>.
|
|
There is a special function in the <code>Fit</code> class for regressions to a polynomial,
|
|
but note that regression to high order polynomials is numerically problematic.</p>
|
|
<table class="pre"><tr><td class="lines"><pre class="fssnip"><span class="l">1: </span>
|
|
</pre></td>
|
|
<td class="snippet"><pre class="fssnip highlighted"><code lang="csharp"><span class="k">double</span>[] p <span class="o">=</span> Fit.Polynomial(xdata, ydata, <span class="n">3</span>); <span class="c">// polynomial of order 3</span>
|
|
</code></pre></td></tr></table>
|
|
<h2><a name="Multiple-Regression" class="anchor" href="#Multiple-Regression">Multiple Regression</a></h2>
|
|
<p>The <span class="math">\(x\)</span> in the linear model can also be a vector <span class="math">\(\mathbf x = [x^{(1)}\; x^{(2)} \cdots x^{(k)}]\)</span>
|
|
and the arbitrary functions <span class="math">\(f_i(\mathbf x)\)</span> can accept vectors instead of scalars.</p>
|
|
<p>If we use <span class="math">\(f_i(\mathbf x) := x^{(i)}\)</span> and add an intercept term <span class="math">\(f_0(\mathbf x) := 1\)</span>
|
|
we end up at the simplest form of ordinary multiple regression:</p>
|
|
<p><span class="math">\[y : x \mapsto p_0 + p_1 x^{(1)} + p_2 x^{(2)} + \cdots + p_N x^{(N)}\]</span></p>
|
|
<p>For example, for the data points <span class="math">\((\mathbf{x}_j = [x^{(1)}_j\; x^{(2)}_j], y_j)\)</span> with values
|
|
<code>([1,4],15)</code>, <code>([2,5],20)</code> and <code>([3,2],10)</code> we can evaluate the best fitting parameters with:</p>
|
|
<table class="pre"><tr><td class="lines"><pre class="fssnip"><span class="l">1: </span>
|
|
<span class="l">2: </span>
|
|
<span class="l">3: </span>
|
|
<span class="l">4: </span>
|
|
</pre></td>
|
|
<td class="snippet"><pre class="fssnip highlighted"><code lang="csharp"><span class="k">double</span>[] p <span class="o">=</span> Fit.MultiDim(
|
|
<span class="k">new</span>[] {<span class="k">new</span>[] { <span class="n">1.0</span>, <span class="n">4.0</span> }, <span class="k">new</span>[] { <span class="n">2.0</span>, <span class="n">5.0</span> }, <span class="k">new</span>[] { <span class="n">3.0</span>, <span class="n">2.0</span> }},
|
|
<span class="k">new</span>[] { <span class="n">15.0</span>, <span class="n">20</span>, <span class="n">10</span> },
|
|
intercept: <span class="k">true</span>);
|
|
</code></pre></td></tr></table>
|
|
<p>The <code>Fit.MultiDim</code> routine uses normal equations, but you can always choose to explicitly use e.g.
|
|
the QR decomposition for more precision by using the <code>MultipleRegression</code> class directly:</p>
|
|
<table class="pre"><tr><td class="lines"><pre class="fssnip"><span class="l">1: </span>
|
|
<span class="l">2: </span>
|
|
<span class="l">3: </span>
|
|
<span class="l">4: </span>
|
|
</pre></td>
|
|
<td class="snippet"><pre class="fssnip highlighted"><code lang="csharp"><span class="k">double</span>[] p <span class="o">=</span> MultipleRegression.QR(
|
|
<span class="k">new</span>[] {<span class="k">new</span>[] { <span class="n">1.0</span>, <span class="n">4.0</span> }, <span class="k">new</span>[] { <span class="n">2.0</span>, <span class="n">5.0</span> }, <span class="k">new</span>[] { <span class="n">3.0</span>, <span class="n">2.0</span> }},
|
|
<span class="k">new</span>[] { <span class="n">15.0</span>, <span class="n">20</span>, <span class="n">10</span> },
|
|
intercept: <span class="k">true</span>);
|
|
</code></pre></td></tr></table>
|
|
<h2><a name="Arbitrary-Linear-Combination" class="anchor" href="#Arbitrary-Linear-Combination">Arbitrary Linear Combination</a></h2>
|
|
<p>In multiple regression, the functions <span class="math">\(f_i(\mathbf x)\)</span> can also operate on the whole
|
|
vector or mix its components arbitrarily and apply any functions on them, provided they are
|
|
defined at all the data points. For example, let's have a look at the following complicated but still linear
|
|
model in two dimensions:</p>
|
|
<p><span class="math">\[z : (x, y) \mapsto p_0 + p_1 \mathrm{tanh}(x) + p_2 \psi(x y) + p_3 x^y\]</span></p>
|
|
<p>Since we map (x,y) to (z) we need to organize the tuples in two arrays:</p>
|
|
<table class="pre"><tr><td class="lines"><pre class="fssnip"><span class="l">1: </span>
|
|
<span class="l">2: </span>
|
|
</pre></td>
|
|
<td class="snippet"><pre class="fssnip highlighted"><code lang="csharp"><span class="k">double</span>[][] xy <span class="o">=</span> <span class="k">new</span>[] { <span class="k">new</span>[]{x<span class="n">1</span>,y<span class="n">1</span>}, <span class="k">new</span>[]{x<span class="n">2</span>,y<span class="n">2</span>}, <span class="k">new</span>[]{x<span class="n">3</span>,y<span class="n">3</span>}, <span class="o">.</span><span class="o">.</span><span class="o">.</span> };
|
|
<span class="k">double</span>[] z <span class="o">=</span> <span class="k">new</span>[] { z<span class="n">1</span>, z<span class="n">2</span>, z<span class="n">3</span>, <span class="o">.</span><span class="o">.</span><span class="o">.</span> };
|
|
</code></pre></td></tr></table>
|
|
<p>Then we can call Fit.LinearMultiDim with our model, which will return an array with the best fitting 4 parameters <span class="math">\(p_0, p_1, p_2, p_3\)</span>:</p>
|
|
<table class="pre"><tr><td class="lines"><pre class="fssnip"><span class="l">1: </span>
|
|
<span class="l">2: </span>
|
|
<span class="l">3: </span>
|
|
<span class="l">4: </span>
|
|
<span class="l">5: </span>
|
|
</pre></td>
|
|
<td class="snippet"><pre class="fssnip highlighted"><code lang="csharp"><span class="k">double</span>[] p <span class="o">=</span> Fit.LinearMultiDim(xy, z,
|
|
d <span class="o">=</span><span class="o">></span> <span class="n">1.0</span>, <span class="c">// p0*1.0</span>
|
|
d <span class="o">=</span><span class="o">></span> Math.Tanh(d[<span class="n">0</span>]), <span class="c">// p1*tanh(x)</span>
|
|
d <span class="o">=</span><span class="o">></span> SpecialFunctions.DiGamma(d[<span class="n">0</span>]*d[<span class="n">1</span>]), <span class="c">// p2*psi(x*y)</span>
|
|
d <span class="o">=</span><span class="o">></span> Math.Pow(d[<span class="n">0</span>], d[<span class="n">1</span>])); <span class="c">// p3*x^y</span>
|
|
</code></pre></td></tr></table>
|
|
<h2><a name="Evaluating-the-model-at-specific-data-points" class="anchor" href="#Evaluating-the-model-at-specific-data-points">Evaluating the model at specific data points</a></h2>
|
|
<p>Let's say we have the following model:</p>
|
|
<p><span class="math">\[y : x \mapsto a + b \ln x\]</span></p>
|
|
<p>For this case we can use the <code>Fit.LinearCombination</code> function:</p>
|
|
<table class="pre"><tr><td class="lines"><pre class="fssnip"><span class="l">1: </span>
|
|
<span class="l">2: </span>
|
|
<span class="l">3: </span>
|
|
<span class="l">4: </span>
|
|
<span class="l">5: </span>
|
|
</pre></td>
|
|
<td class="snippet"><pre class="fssnip highlighted"><code lang="csharp"><span class="k">double</span>[] p <span class="o">=</span> Fit.LinearCombination(
|
|
<span class="k">new</span>[] {<span class="n">61.0</span>, <span class="n">62.0</span>, <span class="n">63.0</span>, <span class="n">65.0</span>},
|
|
<span class="k">new</span>[] {<span class="n">3.6</span>,<span class="n">3.8</span>, <span class="n">4.8</span>, <span class="n">4.1</span>},
|
|
x <span class="o">=</span><span class="o">></span> <span class="n">1.0</span>,
|
|
x <span class="o">=</span><span class="o">></span> Math.Log(x)); <span class="c">// -34.481, 9.316</span>
|
|
</code></pre></td></tr></table>
|
|
<p>In order to evaluate the resulting model at specific data points we can manually apply
|
|
the values of p to the model function, or we can use an alternative function with the <code>Func</code>
|
|
suffix that returns a function instead of the model parameters. The returned function
|
|
can then be used to evaluate the parametrized model:</p>
|
|
<table class="pre"><tr><td class="lines"><pre class="fssnip"><span class="l">1: </span>
|
|
<span class="l">2: </span>
|
|
<span class="l">3: </span>
|
|
<span class="l">4: </span>
|
|
<span class="l">5: </span>
|
|
<span class="l">6: </span>
|
|
</pre></td>
|
|
<td class="snippet"><pre class="fssnip highlighted"><code lang="csharp">Func<<span class="k">double</span>,<span class="k">double</span>> f <span class="o">=</span> Fit.LinearCombinationFunc(
|
|
<span class="k">new</span>[] {<span class="n">61.0</span>, <span class="n">62.0</span>, <span class="n">63.0</span>, <span class="n">65.0</span>},
|
|
<span class="k">new</span>[] {<span class="n">3.6</span>, <span class="n">3.8</span>, <span class="n">4.8</span>, <span class="n">4.1</span>},
|
|
x <span class="o">=</span><span class="o">></span> <span class="n">1.0</span>,
|
|
x <span class="o">=</span><span class="o">></span> Math.Log(x));
|
|
f(<span class="n">66.0</span>); <span class="c">// 4.548</span>
|
|
</code></pre></td></tr></table>
|
|
<h2><a name="Linearizing-non-linear-models-by-transformation" class="anchor" href="#Linearizing-non-linear-models-by-transformation">Linearizing non-linear models by transformation</a></h2>
|
|
<p>Sometimes it is possible to transform a non-linear model into a linear one.
|
|
For example, the following power function</p>
|
|
<p><span class="math">\[z : (x, y) \mapsto u x^v y^w\]</span></p>
|
|
<p>can be transformed into the following linear model with <span class="math">\(\hat{z} = \ln z\)</span> and <span class="math">\(t = \ln u\)</span></p>
|
|
<p><span class="math">\[\hat{z} : (x, y) \mapsto t + v \ln x + w \ln y\]</span></p>
|
|
<table class="pre"><tr><td class="lines"><pre class="fssnip"><span class="l"> 1: </span>
|
|
<span class="l"> 2: </span>
|
|
<span class="l"> 3: </span>
|
|
<span class="l"> 4: </span>
|
|
<span class="l"> 5: </span>
|
|
<span class="l"> 6: </span>
|
|
<span class="l"> 7: </span>
|
|
<span class="l"> 8: </span>
|
|
<span class="l"> 9: </span>
|
|
<span class="l">10: </span>
|
|
<span class="l">11: </span>
|
|
</pre></td>
|
|
<td class="snippet"><pre class="fssnip highlighted"><code lang="csharp"><span class="k">var</span> xy <span class="o">=</span> <span class="k">new</span>[] {<span class="k">new</span>[] { <span class="n">1.0</span>, <span class="n">4.0</span> }, <span class="k">new</span>[] { <span class="n">2.0</span>, <span class="n">5.0</span> }, <span class="k">new</span>[] { <span class="n">3.0</span>, <span class="n">2.0</span> }};
|
|
<span class="k">var</span> z <span class="o">=</span> <span class="k">new</span>[] { <span class="n">15.0</span>, <span class="n">20</span>, <span class="n">10</span> };
|
|
|
|
<span class="k">var</span> z_hat <span class="o">=</span> z.Select(r <span class="o">=</span><span class="o">></span> Math.Log(r)).ToArray(); <span class="c">// transform z_hat = ln(z)</span>
|
|
<span class="k">double</span>[] p_hat <span class="o">=</span> Fit.LinearMultiDim(xy, z_hat,
|
|
d <span class="o">=</span><span class="o">></span> <span class="n">1.0</span>,
|
|
d <span class="o">=</span><span class="o">></span> Math.Log(d[<span class="n">0</span>]),
|
|
d <span class="o">=</span><span class="o">></span> Math.Log(d[<span class="n">1</span>]));
|
|
<span class="k">double</span> u <span class="o">=</span> Math.Exp(p_hat[<span class="n">0</span>]); <span class="c">// transform t = ln(u)</span>
|
|
<span class="k">double</span> v <span class="o">=</span> p_hat[<span class="n">1</span>];
|
|
<span class="k">double</span> w <span class="o">=</span> p_hat[<span class="n">2</span>];
|
|
</code></pre></td></tr></table>
|
|
<h2><a name="Weighted-Regression" class="anchor" href="#Weighted-Regression">Weighted Regression</a></h2>
|
|
<p>Sometimes the regression error can be reduced by dampening specific data points.
|
|
We can achieve this by introducing a weight matrix <span class="math">\(W\)</span> into the normal equations
|
|
<span class="math">\(\mathbf{X}^T\mathbf{y} = \mathbf{X}^T\mathbf{X}\mathbf{p}\)</span>. Such weight matrices
|
|
are often diagonal, with a separate weight for each data point on the diagonal.</p>
|
|
<p><span class="math">\[\mathbf{X}^T\mathbf{W}\mathbf{y} = \mathbf{X}^T\mathbf{W}\mathbf{X}\mathbf{p}\]</span></p>
|
|
<table class="pre"><tr><td class="lines"><pre class="fssnip"><span class="l">1: </span>
|
|
</pre></td>
|
|
<td class="snippet"><pre class="fssnip highlighted"><code lang="csharp"><span class="k">var</span> p <span class="o">=</span> WeightedRegression.Weighted(X,y,W);
|
|
</code></pre></td></tr></table>
|
|
<p>Weighter regression becomes interesting if we can adapt them to the point of interest
|
|
and e.g. dampen all data points far away. Unfortunately this way the model parameters
|
|
are dependent on the point of interest <span class="math">\(t\)</span>.</p>
|
|
<table class="pre"><tr><td class="lines"><pre class="fssnip"><span class="l">1: </span>
|
|
<span class="l">2: </span>
|
|
</pre></td>
|
|
<td class="snippet"><pre class="fssnip highlighted"><code lang="csharp"><span class="c">// warning: preliminary api</span>
|
|
<span class="k">var</span> p <span class="o">=</span> WeightedRegression.Local(X,y,t,radius,kernel);
|
|
</code></pre></td></tr></table>
|
|
<h2><a name="Regularization" class="anchor" href="#Regularization">Regularization</a></h2>
|
|
<h2><a name="Iterative-Methods" class="anchor" href="#Iterative-Methods">Iterative Methods</a></h2>
|
|
|
|
|
|
</div>
|
|
<div class="span3">
|
|
<ul class="nav nav-list" id="menu">
|
|
|
|
<li class="nav-header">Math.NET Numerics</li>
|
|
<li><a href="https://numerics.mathdotnet.com/Packages.html">NuGet & Binaries</a></li>
|
|
<li><a href="https://numerics.mathdotnet.com/ReleaseNotes.html">Release Notes</a></li>
|
|
<li><a href="https://numerics.mathdotnet.com/License.html">MIT/X11 License</a></li>
|
|
<li><a href="https://numerics.mathdotnet.com/Compatibility.html">Platform Support</a></li>
|
|
<li><a href="https://numerics.mathdotnet.com/api/">Class Reference</a></li>
|
|
<li><a href="https://github.com/mathnet/mathnet-numerics/issues">Issues & Bugs</a></li>
|
|
<li><a href="https://numerics.mathdotnet.com/Users.html">Who is using Math.NET?</a></li>
|
|
|
|
<li class="nav-header">Contributing</li>
|
|
<li><a href="https://numerics.mathdotnet.com/Contributors.html">Contributors</a></li>
|
|
<li><a href="https://numerics.mathdotnet.com/Contributing.html">Contributing</a></li>
|
|
<li><a href="https://numerics.mathdotnet.com/Build.html">Build & Tools</a></li>
|
|
<li><a href="http://feedback.mathdotnet.com/forums/2060-math-net-numerics">Your Ideas</a></li>
|
|
|
|
<li class="nav-header">Getting Help</li>
|
|
<li><a href="https://discuss.mathdotnet.com/c/numerics">Discuss</a></li>
|
|
<li><a href="https://stackoverflow.com/questions/tagged/mathdotnet">Stack Overflow</a></li>
|
|
|
|
<li class="nav-header">Getting Started</li>
|
|
<li><a href="https://numerics.mathdotnet.com/">Getting started</a></li>
|
|
<li><a href="https://numerics.mathdotnet.com/Constants.html">Constants</a></li>
|
|
<li>Floating-Point Numbers</li>
|
|
<li>Arbitrary Precision Numbers</li>
|
|
<li>Complex Numbers</li>
|
|
<li><a href="https://numerics.mathdotnet.com/Matrix.html">Matrices and Vectors</a></li>
|
|
<li><a href="https://numerics.mathdotnet.com/Euclid.html">Euclid & Number Theory</a></li>
|
|
<li>Combinatorics</li>
|
|
|
|
<li class="nav-header">Evaluation</li>
|
|
<li><a href="https://numerics.mathdotnet.com/Functions.html">Special Functions</a></li>
|
|
<li>Differentiation</li>
|
|
<li><a href="https://numerics.mathdotnet.com/Integration.html">Integration</a></li>
|
|
|
|
<li class="nav-header">Statistics/Probability</li>
|
|
<li><a href="https://numerics.mathdotnet.com/DescriptiveStatistics.html">Descriptive Statistics</a></li>
|
|
<li><a href="https://numerics.mathdotnet.com/Probability.html">Probability Distributions</a></li>
|
|
|
|
<li class="nav-header">Generation</li>
|
|
<li><a href="https://numerics.mathdotnet.com/Generate.html">Generating Data</a></li>
|
|
<li><a href="https://numerics.mathdotnet.com/Random.html">Random Numbers</a></li>
|
|
|
|
<li class="nav-header">Transformation</li>
|
|
<li>Fourier Transform (FFT)</li>
|
|
<li>Filtering & DSP</li>
|
|
<li>Window Functions</li>
|
|
|
|
<li class="nav-header">Solving Equations</li>
|
|
<li><a href="https://numerics.mathdotnet.com/LinearEquations.html">Linear Equation Systems</a></li>
|
|
<li>Nonlinear Root Finding</li>
|
|
|
|
<li class="nav-header">Optimization</li>
|
|
<li>Linear Least Squares</li>
|
|
<li>Nonlinear Optimization</li>
|
|
<li><a href="https://numerics.mathdotnet.com/Distance.html">Distance Metrics</a></li>
|
|
|
|
<li class="nav-header">Curve Fitting</li>
|
|
<li><a href="https://numerics.mathdotnet.com/Regression.html">Regression</a></li>
|
|
<li>Interpolation</li>
|
|
<li>Fourier Approximation</li>
|
|
|
|
<li class="nav-header">Native Providers</li>
|
|
<li><a href="https://numerics.mathdotnet.com/MKL.html">Intel MKL</a></li>
|
|
|
|
<li class="nav-header">Working Together</li>
|
|
<li><a href="https://numerics.mathdotnet.com/CSV.html">Delimited Text Files (CSV)</a></li>
|
|
<li><a href="https://numerics.mathdotnet.com/MatrixMarket.html">NIST MatrixMarket</a></li>
|
|
<li><a href="https://numerics.mathdotnet.com/MatlabFiles.html">MATLAB</a></li>
|
|
<li><a href="https://numerics.mathdotnet.com/IFSharpNotebook.html">IF# Notebook</a></li>
|
|
<li>FsLab & Deedle</li>
|
|
<li>Microsoft Excel</li>
|
|
<li>numl.net machine learning</li>
|
|
<li>R-project</li>
|
|
|
|
</ul>
|
|
</div>
|
|
</div>
|
|
</div>
|
|
</body>
|
|
</html>
|
|
|