|


Multiple Linear RegressionDate: 01/23/97 at 08:21:22 From: dulud Subject: Multiple linear regression Bonjour from Paris, I am looking for the complete set of formulas for multiple linear regression. Can you help me? Thanks
Date: 01/27/97 at 09:10:37
From: Doctor Mitteldorf
Subject: Re: multiple linear regression
Greetings from Philadelphia!
If you understand single-variable linear regression, then multiple
regression is just the same thing with matrices and vectors where you
had numbers before.
Here are the formulas, first for single variable:
Say you have a collection of points (x,y), and you want the best line
through them. The line will be:
y = ax + b
where a = (<xy>-<x><y>) / (<x^2>-<x>^2)
and b = <y> - a<x>
The correlation coefficient r is given by:
r = (<xy>-<x><y>) / sqrt{ (<x^2>-<x>^2) * (<y^2>-<y>^2) }
In the above, the notation <xy> means "average value of xy": in other
words, for each point, multiply x for that point times y for that
point, add up all the products, and divide by the number of points.
Similarly, <x^2> is the mean value of x^2. You'll recognize the
denominator of the expression for a as the variance of x. So you
could rewrite formulas as:
a = (<xy>-<x><y>) / var(x)
r = a * sqrt{ var(x) / var(y) }
Now for the multivariate version of the formulas, you must think of
x as a vector, but y is still a scalar. y is a function of multiple
variables which together are called x. I'll use capital letters for
vectors and "." for the dot product of two vectors:
A.X means A[1]*X[1] + A[2]*X[2] + ...
We're still looking for a linear relationship between x and y, and
now it's of the form y = A.X + b. Since X is a vector of n numbers,
we look for n coefficients of proportionality, and make scalar a into
vector A.
In the formula for A, the numerator becomes:
(<Xy>-<X><y>)
This is easy to interpret. X is a vector, y is a scalar. Every
component of X is multiplied by the scalar y.
But the denominator takes a little more thought. What do we mean by:
(<XX> - <X><X>)
This is a 2d rank tensor, which looks like a square matrix. If X has
n components, then <XX> has n^2 components. The (i,j) component of
this object is made by averaging <X[i]X[j]> over all the points in
your sample. <X><X> is the matrix that you make just by multiplying
out all possible combinations of the vectors X. The (i,j) component
of <X><X> is given by <X[i]><X[j]>; in other words, separately average
the X[i] components for all points and the X[j] components for all
points, then just multiply those two together.
<XX> and <X><X> are both matrices. Subtract one from the other to
get the "denominator" matrix corresponding to var(X).
Then you must "divide" this matrix into the numerator vector. The
way to do this is to invert the matrix, then multiply. Symbolically,
you could write the steps this way:
Let vector V = (<Xy>-<X><y>)
Let matrix M = (<XX>-<X><X>)
Then let vector A = Inv(M) * V
Also, r^2 = Inv(M) * V
The inverse of the matrix M is another matrix. The product of that
matrix with a vector is another vector.
Finally, b is just a scalar, and the formula for b is just as before,
with A and X becoming vectors:
b = <y> - A.<X>
I hope this helps. Don't hesitate to write again if any part is still
not clear.
-Doctor Mitteldorf, The Math Forum
Check out our web site!
|
Search the Dr. Math Library: |
[Privacy Policy] [Terms of Use]


Ask Dr. MathTM
© 1994-2011 The Math Forum
http://mathforum.org/dr.math/