Main content

### Course: Statistics and probability > Unit 5

Lesson 6: More on regression- Squared error of regression line
- Proof (part 1) minimizing squared error to regression line
- Proof (part 2) minimizing squared error to regression line
- Proof (part 3) minimizing squared error to regression line
- Proof (part 4) minimizing squared error to regression line
- Regression line example
- Second regression example

© 2024 Khan AcademyTerms of usePrivacy PolicyCookie Notice

# Proof (part 4) minimizing squared error to regression line

Proof (Part 4) Minimizing Squared Error to Regression Line. Created by Sal Khan.

## Want to join the conversation?

No posts yet.

## Video transcript

So if you've gotten this far,
you've been waiting for several videos to get to the
optimal line that minimizes the squared distance to
all of those points. So let's just get to
the punch line. Let's solve for the
optimal m and b. And just based on what we did
in the last videos, there's two ways to do that. We actually now know two points
that lie on that line. So we can literally find the
slope of that line and then the the y intercept,
the b there. Or, we could just say
it's the solution to this system of equations. And they're actually
mathematically equivalent. So let's solve for m first. And
if we want to solve for m, we want to cancel out the b's. So let me rewrite this top
equation just the way it's written over here. We have m times the mean of the
x squareds plus b times the mean of-- Actually,
we could even do it better than that. One step better than that is to,
based on the work we did in the last video, we can just
subtract this bottom equation from this top equation. So let me subtract it. Or let's add the negatives. So if I make this negative,
this is negative. This is negative. What do we get? We get m times the mean of the
x's minus the mean of the x squareds over the mean of x. The plus b and the negative
b cancel out. Is equal to the mean of the y's
minus the mean of the xy's over the mean of the x's. And then, we can divide both
sides of the equation by this. And so we get m is equal to the
mean of the y's minus the mean of the xy's over the mean
of the x's over this. The mean of the x's minus the
mean of the x squareds over the mean of the x's. Now notice, this is the exact
same thing that you would get if you found the slope between
these two points over here. Change in y, so the difference
between that y and that y, is that right over there. Over the change in x's. The change in that
x minus that x is exactly this over here. Now, to simplify it, we can
multiply both the numerator and the denominator by
the mean of the x's. And I do that just so we don't
have this in the denominator both places. So if we multiply the numerator
by the mean of the x's, we get the mean of the x's
times the mean of the y's minus, this and this will
cancel out, minus the mean of the xy's. All of that over, mean of the
x's times the mean of the x's is just going to be the mean of
the x's squared, minus over here you have the mean
of the x squared. And that's what we get for m. And if we want to solve for
b, we literally can just substitute back into either
equation, but this equation right here is simpler. And so if we wanted to solve for
b there, we can solve for b in terms of m. We just subtract m times the
mean of x's from both sides. We get b is equal to the mean
of the y's minus m times the mean of the x's. So what you do is you take
your data point. You find the mean of the x's,
the mean of the y's , the mean of the xy's, the mean
of the x's squared. You find your m. Once you find your m, then you
can substitute back in here and you find your b. And then you have your
actual optimal line. And we're done. So these are the two big formula
take aways for our optimal line. What I'm going to do in the next
video, and this is where if anyone wasn't skipping up to
this point, the next video is where they should re-engage,
because we're actually going to use these
formulas for the best fitting line. At least, when you measure the
error by the squared distances from the points. We're going to use these
formulas to actually find the best line for some data.