Math
Equation of the LineThis topic is enjoyable in itself, as an exercise in doing geometry algebraically (while waiting for the airport limousine, or in any other spare moment). It is given here as background for the Correlation Coefficient, so as not to hold up the explanation at that point. You will need your pad of graph paper.
Parameters of the Straight Line
Any straight line function is one where a certain change in one variable (call it x) produces a proportionate change in the other, dependent variable (call it y). If apples (a) are two dollars (2), each, then the price of a given number of apples will be 2a. We may put this into equation form as
y = 2a
Since not all variables are apples, we may substitute the general variable x, and write all such situations (for items costing two dollars each) as
y = 2x
And since not all prices are two dollars, we may call the general multiplier m, whence
y = mx
Finally, we may have to pay an admission or base charge to get into the store in the first place (to move at once to the general case, let us call the base charge b). Then the base charge b must be added to the price of our purchases (mx) to get our final cost. This applies even if we don't buy anything; we will then still be billed for the base charge. Our adjusted equation, the formula for which all this is true, is therefore:
y = mx + b
where, if we buy nothing (x = 0) and thus our cost for purchases is also mx = 0, we will still incur a final charge in the amount y = b, the amount of the base charge. This formula is the general equation for any straight line. In that equation, as drawn on Cartesian graph paper,
y = mx
shows how steeply the line tilts up (or as it may be, down). In this equation,
y = (m)(0) = 0
which means that at the value x = 0, we also have the value y = 0. Then the equation contains a point (0, 0), and therefore goes through the origin (which is the point 0,0). On the the other hand, if there is a "b" base charge, the line is raised on the graph by the amount b, and thus cannot go through the origin. It will touch or cross the vertical axis (all the points for which x = 0) somewhere above or below the horizontal axis (the points for which y = 0). Thus if the general equation
y = mx + b
has a value of b other than zero, the line does not go through the origin. It will cross the y axis (the point at which x = 0) at the point y = b, that is, at the point conventionally labeled (0, b).
We may mention, without examples, that a negative slope, such as m = -5, indicates a curve that slopes down as it moves to the right. Just as a negative value of b indicates a line that crosses the y axis somewhere below the x axis.
So, in our general equation, m is the slope (and has a negative value if the slope is downward) and b is what is called the y-intercept, the point at which the line crosses the y-axis (and it has a negative value if that point is below the x-axis).
The slope and the intercept are the parameters of the line: the only things we need to know in order to determine its equation. When we know those two numbers, we know everything about the line. The line is the locus of all points in the world for which the given equation is true.
Finding the Equation
In the above examples, we were given the values of m (the unit price) and b (the base charge, which raises the line by that amount). Suppose we had only some data points from the resulting line, such as
x 0 1 2 3 4 5 6 y 2 4 6 8 10 12 14We see frpm the first given data point that when x = 0, y = 2, meaning that the line crosses the y-axis at 2 (when x is zero, y has the value 2). Then b must be 2. Subtracting that 2 from the other values, we have the revised table
x 0 1 2 3 4 5 6 y 0 2 4 6 8 10 12from which it is intuitively obvious that each y value is the x value multiplied by 2, so that our multiplier m is 2. Since we previously found out that b = 2, our whole equation must be
y = 2x + 2
Two points are enough to determine a line. So, given any two data points, we could draw the line on our graph paper, determine its intercept (b) by eye, and find the multiplier by examining the adjusted values for the given points. This works well when m and b are small whole numbers, but not with big numbers (1,728) or fractional numbers (0.006) or irrational numbers (p). For those cases, we need formulas to calculate the equation parameters (the values of m and b) from the data points. Of course the same formulas will also cover the simple cases; they will cover all cases whatever. We will now derive the necessary formulas.
We need only two points. From the above list, take, arbitrarily, the points
(2,6) and (5,12)
Notice that neither of them gives us directly the point for which x = 0 (that is, the y-intercept). So we will have to calculate the y-intercept.
First, however, it is easier to extract the multiplier m. To do this, we ask: For a given change in x, what is the change in y? It is conventional to abbreviate "change in x" as Dx, and similarly for y. If we ask this question of our two data points, we find that a change from 2 to 5 in x (that is, a change of 3 in x) produces a change from 6 to 12 in y (a change of 6 in y). Then a change Dx = 3 corresponds to a change Dy = 6. To get the slope (the rate of change), we divide the change in y (the resulting change) by the change in x (the independent change), or in equation terms
m = Dy/Dx = 6/3 = 2
(Notice that by subtracting the two y values from each other, to get Dy, we eliminate b, the amount by which each y value is raised or lowered; this is what lets us get at the slope, in this way, without interference from b).
Now that we know that the slope m has the value 2, we can immediately write the desired equation as
y = 2x + b
leaving only b to be determined. To determine it, we substitute any of our point values in the equation to get the missing b. Take the point (5,12). Then:
12 = (2)(5) + b
12 = 10 + b
2 = bOr, more conventionally, b = 2. If we instead used the other point (2,6), we would have had:
6 = (2)(2) + b
6 = 4 + b
2 = bgiving the same answer. Either point works. Once we know m, any point on the line contains enough information to get the value of b. So we now have b = 2, and our whole equation is:
y = 2x + 2
as before. Eyeball inspection was not required, all was done by calculation. We have described a physical object (the line) withour reference to anything except its numbers. This, in brief, is the Descartes (or, Cartesian) revolution in mathematics, usually called Analytic Geometry. It is one of the most powerful mathematical tools ever invented.
Additional Example
Start over. We are given the two data points (-3, -2) and (2, 5). Remember that in subtracting the second y value (5) from the first y value (-2) we will have -2 -5 = -7, and so on. Then applying our rule,
m = Dy/Dx = -7/-5 = 7/5 = 1 2/5 = 1.4
and substituting this value m = 1.4 in the less attractive of our two points, namely (-3, -2), we get
-2 = (1.4)(-3) + b
-2 = -4.2 + b
transposing, we get 4.2 - 2 = b
and so 2.2 = bThen b = 2.2, and with the previously discovered m = 1.4, we have the complete equation
y = 1.4x + 2.2
If we want to visualize this, we argue that, since the intercept b is positive, the line will cross the y-axis somewhere above the horizontal x-axis, and since the slope m is also positive, the line will run uphill to the right. Verify this physical interpretation by plotting the points on your graph paper and connecting them with your ruler. Your drawing should look like this:
Notice that, though the general situation is clear from your drawing, it might have been difficult, just from your drawing, to be sure of the exact decimal values of m and b.
If we don't like decimals in our equation, we could multiply all terms by 5 to get the form
5y = 7x + 11
but this does not directly reflect the slope (m = 1.4) or the intercept (b = 2.2). That is why the previous form of the equation is better: it is more intelligible. It is more transparent to the conditions (the parameters) which are generating the line.
Summary
- The general equation of a line is y = mx + b, where m (multiplier) is the slope of the line, and b (the base constant) is the y-intercept of the line.
- Two points determine a line, thus
- The equation of a line can be calculated from any two points on that line.
- To calculate the m or multiplier from two data points, use m = Dy/Dx.
- Then substitute one of the data points in the equation to find the base constant b.
14 Jan 2006 / Contact The Project / Exit to Statistics Page