|  
    
   	
 
  
Introduction 
In this article I present a nice formula for the regression line though a set of points in a plane. 
  
Given are points  (xi, yi)...where  i = 1,2,...,n 
Asked the line  y = ax + b which has the least deviation with these points.
  
A commom measure for the deviation is the sum of squares of the differences:
in case of n points.
  
For point i we have :
Before continuing, first some definitions and rules:
  
Definition
 sum: 
 S xi = x1+ x2 + .... + xn
  
 
 average: 
  =  
 
Arithmetic rules
      S (xi + yi) = S xi + S yi
  
	  if c is a constant:
  	  
	  S cxi = c S xi 
	  and 
	  S c = n c 
  
	  
	  from the average we conclude: 
	   = n   
 
application of the rules:
The formula's for a and b of  the regression line y = ax + b 
Function f(a,b) of the sum of the squared deviations of points 1..n is:
f(a,b) is first differentiated to a , then to b.
  
differentiation to a:
  f 'a(a,b) = 
  2 | Σ  | (yi − (a xi + b)) · −xi |   
 
differentiation to b:
For the best fit, both derivatives must be zero. 
This yields the following system of equations::
     | Σ  | (xi yi − a xi 2 − b xi) |   = 0 
     ...................1) 
	  = 0 
     ....................2)
 
from ....2) we see
       − a  − b n = 0 
	   
	   −  − b = 0 
	  
  
	  b =  − a  
      ................3)
 
substitute result for b at ........1)
  | Σ  | (xi yi − a xi 2 − ( − a ) xi) |   = 0 
   
  | Σ  | (xi yi − a xi 2 −  xi + a  xi) |   = 0 
    
    − a  −   + a   = 0 
    
    − a ( −  ) −   = 0 
    
   a ( −  ) =  −   
    
   a =  
 
Formally, we have found the formula's for a and b. 
Above value of a can be substituted at  .......3) to know b. 
However, with some manipulation the formula may be converted into a more elegant form. 
We separately attack nominator and denominator.
  
1. the nominator
 
2. the denominator
summarizing:
 
Note: 
please look  [here] for an article about the best polynomial through a set of points. 
It is a nice application of linar algebra.
   
 	
	 |  
     |