Advertisement
acclivity

pyLinearRegression

Feb 13th, 2021 (edited)
243
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
Python 4.16 KB | None | 0 0
  1. # Linear Regression Analysis
  2.  
  3. # Mike Kerry - Feb 2021 - acclivity2@gmail.com
  4.  
  5. # This was a fun project. We are given the prices of pizzas at 5 different sizes (6, 8, 10, 14 and 18 inches)
  6. # The task is to compute a straight line graph through these points, finding the line that gives the minimum value
  7. # of sum-of-squares differences from the actual prices.
  8. # Having found this line, use it to predict the price of a 12" pizza
  9.  
  10. # I started by finding the slope of the existing graph between each adjacent pair of points.
  11. # I found the minimum and maximum slope, and I assumed that my final prediction line would lie between these slopes.
  12.  
  13. # Using these min and max slopes, I made initial "inspired guesses" at the mininum and maximum price of a 6" pizza
  14. # based on my least-sum-of-squares line.
  15.  
  16. # I then computed the sum-of-squares for 500 starting prices (between min and max 6" pizza) and for 500 slopes for
  17. # each of these starting prices. I recorded the line that gave me the least sum-of-squares of errors in price
  18.  
  19. # Using my best-fit line, I computed the price of a 12" pizza. This came out as $13.68 (which was the right answer!)
  20.  
  21. # All this was done without using any Python Library.
  22.  
  23. # (Interestingly, this is a simplified version of the first computer program I ever wrote,
  24. # which was a Multiple Regression Analysis written while I was at college in 1961)
  25.  
  26. x = [6, 8, 10, 14, 18]          # Sizes of pizzas in inches
  27. y = [7, 9, 13, 17.5, 18]        # Price of pizza in $ at each of the given sizes
  28.  
  29. num = len(x)                    # Number of points in the given graph
  30. slopes = []                     # a list of the slopes of the graph for all adjacent points
  31.  
  32. for i in range(1, num):         # look at each x interval
  33.     xdiff = x[i] - x[i-1]
  34.     ydiff = y[i] - y[i-1]
  35.     slopes.append(ydiff/xdiff)      # compute the slope and append to list of slopes
  36.  
  37. minslope = min(slopes)
  38. maxslope = max(slopes)
  39. # we will integrate between these two slope values
  40.  
  41. mid = num // 2                      # Find a pizza size and price around the middle of the range
  42. # Apply our min and max slopes to this mid-point pizza, and hence find a min and max price for a 6" (starting) pizza
  43. midxdif = x[mid] - x[0]
  44. maxystart = y[mid] - (midxdif * minslope)
  45. minystart = y[mid] - (midxdif * maxslope)
  46. # We will integrate between these starting prices
  47.  
  48. deltastart = (maxystart - minystart) / 500          # We will compute for 500 values of price (Y) at x[0]
  49. deltaslope = (maxslope - minslope) / 500            # And compute for 500 values of slope per Y start
  50.  
  51. leastssq = 999999.0                                  # This will be the least sum-of-squares value of all 250,000 loops
  52. bestystart = 0.0
  53. bestslope = 0.0
  54. ystart = minystart
  55. while ystart < maxystart:                           # integrate over 500 start values
  56.     aslope = minslope
  57.     while aslope <= maxslope:                       # integrate over 500 values of slope of graph
  58.         sumssq = 0.0
  59.         for i in range(num):
  60.             predicted_y = ystart + (x[i] - x[0]) * aslope
  61.             sumssq += (predicted_y - y[i]) ** 2
  62.         if sumssq < leastssq:
  63.             leastssq = sumssq                        # record the least sum-of-squares so far
  64.             bestystart = ystart                     # record the corresponding starting price
  65.             bestslope = aslope                      # and the corresponding graph slope
  66.  
  67.         aslope += deltaslope                        # bump to next slope value
  68.  
  69.     ystart += deltastart                            # bump to next starting price
  70.  
  71. # ----------------------------------------------------------------------------------
  72.  
  73. print('Best ystart = $%.2f   Best slope = %.2f   Least SSQ = %.2f' % (bestystart, bestslope, leastssq))
  74. print()
  75.  
  76. # compute best fit Y values for each value of X
  77. # (Not actually required, unless we wanted to plot this line)
  78. bestylist = []
  79. for j in range(num):
  80.     dxj = x[j] - x[0]
  81.     yj = bestystart + dxj * bestslope
  82.     bestylist.append(yj)
  83.  
  84. # Compute prediction for x = 12  (price for 12" pizza)
  85. px = 12
  86. dx = px - x[0]
  87. predict_y = bestystart + (dx * bestslope)
  88.  
  89. print('Predicted price of 12" pizza: %.2f ' % (predict_y))
  90.  
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement