ML::02

Model Representation
---------------------

To establish notation for future use, we’ll use
  x^(i) to denote the “input” variables (living area in the example), also called input features, and
  y^(i) to denote the “output” or target variable that we are trying to predict (price).

Training example: A pair (x^(i) , y^(i))   (Note that the superscript “(i)” in the notation is simply an index into the training set, and has nothing to do with exponentiation)

Training set: (x(i),y(i));i=1,...,m . (The dataset that we’ll be using to learn)

X: the space of input values
Y: the space of output values. In this example, X = Y = ℝ.

Goal: given a training set, learn a function h : X → Y so that h(x) is a “good” predictor for the corresponding value of y.

Training Set --> Learning Algorithm --> (h:X->Y)

"hypothesis": the function h (named so due to historical reasons)

When the target variable is continuous: "regression problem"
When y can take on only a small number of discrete values: "classification problem"


Cost Function
--------------

Measure the accuracy of our hypothesis function by using a cost function.

This takes an average difference (actually a fancier version of an average) of all the results of the hypothesis with inputs from x's and the actual output y's.

                                 i=1,..,m
J(theta_0, theta_1) =  (1 / 2m) * ∑ ( h(x^(i)) - y^(i) )^2

h:x->y = theta_0 + theta_1 * x

J: "Squared error function", or "Mean squared error"


Objective: Choose theta_0, theta_1 so that J(theta_0, theta_1) is minimal for training data set (x,y)


Terms
-----
Contour plot of cost function: A contour plot is a graph that contains many contour lines. A contour line of a two variable function has a constant value at all points of the same line, c (like isobars)