Somehow even wiki https://en.wikipedia.org/wiki/Bias%E2%80%93variance_tradeoff

seems not clearly explained what it is due to un-cleary math denotion.

The video gave us more precision meaning of bias, variance.

# Expected error of algorithm

the goal of the algorithm is to reduce the total error when we do the prediction/(genelization).

Thus we need to calculate the expected error of the algorithm over prediction data set.

by math:

expected error of an algorithm: (x,y) are draw from samples ( for testing or prediction), D is training set.

where

The E[f^(x,D)] means: expected fitting function/classifier, which means: for one training set D, we can train the model to get f^(x), for another training set D, we can get another f^(x), when we average all those training sets Ds, we get the E[f^(x, D)].

## Variance

The **variance** means the difference between the one particular classifier/regression function draw from one particular D and expected regress/classifier.

low variance means: we almost get the same/similar regression function even if the training set is different. for example: some linear functions.

## Bias

The **bias**: the difference between my expected regress/classifier and real regression/classifier. the expected one is the best this algorithm can do, that means it captures the limitation of this algorithm.

thus the bias of the model/algorithm.

High bias means: linear function to fit curve. no matter what we did, there are bias there.

## the irreducible error

Since all three terms are non-negative, the irreducible error forms a lower bound on the expected error on unseen samples

## Key to understanding

The key to understanding is thinking of using multiple ( more than 2 for better understanding) different training data set, what are those bias/variance terms’ errors over predict set.

For bias: if low, means no matter what training set, it almost gives us the similar/same regress/classifier.

For variance: if high, it means different training set will give different regress/classifier.

# General graph:

For each Algo, we can calculate Total Error, bias, variance ( over the all Training Ds, and prediction set (x, y), thus we get the figure above.

More details at:

https://en.wikipedia.org/wiki/Bias%E2%80%93variance_tradeoff

Comments are closed here.