Model Assessment and Selection

Understanding the Bias-Variance Tradeoff

The bias-variance tradeoff of k-NN see in ref 5
The schematic of the behavior of bias and variance see ref 6

The Wrong and Right Way to Do Corss-validation

A typical wrong strategy for analysis might be as follows:

Screen the predctors: find a subset of "good" predictors that show fairly strong (univariate) correlations with the class labels

Using just this subset of predictors, build a multivariate classifier

Use cross-validation to estimate the unknown tuning parameters and to estimate the prediction error of the final model.

Leaving samples out after the variables have been selected does not correctly mimic the application of the classifer to a completely indenpendent test set, since these predictors "have already seen" the left out samples.

Bootstrap Methods

As with cross-validation, the bootstrap seeks to estimate the conditional error Errτ, the extra-sample prediction error, but typically estimates well only the expected prediction error Err.
From the bootstrap sampling we can estimate any aspect of the distribution of S(Z), for example, its variance

Var^[S(Z)]=1B−1∑Bb=1(S(Z∗b)−S¯∗)2

where S¯∗=∑bS(Z∗b/B)

Err^(1)=1N∑Ni=11C−i∑b∈C−iL(yi,f^∗b(xi)).

Here C−i is the set of indices of the bootstrap samples b that do not contain observation i, and |C−i| is the number of such samples.

References:

Understanding the Bias-Variance Tradeoff
Bias-Variance Decomposition
Bias/Variance Tradeoff
Ensemble Methods
Machine Learning Crash Course
AIC vs. BIC
Is there any reason to prefer the AIC or BIC over the other?
Elements of Statistical Learning: Schedule & Associated Material
Chapter 7: Model Assessment and Selection

Other Links:
Practical machine learning: methods and algorithmics
Slides and R code
Bias Bias--Variance Theory Variance Theory