Model Assessment and Selection
Understanding the Bias-Variance Tradeoff
The bias-variance tradeoff of k-NN see in ref 5
The schematic of the behavior of bias and variance see ref 6
The Wrong and Right Way to Do Corss-validation
A typical wrong strategy for analysis might be as follows:
- Screen the predctors: find a subset of "good" predictors that show fairly strong (univariate) correlations with the class labels
- Using just this subset of predictors, build a multivariate classifier
- Use cross-validation to estimate the unknown tuning parameters and to estimate the prediction error of the final model.
Leaving samples out after the variables have been selected does not correctly mimic the application of the classifer to a completely indenpendent test set, since these predictors "have already seen" the left out samples.
Bootstrap Methods
As with cross-validation, the bootstrap seeks to estimate the conditional error Errτ, the extra-sample prediction error, but typically estimates well only the expected prediction error Err.
From the bootstrap sampling we can estimate any aspect of the distribution of S(Z), for example, its variance
Var^[S(Z)]=1B−1∑Bb=1(S(Z∗b)−S¯∗)2
where S¯∗=∑bS(Z∗b/B)
Err^(1)=1N∑Ni=11C−i∑b∈C−iL(yi,f^∗b(xi)).
Here C−i is the set of indices of the bootstrap samples b that do not contain observation i, and |C−i| is the number of such samples.
References:
- Understanding the Bias-Variance Tradeoff
- Bias-Variance Decomposition
- Bias/Variance Tradeoff
- Ensemble Methods
- Machine Learning Crash Course
- AIC vs. BIC
- Is there any reason to prefer the AIC or BIC over the other?
- Elements of Statistical Learning: Schedule & Associated Material
- Chapter 7: Model Assessment and Selection
Other Links:
Practical machine learning: methods and algorithmics
Slides and R code
Bias Bias--Variance Theory Variance Theory
Model%20Assessment%20and%20Selection%20%20%20%0A%3D%3D%3D%20%20%20%0A@%28ir%29%5Bpublished%7Cmachine%20learning%5D%20%20%20%0A%0A%23%23%23Understanding%20the%20Bias-Variance%20Tradeoff%20%20%20%0A%0AThe%20bias-variance%20tradeoff%20of%20*k*-NN%20see%20in%20%5Bref%205%5D%28http%3A//lcsl.mit.edu/courses/mlcc/classes/Lecture2_MemoryBasedLearning.pdf%29%20%20%0AThe%20schematic%20of%20the%20behavior%20of%20***bias%20and%20variance***%20see%20%5Bref%206%5D%28http%3A//www.math.ku.dk/%7Erichard/download/courses/stat_learn_2007/lecture12032007.pdf%29%20%20%20%0A%23%23%23%20The%20Wrong%20and%20Right%20Way%20to%20Do%20Corss-validation%20%20%0AA%20typical%20wrong%20strategy%20for%20analysis%20might%20be%20as%20follows%3A%20%20%0A%3E%201.%20Screen%20the%20predctors%3A%20find%20a%20subset%20of%20%22good%22%20predictors%20that%20show%20fairly%20strong%20%28univariate%29%20correlations%20with%20the%20class%20labels%20%20%0A2.%20Using%20just%20this%20subset%20of%20predictors%2C%20build%20a%20multivariate%20classifier%20%20%0A3.%20Use%20cross-validation%20to%20estimate%20the%20unknown%20tuning%20parameters%20and%20to%20estimate%20the%20prediction%20error%20of%20the%20final%20model.%20%20%0A%0ALeaving%20samples%20out%20*after*%20the%20variables%20have%20been%20selected%20does%20not%20correctly%20mimic%20the%20application%20of%20the%20classifer%20to%20a%20completely%20indenpendent%20test%20set%2C%20since%20these%20predictors%20%22have%20already%20seen%22%20the%20left%20out%20samples.%20%20%0A%0A%23%23%23Bootstrap%20Methods%20%20%0AAs%20with%20cross-validation%2C%20the%20bootstrap%20seeks%20to%20estimate%20the%20conditional%20error%20%24%5Ctext%7BErr%7D_%5Ctau%24%2C%20the%20extra-sample%20prediction%20error%2C%20but%20typically%20estimates%20well%20only%20the%20expected%20prediction%20error%20%24%5Ctext%7BErr%7D%24.%20%20%0AFrom%20the%20bootstrap%20sampling%20we%20can%20estimate%20any%20aspect%20of%20the%20distribution%20of%20%24S%28Z%29%24%2C%20for%20example%2C%20its%20variance%20%20%20%0A%24%24%0A%20%20%20%20%5Chat%7BVar%7D%5BS%28Z%29%5D%20%3D%20%5Cfrac%7B1%7D%7BB-1%7D%5Csum_%7Bb%3D1%7D%5EB%28S%28Z%5E%7B*b%7D%29-%5Cbar%7BS%7D%5E*%29%5E2%20%20%0A%24%24%20%20%0Awhere%20%24%5Cbar%7BS%7D%5E*%20%3D%20%5Csum_b%20S%28Z%5E%7B*b%7D/B%29%24%20%20%0A%0A%24%24%0A%20%20%20%20%5Chat%7BErr%7D%5E%7B%281%29%7D%20%3D%20%5Cfrac%7B1%7D%7BN%7D%5Csum_%7Bi%3D1%7D%5EN%5Cfrac%7B1%7D%7BC%5E%7B-i%7D%7D%5Csum_%7Bb%5Cin%20C%5E%7B-i%7D%7DL%28y_i%2C%20%5Chat%7Bf%7D%5E%7B*b%7D%28x_i%29%29.%20%20%20%0A%24%24%20%20%20%0AHere%20%24C%5E%7B-i%7D%24%20is%20the%20set%20of%20indices%20of%20the%20bootstrap%20samples%20%24b%24%20that%20*do%20not*%20contain%20observation%20*i*%2C%20and%20%24%7CC%5E%7B-i%7D%7C%24%20is%20the%20number%20of%20such%20samples.%0A%0A%0A%0A%0AReferences%3A%20%20%0A1.%20%5BUnderstanding%20the%20Bias-Variance%20Tradeoff%5D%28http%3A//scott.fortmann-roe.com/docs/BiasVariance.html%29%20%20%0A2.%20%5BBias-Variance%20Decomposition%5D%28http%3A//www.cedar.buffalo.edu/%7Esrihari/CSE574/Chap3/Bias-Variance.pdf%29%20%20%0A3.%20%5BBias/Variance%20Tradeoff%5D%28http%3A//www.cs.cornell.edu/courses/cs578/2005fa/CS578.bagging.boosting.lecture.pdf%29%20%20%0A4.%20%5BEnsemble%20Methods%5D%28http%3A//www.inf.ed.ac.uk/teaching/courses/dme/2012/slides/ensemble.pdf%29%20%20%20%0A5.%20%5BMachine%20Learning%20Crash%20Course%5D%28http%3A//lcsl.mit.edu/courses/mlcc/classes/Lecture2_MemoryBasedLearning.pdf%29%20%20%20%0A6.%20%5BAIC%20vs.%20BIC%5D%28http%3A//methodology.psu.edu/eresources/ask/sp07%29%20%20%0A7.%20%5BIs%20there%20any%20reason%20to%20prefer%20the%20AIC%20or%20BIC%20over%20the%20other%3F%5D%28http%3A//stats.stackexchange.com/questions/577/is-there-any-reason-to-prefer-the-aic-or-bic-over-the-other%29%20%20%0A8.%20%5BElements%20of%20Statistical%20Learning%3A%20Schedule%20%26%20Associated%20Material%5D%28http%3A//www.csc.kth.se/utbildning/kth/kurser/DD3364/Schedule.php%29%20%20%20%0A9.%20%5BChapter%207%3A%20Model%20Assessment%20and%20Selection%5D%28http%3A//www.csc.kth.se/utbildning/kth/kurser/DD3364/Lectures/Lecture6.pdf%29%0A%0A%0A-----%0AOther%20Links%3A%20%20%0A%5BPractical%20machine%20learning%3A%20methods%20and%20algorithmics%5D%28http%3A//www.cbcb.umd.edu/%7Ehcorrada/PracticalML/%29%20%20%0ASlides%20and%20**R%20code**%20%20%0A%5BBias%20Bias--Variance%20Theory%20Variance%20Theory%5D%28http%3A//web.engr.oregonstate.edu/%7Etgd/classes/534/slides/part9.pdf%29