In k-fold cross-validation, the original sample is randomly partitioned into

**k****equal sized sub-samples**of the k subsamples,**A single sub-sample is retained as the validation data for testing**the model, and

the remaining

**k − 1 subsamples are used as training data**.The

**cross-validation process is then repeated k times**(the folds), with each of the k subsamples used exactly once as the validation data.The k results from the folds can then be

**averaged**(or otherwise combined)**to produce a single estimation**.The

**advantage**of this method over repeated random sub-sampling is that all observations are used for both training and validation, and each observation is used for validation exactly once.The

**disadvantage**of this method is training algorithm has to be**re-run from the scratch k-times**which means it takes as much**computation****to make an evaluation**.The

**error**of the classifier is the**averages testing error**across k-testing parts.
Advertisements