> Validation sets are merely one technique among many to gauge how well your predictions are doing
In Andrew Ng's "Machine Learning" offering on Coursera he talks about having three sets of data:
1. Training data. He uses this for fitting most model parameters.
2. A second set for "more general" analyses -- judging the effects of additional data, regularisation parameters, neural-network topology etc. Performance on this data is used to decide which model to use and how to use it.
3. A third set to estimate how good the choice of model is.
The theory is that the parameters in #1 are fitted to the training data, and the model choice is "fitted" to the data in #2. Even though we think (hope?) that the inferences made in those two steps will generalise reasonably well, we should still expect measures of fit from those analyses to be optimistic. We need a set that has not been used for calibration to reliably estimate how good our model will be on data in the field.
In Andrew Ng's "Machine Learning" offering on Coursera he talks about having three sets of data:
1. Training data. He uses this for fitting most model parameters.
2. A second set for "more general" analyses -- judging the effects of additional data, regularisation parameters, neural-network topology etc. Performance on this data is used to decide which model to use and how to use it.
3. A third set to estimate how good the choice of model is.
The theory is that the parameters in #1 are fitted to the training data, and the model choice is "fitted" to the data in #2. Even though we think (hope?) that the inferences made in those two steps will generalise reasonably well, we should still expect measures of fit from those analyses to be optimistic. We need a set that has not been used for calibration to reliably estimate how good our model will be on data in the field.