Training and Test sets

A PCA or PLS model is calculated from a number of observations that belongs to the imported DataSet. The observations that are specifically used for creating a model are said to be part of the training set. As these observations are already known to the model, validating a model with training set observations often lead to over-optimistic results. Therefore, a test set is usually constructed with observations omitted from the model calculations. The test set observations are used for testing the predictability of unknown observations, i.e. those not used for building the model. Observations can belong to both sets, but most of the time, a specific observation should only belong to either set.

When viewing the observations of a DataSet, you can choose which observations should belong to the Training set and which should belong to the Test set.

Predictions can be made for both PCA and PLS models. These appear in the Data Tree under the corresponding model.