| Introduction to Multivariate Data Analysis and Data Sets |
A typical experimental study gathers information about a number of samples or observations. The gathered information comprises different properties or measurements also known as variables, which can have different values from sample to sample. For example, one could gather information about the age and shoe size of different persons. Each person would then represent an observation and age and shoe size would be two different variables. When observations and variables are assembled together, a data set is obtained. A multivariate data set is characterized by having at least two variables or dimensions.

When data sets are large it may sometimes be difficult to get an overview by looking at one or a few variables at a time. Another problem is created by variables sharing the same information. By using multivariate data analysis methods these problems can be avoided. With these methods it is possible to extract the information found in all variables and summarize it in a few variables, each holding unique information about the data . Evince is a software that applies such methods on a data set creating multivariate models of the data. A multivariate model is a representation of the studied data set and is comprised of a number of significant variables or components. The model components can be used in various kinds of plots, which provide representations of the original multidimensional data in just a few dimension. The Evince workflow can be seen described by the following scheme:

Two kinds of multivariate models are available in Evince, PCA (Principal Component Analysis) and PLS (Partial Least Squares). PCA is used for analyzing a single matrix X while PLS uses two matrices, X and Y for constructing a regression model between them.
Multivariate analysis methods can be applied to data in many research fields and application areas, such as
- Biotechnical Research. Analysis of proteomic and metabonomic data.
- Process and Forest Industries. Analysis of process variables and measured properties of process material.
- Pharmaceutical Industries. Analysis of spectroscopic measurements of tablets and chemical descriptors.
- Environmental Science Research. Analysis of environmental variables and climate.
- Remote Sensing. Analysis of hyperspectral images.
See the Multivariate data analysis sections for more specific information.