next up previous
Next: Affine transformations Up: Explanations Previous: Explanations

   
A note on correlation and covariance matrices

In neural network literature, the matrix $\bC_{xx}$ in equation 3 is often called a correlation matrix. This can be a bit confusing, since $\bC_{xx}$ does not contain the correlations between the variables in a statistical sense, but rather the expected values of the products between them. The correlation between xi and xjis defined as

 \begin{displaymath}
\rho_{ij} =
\frac{E[(x_i - \bar{x}_i) (x_j - \bar{x}_j)]}
{\sqrt{E[(x_i - \bar{x}_i)^2] E[ (x_j - \bar{x}_j)^2]}},
\end{displaymath} (18)

see for example[1], i.e. the covariance between xiand xj normalized by the geometric mean of the variances of xiand xj ( $\bar{x} = E[x]$). Hence, the correlation is bounded, $-1
\leq \rho_{ij} \leq 1$. In this tutorial, correlation matrices are denoted $\bR$.

The diagonal terms of $\bC_{xx}$ are the second order origin moments, E[xi2], of xi. The diagonal terms in a covariance matrix are the variances or the second order central moments, $E[(x_i -
\bar{x_i})^2]$, of xi.

The maximum likelihood estimator of $\rho$ is obtained by replacing the expectation operator in equation 18 by a sum over the samples. This estimator is sometimes called the Pearson correlation coefficient after K. Pearson[16].


next up previous
Next: Affine transformations Up: Explanations Previous: Explanations
Magnus Borga
1999-10-29