Skip to contents

Derivation of standardised principal components

Data

Let 𝐘\pmb{Y} be an A×LA \times L matrix of values, where a=1,,Aa=1,\cdots,A indexes age group and l=1,,Ll=1,\cdots,L indexes some combination of classifying variables, such as country crossed with time. The values are real numbers, including negative numbers, such as log-transformed rates, or logit-transformed probabilities.

Singular value decomposition

We perform a singular value decomposition on 𝐘\pmb{Y}, and retain only the first C<AC < A components, to obtain 𝐘𝐔𝐃𝐕\begin{equation} \pmb{Y} \approx \pmb{U} \pmb{D} \pmb{V}^\top \end{equation}

𝐔\pmb{U} is an A×CA \times C matrix whose columns are left singular vectors. 𝐃\pmb{D} is a C×CC \times C diagonal matrix holding the singular values. 𝐕\pmb{V} is a L×CL \times C matrix whose columns are right singular vectors.

Standardising

Let 𝐦V\pmb{m}_V be a vector, the ccth element of which is the mean of the ccth singular vector, l=1Lvlc/L\sum_{l=1}^L v_{lc} / L. Similarly, let 𝐬V\pmb{s}_V be a vector, the ccth element of which is the standard deviation of the ccth singular vector, l=1L(vlcmc)2/(L1)\sqrt{\sum_{l=1}^L (v_{lc} - m_c)^2 / (L-1)}. Then define 𝐌V=𝟏𝐦V𝐒V=diag(𝐬V),\begin{align} \pmb{M}_V & = \pmb{1} \pmb{m}_V^\top \\ \pmb{S}_V & = \text{diag}(\pmb{s}_V), \end{align} where 𝟏\pmb{1} is an L-vector of ones. Let 𝐕̃\tilde{\pmb{V}} be a standardized version of 𝐕\pmb{V}, 𝐕̃=(𝐕𝐌V)𝐒V1.\begin{equation} \tilde{\pmb{V}} = (\pmb{V} - \pmb{M}_V) \pmb{S}_V^{-1}. \end{equation}

We can now express 𝐘\pmb{Y} as 𝐘𝐔𝐃(𝐕̃𝐒V+𝐌V)=𝐔𝐃𝐒V𝐕̃+𝐔𝐃𝐌V=𝐀𝐕̃+𝐁.\begin{align} \pmb{Y} & \approx \pmb{U} \pmb{D} (\tilde{\pmb{V}} \pmb{S}_V + \pmb{M}_V)^\top \\ & = \pmb{U} \pmb{D} \pmb{S}_V \tilde{\pmb{V}}^\top + \pmb{U} \pmb{D}\pmb{M}_V ^\top \\ & = \pmb{A}\tilde{\pmb{V}}^\top + \pmb{B}. \end{align}

Furthermore, we can express matrix 𝐁\pmb{B} as $$\begin{align} \pmb{B} & = \pmb{U} \pmb{D}\pmb{M}_V ^\top \\ & = \pmb{U} \pmb{D} \pmb{m}_V \pmb{1}^\top$ \\ & = \pmb{b} \pmb{1}^\top. \end{align}$$

Result

Consider a randomly selected row 𝐯̃l\tilde{\pmb{v}}_l from 𝐕̃\tilde{\pmb{V}}. From the construction of 𝐕̃\tilde{\pmb{V}}, and the orthogonality of the columns of 𝐕\pmb{V}$\color{cyan}{\text{TODO-spell this out a bit more}}$, we obtain E[𝐯̃l]=𝟎\begin{equation} \text{E}[\tilde{\pmb{v}}_l] = \pmb{0} \end{equation} and Var[𝐯̃l]=𝐈.\begin{equation} \text{Var}[\tilde{\pmb{v}}_l] = \pmb{I}. \end{equation} This implies that if set 𝐲=𝐀𝐳+𝐛\begin{equation} \pmb{y}' = \pmb{A} \pmb{z} + \pmb{b} \end{equation} where 𝐳N(𝟎,𝐈),\begin{equation} \pmb{z} \sim \text{N}(\pmb{0}, \pmb{I}), \end{equation} then 𝐲\pmb{y}' will look like a randomly-chosen column from 𝐘\pmb{Y}.

$\color{cyan}{\text{TODO - illustrate with examples}}$