0%

Transformation Method

1. Transformation Method

Suppose we have a \(\mathbb{Y}\)-valued r.v. \(Y \sim q\) which we can simulate and some other \(\mathbb{X}\)-valued r.v. \(X \sim \pi\) which we want to simulate. The transformation method is to find a function \(\varphi: \mathbb{Y} \to \mathbb{X}\) with the property that if we simulate \(Y \sim q\) and set \(X=\varphi(Y)\), then we get \(X \sim \pi\).

Inversion method is a special case of transformation method where \(Y\) is uniformly distributed and \(\varphi\) is the generalized inverse of the CDF.

2. Example

Example 2.1 (Gamma Distribution)    Let \(Y_i\), \(i=1, \ldots, \alpha\) for \(\alpha \in \mathbb{N}\), be i.i.d. r.v. with \(Y_i \sim \mathcal{E}(1)\) and \[X=\frac{1}{\beta}\sum_{i=1}^\alpha Y_i\] with \(\beta \in \mathbb{R}^+\), then \(X \sim \mathcal{G}(\alpha, \beta)\), where \(\mathcal{G}(\alpha, \beta)\) is the Gamma distribution of density \(\pi(x) \propto x^{\alpha-1}e^{-\beta x}\).

Example 2.2 (Beta Distribution)    Let \(X_1 \sim \mathcal{G}(\alpha, 1)\) and \(X_2 \sim \mathcal{G}(\beta, 1)\) be independent, then \[\frac{X_1}{X_1+X_2} \sim \mathcal{B}(\alpha, \beta),\] where \(\mathcal{B}(\alpha, \beta)\) is the Beta distribution of density \(\pi(x) \propto x^{\alpha-1}(1-x)^{\beta-1}\).

2.1. Change of Variables Formula

For continuous r.v., another useful tool is the transformation/change of variable formula for probability density function (PDF). Recall that when \(\text{d}q(x)=q(x)\text{d}x\) and \(\varphi\) is a bijection, then \[\pi(x)=q \circ \varphi^{-1}(x)|\text{det}(D\varphi^{-1})|=\pi(x)=q \circ \varphi^{-1}(x)|(\text{det}(D\varphi))^{-1}|.\]

Example 2.3 (Multivariate Gaussian Distribution)    Let \(Z=(Z_1, \ldots, Z_d)\) be a collection of \(d\) independent standard normal r.v.s., and \(L\) be a real invertible \(d \times d\) matrix satisfying \(LL^T=\Sigma\) and \(X=LZ+\boldsymbol{\mu}\), then \(X \sim \mathcal{N}(\boldsymbol{\mu}, \Sigma)\).

Proof. Since \(X=\varphi(Z)=LZ+\boldsymbol{\mu}\), then \(Z=\varphi^{-1}(X)=L^{-1}(X-\boldsymbol{\mu})\). Since \[q(\mathbf{z})=\frac{1}{(2\pi)^{d/2}}\exp\left(-\frac{1}{2}\mathbf{z}^\top \mathbf{z}\right),\] then \[q(\varphi^{-1}(\mathbf{x}))=\frac{1}{(2\pi)^{d/2}}\exp\left(-\frac{1}{2}(\mathbf{x}-\boldsymbol{\mu})^\top \Sigma^{-1} (\mathbf{x}-\boldsymbol{\mu})\right).\] Besides, we know \(\displaystyle D\varphi=\frac{\partial \varphi}{\partial Z}=L\). Since \(\text{det}(L)=\text{det}(L^\top)\) and \(\text{det}(LL^\top)=\text{det}(\Sigma)=\text{det}(L)\text{det}(L^\top)\), then \(\text{det}(D\varphi)=\text{det}(L)=|\Sigma|^{1/2}\). Thus \(|\text{det}(D\varphi^{-1})|=(\text{det}(D\varphi))^{-1}=|\Sigma|^{-1/2}\). Therefore, \[\pi(x)=\frac{1}{\sqrt{(2\pi)^d|\Sigma|}}\exp\left(-\frac{1}{2}(\mathbf{x}-\boldsymbol{\mu})^\top \Sigma^{-1} (\mathbf{x}-\boldsymbol{\mu})\right),\] i.e. \(X \sim \mathcal{N}(\boldsymbol{\mu}, \Sigma)\).

\(\square\)

In practice, we use a Cholesky factorization \(\Sigma=LL^T\) where \(L\) is a lower triangular matrix.

3. Comment

The transformation method requires a representation of \(\pi\) in terms of more simple distributions that can be readily sampled.