Last Update：May 18, 2024 am

Monte Carlo Integration Method

The Monte Carlo integration method is a numerical method for solving definite integrals. Suppose we need to compute the following integral:

$I = \int_a^b f(x)dx.$

First, select a continuous random variable $X$ with a probability density function $p(x)$ that satisfies

$\forall x \in [a, b], \quad p(x) > 0.$

Then let $Y = g(X) = \frac{f(X)}{p(X)}$. Note that

$\mathrm{E}[Y] = \mathrm{E}\left[\frac{f(X)}{p(X)}\right] = \int_{a}^{b} \frac{f(x)}{p(x)} p(x) dx = I,$

so we can transform the task of solving the definite integral $I$ into estimating the expectation $\mathrm{E}[Y]$ of the random variable $Y$.

A common approach to estimate $\mathrm{E}[Y]$ is to perform simple random sampling with $Y$ and then use the sample mean

$\overline{Y} = \frac{1}{N}\sum_{i=1}^N Y_i$

as the estimator of $\mathrm{E}[Y]$. The specific scheme is as follows:

Let $X_i \sim p(x) \ (i = 1, 2, \cdots, n)$ be $n$ independent and identically distributed random variables, then $Y_i = \frac{f(X_i)}{p(X_i)}$ are also $n$ independent and identically distributed random variables. Define
$I_N = \frac{1}{N}\sum_{i=1}^N \frac{f(X_i)}{p(X_i)} = \frac{1}{N}\sum_{i=1}^N Y_i,$
and use $I_N$ as the estimator of $\mathrm{E}[Y] = I$.

Statistics tells us that the sample mean is a consistent and unbiased estimator of the population mean. Here, we can prove this conclusion again. First, the expectation of $I_N$ is

$\begin{aligned} \mathrm{E}[I_N] &= \mathrm{E}\left[\frac{1}{N}\sum_{i=1}^N Y_i\right] = \frac{1}{N} \mathrm{E}\left[\sum_{i=1}^N Y\right] = \mathrm{E}[Y] = I, \end{aligned}$

which shows that $I_N$ is unbiased. Secondly, when $N \to \infty$, the variance of $I_N$ is

$\begin{aligned} \mathrm{Var}[I_N] &= \mathrm{Var}\left[\frac{1}{N} \sum_{i=1}^{N} Y_i\right] \\ &= \frac{1}{N^2} \sum_{i=1}^N \mathrm{Var}[Y_i] \\ &= \frac{1}{N} \mathrm{Var}[Y] \to 0, \end{aligned}$

which shows that $I_N$ is consistent. Proof completed.

Importance Sampling

For an estimator, we always hope to reduce its variance. The variance of the estimator $I_N$ is given by

$\mathrm{Var}[I_N] = \frac{1}{N} \mathrm{Var}\left[\frac{f(X)}{p(X)}\right].$

It is known that when $p(x) = kf(x)$, i.e., $p(x) = \frac{1}{I} f(x)$, the variance will be minimized to 0. However, finding $I$ is impossible, because that is exactly the goal of the Monte Carlo integration method. What we can do is to make the shape of $p(x)$ as close to $f(x)$ as possible. This variance reduction technique is called importance sampling. The regions where $f(x)$ has larger values contribute more to the integral and are therefore more important, requiring more samples.

If we simply let $X$ follow a uniform distribution over $[a, b]$, the estimator is

$I_N = \frac{b - a}{N}\sum_{i=1}^N f(X_i).$

Intuitively, this means dividing the integral interval into $N$ segments, using $N$ rectangles with areas $\frac{b - a}{N} \cdot f(X_i)$ to estimate the curved trapezoid. If we want to double the number of samples in a certain subinterval, then this subinterval needs to be halved, replacing one rectangle with two rectangles each with width $\frac{b - a}{2N}$. More generally, the larger the sampling probability in a region, the smaller the weight of a single sample. This is why $I_N$ is divided by $p(X_i)$.

Cosine Sampling

Given a normal vector $\mathbf{n}$, we use $H^2_{\mathbf{n}}$ to denote the upper hemisphere whose bottom great circle is perpendicular to $\mathbf{n}$. We consider performing importance sampling for the integral

$I=\int_{H^2_{\mathbf{n}}} f(\boldsymbol{\omega}_i)(\mathbf{n} \cdot \boldsymbol{\omega}_i) d\Omega_i.$

Note that

$\int_{H^2_{\mathbf{n}}} \frac{\mathbf{n} \cdot \boldsymbol{\omega}_i}{\pi} \, d\Omega_i = 1,$

so we can define $\eta_{\mathbf{n}}$ as a probability distribution on the hemisphere $H^2$ with the probability density function

$pdf_{\mathbf{n}}(\boldsymbol{\omega}_i) = \frac{\mathbf{n} \cdot \boldsymbol{\omega}_i}{\pi}.$

We use $\eta_{\mathbf{n}}$ for importance sampling, yielding the estimator for $I$:

$\widehat{I} = \frac{\pi}{N} \sum_{k=1}^N f\left(\boldsymbol{\omega}_i^{(k)}\right), \quad \boldsymbol{\omega}_i^{(k)} \sim \eta_{\mathbf{n}},$

where $N$ is the number of samples. Below, we describe how to generate samples for $\eta_{\mathbf{n}}$.

First, we discuss how to generate samples for $\eta_{(0,0,1)}$.

Using the inverse transform method, it can be proven that if $\xi_{1}^{(k)}$ , $\xi_{2}^{(k)}$ are independent and uniformly distributed over $U(0,1)$, then

$\boldsymbol{\nu}_i^{(k)} = \left( \sqrt{\xi_{1}} \cos \left(2 \pi \xi_{2}\right), \sqrt{\xi_{1}} \sin \left(2 \pi \xi_{2}\right), \sqrt{1 - \xi_{1}} \right)$

is distributed according to $\eta_{(0,0,1)}$.

For a general $\mathbf{n}$, we provide two methods for sampling $\eta_{\mathbf{n}}$: the coordinate transformation method and the quaternion rotation method.

Coordinate Transformation Method:

Let

$\mathbf{u} = \begin{cases} (0,0,1) & \text{if}\; \mathbf{n} \ne (0,0,\pm1), \\ (1,0,0) & \text{if}\; \mathbf{n} = (0,0,\pm1). \end{cases}$

The matrix $T_{\mathbf{n}}$ representing the basis vectors of the tangent space in the world space can be expressed as

$T_{\mathbf{n}} = \left( \mathbf{t}, \mathbf{b}, \mathbf{n} \right),$

where

$\begin{aligned} \mathbf{t} &= \frac{\mathbf{u} \times \mathbf{n}}{\Vert \mathbf{u} \times \mathbf{n} \Vert}, \\ \mathbf{b} &= \mathbf{n} \times \mathbf{t}. \end{aligned}$

Thus, we have the sampling scheme for $\eta_{\mathbf{n}}$.

Sampling Scheme: Let $\xi_{1}^{(k)}$ , $\xi_{2}^{(k)}$ be independent and uniformly distributed over $U(0,1)$. Then

$\boldsymbol{\omega}_i^{(k)} = T_{\mathbf{n}} \boldsymbol{\nu}_i^{(k)} = T_{\mathbf{n}}\begin{bmatrix} \sqrt{\xi_{1}} \cos \left(2 \pi \xi_{2}\right) \\ \sqrt{\xi_{1}} \sin \left(2 \pi \xi_{2}\right) \\ \sqrt{1 - \xi_{1}}\end{bmatrix}$

is distributed according to

$pdf_{\mathbf{n}}(\boldsymbol{\omega}_i) = \frac{\mathbf{n} \cdot \boldsymbol{\omega}_i}{\pi}.$

Quaternion Rotation Method:

When $\mathbf{n} = (0,0,1)$, we can use inverse transform sampling. Let $\xi_{1}^{(k)}$, $\xi_{2}^{(k)}$ be independent and uniformly distributed over $U(0,1)$. Then

$\boldsymbol{\nu}_i^{(k)} = \left( \sqrt{\xi_{1}} \cos \left(2 \pi \xi_{2}\right), \sqrt{\xi_{1}} \sin \left(2 \pi \xi_{2}\right), \sqrt{1 - \xi_{1}} \right)$

is distributed according to $\eta_{(0,0,1)}$.

When $\mathbf{n} = (0,0,-1)$, $-\boldsymbol{\nu}_i^{(k)}$ is distributed according to $\eta_{(0,0,-1)}$.
When $\mathbf{n} = (n_x, n_y, n_z) \ne (0,0,\pm 1)$, the quaternion that rotates the vector $(0,0,1)$ to $\mathbf{n}$ around the axis

$(0,0,1) \times \mathbf{n} = (-n_y, n_x, 0)$

is given by

$q_{\mathbf{n}} = \cos\frac{\alpha}{2} + \sin\frac{\alpha}{2}\left( -\frac{n_y}{\sqrt{n_x^2 + n_y^2}}i + \frac{n_x}{\sqrt{n_x^2 + n_y^2}}j \right),$

where $\alpha$ is the rotation angle. Since

$\cos\alpha = (0,0,1) \cdot \mathbf{n} = n_z,$

we have

$\begin{aligned} q_{\mathbf{n}} &= \sqrt{\frac{1 + \cos\alpha}{2}} + \sqrt{\frac{1 - \cos\alpha}{2}}\left( -\frac{n_y}{\sqrt{n_x^2 + n_y^2}}i + \frac{n_x}{\sqrt{n_x^2 + n_y^2}}j \right) \\ &= \sqrt{\frac{1 + n_z}{2}} + \sqrt{\frac{1 - n_z}{2}}\left( -\frac{n_y}{\sqrt{n_x^2 + n_y^2}}i + \frac{n_x}{\sqrt{n_x^2 + n_y^2}}j \right). \end{aligned}$

Thus, the rotation matrix corresponding to the quaternion rotation $\mathrm{rot}_q: p \mapsto qpq^{-1}$ is given by

$R_{\mathbf{n}} = \begin{pmatrix} \tfrac{n_x^2 n_z + n_y^2}{n_x^2 + n_y^2} & -n_x n_y (1 + n_z) & n_x \\ -n_x n_y (1 + n_z) & \tfrac{n_x^2 + n_y^2 n_z}{n_x^2 + n_y^2} & n_y \\ -n_x & -n_y & n_z \end{pmatrix}.$

It is evident that $\boldsymbol{\omega}_i^{(k)} = R_{\mathbf{n}} \boldsymbol{\nu}_i^{(k)}$ is distributed according to $\eta_{\mathbf{n}}$.

Sampling GGX Normal Distribution

Estimator

Given a normal vector $\mathbf{n}$, let $H^2_{\mathbf{n}}$ denote the upper hemisphere with its bottom great circle perpendicular to $\mathbf{n}$. We consider performing importance sampling for the integral

$I=\int_{H^2_{\mathbf{n}}} f(\boldsymbol{\omega}_i) D(\mathbf{h}) (\mathbf{n} \cdot \boldsymbol{\omega}_i) d\Omega_i,$

where

$D(\mathbf{h}) = \dfrac{\alpha^{2}}{\pi \left( (\mathbf{n} \cdot \mathbf{h})^{2} (\alpha^{2} - 1) + 1 \right)^{2}}, \quad \mathbf{h} = \frac{\boldsymbol{\omega}_i + \boldsymbol{\omega}_o}{\Vert \boldsymbol{\omega}_i + \boldsymbol{\omega}_o \Vert}.$

According to the relation between differential areas

$4 (\mathbf{h} \cdot \boldsymbol{\omega}_o) d\Omega_\mathbf{h} = d\Omega_i,$

$I$ can be rewritten as an integral over $\mathbf{h}$:

$\begin{align*} I &= \int_{S^2} f(\boldsymbol{\omega}_i) D(\mathbf{h}) (\mathbf{n} \cdot \boldsymbol{\omega}_i)^+ d\Omega_i \\ &= \int_{H^2_{\mathbf{n}}} 4 f(\boldsymbol{\omega}_i) D(\mathbf{h}) (\mathbf{h} \cdot \boldsymbol{\omega}_o) (\mathbf{n} \cdot \boldsymbol{\omega}_i)^+ d\Omega_\mathbf{h}. \end{align*}$

Note that

$\int_{H^2_{\mathbf{n}}} D(\mathbf{h}) (\mathbf{n} \cdot \mathbf{h}) \: d\Omega_{\mathbf{h}} = 1,$

so we can define $\eta_{\mathbf{n}}$ as a probability distribution on the hemisphere $H^2_{\mathbf{n}}$ with the probability density function

$pdf_{\mathbf{n}}(\mathbf{h}) = D(\mathbf{h}) (\mathbf{n} \cdot \mathbf{h}).$

We use $\eta_{\mathbf{n}}$ for importance sampling, yielding the estimator for $I$:

$\widehat{I} = \frac{4}{N} \sum_{k=1}^N \frac{\left( \mathbf{h}^{(k)} \cdot \boldsymbol{\omega}_o \right) \left( \mathbf{n} \cdot \boldsymbol{\omega}_i^{(k)} \right)^+}{\mathbf{n} \cdot \mathbf{h}^{(k)}} f\left( \boldsymbol{\omega}_i^{(k)} \right), \quad \mathbf{h}^{(k)} \sim \eta_{\mathbf{n}},$

where $N$ is the number of samples,

$\boldsymbol{\omega}_i^{(k)} = \mathbf{r}(\boldsymbol{\omega}_o, \mathbf{h}^{(k)}) = 2 (\boldsymbol{\omega}_o \cdot \mathbf{h}^{(k)}) \mathbf{h}^{(k)} - \boldsymbol{\omega}_o.$

Next, we describe how to generate samples $\mathbf{h}^{(k)}$ for $\eta_{\mathbf{n}}$.

Sampling Method

To apply the inverse transform sampling method, parameterize $\mathbf{h}^{(k)}$ in the tangent space at $\mathbf{n}$ using spherical coordinates

$(\theta_{\mathbf{h}^{(k)}}, \varphi_{\mathbf{h}^{(k)}}) \in [0, \pi/2] \times [0, 2\pi].$

Let $\xi_{1}^{(k)}, \xi_{2}^{(k)}$ be independent and uniformly distributed over $U(0,1)$. From

$cdf_{\theta,\varphi}\left(\theta_{\mathbf{h}^{(k)}}, \varphi_{\mathbf{h}^{(k)}}\right) = \int_{0}^{\varphi_{\mathbf{h}^{(k)}}} d\varphi_{\mathbf{h}} \int_{0}^{\theta_{\mathbf{h}^{(k)}}} \frac{\alpha^{2} \cos \theta_{\mathbf{h}} \sin \theta_{\mathbf{h}}}{\pi \left( (\alpha^{2} - 1) \cos^{2} \theta_{\mathbf{h}} + 1 \right)^{2}} d\theta_{\mathbf{h}},$

it is clear that $\theta_{\mathbf{h}^{(k)}}, \varphi_{\mathbf{h}^{(k)}}$ are independent. Thus, we can derive the marginal distributions of $\theta_{\mathbf{h}^{(k)}}$ and $\varphi_{\mathbf{h}^{(k)}}$ .

$\begin{aligned} cdf_{\theta}\left(\theta_{\mathbf{h}^{(k)}}\right)&=cdf_{\theta,\varphi}\left(\theta_{\mathbf{h}^{(k)}},2\pi\right)\\ &=\int_{0}^{2\pi} \mathrm{d} \varphi_{\mathbf{h}} \int_{0}^{\theta_{\mathbf{h}^{(k)}}} \frac{\alpha^{2} \cos \theta_{\mathbf{h}} \sin \theta_{\mathbf{h}} }{\pi\left(\left(\alpha^{2}-1\right) \cos ^{2} \theta_{\mathbf{h}} +1\right)^{2}} d\theta_{\mathbf{h}}\\ &=\int_{0}^{\theta_{\mathbf{h}^{(k)}}} \frac{2\alpha^{2} \cos \theta_{\mathbf{h}} \sin \theta_{\mathbf{h}} }{\left(\left(\alpha^{2}-1\right) \cos ^{2} \theta_{\mathbf{h}} +1\right)^{2}} d\theta_{\mathbf{h}}\\ &=\left.\frac{\alpha^{2}}{\alpha^{2}-1} \frac{1}{\left(\alpha^{2}-1\right) \cos^2\theta_{\mathbf{h}}+1}\right|_{0} ^{\theta_{\mathbf{h}^{(k)}}}\\ &=\frac{\alpha^{2}}{\alpha^{2}-1} \frac{1}{\left(\alpha^{2}-1\right)\cos^2\theta_{\mathbf{h}^{(k)}}+1}-\frac{1}{\alpha^{2}-1} \\ &=\frac{1}{\alpha^{2}-1} \frac{\alpha^2-1-\left(\alpha^{2}-1\right)\cos^2\theta_{\mathbf{h}^{(k)}}}{\left(\alpha^{2}-1\right)\cos^2\theta_{\mathbf{h}^{(k)}}+1} \\ &= \frac{1-\cos^2\theta_{\mathbf{h}^{(k)}}}{\left(\alpha^{2}-1\right)\cos^2\theta_{\mathbf{h}^{(k)}}+1}, \\ \end{aligned}$ $\begin{aligned} cdf_{\varphi}\left(\varphi_{\mathbf{h}^{(k)}}\right)&=cdf_{\theta,\varphi}\left(\frac{\pi}{2},\varphi_{\mathbf{h}^{(k)}}\right)\\ &=\int_{0}^{\varphi_{\mathbf{h}^{(k)}}} \mathrm{d} \varphi_{\mathbf{h}} \int_{0}^{\frac{\pi}{2}} \frac{\alpha^{2} \cos \theta_{\mathbf{h}} \sin \theta_{\mathbf{h}} }{\pi\left(\left(\alpha^{2}-1\right) \cos ^{2} \theta_{\mathbf{h}} +1\right)^{2}} d\theta_{\mathbf{h}}\\ &=\frac{1}{2\pi}\varphi_{\mathbf{h}^{(k)}}. \end{aligned}$

Using the inverse transform of the CDF, we obtain the sampling scheme for $\mathbf{h}^{(k)}$ in spherical coordinates:

$\begin{aligned} \theta_{\mathbf{h}^{(k)}} &= \arccos \sqrt{\frac{1 - \xi_{1}^{(k)}}{1 - (1 - \alpha^2) \xi_{1}^{(k)}}}, \\ \varphi_{\mathbf{h}^{(k)}} &= 2\pi \xi_{2}^{(k)}. \end{aligned}$

Since the spherical coordinates above are generated in the tangent space, we need to first convert them to Cartesian coordinates, and then use the coordinate transformation method to convert them to world space coordinates. Using the previously defined $T_{\mathbf{n}}$, we obtain the sampling scheme for $\mathbf{h}^{(k)}$ in world space.

Sampling Scheme: Let $\xi_{1}^{(k)}, \xi_{2}^{(k)}$ be independent and uniformly distributed over $U(0,1)$, then

$\mathbf{h}^{(k)}=T_{\mathbf{n}}\begin{bmatrix} \sin\theta_{\mathbf{h}^{(k)}}\cos\varphi_{\mathbf{h}^{(k)}}\\\\ \sin\theta_{\mathbf{h}^{(k)}}\sin\varphi_{\mathbf{h}^{(k)}}\\\\ \cos\theta_{\mathbf{h}^{(k)}}\end{bmatrix}=T_{\mathbf{n}}\begin{bmatrix} \alpha\cos2\pi\xi_{2}^{(k)}\sqrt{\dfrac{\xi_{1}^{(k)}}{1-(1-\alpha^2)\xi_{1}^{(k)}}}\;\\ \\ \alpha\sin2\pi\xi_{2}^{(k)}\sqrt{\dfrac{\xi_{1}^{(k)}}{1-(1-\alpha^2)\xi_{1}^{(k)}}}\\ \\ \sqrt{\dfrac{1-\xi_{1}^{(k)}}{1-(1-\alpha^2)\xi_{1}^{(k)}}}\end{bmatrix}.$

follows the distribution with the probability density function

$pdf_{\mathbf{n}}(\mathbf{h}) =D(\mathbf{h})(\mathbf{n}\cdot\mathbf{h}).$

Multiple Importance Sampling

Suppose we need to compute the following integral:

$I = \int_a^b f(x)g(x)dx,$

and we know that the probability density functions $p_X(x)$ and $p_Y(x)$ can approximate the shapes of $f(x)$ and $g(x)$ well, respectively, meaning they can perform good importance sampling for

$\int_a^b f(x)dx \quad \text{and} \quad \int_a^b g(x)dx.$

So, how do we perform importance sampling for $I$?

One easy-to-think-of method is to use $p_X(x)p_Y(x)$ for importance sampling. However, sometimes it is difficult to generate samples for the distribution $Z \sim p_X(x)p_Y(x)$. In such cases, multiple importance sampling is a good choice.

Let $X_i \sim p_X(x)$ and $Y_i \sim p_Y(x)$. Multiple importance sampling uses the following estimator:

$I_{n_{f},n_g} = \frac{1}{n_{f}} \sum_{i=1}^{n_{f}} \frac{f(X_i) g(X_i) w_{f}(X_i)}{p_X(X_i)} + \frac{1}{n_{g}} \sum_{j=1}^{n_{g}} \frac{f(Y_j) g(Y_j) w_{g}(Y_j)}{p_Y(Y_j)},$

where $w_f(x)$ and $w_g(x)$ are weight functions that satisfy $w_f(x) + w_g(x) = 1$. It can be verified that

$\begin{aligned} \mathrm{E}\left[I_{n_{f},n_g}\right]&= \mathrm{E}\left[\frac{f\left(X\right) g\left(X\right) w_{f}\left(X\right)}{p_{X}\left(X\right)}\right]+\mathrm{E}\left[\frac{f\left(Y\right) g\left(Y\right) w_{g}\left(Y\right)}{p_{Y}\left(Y\right)}\right]\\ &=\int_a^b\frac{f\left(x\right) g\left(x\right) w_{f}\left(x\right)}{p_X(x)}p_X(x)dx+\int_a^b\frac{f\left(x\right) g\left(x\right) w_{g}\left(x\right)}{p_Y(x)}p_Y(x)dx\\ &=\int_a^bf\left(x\right) g\left(x\right) w_{f}\left(x\right)dx+\int_a^bf\left(x\right) g\left(x\right) w_{g}\left(x\right)dx\\ &=\int_a^bf\left(x\right) g\left(x\right) (w_{f}\left(x\right)+w_{g}\left(x\right))\:dx\\ &=\int_a^bf\left(x\right) g\left(x\right)dx, \end{aligned}$

That is, $I_{n_{f},n_{g}}$ is indeed an unbiased estimator of $I$. In practice, the following form of heuristic weight functions is often used:

$w_{f}(x) = \frac{\left(n_{f} p_{X}(x)\right)^{\beta}}{\left(n_{f} p_{X}(x)\right)^{\beta} + \left(n_{g} p_{Y}(x)\right)^{\beta}}, \quad w_{g}(x) = \frac{\left(n_{g} p_{Y}(x)\right)^{\beta}}{\left(n_{f} p_{X}(x)\right)^{\beta} + \left(n_{g} p_{Y}(x)\right)^{\beta}}.$

It is often found that setting $\beta = 2$ works well.

Computer Graphics

Computer Graphics Sampling Theory

PBR - Surface Reflection Previous

Color Science Next