Importance Sampling

Last Update:May 18, 2024 am

Monte Carlo Integration Method

The Monte Carlo integration method is a numerical method for solving definite integrals. Suppose we need to compute the following integral:

First, select a continuous random variable $X$ with a probability density function $p(x)$ that satisfies

Then let $Y = g(X) = \frac{f(X)}{p(X)}$. Note that

so we can transform the task of solving the definite integral $I$ into estimating the expectation $\mathrm{E}[Y]$ of the random variable $Y$.

A common approach to estimate $\mathrm{E}[Y]$ is to perform simple random sampling with $Y$ and then use the sample mean

as the estimator of $\mathrm{E}[Y]$. The specific scheme is as follows:

Let $X_i \sim p(x) \ (i = 1, 2, \cdots, n)$ be $n$ independent and identically distributed random variables, then $Y_i = \frac{f(X_i)}{p(X_i)}$ are also $n$ independent and identically distributed random variables. Define

and use $I_N$ as the estimator of $\mathrm{E}[Y] = I$.

Statistics tells us that the sample mean is a consistent and unbiased estimator of the population mean. Here, we can prove this conclusion again. First, the expectation of $I_N$ is

which shows that $I_N$ is unbiased. Secondly, when $N \to \infty$, the variance of $I_N$ is

which shows that $I_N$ is consistent. Proof completed.

Importance Sampling

For an estimator, we always hope to reduce its variance. The variance of the estimator $I_N$ is given by

It is known that when $p(x) = kf(x)$, i.e., $p(x) = \frac{1}{I} f(x)$, the variance will be minimized to 0. However, finding $I$ is impossible, because that is exactly the goal of the Monte Carlo integration method. What we can do is to make the shape of $p(x)$ as close to $f(x)$ as possible. This variance reduction technique is called importance sampling. The regions where $f(x)$ has larger values contribute more to the integral and are therefore more important, requiring more samples.

If we simply let $X$ follow a uniform distribution over $[a, b]$, the estimator is

Intuitively, this means dividing the integral interval into $N$ segments, using $N$ rectangles with areas $\frac{b - a}{N} \cdot f(X_i)$ to estimate the curved trapezoid. If we want to double the number of samples in a certain subinterval, then this subinterval needs to be halved, replacing one rectangle with two rectangles each with width $\frac{b - a}{2N}$. More generally, the larger the sampling probability in a region, the smaller the weight of a single sample. This is why $I_N$ is divided by $p(X_i)$.

Cosine Sampling

Given a normal vector $\mathbf{n}$, we use $H^2_{\mathbf{n}}$ to denote the upper hemisphere whose bottom great circle is perpendicular to $\mathbf{n}$. We consider performing importance sampling for the integral

Note that

so we can define $\eta_{\mathbf{n}}$ as a probability distribution on the hemisphere $H^2$ with the probability density function

We use $\eta_{\mathbf{n}}$ for importance sampling, yielding the estimator for $I$:

where $N$ is the number of samples. Below, we describe how to generate samples for $\eta_{\mathbf{n}}$.

First, we discuss how to generate samples for $\eta_{(0,0,1)}$.

Using the inverse transform method, it can be proven that if , are independent and uniformly distributed over $U(0,1)$, then

is distributed according to $\eta_{(0,0,1)}$.

For a general $\mathbf{n}$, we provide two methods for sampling $\eta_{\mathbf{n}}$: the coordinate transformation method and the quaternion rotation method.

Coordinate Transformation Method:

Let

The matrix $T_{\mathbf{n}}$ representing the basis vectors of the tangent space in the world space can be expressed as

where

Thus, we have the sampling scheme for $\eta_{\mathbf{n}}$.

Sampling Scheme: Let , be independent and uniformly distributed over $U(0,1)$. Then

is distributed according to

Quaternion Rotation Method:

  • When $\mathbf{n} = (0,0,1)$, we can use inverse transform sampling. Let be independent and uniformly distributed over $U(0,1)$. Then

is distributed according to $\eta_{(0,0,1)}$.

  • When $\mathbf{n} = (0,0,-1)$, is distributed according to $\eta_{(0,0,-1)}$.

  • When $\mathbf{n} = (n_x, n_y, n_z) \ne (0,0,\pm 1)$, the quaternion that rotates the vector $(0,0,1)$ to $\mathbf{n}$ around the axis

is given by

where $\alpha$ is the rotation angle. Since

we have

Thus, the rotation matrix corresponding to the quaternion rotation $\mathrm{rot}_q: p \mapsto qpq^{-1}$ is given by

It is evident that is distributed according to $\eta_{\mathbf{n}}$.

Sampling GGX Normal Distribution

Estimator

Given a normal vector $\mathbf{n}$, let $H^2_{\mathbf{n}}$ denote the upper hemisphere with its bottom great circle perpendicular to $\mathbf{n}$. We consider performing importance sampling for the integral

where

According to the relation between differential areas

$I$ can be rewritten as an integral over $\mathbf{h}$:

Note that

so we can define as a probability distribution on the hemisphere $H^2_{\mathbf{n}}$ with the probability density function

We use $\eta_{\mathbf{n}}$ for importance sampling, yielding the estimator for $I$:

where $N$ is the number of samples,

Next, we describe how to generate samples $\mathbf{h}^{(k)}$ for $\eta_{\mathbf{n}}$.

Sampling Method

To apply the inverse transform sampling method, parameterize $\mathbf{h}^{(k)}$ in the tangent space at $\mathbf{n}$ using spherical coordinates

Let be independent and uniformly distributed over $U(0,1)$. From

it is clear that are independent. Thus, we can derive the marginal distributions of and .

Using the inverse transform of the CDF, we obtain the sampling scheme for $\mathbf{h}^{(k)}$ in spherical coordinates:

Since the spherical coordinates above are generated in the tangent space, we need to first convert them to Cartesian coordinates, and then use the coordinate transformation method to convert them to world space coordinates. Using the previously defined $T_{\mathbf{n}}$, we obtain the sampling scheme for $\mathbf{h}^{(k)}$ in world space.

Sampling Scheme: Let be independent and uniformly distributed over $U(0,1)$, then

follows the distribution with the probability density function

Multiple Importance Sampling

Suppose we need to compute the following integral:

and we know that the probability density functions $p_X(x)$ and $p_Y(x)$ can approximate the shapes of $f(x)$ and $g(x)$ well, respectively, meaning they can perform good importance sampling for

So, how do we perform importance sampling for $I$?

One easy-to-think-of method is to use $p_X(x)p_Y(x)$ for importance sampling. However, sometimes it is difficult to generate samples for the distribution $Z \sim p_X(x)p_Y(x)$. In such cases, multiple importance sampling is a good choice.

Let $X_i \sim p_X(x)$ and $Y_i \sim p_Y(x)$. Multiple importance sampling uses the following estimator:

where $w_f(x)$ and $w_g(x)$ are weight functions that satisfy $w_f(x) + w_g(x) = 1$. It can be verified that

That is, is indeed an unbiased estimator of $I$. In practice, the following form of heuristic weight functions is often used:

It is often found that setting $\beta = 2$ works well.


Copyright Notice: All articles in this blog are licensed under CC BY-SA 4.0 unless declaring additionally.