Latin hypercube sampling (LHS) represents one of the realizations of the stratified sampling methodology. The motivation for the stratified sampling is that probability distribution, *P(x)*, with irregular shape (i.e. far from the uniform distribution) is not sampled evenly. In other words some regions of the distribution are sampled more frequently and some are not sampled at all if the number of trials is not sufficiently large. To overcome this drawback probability distribution is stratified into segments with equal under the curve areas and a single point is sampled from each strata in a given trial. In LHS the number of trials is equal to the number of the segments. In contrast to the standard stratified sampling method, just one point is sampled from each segment during the sampling process.

### Taking an example

Let’s assume that one wants to compute some statistics for a function which depends on *M* number of variables. First, the probability distribution for each dimension is stratified into *L* regions such that each of them have the same area, they are equiprobable. How to do that? We use the corresponding cumulative distribution function, CDF, *F(x)*. For a given value of *x* and* x0* CDF returns *P(x < x0)*, that is the area under the curve of *P(x)* for *-∞ <x <x0* . Given that *P(x)* is normalized, output of *F(x)* ranges from 0 to 1. So, dividing the output of *F(x)* into L equal length segments corresponds to stratifying *P(x)* into regions with equal under the curve areas. The corresponding division points on the* x* axis can be found by applying *F-1* to the division points on the *y* axis. For example, if one wants to have *L=5* strata the domain of *F(x)* is split at 0.2, 0.4, 0.6, 0.8. A point is sampled uniformly from each of these intervals and is passed to *F-1* to identify the corresponding point on the* x* axis.

### Moving to the next step

As I mentioned above this procedure is done for all* M* dimensions. In LHS a single point has to be sampled from each segment in each dimension and thus the number of trials is equal to *L*, number of strata, and is equal to the number of points sampled for a given variable/dimension. Next step is to randomly combine these values of independent variables to be able to compute the value of the function of interest. To do that let’s draw *L* random numbers from the uniform distribution for each dimension (in total, we need* L x M* random numbers). The sequence of these random numbers defines the sequence of the sampled values as follows: the rank of the value sampled from the first segment in the sequence of the *L* sampled values of a given dimension is the same as that of the highest random number in the sequence of random numbers we generated in the previous step, the rank of the value sampled from the second segment is equal to that of the next highest random number and so on. This way we end up with *L* combinations of values of *M* variables. In other words, we get coordinates of *L* points on the *M* dimensional hypercube. And these are the points the function of interest is evaluated at.

### Using visual thinking

It’s instructive to visually think of how sampled points are distributed in case of the LHS. Let’s consider a 2 dimensional case. Our function of interest has 2 arguments, *g(x1,x2)*. So, both, *x1* and *x2* are stratified into, say, 3 segments. Assume that *x1* is the *x* axis and *x2* is y axis. Since we divided both of the dimensions in three equal intervals we get *3 x 3* grid on the 2-D space (see figures above). In LHS single value is sampled from each segment of a given variable during the whole sampling process (that is that value of the coordinate xi is unique) and also the chance of generating equal random numbers (remember, we need *L* random numbers from the uniform distribution to define the sequence of the sampled points) is infinitesimally small which means that if we first sample, for example, the point in the upper left corner of the square on the left then no points will be sampled from the cells in the same row and column. Whereas in the case of the standard stratified sampling there is no such constraint (square on the right).