Random hyperplanes

On the previous post, when trying to explain LSH, there was a need to generate vectors on a d-dimensional space. The requirement was that each direction was equally likely.

To achieve this, the references used a normal distribution to obtain the weights along each of the d dimensions of the random \vec{r} vector.
At first I didn’t understand why they didn’t just draw a random number from [0, 1] for the weights. Intuitively one would assume the procedure would generate vectors evenly distributed across all directions.

I decided to understand why the normal Gaussian was required. For simplicity I considered a 2d-space.
I draw a large number of random pairs (X,Y) such that X and Y were evenly distributed across [-1, 1]. For each pair I computed the angle and plotted the distribution of all angles. I did the same thing with a normal distribution. This is what I got:

Distribution of the angle.Left:  When X and Y are drawn from a uniform distribution. Right: When X and Y are drawn from a normal distribution. Continuous lines are ‘Seaborn-Python’ density fits.

On the left the vectors are biased towards 45º. This can’t be used.
After a quick search, I found someone who had had the similar question:

Out of curiosity I tried to proof this distribution for the left case, when:

f_{X}(x) = \begin{cases} \frac{1}{2} & \text{if } -1 \leq x \leq 1 \\ 0 & \text{otherwise} \end{cases} \qquad \textrm{and} \qquad f_{Y}(y) = \begin{cases} \frac{1}{2} & \text{if } -1 \leq y \leq 1 \\ 0 & \text{otherwise} \end{cases}

The angle is a function of the two random variables X and Y:

\Theta = arctan(\frac{Y}{X})

Following the method of transformations1 the p.d.f of \Theta can be computed:

\begin{cases} \Theta = arctan(\frac{Y}{X}) \\ W = X \end{cases} \Leftrightarrow \begin{cases} Y = W\tan(\Theta) = h_1(\Theta, W) \\ X = W =h_2(\Theta, W)\end{cases}

Where W = X is defined so that the method can be applied.
The method of transformations states:

f_{\Theta W} = f_{XY}(h_1(\theta, w), h_2(\theta, w))|J|

f_{\Theta W} is the joint PDF of the random variables \Theta and W.
f_{XY} is the joint PDF of the random variables X and Y.
|J| = det \left[\begin{smallmatrix}\frac{\partial h_1}{\partial \theta}&\frac{\partial h_1}{\partial w}\\\frac{\partial h_2}{\partial \theta}&\frac{\partial h_2}{\partial w}\end{smallmatrix}\right]

In these case |J| = |w|\sec^2(\theta).
Since X and Y are independent we also have:

f_{XY}(x, y) = f_{X}(x) f_Y(y)

So that we get:

f_{\Theta W} = f_{X}(W \tan\theta) f_{Y}(W)|w|\sec^2(\theta)

But what we want is the marginal PDF2 of \Theta:

f_{\Theta} = \int_{-\infty}^{\infty}f_{\Theta W} dw

f_{\Theta} = \frac{1}{2}\sec^2(\theta)\int_{-1}^{1}f_{X} (w\tan\theta \sqrt{x^2})dw

Using the substitution variable u = w\tan\theta we get:

f_{\Theta} = \frac{\sec^2(\theta)}{2\tan^2\theta}\int_{-\tan\theta}^{\tan\theta}f_{X} (u)\sqrt{u^2}du

Since X is uniformly distributed between -1 and 1 we get the two cases:

0 \leq \tan\theta \leq 1 or 0 \leq \theta \leq 45^{\circ}:

f_{\Theta} = \frac{\sec^2(\theta)}{2\tan^2\theta}\int_{-\tan\theta}^{\tan\theta}f_{X} (u)\sqrt{u^2}du = \frac{1}{4\cos^2\theta}

\tan\theta > 1 or \theta > 45^{\circ}:

f_{\Theta} = \frac{\sec^2(\theta)}{2\tan^2\theta}\int_{-1}^{1}\frac{1}{2}\sqrt{u^2}du = \frac{1}{4\sin^2\theta}

Density distribution of the angle of a vector in 2d. Computational results when X and Y are drawn from a uniform distribution. Green line represents the analytical derivation.


1. Functions of Two Continuous Random Variables
2. Joint Probability Density Function

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s