On the previous post, when trying to explain LSH, there was a need to generate vectors on a d-dimensional space. The requirement was that each direction was equally likely.
To achieve this, the references used a normal distribution to obtain the weights along each of the d dimensions of the random vector.
At first I didn’t understand why they didn’t just draw a random number from for the weights. Intuitively one would assume the procedure would generate vectors evenly distributed across all directions.
I decided to understand why the normal Gaussian was required. For simplicity I considered a 2d-space.
I draw a large number of random pairs (X,Y) such that X and Y were evenly distributed across [-1, 1]. For each pair I computed the angle and plotted the distribution of all angles. I did the same thing with a normal distribution. This is what I got:
On the left the vectors are biased towards 45º. This can’t be used.
After a quick search, I found someone who had had the similar question:
Out of curiosity I tried to proof this distribution for the left case, when:
The angle is a function of the two random variables X and Y:
Following the method of transformations1 the p.d.f of can be computed:
Where is defined so that the method can be applied.
The method of transformations states:
is the joint PDF of the random variables and .
is the joint PDF of the random variables X and Y.
In these case .
Since X and Y are independent we also have:
So that we get:
But what we want is the marginal PDF2 of :
Using the substitution variable we get:
Since X is uniformly distributed between -1 and 1 we get the two cases: