Central Limit Theorem and Convergence

Andresa de Andrade Data Science, Inference May 20, 2015

This is my first post (yaaay \o/).
Sometimes it’s hard to understand the meaning of CLT and convergence. I realized many times when people first hear about these concepts that they actually don’t know what it’s. And they stay this way until they learn (if they learn) more advanced theories.

One of the first things that we learn in school is Central Limit Theorem. Basically for any symmetric distribution, when you have a large number of observations, it converges to a normal distribution.

There are some distributions more inclined to converge than others. For example, the exponential distribution, t-student, binomial.

But those are all old news. Here I’m going to show an example of convergence in real life.

Exponential Distribution

The CLT says that if you have $X_1, X_2, ...X_k$ independent random variables with distribuition $exp(\lambda)$ when k is large enough the mean of these identical distribuited vectors converges to a normal with mean $\lambda$ and variance 1/(k* $\lambda^2$ ).
Now to prove this, we can generate 40 observations from a exponential distribuition with $\lambda$ = 0.2. And then replicate this experiment k times.

The Rcode for that is:

k=10 #number of replication n= 40 #size of each vector lambda = 0.2 list_of_exponential = array(1:k) #variable to keep all the calculated means #loop to run the replication for(i in 1:k){ list_of_exponential[i] = mean(rexp(n, lambda)) }

As you can see the k here is very small (10) so how would be the distribution of the mean for these iid variables?