Merkezi limit teoreminin tutmadığı herhangi bir örnek var mı?


32

Wikipedia diyor -

Olasılık teorisinde, merkezi limit teoremi (CLT), çoğu durumda , bağımsız rastgele değişkenler eklendiğinde, normalize edilmiş toplamlarının, orijinal değişkenlerin kendileri olmasa bile, normal bir dağılıma (gayrı resmi olarak "çan eğrisi") yöneldiğini tespit eder. normal dağılım...

"Çoğu durumda" deyince, hangi durumlarda merkezi limit teoremi işe yaramaz?

Yanıtlar:


33

Bunu anlamak için, önce Merkezi Limit Teoreminin bir versiyonunu belirtmeniz gerekir. İşte merkezi limit teoreminin "tipik" ifadesi:

Lindeberg – Lévy CLT. Varsayalım X1,X2, ile Rasgele değişkenlerin dizisidir E[Xi]=μ ve Var[Xi]=σ2< . Let Sn:=X1++Xnn . Sonra nsonsuzluğa yaklaştığında, rastgele değişkenlerdağılımda normal birN(0,σ2) 'yedönüşür,yanin(Snμ)N(0,σ2)

n((1ni=1nXi)μ) d N(0,σ2).

Peki, bunun gayrı resmi açıklamadan farkı nedir ve boşluklar nelerdir? Gayri resmi açıklamanız ve bu açıklama arasında, bazıları diğer cevaplarda tartışılan, ancak tam olarak açıklanmayan birçok fark vardır. Böylece, bunu üç özel soruya dönüştürebiliriz:

  • Değişkenler aynı şekilde dağılmazsa ne olur?
  • Değişkenlerin sonsuz varyansı veya sonsuz ortalaması varsa ne olur?
  • Bağımsızlık ne kadar önemlidir?

Bunları bir seferde almak,

Aynı şekilde dağılmamış , En iyi genel sonuç, merkezi limit teoreminin Lindeberg ve Lyaponov versiyonlarıdır. Temel olarak, standart sapmalar çok fazla çılgınca büyümediği sürece, ondan iyi bir merkezi limit teoremi elde edebilirsiniz.

Lyapunov CLT. [5] Varsayalım , bağımsız rastgele değişken bir dizisidir, sonlu her beklenen değer μ i ve varyans σ 2 tanımlayın: s 2 , n = Σ n i = 1 σ 2 iX1,X2,μiσ2sn2=i=1nσi2

Eğer bazıları için , Lyapunov'un koşul lim n 1δ>0limn1sn2+δi=1nE[|Xiμi|2+δ]=0 is satisfied, then a sum of Xiμi/sn converges in distribution to a standard normal random variable, as n goes to infinity:

1sni=1n(Xiμi) d N(0,1).

Infinite Variance Theorems similar to the central limit theorem exist for variables with infinite variance, but the conditions are significantly more narrow than for the usual central limit theorem. Essentially the tail of the probability distribution must be asymptotic to |x|α1 for 0<α<2. In this case, appropriate scaled summands converge to a Levy-Alpha stable distribution.

Importance of Independence There are many different central limit theorems for non-independent sequences of Xi. They are all highly contextual. As Batman points out, there's one for Martingales. This question is an ongoing area of research, with many, many different variations depending upon the specific context of interest. This Question on Math Exchange is another post related to this question.


2
I have removed a stray ">" from a formula that I think has crept in because of the quoting system - feel free to reverse my edit if it was intentional!
Silverfish

A triangular array CLT is probably a more representative CLT than the one stated. As for not independent, martingale CLT's are reasonably commonly used case.
Batman

@Batman, what's an example of a triangular array CLT? Feel free to edit my response, to add it. I'm not familiar with that one.
John


1
"as long as the standard deviations don't grow too wildly" Or shrink (eg: σi2=σi12/2)
leonbloy

21

Although I'm pretty sure that it has been answered before, here's another one:

There are several versions of the central limit theorem, the most general being that given arbitrary probability density functions, the sum of the variables will be distributed normally with a mean value equal to the sum of mean values, as well as the variance being the sum of the individual variances.

A very important and relevant constraint is that the mean and the variance of the given pdfs have to exist and must be finite.

So, just take any pdf without mean value or variance -- and the central limit theorem will not hold anymore. So take a Lorentzian distribution for example.


+1 Or take a distribution with an infinite variance, like the distribution of a random walk.
Alexis

2
@Alexis - assuming you are looking at a random walk at a finite point in time, I would have thought it would have a finite variance, being the sum of n i.i.d steps each with finite variance
Henry

1
@Henry: Nope, am not assuming at a point in time, but the variance of the distribution of all possible random walks of infinite lengths.
Alexis

1
@Alexis If each step Xi of the random walk is +1 or 1 i.i.d. with equal probability and the positions are Yn=1nXi then the Central Limit Theorem implies correctly that as n you have the distribution of n(1nYn)=Ynn converging in distribution to N(0,1)
Henry

1
@Alexis Doesn't matter for the CLT, because each individual distribution still has a finite variance.
Cubic

15

No, CLT always holds when its assumptions hold. Qualifications such as "in most situations" are informal references to the conditions under which CLT should be applied.

For instance, a linear combination of independent variables from Cauchy distribution will not add up to Normal distributed variable. One of the reasons is that the variance is undefined for Cauchy distribution, while CLT puts certain conditions on the variance, e.g. that it has to be finite. An interesting implication is that since Monte Carlo simulations is motivated by CLT, you have to be careful with Monte Carlo simulations when dealing with fat tailed distributions, such as Cauchy.

Note, that there is a generalized version of CLT. It works for infinite or undefined variances, such as Cauchy distribution. Unlike many well behaving distributions, the properly normalized sum of Cauchy numbers remains Cauchy. It doesn't converge to Gaussian.

By the way, not only Gaussian but many other distributions have bell shaped PDFs, e.g. Student t. That's why the description you quoted is quite liberal and imprecise, perhaps on purpose.


7

Here is an illustration of cherub's answer, a histogram of 1e5 draws from scaled (by n) sample means of t-distributions with two degrees of freedom, such that the variance does not exist.

If the CLT did apply, the histogram for n as large as n=1000 should resemble the density of a standard normal distribution (which, e.g., has density 1/2π0.4 at its peak), which it evidently does not.

enter image description here

library(MASS)
n <- 1000
samples.from.t <- replicate(1e5, sqrt(n)*mean(rt(n, df = 2)))
truehist(samples.from.t, xlim = c(-10,10), col="salmon")

3
You have to be slightly careful here as if you did this with a t-distribution with say 3 degrees of freedom then the Central Limit theorem would apply but your graph would not have a peak density around 0.4 but instead around 16π0.23 because the original variance would not be 1
Henry

That is a good point, one might standardize the mean by sd(x) to get something which, if the CLT works, converges by Slutzky's theorem, to a N(0,1) variate. I wanted to keep the example simple, but you are of course right.
Christoph Hanck

6

A simple case where the CLT cannot hold for very practical reasons, is when the sequence of random variables approaches its probability limit strictly from the one side. This is encountered for example in estimators that estimate something that lies on a boundary.

The standard example here perhaps is the estimation of θ in a sample of i.i.d. Uniforms U(0,θ). The maximum likelihood estimator will be the maximum order statistic, and it will approach θ necessarily only from below: naively thinking, since its probability limit will be θ, the estimator cannot have a distribution "around" θ - and the CLT is gone.

The estimator properly scaled does have a limiting distribution - but not of the "CLT variety".


3

You can find a quick solution here.

Exceptions to the central-limit theorem arise

  1. When there are multiple maxima of the same height, and
  2. Where the second derivative vanishes at the maximum.

There are certain other exceptions which are outlined in the answer of @cherub.


The same question has already been asked on math.stackexchange. You can check the answers there.


5
By "maxima", do you mean modes? Being bimodal has nothing to do with failing to satisfy CLT.
Acccumulation

@Acccumulation: The wording here is confusing because it actually refers to the PGF of a discrete r.v. M(z)=n=P(X=n)zn
Alex R.

@AlexR. The answer doesn't make sense at all without reading through the link, and is far from clear even with the link. I'm leaning towards downvoting as being even worse than a link-only answer.
Acccumulation
Sitemizi kullandığınızda şunları okuyup anladığınızı kabul etmiş olursunuz: Çerez Politikası ve Gizlilik Politikası.
Licensed under cc by-sa 3.0 with attribution required.