İki bağımsız Bernoulli popülasyonundan örnekleme dağılımı


17

Diyelim ki iki bağımsız Bernoulli rasgele değişken olan B e r ( θ 1 )Ber(θ1) ve B e r ( θ 2 ) örneklerine sahibizBer(θ2) .

Bunu nasıl kanıtlarız ( ˉ X 1 - ˉ X 2 ) - ( θ 1 - θ 2 )θ 1 ( 1 - θ 1 )n 1 +θ2(1-θ2)n, 2 dN(0 , 1 )

(X¯1X¯2)(θ1θ2)θ1(1θ1)n1+θ2(1θ2)n2dN(0,1)
?

N 1n 2 olduğunu varsayın n1n2.


Z_i= X_1i - X_2i is a sequence of iid rv of finite mean and variance. Hence it satisfies the Levy-Linderberg central limit theorem from which your results follow. Or are you asking for a proof of the clt itself?
Three Diag

@ThreeDiag CLT'nin LL sürümünü nasıl uyguluyorsunuz? Bunun doğru olduğunu düşünmüyorum. Detayları kontrol etmem için bana bir cevap yaz.
Denizde yaşlı bir adam.

Tüm detaylar zaten orada. LL'nin uygulanması için sonlu ortalama ve varyanslı bir dizi iid rv'ye ihtiyacınız vardır. Z_i = X_i1 ve X_i2 değişkeni her üç gereksinimi de karşılar. Kurtuluş iki orijinal Bernoulli bağımsızlığı izler değişkenler ve E (Z_i) ve V (Z_i) standardı E özelliklerini ve V uygulanarak sonlu olduğunu görüyoruz
Üç Diag

1
"iki bağımsız Bernoulli rasgele değişken örnekleri" - yanlış ifade. Olması gereken: "Bernoulli dağılımından iki bağımsız örnek".
Viktor

1
Please add "as n1,n2n1,n2".
Viktor

Yanıtlar:


10

Put a=θ1(1θ1)n1a=θ1(1θ1)n1, b=θ2(1θ2)n2b=θ2(1θ2)n2, A=(ˉX1θ1)/aA=(X¯1θ1)/a, B=(ˉX2θ2)/bB=(X¯2θ2)/b. We have AdN(0,1), BdN(0,1)AdN(0,1), BdN(0,1). In terms of characteristic functions it means ϕA(t)EeitAet2/2, ϕB(t)et2/2.

ϕA(t)EeitAet2/2, ϕB(t)et2/2.
We want to prove that D:=aa2+b2Aba2+b2BdN(0,1)
D:=aa2+b2Aba2+b2BdN(0,1)

Since AA and BB are independent, ϕD(t)=ϕA(aa2+b2t)ϕB(ba2+b2t)et2/2,

ϕD(t)=ϕA(aa2+b2t)ϕB(ba2+b2t)et2/2,
as we wish it to be.

This proof is incomplete. Here we need some estimates for uniform convergence of characteristic functions. However in the case under consideration we can do explicit calculations. Put p=θ1, m=n1p=θ1, m=n1. ϕX1,1(t)=1+p(eit1),ϕˉX1(t)=(1+p(eit/m1))m,ϕˉX1θ1(t)=(1+p(eit/m1))meipt,ϕA(t)=(1+p(eit/mp(1p)1))meiptm/p(1p)=((1+p(eit/mp(1p)1))eipt/mp(1p))m=(1t22m+O(t3m3/2))m

ϕX1,1(t)ϕX¯1(t)ϕX¯1θ1(t)ϕA(t)=1+p(eit1),=(1+p(eit/m1))m,=(1+p(eit/m1))meipt,=(1+p(eit/mp(1p)1))meiptm/p(1p)=((1+p(eit/mp(1p)1))eipt/mp(1p))m=(1t22m+O(t3m3/2))m
as t3m3/20t3m3/20. Thus, for a fixed tt, ϕD(t)=(1a2t22(a2+b2)n1+O(n3/21))n1(1b2t22(a2+b2)n2+O(n3/22))n2et2/2
ϕD(t)=(1a2t22(a2+b2)n1+O(n3/21))n1(1b2t22(a2+b2)n2+O(n3/22))n2et2/2
(even if a0a0 or b0b0), since |ey(1y/m)m|y2/2m |ey(1y/m)m|y2/2m  when  y/m<1/2 y/m<1/2 (see /math/2566469/uniform-bounds-for-1-y-nn-exp-y/ ).

Note that similar calculations may be done for arbitrary (not necessarily Bernoulli) distributions with finite second moments, using the expansion of characteristic function in terms of the first two moments.


This seems correct. I'll get back to you later on, when I have time to check everything. ;)
An old man in the sea.

-1

Proving your statement is equivalent to proving the (Levy-Lindenberg) Central Limit Theorem which states

If {Zi}ni=1{Zi}ni=1 is a sequence of i.i.d random variable with finite mean E(Zi)=μE(Zi)=μ and finite variance V(Zi)=σ2V(Zi)=σ2 then n(ˉZμ)dN(0,σ2)

n(Z¯μ)dN(0,σ2)

Here ˉZ=iZi/nZ¯=iZi/n that is the sample variance.

Then it is easy to see that if we put

Zi=X1iX2i

Zi=X1iX2i
with X1i,X2iX1i,X2i following a Ber(θ1)Ber(θ1) and Ber(θ2)Ber(θ2) respectively the conditions for the theorem are satisfied, in particular

E(Zi)=θ1θ2=μ

E(Zi)=θ1θ2=μ

and

V(Zi)=θ1(1θ1)+θ2(1θ2)=σ2

V(Zi)=θ1(1θ1)+θ2(1θ2)=σ2

(There's a last passage, and you have to adjust this a bit for the general case where n1n2n1n2 but I have to go now, will finish tomorrow or you can edit the question with the final passage as an exercise )


I could not obtain what I wanted exactly because of the possibility of n1n2n1n2
An old man in the sea.

I will show later if you can't get it. Hint: compute the variance of the sample mean of Z and use that as the variable in the theorem
Three Diag

Three, could you please add the details for when n1n2? Thanks
An old man in the sea.

Will do as soon as find a little timr. There was in fact a subtlety that prevents from using LL clt without adjustment. There are three ways to go, the simplest of which is invoking the fact that for large n1 and n2, X1 and X2 go in distribution to normals, then a linear combination of normal is also normal. This is a property of normals that you can take as given, otherwise you can prove it by characteristic functions.
Three Diag

The other two require either a different clt (Lyapunov possibly) or alternatively treat n1 = i and n2= i +k. Then for large i you can essentially disregard k and you can go back to apply LL (but still it will require some care to nail the right variance)
Three Diag
Sitemizi kullandığınızda şunları okuyup anladığınızı kabul etmiş olursunuz: Çerez Politikası ve Gizlilik Politikası.
Licensed under cc by-sa 3.0 with attribution required.