基本概率分布Basic Concept of Probability Distributions 3: Geometric Distribution

PDF version

PMF

Suppose that independent trials, each having a probability $p$, $0 < p < 1$, of being a success, are performed until a success occurs. If we let $X$ equal the number of failures required, then the geometric distribution mass function is $$f(x; p) =Pr(X=x) = (1-p)^{x}p$$ for $x=0, 1, 2, cdots$.

Proof:

$$ egin{align*} sum_{x=0}^{infty}f(x; p) &= sum_{x=0}^{infty}(1-p)^{x}p\ &= psum_{x=0}^{infty}(1-p)^{x}\ & = pcdot {1over 1-(1-p)}\ & = 1 end{align*} $$

Mean

The expected value is $$mu = E[X] = {1-pover p}$$

Proof:

Firstly, we know that $$sum_{x=0}^{infty}p^x = {1over 1-p}$$ where $0 < p < 1$. Thus $$ egin{align*} {dover dp}sum_{x=0}^{infty}p^x &= sum_{x=1}^{infty}xp^{x-1}\ &= {1over(1-p)^2} end{align*} $$ The expected value is $$ egin{align*} E[X] &= sum_{x=0}^{infty}x(1-p)^{x}p\ &=p(1-p)sum_{x=1}^{infty}x(1-p)^{x-1}\ &= p(1-p){1over(1-(1-p))^2}\ &= {1-pover p} end{align*} $$

Variance

The variance is $$sigma^2 = mbox{Var}(X) = {1-pover p^2}$$

Proof:

$$ egin{align*} Eleft[X^2 ight] &=sum_{x=0}^{infty}x^2(1-p)^{x}p\ &= (1-p)sum_{x=1}^{infty}x^2(1-p)^{x-1}p end{align*} $$ Rewrite the right hand summation as $$ egin{align*} sum_{x=1}^{infty} x^2(1-p)^{x-1}p&= sum_{x=1}^{infty} (x-1+1)^2(1-p)^{x-1}p\ &= sum_{x=1}^{infty} (x-1)^2(1-p)^{x-1}p + sum_{x=1}^{infty} 2(x-1)(1-p)^{x-1}p + sum_{x=1}^{infty} (1-p)^{x-1}p\ &= Eleft[X^2 ight] + 2E[X] + 1\ &= Eleft[X^2 ight] + {2-pover p} end{align*} $$ Thus $$Eleft[X^2 ight] = (1-p)Eleft[X^2 ight] + {(1-p)(2-p) over p}$$ That is $$Eleft[X^2 ight]= {(1-p)(2-p)over p^2}$$ So the variance is $$ egin{align*} mbox{Var}(X) &= Eleft[X^2 ight] - E[X]^2\ &= {(1-p)(2-p)over p^2} - {(1-p)^2over p^2}\ &= {1-pover p^2} end{align*} $$

Examples

1. Let $X$ be geometrically distributed with probability parameter $p={1over2}$. Determine the expected value $mu$, the standard deviation $sigma$, and the probability $Pleft(|X-mu| geq 2sigma ight)$. Compare with Chebyshev's Inequality.

Solution:

The geometric distribution mass function is $$f(x; p) = (1-p)^{x}p, x=0, 1, 2, cdots$$ The expected value is $$mu = {1-pover p} = 1$$ The standard deviation is $$sigma = sqrt{1-pover p^2} = 1.414214$$ The probability that $X$ takes a value more than two standard deviations from $mu$ is $$Pleft(|X-1| geq 2.828428 ight) = P(Xgeq 4) = 0.0625$$ R code:

1 - sum(dgeom(c(0:3), 1/2))
# [1] 0.0625 

Chebyshev's Inequality gives the weaker estimation $$Pleft(|X - mu| geq 2sigma ight) leq {1over4} = 0.25$$

2. A die is thrown until one gets a 6. Let $V$ be the number of throws used. What is the expected value of $V$? What is the variance of $V$?

Solution:

The PMF of geometric distribution is $$f(x; p) = (1-p)^xp, = 0, 1, 2, cdots$$ where $p = {1over 6}$. Let $X = V-1$, so the expected value of $V$ is $$ egin{align*} E[V] &= E[X+1]\ &= E[X] + 1\ &= {1-pover p} + 1\ &= {1-{1over6} over {1over6}} + 1\ &= 6 end{align*} $$ The variance of $V$ is $$ egin{align*} mbox{Var}(V) &= mbox{Var}(X+1)\ &= mbox{Var}(X)\ &= {1-pover p^2}\ &= {1-{1over 6} over left({1over6} ight)^2}\ &= 30 end{align*} $$ Note that this is another form of the geometric distribution which is so-called the shifted geometric distribution (i.e. $X$ equals to the number of trials required). By the above process we can see that the expected value of the shifted geometric distribution is $$mu = {1over p}$$ and the variance of the shifted geometric distribution is $$sigma^2 = {1-pover p^2}$$

3. Assume $W$ is geometrically distributed with probability parameter $p$. What is $P(W < n)$?

Solution:

$$ egin{align*} P(W < n) &= 1 - P(W geq n)\ &= 1-(1-p)^n end{align*} $$

4. In order to test whether a given die is fair, it is thrown until a 6 appears, and the number $n$ of throws is counted. How great should $n$ be before we can reject the null hypothesis $$H_0: mbox{the die is fair}$$ against the alternative hypothesis $$H_1: mbox{the probability of having a 6 is less than 1/6}$$ at significance level $5\%$?  

Solution:

The probability of having to use at least $n$ throws given $H_0$ (i.e. the significance probability) is $$P = left(1 - {1over 6} ight) ^n$$ We will reject $H_0$ if $P < 0.05$. R code:

n = 1
while (n > 0){
+   p = (5/6) ^ n
+   if (p < 0.05) break
+   n = n + 1
+ }
n
# [1] 17 

That is, we have to reject $H_0$ if $n$ is at least 17.

Reference

  1. Ross, S. (2010). A First Course in Probability (8th Edition). Chapter 4. Pearson. ISBN: 978-0-13-603313-4.
  2. Brink, D. (2010). Essentials of Statistics: Exercises. Chapter 5 & 10. ISBN: 978-87-7681-409-0.


作者:赵胤
出处:http://www.cnblogs.com/zhaoyin/
本文版权归作者和博客园共有,欢迎转载,但未经作者同意必须保留此段声明,且在文章页面明显位置给出原文连接,否则保留追究法律责任的权利。

原文地址:https://www.cnblogs.com/zhaoyin/p/4200760.html