加州大学伯克利分校Stat2.3x Inference 统计推断学习笔记: Section 2 Testing Statistical Hypotheses

Stat2.3x Inference（统计推断）课程由加州大学伯克利分校（University of California, Berkeley）于2014年在edX平台讲授。

Summary

Test of Hypotheses $$ ext{Null}: H_0$$ $$ ext{Alternative}: H_A$$ Assuming the null is true, the chance of getting data like the data in the sample or even more like the alternative: P-value. If $P$ is small (i.e. cutoff), choose the alternative. Otherwise, stay with the null.
Significance Level and Power
- Significance level is the probability, under $H_0$, that the test concludes $H_A$ error probability, should be small.
- Power is the probability, under $H_A$, that the test concludes $H_A$ probability of correct conclusion, should be large.

ADDITIONAL PRACTICE PROBLEMS FOR EXERCISE SET 2

PROBLEM 1

To test whether “red” comes up 18/38 of the time in spins of a roulette wheel, the wheel is spun 3800 times; the result is “red” 1720 times. Is the wheel biased against “red”? Answer the question in the following steps:

a) Formulate null and alternative hypotheses about p, the chance with which the wheel shows “red”.

b) Calculate an exact P-value or its normal approximation.

c) State the conclusion of the test.

Solution

a) $$ ext{Null}: p=frac{18}{38}$$ $$ ext{Alternative}: p < frac{18}{38}$$

b) Binomial distribution (exact) $n=3800, p=frac{18}{38}, k=0:1720$: $$P(p < frac{18}{38})=sum_{k=0}^{1720}C_{3800}^{k}cdot p^kcdot(1-p)^{3800-k}=0.004871166$$R code:

sum(dbinom(0:1720, 3800, 18/38))
[1] 0.004871166

Normal distribution (approximate): $mu=ncdot p=1800, sigma=sqrt{ncdot pcdot(1-p)}$: $$Z=frac{1720.5-mu}{sigma}Rightarrow P(p < frac{18}{38})=0.004898679$$ R code:

n = 3800; p = 18 / 38; mu = n * p; sigma = sqrt(n * p * (1 - p))
z = (1720.5 - mu) / sigma
pnorm(z)
[1] 0.004898679

c) $P$ is very small so reject Null, that is, the wheel is biased against "red".

PROBLEM 2

In a “blind taste test” during a nationally televised football game, each of 100 “loyal Budweiser drinkers” was given two unmarked beer containers and asked to say which one they liked better. One of the containers had Budweiser and the other had Schlitz. Of the 100 participants, 46 said they liked the Schlitz better. Schlitz said this was an impressive showing. But maybe the subjects just couldn’t tell one beer from another. Test whether the results are or aren’t like tossing a coin, by providing:

a) the null and alternative hypotheses

b) the P-value

c) the conclusion of the test

Solution

a) $$ ext{Null}: p=0.5$$ $$ ext{Alternative}: p eq0.5$$

b) Binomial distribution: $$sum_{i=0}^{46}C_{100}^{i}cdot p^icdot(1-p)^{100-i}+sum_{j=54}^{100}C_{100}^{j}cdot p^jcdot(1-p)^{100-j}=0.4841184$$ R code:

sum(dbinom(0:46, 100, 0.5)) + sum(dbinom(54:100, 100, 0.5))
[1] 0.4841184

Normal approximation: $$mu=100 imes0.5=50, sigma=sqrt{100 imes0.5 imes0.5}=5$$ $$Rightarrow Z=frac{46.5-mu}{sigma}, P=0.4839273$$ R code:

n = 100; p = 0.5;
mu = n * p; sigma = sqrt(n * p * (1 - p))
z = (46.5 - mu) / sigma
2 * pnorm(z)
[1] 0.4839273

c) $P$ is huge so reject Alternative, that is, the results are look like tossing a coin.

PROBLEM 3

I have a bag of 100 M&M’s (known as Smarties in some countries; students who recognize neither should please think of them as colored pieces of candy). I think 20 of them are red, and my friend thinks that more than 20 are red. In order to decide between these two hypotheses, we are going to take a simple random sample of 40 M&M’s from the bag. If more than 10 M&M’s in the sample are red, we’ll choose my friend’s hypothesis, and otherwise we’ll choose mine. a) State the null hypothesis that is being tested. b) The significance level of the test is exactly (pick one option and fill in the blanks):

(i) binomial n = _____, p = ______, k in the range _______. (ii) hypergeometric N = ____, G = _____, n = _______, g in the range _____. c) Suppose that in fact there are 30 red M&M’s in the bag. The power of the test against this alternative is exactly (pick one option and fill in the blanks): (i) binomial n = _____, p = ______, k in the range _______.

(ii) hypergeometric N = ______, G = _______, n = _________, g in the range ______.

Solution

a) $H_0:$ there are 20 red in the bag.

b) The significance level is under $H_0$ but concludes $H_A$ (reject $H_0$), this is Type 1 error. $P$ should be small in this case. Hypergeometric distribution: $$N=100, G=20, n=40, g=11:20$$ $$Rightarrow P=frac{sum_{g=11}^{20}C_{20}^{g}cdot C_{80}^{40-g}}{C_{100}^{40}}=0.1017439$$ R code:

sum(dhyper(11:20, 20, 80, 40))
[1] 0.1017439

c) The power is under $H_A$ and concludes $H_A$, this is correct answer and $P$ should be large. Hypergeometric distribution: $$N=100, G=30, n=40, g=11:30$$ $$Rightarrow P=frac{sum_{g=11}^{30}C_{30}^{g}cdot C_{70}^{40-g}}{C_{100}^{40}}=0.7466689$$ R code:

sum(dhyper(11:30, 30, 70, 40))
[1] 0.7466689

PROBLEM 4

In order to test whether or not a random number generator is producing the digit “0” in the correct proportion (1/10), the generator will be run 5,000 times. You can assume that the runs are mutually independent and that each has the same probability p of producing “0”. Construct a test that has a significance level of approximately 1%. [Note: “Construct a test” means “Come up with a decision rule.” In the context of this problem, that means you have to say how you will use the number of 0’s among your 5,000 results to decide between your hypotheses.]

Solution $$H_0: p=0.1, H_A: p e0.1$$ The significance level is $1%$ means concluding $H_A$ while assuming $H_0$ is right. Under $H_0$, by normal approximation: $$n=5000, p=0.1Rightarrowmu=ncdot p=500, sigma=sqrt{ncdot pcdot(1-p)}$$ This is two-tail test which means each tail is 0.005, so $Z=pm2.575829$ R code:

qnorm(1 - 0.005)
[1] 2.575829

Thus, the cutoffs are $mupm Zcdotsigma=[445.2084,554.7916]$ R code:

n = 5000; p = 0.1
sigma = sqrt(n * p * (1 - p)); mu = n * p
mu + z * sigma
[1] 445.2084
mu - z * sigma
[1] 554.7916

The test is: choose $H_A$ if the number of 0 is 445 or less, or 555 or more; otherwise stay with $H_0$.

EXERCISE SET 2

If a problem asks for an approximation, please use the methods described in the video lecture segments. Unless the problem says otherwise, please give answers correct to one decimal place according to those methods. Some of the problems below are about simple random samples. If the population size is not given, you can assume that the correction factor for standard errors is close enough to 1 that it does not need to be computed. Please use the 5% cutoff for P-values unless otherwise instructed in the problem.

PROBLEM 1

A die is rolled 600 times. The face with six spots appears 112 times. Is the die biased towards that face, or is this just chance variation? Answer the question in the steps outlined in Problems 1A-1F.

a. The null hypothesis is

b. The die is biased towards the face with six spots.

c. The chance that the face with six spots appears is greater than 1/6, and the face appeared 112 times in the sample just by chance.

d. The chance that the face with six spots appears is equal to 1/6, and the face appeared 112 times in the sample just by chance.

e. The die is biased towards the faces that don’t show six spots.

f. The chance that the face with six spots appears is equal to 112/600.

g. The proportion of times the face with six spots appears is equal to 112/600.

1B The alternative hypothesis is

a. The die is biased.

b. The chance that the face with six spots appears is greater than 1/6.

c. The chance that the face with six spots appears is equal to 112/600.

d. The proportion of times the face with six spots appears is equal to 112/600.

1C If the null hypothesis were true, the expected number of times the face with six spots appeared would be 112 100

1D If the null hypothesis were true, the standard error of the number of times the face with six spots appeared would be _____.

1E The P-value of the test is _____%. [Please be careful to enter your answer as a percent; that is, if your answer is 50% you should enter 50 in the blank, not 50%, nor 0.5, nor 1/ 2, etc]

1F “The test concludes that the die is biased towards the face with six spots.” True False

Solution

1A) d is correct. $$H_0: p=frac{1}{6}$$

1B) b is correct. $$H_A: p > frac{1}{6}$$

1C) $$ncdot p=600 imesfrac{1}{6}=100$$

1D) $$sigma=sqrt{ncdot pcdot(1-p)}=9.128709$$

1E) Binomial distribution: $$sum_{k=112}^{600}C_{600}^{k}cdot p^kcdot(1-p)^{600-k}=10.50586\%$$ R code:

sum(dbinom(112:600, 600, 1/6))
[1] 0.1050586

1F) Because $P > 5\%$ cutoff, so reject $H_A$ (choose $H_0$). The answer is False.

PROBLEM 2

A statistics student hands each of 300 classmates 2 cookies side by side on a plate. Of the 300 students, 171 choose the cookie that’s on their right hand side, and the remaining 129 choose the cookie that’s on their left. The student says, “That’s just like tossing a coin.” The student’s friend says, “No, it’s not.” Help them settle their argument by performing a one-sample z test in Problems 2A-2C.

2A The test should be one-tailed. two-tailed.

2B The P-value of the test is _____%.

2C The test concludes: “That’s just like tossing a coin.” “No, it’s not.”

Solution

2A) $$H_0: p=0.5, H_A: p eq0.5$$ Thus it is two-tailed.

2B) Binomial distribution: $n=300, p=0.5, k=0:129 & 171:300$:$$sum_{i=0}^{129}C_{300}^{i}cdot p^icdot(1-p)^{300-i}+sum_{j=171}^{300}C_{300}^{j}cdot p^jcdot(1-p)^{300-j}=1.777934\%$$ R code:

sum(dbinom(0:129, 300, 0.5)) + sum(dbinom(171:300, 300, 0.5))
[1] 0.01777934

2C) $P < 5\%$, so reject $H_0$. That is, it is not like tossing a coin.

PROBLEM 3

There are two boxes, each with several million tickets marked “1” or “0”. The two boxes have the same number of tickets, but in one of the boxes, 49% of the tickets are marked “1” and in the other box 50.5% of the tickets are marked “1”. Someone hands me one of the boxes but doesn’t tell me which box it is. Consider the following hypotheses: Null: p = 0.49 Alternative: p = 0.505 Here is my proposed test: I will draw a simple random sample of 10,000 tickets, and if 5,000 or more of them are marked “1” then I will choose the alternative; otherwise I will stay with the null.

3A The significance level of my test is _____%.

3B The power of my test is _____%.

Solution

3A) The significance level is the probability of under $H_0$ but concludes $H_A$. Thus, by binomial distribution $p=0.49$ in this case: $$sum_{k=5000}^{10000}C_{10000}^{k}cdot0.49^kcdot0.51^{10000-k}=2.328171\%$$ R code:

sum(dbinom(5000:10000, 10000, 0.49))
[1] 0.02328171

3B) The power is the probability of under $H_A$ and concludes $H_A$. Thus, by binomial distribution $p=0.505$ in this case: $$sum_{k=5000}^{10000}C_{10000}^{k}cdot0.505^kcdot0.495^{10000-k}=84.37643\%$$ R code:

sum(dbinom(5000:10000, 10000, 0.505))
[1] 0.8437643

PROBLEM 4

There are 21,000 students in a Statistics MOOC. Each student tests the fairness of a coin (yes, the same coin; the instructor somehow gets Tyche’s help in getting the coin to each student in turn). Specifically, each student tests: Null: p is equal to 0.5 Alternative: p is not equal to 0.5 using the 5% cutoff. Suppose that, unknown to the students, the coin is in fact fair. The expected number of students whose test will conclude that the coin is unfair is __________. [This answer is actually an integer, cleanly calculated; but I’ll allow you an error of +-5.]

Solution

The chance that a single student makes the wrong conclusion under the null is 5%; that’s what the cutoff represents. So the number of students who make the wrong conclusion under the null is binomial with $n=21000, p=0.05$. The expected value is $21000 imes0.05=1050$.

作者：赵胤
出处：http://www.cnblogs.com/zhaoyin/
本文版权归作者和博客园共有，欢迎转载，但未经作者同意必须保留此段声明，且在文章页面明显位置给出原文连接，否则保留追究法律责任的权利。