Key

ALCOHOL.REC

(A) The Univariate Description

The mean alcohol consumption score is 3.7 (standard deviation = 3.6, n = 713)
The five-point summary is: 0, 1, 3, 5, 13
The histogram has a pronounced positive (right) skew, with modes at 1 and 3 units. The right tail has a mode at 11. (The graph is not posted for technical reasons. I wish it were, though, since "a picture is worth a thousand words" (or so they say).
This univariate analysis provides background for the more detailed analyses below.

(B) Alcohol Consumption by Income Group

The research focus is to determine whether alcohol scores differ by income.

1. Summary statistics: alcohol score by income level; alcohol consumption scores scaling is documented in the assignment

Alcohol Consumption Score	Income Level
Alcohol Consumption Score	1 (n = 46)	2 (n = 88)	3 (n = 140)	4 (n = 250)	5 (n = 189)
Mean	2.8	3.9	4.5	3.5	3.6
SD	3.1	4.2	4.2	3.5	2.9

Mean alcohol scores vary from 2.8 to 4.5, with income level 3 having the highest mean score and income level 1 having the lowest mean score. Standard deviations range from 0.22 to 0.46.

2. Side-by-side quartile plot

Not shown on Web. Graph shows medians values of 3 in all five groups. Interquartile ranges vary considerably.

3. Descriptive comparison of group variances

The descriptive statistics (SDs) and side-by-side quartile plot seem to suggest that groups have unequal variances ("heteroscedasticity").

4. Bartlett's Test

H₀: "sigma squared, group 1" = "sigma squared, group 2" = "sigma squared, group 3" = "sigma squared, group 4" = "sigma squared, group 5" (populations variances are equal; I can't get Greek sigmas to print on the Web)
H₁: at least one populations variance differs
Let alpha = .05.
Chi-square(4, N = 713) = 25.85, p = .000034.
Reject H₀.

Comments: This test confims the exploratory results.

5. Mean � se estimates (graph not shown on Web)

Alcohol Consumption Score	Income Level
Alcohol Consumption Score	1 (n = 46)	2 (n = 88)	3 (n = 140)	4 (n = 250)	5 (n = 189)
Mean	2.8	3.9	4.5	3.5	3.6
se	0.46	0.44	0.35	0.22	0.21

Comment: Standard error calculations do not assume homoscedasticity.

After drawing and viewing the graph, an interesting pattern emerges . . . don't you think?

6. Test of Means or Medians

Since data are assumed to be heteroscedastic, the Kruskal-Wallis test is used.

H₀ can be stated in multiple ways, i.e, directed toward either population (a) means, (b) medians, or (c) locations in general. In this key, let us state the hypotheses as follows. H₀: M₁ = M₂ = M₃ = M₄ = M₅ vs. H₁: at least one population median differs.
Let alpha = .05.
K-W test results: Chi-square(4, N = 713) = 7.79, p = .10.
Therefore, the null hypothesis is retained and we conclude that the observed differences are not significant (i.e., could be due to random sampling error).

Comment: Perhaps we should do a power analysis to determine the power of the analysis (?).

7. Summary

Some points to consider:

Variability of alcohol scores within groups varied considerably. For example, the standard deviations of alcohol score in Group 1 was 2.9 and the standard deviation of scores Group 2 and Group 3 was 4.2. These differences were statistically significant. (p = .000034 by Bartlett's test).
Although mean alcohol scores vary from 2.8 (Group 1) to 4.5 (Group 3), this difference was not significant (p = .10, by the K-W test). The power of this test was not evaluated.

(C) Alcohol Consumption Score by Age Group

The research focus is to determine whether alcohol scores differ by age group.

(1) Summary Statistics, Alcohol Scores by Age Group

	Age Group (Years)
	20 to 29 (n = 234)	30 to 42 (n = 231)	43+ (n = 248)
Mean	4.8	3.6	2.8
Standard Dev.	3.9	3.3	3.2

Comment: Note trend.

2. Side-by-side quartile plot

Not shown on Web. Trend in medians noted. Interquartile ranges: some variabililty, difficult to evaluate.

3. Descriptive comparison of group variances

Hard to evaluate -- some discrepancy in SDs and inter-quartile ranges (as seen in side-by-side quartile plots), but these are modest.

4. Bartlett's Test

H₀: "sigma-squared"₁ = "sigma-squared"₂ = "sigma-squared"₃ vs. H₁: H₀ false
Let alpha = .05.
Chi-square(2, N = 713) = 8.89, p = .012.
Conclusion: reject the null hypothesis; significant difference in variances.

5. Mean � Standard Error Estimates

	Age Group (Years)
	20 to 29 (n = 234)	30 to 42 (n = 231)	43+ (n = 248)
Mean	4.83	3.55	2.80
Standard Error	sqrt (15.072 / 234) = 0.25	sqrt (11.179 / 231) = 0.22	sqrt (10.556 / 248) = 0.21

6. Test of Means or Medians

Let us use the Kruskal-Wallis test so as to avoid a violation of assumptions (in particular, a violation of the equal variance assumption).

H₀: M₁ = M₂ = M₃ vs. H₁: at least one population median differs
Let alpha = .05
Kruskal-Wallis Chi-squared(2, N = 713) = 44.56, p < .0000005.
Reject the null hypothesis; conclude significant difference exists.
Proceed with pairwise comparisons to determine the extent of the difference.

Let alpha_Bonf = .018 (to maintain an "experiment-size" alpha of .05).
Let us use the K-W test to perform these tests, so as not to violate assumptions.
Results may be summarized as follows:

H₀: M₁ = M₂ vs. H₁: M₁ not equal to M₂; alpha_Bonf = .018; K-W Chi-squared(1, N = 465) = 13.89; p = .00019; Reject H₀.
H₀: M₁ = M₃ vs. H₁: M₁ not equal to M₃; alpha_Bonf = .018; K-W Chi-squared(1, N = 482) = 42.11, p < .0000005; Reject H₀
H₀: M₂ = M₃ vs. H₁: M₂ not equal to M₃; alpha_Bonf = .018; K-W Chi-squared(1, N = 479) = 10.56; p = .0012; Reject H₀

Therefore, significant differences are noted all around.

7. Summary

Both the variability and expected values (means) of alcohol scores differ significantly by age group, with average alcohol consumption inversely associated with age, and greater alcohol consumption variability associated with younger age.

DEERMICE

The research focus is to determine whether weight gain differs by diet.

1. Summary Statistics: Weight Gain by Diet

Weight Gain (grams)	Diet A (Standard Diet) n = 5	Diet B (Junk Food) n = 5	Diet C (Health Food) n = 5
Mean	11.14	13.44	9.14
Standard Deviation	1.27	0.62	0.58

2. Side-by-side quartile plot

Not shown on Web.

3. Descriptive comparison of group variances

Hard to evaluate -- seem to differ(?).

4. Test for Inequality of Population Variances

H₀: sigma-squared₁ = sigma-squared₂ = sigma-squared₃ vs. H₁: at least one population variance differs
Let alpha = .05
Bartlett's Chi-squared(2, N = 15) = 2.90, p = .23.
Conclusion: retain H₀; no significant evidence of heteroscedasticity.
Let us assume homoscedasticity. (One could argue that this is a foolhearty assumption.) Thefore, the pooled estimate of variance = Mean Square Within = 0.78.

5. Mean � standard error estimates

Weight Gain (grams)	Diet A (Standard Diet) n = 5	Diet B (Junk Food) n = 5	Diet C (Health Food) n = 5
Mean Estimate	11.14	13.44	9.14
Standard Error Estimates	sqrt (0.780 / 5) = 0.39	0.39	0.39

6. Test of Means

H₀: µ₁ = µ₂ = µ₃ vs. H₁: at least one population mean differs
Let = .05
ANOVA F(2,12) = 29.69; p = .000089, reject H₀

Proceed with multiple comparisons letting alpha_Bonf = .018 in order to maintain experiment-wise alpha level of .05.
H₀: �_A = �_B vs. H₁: �_A not equal to �_B; alpha_Bonf = .018; t(8) = 3.63, p = .0066; reject H₀.
H₀: �_A = �_C vs. H₁: �_A not equal to �_C; alpha_Bonf = .018; t(8) = 3.20, p = .013, reject H₀.
H₀: �_B = �_C vs. H₁: �_B not equal to �_C; alpha_Bonf = .018; t(8) = 11.29, p < .0005; reject H₀.

Comment: Independent t tests were used above. This is equivalent to ANOVA tests with k = 2.

7. Summary

Average weight gain differs significantly by diet type, with junk food associated with the greatest gain (mean = 13.44 gms; sd = 0.62 mgs) and health food associated with the least gain (mean = 9.14, sd = 0.58).

11.3 ROOSTER

The research problem is to determine whether testosterone levels differ by rooster strain.

a. Create ROOSTER.REC

b. Summary statistics. Testerone Levels by Rooster Strain

Testosterone (µg/dl) Strain A
n = 6 Strain B
n = 6 Strain C
n = 6

Mean ± SD 43.27 ± 274.0 112.8 ± 10.5 102.0 ± 7.4

(minimum, maximum) (134, 897) (98, 126) (89, 110)

c. Test for Inequality of Variances

H₀: sigma-squared₁ = sigma-squared₂ = sigma-squared₃ vs. H₁: at least one population variance differs
Let alpha = .05
Bartlett's Chi-square(2, N = 18) = 47.99, p < .0001.
Conclusion: reject H₀.
The conclusion to the hypothesis tests, combined with the widely varying sample standard deviations (Table, above), suggest that the groups are heteroscedastic. (This is interesting. I wonder what there's more variability in Strain A than in the other Strains.) We will therefore proceed under the assumption of unequal variance.

d. Test for Inequality of Means

The nonparametric K-W test will be performed. (See comments about heteroscedasticity, above.)
Let alpha = .05
K-W Chi-square(2, N = 18) = 12.55, p = .0019.
Conclusion: rejected the null hypothesis of equal means and proceed on with pairwise comparisons at alpha_Bonf = .018 (so as to maintain an "experiment-wise" alpha of .05; see Reader pp. 11.7 - 11.9. The Kruskal-Wallis procedure will be used because of the assumed heteroscedasticity. Results are as follows:

Strain A vs. Strain B: K-W: Chi-square(1, N = 12) = 8.34, p = .0039
Strain A vs. Strain C: K-W: Chi-square(1, N = 12) = 8.31, p = .0039
Strain B vs. Strain C: K-W: Chi-square(1, N = 12) = 2.57, p = .11

Conclusion: Strain A differs from Strain B and Strain C, but there is no significant difference between Strain B and Strain C.

e. Power Analysis:

Assumptions: alpha = .05, k = 2, df(between) = 1, df(within) = 18, s²= 100.

i.     For a minimal detectable difference (MDD) of 10, phi = 1.58 and power = .58
ii.    For a MDD of 15, phi = 2.37 and power = .89
iii.   For a MDD of 20, phi = 3.16 and power > .98
iv.   What size samples are needed to achieve a MDD of 10? We know that n (per group) of 10 won't do the trick (see part i.) so, we might try to determine the power when n = 11, n = 12, and so on. One enterprising student did just this, and here are her results:
n phi power
10 1.58 .58
11 1.66 .61
12 1.73 .65
13 1.80 .68
14 1.87 .72
15 1.94 .74
16 2.00 .77
17 2.06 .79
18 2.12 .81

The conclusion, therefore, is use a sample size of 18 per group to achieve .81 power.

A rough answer can be achieved by assuming df(within) is "big", and then look up the phi value need to achieve at least 80% power. In this case, phi is approximately equal to 2. Then, use our sample size formula, n = (2²)(2)(2)(100)/(10²) = 16, which is a good approximation to the more accurate estimates, above.

NEW (Assigned in 3/4 Class): Confidence Intervals for Group Means

Notes:

See last equation on 3/4 Handout titled Estimation
Data are heteroscedastic, so group standard errors = sqrt (group variance / group n)
Degrees of freedom = group size minus 1 (i.e., n_i - 1), which in each case is 5.
Use t(5,.025) = 2.57 as "confidence coefficient" for each calculation
General formula is sample mean +/- (confidence coefficient)(standard error of the mean)

95% confidence interval for the mean of Strain A = 432.67 +/- (2.57)(sqrt 75077.867 / 6) = 432.67 +/- 287.48 = (145.2, 720.2)

95% confidence interval for the mean of Strain B = 112.88 +/- (2.57)(sqrt 110.167 / 6) = (101.50, 124.16)

95% confidence interval for the means of Strain C = 102.00 +/- (2.57)(sqrt 55.20 / 6) = (94.20, 109.8)

Testosterone (µg/dl)	Strain A n = 6	Strain B n = 6	Strain C n = 6
Mean ± SD	43.27 ± 274.0	112.8 ± 10.5	102.0 ± 7.4
(minimum, maximum)	(134, 897)	(98, 126)	(89, 110)