EXERCISES

ALCOHOL.REC: Alcohol Consumption Survey (Data source: Monder, 1986)

Data come from a survey of alcohol consumption and socioeconomic status. Data are contained in ALCOHOL.REC (713 records of 5 bytes each), with variables coded as follows:
Variable Name Type Description and codes
ALCS ##  Alcohol consumption score. Codes are as follows: 
00 = non-drinker 
01 = 1 drink per week 
02 = 1-2 drinks per week 
03 = 2 drinks per week 
04 = 2-3 drinks per week 
05 = 3 drinks per week 
06 = 3-4 drinks per week 
07 = 4 drinks per week 
08 = 4-5 drinks per week 
09 = 5 drinks per week  
10 = 5-6 drinks per week 
11 = 6 drinks per week 
12 = 7-11 drinks per week 
13 = 12+ drinks per week 
AGE ##  Age (in years)
INC # Income level: 1 = low, 5 = high

(A) "Univariate" description

Before analyzing alcohol consumption scores by group , perform a descriptive analysis of ALCS for all groups combined (command: MEANS ALCS). Report the mean, standard deviation, sample size, and a five-point summary of the outcome (minimum, Q2, median, Q3, and maximum). Also, produce a histogram of the variable (HISTOGRAM ALCS), and describe (in words) the distribution's shape and location.

(B) Alcohol Consumption by Income Level

  1. Report summary statistics (mean, sd, n) by income group. Use APA reporting standards for rounding and reporting.
  2. Draw a side-by-side quartile plot of the data. Concisely describe (in words) what you see.
  3. Based on the above quartile plot, would you say that group variances are homogenous or heterogenous. Justify your response.
  4. Perform Bartlett's test for heteroscedasticity. (List the null and alternative hypotheses; let alpha = .05; report the hypothesis testing statistic using APA format; state your conclusion.) Does this analysis support the interpretation you put forward in the graphical analysis provided by number 3, above?
  5. Calculate standard error estimates for the mean of each group. Report these ses. Draw a mean � standard error plot and report what you see.
  6. Perform a test to determine whether group means are homogenous. (List the null and alternative hypotheses; let alpha = .05; report an appropriate test statistic using APA format; state your conclusion.)
  7. In a brief one or two sentence narrative, summarize your findings.

(C) Alcohol Consumption by Age

Categorize age into 3 age-class intervals as follows: 20- to 29-year-olds, 30- to 42-year-olds, and 43+ year-olds. This can be accomplished with the following commands:
 
EPI6> DEFINE AGEGROUP #
EPI6> IF AGE <= 29 THEN AGEGROUP = 1
EPI6> IF AGE >= 30 AND AGE <= 42 THEN AGEGROUP = 2
EPI6> IF AGE >= 43 THEN AGEGROUP = 3
EPI6> MEANS ALCS AGEGROUP /N

Perform an analysis similar to the one described in Part B above, now directing your analysis toward alcohol consumption scores by age group. Label your analysis 1 - 7, as above.

DEERMICE: Weight Gain in White-Footed Deer Mice (Hampton, 1994, p. 118, modified)

Fifteen white-footed deer mice are randomly assigned to one of three groups. Group A receives a diet of standard mouse food, Group B receives a diet of junk food, and Group C receives a diet of health food. Data are:

REC  DIET WTGAIN
---  ---- ------
  1  A      11.8
  2  A      12.0
  3  A      10.7
  4  A       9.1
  5  A      12.1
  6  B      13.6
  7  B      14.4
  8  B      12.8
  9  B      13.0
 10  B      13.4
 11  C       9.2
 12  C       9.6
 13  C       8.6
 14  C       8.5
 15  C       9.8

Create an Epi Info data set with these data and then perform an analysis similar to the ones described above.

ROOSTER: Testosterone Levels in Roosters (Hampton, 1994, p. 147, modified)

A chicken pathologist believes that testosterone levels may differ by rooster strain. To test her hypothesis, testosterone levels are measured in 3 strains of roosters. Data are as follows:
 

REC  TESTOSTERO STRAIN
---  ---------- ------
  1         439 A
  2         568 A
  3         134 A
  4         897 A
  5         229 A
  6         329 A
  7         103 B
  8         115 B
  9          98 B
 10         126 B
 11         115 B
 12         120 B
 13         107 C
 14          99 C
 15         102 C
 16         105 C
 17          89 C
 18         110 C
 
(A) Create an Epi Info data file with these data
(B) Compute summary statistics by rooster strain.
(C) Test for inequality of variances.
(D) Test for inequality of means.
(E) Suppose we now want to design a new experiment to test whether average testosterone levels difer in Strain B and Strain C . Let us start with the following simplifying assumptions: k = 2 (Strain B vs. Strain C), n (per group) = 10, within group variance = 100, and = .05. What is the power of this new study to find a minimal detectable difference of 10 units? 15 units? 20 units?
(F) Let us now turn the question on its head by asking how many roosters are needed in each group to find a difference of 10 units. We want 80% power.

More Power to You

Assume alpha = .05, s-squared = 100, n = 10, and k = 2.

(A) df1 =
(B) df2 =
(C) Let d represent the minimal detectable difference you wish to uncover. ("Delta" doesn't print well on the web.) Using the above assumed values, determine the power to detect:
(i) d = 5
(ii) d = 10
(iii) d = 15
(iv.) d = 20
Show all work.
(D) Determine the sample size required to achieve 80% power to detect a minimal detectable difference of 5. Show all work.


ANSWERS:
(A) df1 = k - 1 = 2 - 1 = 1
(B) df2 = N - k = 20 - 2 = 18
(C) Let d represent the minimal detectable difference you wish to uncover. ("Delta" doesn't print well on the web.) Using the above assumed values, determine the power to detect:
(i) d = 5; phi = 0.79 and power < .30 (by interpolation)
(ii) d = 10; phi = 1.58 and power ~= .55
(iii) d = 15; phi = 2.37 and power ~= .89
(iv.) d = 20; phi = 3.16 and power ~= .99
(D) Determine the sample size required to achieve 80% power to detect a minimal detectable difference of 5. Show all work.
phi ~= 2; n = (22)(2)(2)(100)/52 = 64