Exercises

(1) EAR.REC: Otitis Media Clinical Trial (Rosner, 1990, p. 68, modified)

Data are from a clinical trial on the treatment of acute otitis media in children. Group 1 received a 14-day trial of cefaclor. Group 2 received a 14-day trial of amoxicillin. This information is contained in the variable called AB (1 = cefaclor, 2 = amoxicillin). A total of 278 infected-ears were treated, with clearance of infection represented in variable CLEAR (1 = yes, 2 = no).

(A) Calculate the clearance rates (p^1 and p^2) associated with each antibiotic. (Report relevant counts and percentages.)
(B) Calculate the relative rate of clearance associated with cefaclor. Include a 95% confidence interval for RR.
(C) Perform a test of association. (Report the null and alternative hypotheses, let alpha = .05, report hypothesis testing statistic in an APA format, state your conclusion and interpret your results.

(2) PRISON.REC: Human Immunodeficiency Virus Infection in a Women's Correctional Institution (Smith et al., 1991)

A study of HIV infection in women entering the New York State Prison system cross-classified 465 inmates with respect to HIV sero-positivity (variable HIV) and history of intravenous drug use (variable IVDU).

(A) Calculate the prevalence of HIV in each exposure groups.
(B) Calculate the prevalence ratio, while including a 95% confidence interval for the parameter.
(C) Perform a test of the association.

(3) LABOR.REC: Induction of Labor and Meconium Staining

Induced labor (by administering pitocin and other hormones) in near-term pregnancies is a common obstetrical procedure. Meconium staining during childbirth is a sign of fetal distress. Use LABOR.REC to determine whether there is an association between INDUCE and MECON. In so doing,

(A) Report the incidence of meconium staining in both exposure groups.
(B) Calculate the relative risk for meconium staining associated with induction. (Include a 95% confidence interval.)
(C) Perform a hypothesis test for the problem. (List the null and alternative, let alpha = .05, fully report the best test statistics; clearly state the conclusion).
(D) Summarize your descriptive and inferential findings.

(4) OSWEGO.REC: Food Poisoning in Oswego, New York (Centers for Disease Control, 1992)

Data from an outbreak of gastrointestinal illness following a church supper in upstate New York are reported in OSWEGO.REC. Variable in the data set are self-explanatory (use the VARIABLES command to see variable names). Based on these data, fill in the table below and determine the most likely source of agent.



Food
Ate Food Did Not Eat Food Relative Risk 95% conf. int. p*
Ill Total % Ill Total %
Baked Ham 29 46 63.0% 17 29 58.6% 1.1 0.7 - 1.6 .70
Spinach ___ ___ ___ ___ ___ ___ ___ ___ ___
Mashed Pot. ___ ___ ___ ___ ___ ___ ___ ___ ___
Cabbage Sal. ___ ___ ___ ___ ___ ___ ___ ___ ___
Jell-O ___ ___ ___ ___ ___ ___ ___ ___ ___
Rolls ___ ___ ___ ___ ___ ___ ___ ___ ___
Brown bread ___ ___ ___ ___ ___ ___ ___ ___ ___
Milk ___ ___ ___ ___ ___ ___ ___ ___ ___
Coffee ___ ___ ___ ___ ___ ___ ___ ___ ___
Water ___ ___ ___ ___ ___ ___ ___ ___ ___
Cakes ___ ___ ___ ___ ___ ___ ___ ___ ___
Van. ice cream ___ ___ ___ ___ ___ ___ ___ ___ ___
Choc. ice cream ___ ___ ___ ___ ___ ___ ___ ___ ___
Fruit salad ___ ___ ___ ___ ___ ___ ___ ___ ___

* uncorrected chi-square or Fisher's exact test, as appropriate.

(5) FOODBRNE: Foodborne Outbreak X

The instructor will provide you the background and data for a foodborne disease outbreak. Computerize these data and analyze these data in a way similar to above. Your table should look something like this:



Food
Ate Food Did Not Eat Food Relative Risk 95% conf. int. p*
Ill Total % Ill Total %
Food1 ___ ___ ___ ___ ___ ___ ___ ___ ___
Food2 ___ ___ ___ ___ ___ ___ ___ ___ ___
etc. ___ ___ ___ ___ ___ ___ ___ ___ ___

Identify the most likely source of exposure.

(6) RESTENOS: Restenosis Following Coronary Atherectomy (Zhou et al., 1996)

Each year, cardiologists open many clogged arteries only to have these same arteries restenose following surgery. A study sponsored by the NIH / Heart, Lung and Blood Institute was performed to determine whether silent infection with a common virus (cytomegalovirus) was predictive of the regrowth of arterial plaque. In 21 of the 49 patients with serologic evidence of cytomegalovirus infection, regrowth of arterial plaque was noted. In contrast, 2 of the 26 patients without serologic evidence of cytomegalovirus had plaque regrowth.

(A) Create a 2-by-2 table with these data.
(B) Determine the relative risk of restenosis associated with cytomegalovirus infection. Include a 95% confidence interval and p value for this estimate.
(C) Do data support the theory that subclinical viral infections may play a role in arteriosclerosis?

(7) PHENFORM: Phenformin and Cardiovascular Death (Osborn, 1979, modified)

In a clinical trial of phenform for the treatment of diabetes treatment, 26 out 204 patients treated with phenformin died from cardiovascular disease, whereas two of 64 control patients died of cardiovascular disease. Based on these data:

(A) Calculate cardiovascular death rates in each group.
(B) Put these data into a 2-by-2 table and using either EpiTable or STATCALC to calculate the relative risk of cardiovascular associated with phenformin. Include a 95% confidence interval for RR.
(C) Perform a hypothesis test to determine whether the observed relative risk is significant.
(D) Briefly interpret your results.

(8) SIZE-COH: Cohort Power and Sample Size Exercises

(A) Assume: alpha = .05; power = .8; allocation ratio = 1:1, and background rate (p2) of 25%. What size sample is needed to detect RR = 2? RR = 3? RR = 4?
(B) What is the power of a study looking for RR = 2, assuming n1 = 50, n2 = 100, p2 = 5%, and alpha =.05. What if the true RR = 3? What if RR = 4?

(9) BI-HELM1.REC: Bicycle Helmet Use in Two Northern California Counties (Perales et al., 1994)

In 1991, 1491 bicyclists were hospitalized for head injuries in California. Forty percent of these injuries were in 0- to 12-year olds. BI-HELM1 contains bicycle helmet use data for 1651 bicycle riders in two northern California counties: Santa Clara County and Contra Costa County. A data documentation table for the data set is see below:
Variable Type Len Description
SCHOOL Real 8 1=Kennedy (Santa Clara County)
2=Los Arboles (Santa Clara County)
3=Cassell (Santa Clara County)
4=Miner (Santa Clara County)
5=Sakamoto (Santa Clara County)
6=Toyon (Santa Clara County)
7=Lietz (Santa Clara County)
8=Sedgewick (Santa Clara County)
9=Belshaw (Santa Clara County)
10=Disco Bay (Contra Costa County)
11=Fair Oaks (Contra Costa County)
12=Grant (Contra Costa County)
13=Walnut Acres (Contra Costa County)
14=Standwood (Contra Costa County)
15=Downer (Contra Costa County)
COUNTY Integer 2 1 = Santa Clara
2 = Contra Costa
HELMETUSE Integer 1 Rider wearing helmet: 1 = yes / 2 = no
MATCHVAR Integer 1 Matching variable based on percent of school population receiving reduced or free meals at school; a surrogate measure of neighborhood SES. School pairs (Santa Clara school / Contra Costa school) as follows::
3: Miner / Fair Oaks
4: Sedgewick / Strandwood
5: Sakamoto / Walnut Acres
6: Toyon / Disco Bay
7: Lietz / Belshaw

Complete the following analyses:

(A) Calculate the helmet-use rates in the Santa Clara County (p^1) and Contra Costa County (p^2). Report relevant counts and percentages.
(B) Calculate the rate ratio and a 95% confidence interval for RR.
(C) Test whether rates differ significantly. (List the null and alternative hypotheses; let alpha = .05; report the hypothesis testing statistic; state the conclusion to the test.)