Chapter 3

Author
Affiliation
Juan H Klopper (MD), Do Hee Lee (MPH)

The George Washington University

This chapter of Applied Categorical Data Analysis by JH Klopper and DH Lee is licensed under Attribution-NonCommercial-NoDerivatives 4.0 International

1 Introduction

In the previous chapter, we explored measures of association between two binary variables. Our aim was to understand if there was an association between the exposure and outcome variables. We used risk difference (RD), risk ratio (RR), and odds ratio (OR), together with their confidence intervals, to express the association and direction of the association between the two variables. Finally, we explore the use of Pearson’s \chi^{2} test to determine independence between two variables.

To test for independence (or association), we stated the null hypothesis and alternative hypothesis in several ways. In terms of words, we have the following hypotheses for the \chi^{2} test.

  • H_{0}: Outcome level and exposure level are not associated (they are independent)
  • H_{1}: Outcome level and exposure level are associated (they are not independent)

The hypotheses are stated in statistical terms as follows.

  • H_{0}: \; \pi_{ij} = \pi_{i+} \pi_{+j} (independence under the null hypothesis)
  • H_{1}: \; \pi_{ij} \ne \pi_{i+} \pi_{+j}

In terms of the measures of association, we stated the following when expressing confidence intervals.

  • H_{0}: \; \text{RD} = 0 vs. H_{1}: \; \text{RD} \ne 0
  • H_{0}: \; \text{RR} = 1 vs. H_{1}: \; \text{RR} \ne 1
  • H_{0}: \; \text{OR} = 1 vs. H_{1}: \; \text{OR} \ne 1

There are other tests that we can use to measure independence. These are the likelihood ratio test, Fisher’s exact test, and Fisher-Boschloo’s test (among others). We explore these test in this chapter.

Section 3. The Likelihood Ratio Test likelihood ratio test
Section 4. Conditional Exact Tests Fisher’s exact test, Fisher-Boschloo’s exact test

2 Libraries

Load the required libraries for Chapter 3. If these packages are not already installed, please do so before loading the libraries.

Code
library(rio)
library(DescTools)
library(sjPlot)
library(epitools)
library(epiR)

3 The Likelihood Ratio Test

3.1 The Likelihood Ratio Test as a Determinant of Independence

The likelihood ratio test (LRT) is used as an alternative to Pearson’s \chi^{2} test. Both test independence between two categorical variables, that is to say, no association under the null hypothesis. While the LRT is much more commonly used when comparing models, we introduce it here in the setting of determining the independence between two variables.

We have already explored the likelihood function to determine the probability of observing the data for different values of a specified parameter.

The likelihood-ratio test (LRT) (sometimes called the likelihood-ratio chi-squared test) is a hypothesis test that helps choose the best model between when comparing two models. The best model is the one that makes the data most likely, or maximizes the likelihood function.

Likelihood-ratio tests use log-likelihood functions, which are difficult and lengthy to calculate by hand. Most statistical software packages have built in functions to perform the calculations. Log-likelihood functions pose other serious challenges, such as the difficulty in calculating global maximums. These often involve hefty computations with multi-dimensional derivatives.

Both Pearson’s \chi^{2} tests and the LRT lead to similar conclusions, especially when we have large sample sizes. The LRT is used more often in conjunction with models and we will use it when we start to explore generalized linear models in later chapters.

In the case of a 2 \times 2 table, we can determine the likelihood function when we know the study (sampling) design (case-control, cohorts or trials, or cross-sectional design). In the case of a prospective design, as an example, the likelihood function is the product of two binomial distributions with parameters P(D \vert E) and P(D \vert \overline{E}) (the success probabilities).

3.2 Examining the Likelihood Ratio Test

The LRT is based on the concept of a ratio, described by the equation in Equation 1, where we use the symbols, L_{0} and L_{1}, and reference the null hypothesis, H_{0} and the alternative hypothesis, H_{1}.

\begin{align} &\Lambda = \frac{\text{max likelihood when parameters satisfy } H_{0}}{\text{max likelihood when parameters satisfy } H_{1}} \\ &\Lambda = \frac{L_{0}}{L_{1}} \end{align} \tag{1}

If \Lambda is much less than 1.0 (L_{0} much less than L_{1}), then we have strong evidence against H_{0}. If \Lambda is approximately equal to 1.0 (L_{0} \approx L_{1}), then we have no or weak evidence against H_{0}.

In this chapter, we will use R to do these computations for us. The result is, rather than \Lambda, we express the G^{2} statistic, shown in Equation 2, where \log is the natural logarithm, \ln.

G^{2} = -2 log{\frac{L_{0}}{L_{1}}} \tag{2}

The G-test of independence is a likelihood ratio test which tests the goodness of fit of observed frequencies to their expected frequencies, if row and column classifications were independent.

As with Pearson’s \chi^{2} test, we also consider the comparison of the tables of observed, O, and expected, E, values, but in a different way, shown in Equation 3, where we make use of the law of logarithms.

\begin{align} &G^{2} = 2 \sum_{i=1}^{I} \sum_{j=1}^{J} O_{ij} \log{\left( \frac{O_{ij}}{E_{ij}} \right)} \\ &G^{2} = 2 \sum_{i=1}^{I} \sum_{j=1}^{J} O_{ij} \left[ \log{\left( O_{ij} \right) - \log{\left( E_{ij} \right)}} \right] \end{align} \tag{3}

Under H_{0} ,and if no more than 20% of the expected values are less than 5, we have that G^{2} \sim \chi^{2}_{(I-1)(J-1)}. In the case of a 2 \times 2 table, this means that all expected values should be larger than or equal to 5.

Since we are using the \chi^{2} distribution as approximation for the distribution of G^{2} test statistics, we can express critical values for \alpha = 0.1, \alpha = 0.05, and \alpha = 0.01 (values most often used), using the qchisq function.

Code
qchisq(0.1, df = 1, lower.tail = FALSE)
[1] 2.705543
Code
qchisq(0.05, df = 1, lower.tail = FALSE)
[1] 3.841459
Code
qchisq(0.01, df = 1, lower.tail = FALSE)
[1] 6.634897

For a chosen \alpha value, we reject the null hypothesis if G^{2} is larger than these critical values.

3.3 Example

We use the same example as in the previous chapter (the measures_assoc.csv file), imported below.

Code
dfr <- import_dataset("measures_assoc.csv")

We see the table of observed values below, using the table function.

Code
tb <- table(dfr$Exposure, dfr$Disease)
tb
    
      D ND
  E  21 19
  NE  8 32

The table of expected values can be returned using the chisq.test function (passing the table of observed values, tb, as argument), and calling the expected attribute.

Code
chisq.test(tb)$expected
    
        D   ND
  E  14.5 25.5
  NE 14.5 25.5

We note that all four expected values are greater than or equal to 5, and we can use the LRT. The hypotheses are as stated in the introduction to this chapter.

As exercise, we substitute the values into Equation 3.

Code
2 * ((21 * log(21/14.5)) + (19 * log(19/25.5)) + (8 * log(8/14.5)) + (32 * log(32/25.5)))
[1] 9.390962

We have that G^{2} = 9.39. For 1 degree of freedom, and \alpha = 0.05, we see that G^{2} > 3.841 and (as with Pearson’s \chi^{2} test performed below using the chisq test), we reject the null hypothesis.

Code
chisq.test(tb)

    Pearson's Chi-squared test with Yates' continuity correction

data:  tb
X-squared = 7.789, df = 1, p-value = 0.005256

In real-world research analysis, remember to include (the appropriate) measures of association together with their confidence intervals as a more complete description of the conclusion.

3.4 Gtest Function for the Likelihood Ratio Test

The LRT can be performed using the GTest function from the DescTools library. We simply pass the table of observed values as argument.

Code
tst_lrt <- DescTools::GTest(tb, correct = "none")
tst_lrt

    Log likelihood ratio (G-test) test of independence without correction

data:  tb
G = 9.391, X-squared df = 1, p-value = 0.002181

Note that the GTest function uses the symbol G for the test statistic. It is, in fact, the G^{2} test statistic value.

We see p < \alpha if \alpha=0.05 and we reject the null hypothesis. From the Figure 1 created by the dist_chisq function (sjPlot library), we see that the G^{2} statistic is higher than the critical value for \alpha=0.05. (In fact, it is so far to the right of the critical value that it is not visible.)

Code
sjPlot::dist_chisq(
  chi2 = tst_lrt$statistic,
  deg.f = 1,
  p = 0.05
)

Figure 1: Visual Representation of Test Results

We now have to explore cases in which the assumptions of the use of the \chi^{2} distribution are not met.

4 Conditional Exact Tests

We use the \chi^{2} distribution to approximate the sampling distribution of the X^{2} test statistic (for Pearson’s \chi^{2} test) and the G^{2} test statistic (for the G test as an example of a likelihood ratio test).

The approximation can be used when no more than 20% of the expected values are less than 5. In the case that it is, we must consider exact tests.

Fisher’s exact test is a conditional exact test that does not have requirements for the expected values. Fisher’s exact test is conservative and may suffer from a type II error (fail to reject an incorrect null hypothesis).

Fisher-Boschloo’s exact test, is an unconditional exact test and is arguably more powerful. We will see that it makes no requirements for the expected frequencies.

4.1 Fisher’s Exact Test

Fisher’s exact test is conditional on row and column totals (the marginal totals) that are fixed. In the table below, we state a value for n_{11}. This value determines the value of all the other cell values, given fixed marginal values.

D \overline{E} Total
E n_{11} n_{1+}
\overline{E} n_{2+}
Total n_{+1} n_{+2} n

Remember that, in some prospective studies, the row totals are fixed (the level of exposure is fixed by the study design). For a retrospective study, the column totals are fixed. In a cross-sectional study, only the total is fixed. For Fisher’s exact test, we treat the row and column totals as fixed (contrary to the study design).

Considering the table above, we actually only need four values for a 2 \times 2 table. In the table below, we have two levels of the explanatory variable, E and \overline{E}. We have success, D, and failure, \overline{D}, as the levels of the response variable.

D \overline{D} TOTAL
A n_{11} n_{1+}
B
TOTAL n_{+1} n

Note that the four values uniquely define the table. All the other values can be calculated as shown in the next table.

D \overline{D} TOTAL
A n_{11} n_{1+} - n_{11} n_{1+}
B n_{+1} - n_{11} n - n_{1+} - (n_{+1} - n_{11}) n - n_{1+}
TOTAL n_{+1} n - n_{+1} n

As mentioned, with these fixed values, the value of one cell determines the others through simple arithmetic. As explanation, we concentrate on n_{11}.

Under the null hypothesis, and with row and column totals fixed, n_{11} (the only cell free to vary) follows a hypergeometric distribution, shown in Equation 4.

P(n_{11}) = \frac{\begin{pmatrix} n_{1+} \\ n_{11} \end{pmatrix} \begin{pmatrix} n_{2+} \\ n_{+1} - n_{11} \end{pmatrix}}{\begin{pmatrix} n \\ n_{+1} \end{pmatrix}} \tag{4}

Because of the fixed totals, we have the limits for n_{11}, shown in Equation 5.

\max{(0, n_{1+} - n_{+2})} \le n_{11} \le \min{(n_{+1} , n_{1+})} \tag{5}

We can compute the probability of each possible 2 \times 2 table under the null hypothesis. The p value for testing H_{0}: \; \text{OR} = 1 vs. H_{1}: \; \text{OR} \ne 1, is the sum of table probabilities that are as small as, or smaller than, the one we observe. We can also use risk difference or relative risk (design dependent). In this course, as in R, we use the OR.

4.1.1 Example

We use data from a hypothetical small experimental vaccine trial. Participants received either the new vaccine or a placebo. Data were collected on whether the disease was prevented (denoted by Yes) or not (denoted by No).

Code
dfr <- import_dataset("small_vaccine.csv")

We see the table of observed values below.

Code
# Remember to set set the order of the levels
dfr$Group <- factor(dfr$Group, levels = c("Vaccine", "Placebo"))
dfr$Prevented <- factor(dfr$Prevented  , levels = c("Yes", "No"))
tb <- table(dfr$Group, dfr$Prevented)
tb
         
          Yes No
  Vaccine   5  2
  Placebo   0  4

The table of expected values is shown in Table 1. All values are less than 5, and we have to use an exact test as our test statistics X^{2} \nsim \chi^{2}_{\text{df}} and G^{2} \nsim \chi^{2}_{\text{df}}.

Table 1: Table of Expected Values
Expected Yes No
Vaccine 3.18 3.82 7
Placebo 1.82 2.18 4
5 6 11

If the marginal totals are fixed, then n_{11} can take the values 1,2,3,4 or 5, using Equation 5. The possible tables under the null hypothesis are shown in Table 2, Table 3, Table 4, and Table 5.

Table 2: First Case
n_{11}=1 Yes No
Vaccine 1 6 7
Placebo 4 0 4
5 6 11

We can use the phyper function to calculate the probability of the table above, using Equation 4. The arguments are (n_{11}, n_{1+}, n_{2+} , n_{+1}).

Code
dhyper(1, 7, 4, 5)
[1] 0.01515152
Table 3: Second Case
n_{11}=2 Yes No
Vaccine 2 5 7
Placebo 3 1 4
5 6 11

The probability for the table above is shown below.

Code
dhyper(2, 7, 4, 5)
[1] 0.1818182
Table 4: Third Case
n_{11}=3 Yes No
Vaccine 3 4 7
Placebo 2 2 4
5 6 11

The probability for the table above is shown below.

Code
dhyper(3, 7, 4, 5)
[1] 0.4545455
Table 5: Fourth Case
n_{11}=4 Yes No
Vaccine 4 3 7
Placebo 1 3 4
5 6 11

The probability for the table above is shown below.

Code
dhyper(4, 7, 4, 5)
[1] 0.3030303

For the observed data, n_{11}=5, we calculate the probability below.

Code
dhyper(5, 7, 4, 5)
[1] 0.04545455

What are the tables with probabilities equal to or less than p = 0.045 (for the table of observed data with n_{11}=5)? It is only for the table with n_{11} = 1, with p=0.015. We sum these. Therefor we have p=0.045 + 0.015 = 0.060. The resultant value is the probability of finding the observed data that we did or more extreme.

4.1.2 fisher.test Function for the Fisher’s Exact Test

The fisher.test function is used for Fisher’s exact test. We pass the table of observed values are argument.

Code
fisher.test(tb)

    Fisher's Exact Test for Count Data

data:  tb
p-value = 0.06061
alternative hypothesis: true odds ratio is not equal to 1
95 percent confidence interval:
 0.7331565       Inf
sample estimates:
odds ratio 
       Inf 

Fisher’s exact test can also be calculated as in Equation 6 (using factorials), where we have the table value shown in Table 6.

Table 6: Table for using Factorials
D \overline{D} Total
E a b a+b
\overline{E} c d c+d
Total a+c b+d n

p = \frac{(a+b)! \, (c+d)! \, (a+c)! \, (b+d)! \, }{a! \, b! \, c! \, d! \, n!} \tag{6}

We calculate the results using code below.

Code
(factorial(7) * factorial(4) * factorial(5) * factorial(6)) / (factorial(5) * factorial(2) * factorial(4) * factorial(11))
[1] 0.04545455
Code
(factorial(7) * factorial(4) * factorial(5) * factorial(6)) / (factorial(6) * factorial(4) * factorial(11))
[1] 0.01515152

The sum of these probabilities is equal to the p value from the fisher.exact function.

4.2 Fisher-Boschloo’s Test

We only briefly mention the Fisher-Boschloo’s test in this chapter.

The conservative notion of Fisher’s exact test is a consequence of the discreteness of the hypergeometric distribution. It may also be odd to think about both row and column totals as fixed.

A test that addresses these problems is Fisher-Boschloo’s test. It uses the p value from Fisher’s test as its test statistic. It is at least as powerful as Fisher’s exact test.

In Table 7, we see a reminder of the data from above.

Table 7: Reminder of the Data
Success Failure Total
Treatment 5 2 7
Control 0 4 4
Total 5 6 11

The exact.test function from the Exact library is used to conduct the Fisher-Boschloo test, which we see as our example, in the code chunk below.

Code
Exact::exact.test(
  data = tb,
  method = "boschloo",
  to.plot = FALSE
)

    Boschloo's Exact Test

data:  5 out of 7 vs. 0 out of 4
test statistic = 0.060606, first sample size = 7, second sample size =
4, p-value = 0.03223
alternative hypothesis: true difference in proportion is not equal to 0
sample estimates:
difference in proportion 
               0.7142857 

4.3 Comparison of Tests for Association

We can compare all the test using our illustrative example, and selecting a level of significance, \alpha=0.05. The summary of the data is presented as Table 8.

We start with Pearson’s \chi^{2} test.

Code
chisq.test(tb, correct = FALSE)
Warning in chisq.test(tb, correct = FALSE): Chi-squared approximation may be
incorrect

    Pearson's Chi-squared test

data:  tb
X-squared = 5.2381, df = 1, p-value = 0.0221

Below, we conduct the LRT.

Code
DescTools::GTest(tb, correct = "none")

    Log likelihood ratio (G-test) test of independence without correction

data:  tb
G = 6.7824, X-squared df = 1, p-value = 0.009206

Below we use Fisher’s exact test.

Code
fisher.test(tb)

    Fisher's Exact Test for Count Data

data:  tb
p-value = 0.06061
alternative hypothesis: true odds ratio is not equal to 1
95 percent confidence interval:
 0.7331565       Inf
sample estimates:
odds ratio 
       Inf 

Finally, we have Fisher-Boschloo’s test.

Code
Exact::exact.test(
  data = tb,
  method = "boschloo",
  to.plot = FALSE
)

    Boschloo's Exact Test

data:  5 out of 7 vs. 0 out of 4
test statistic = 0.060606, first sample size = 7, second sample size =
4, p-value = 0.03223
alternative hypothesis: true difference in proportion is not equal to 0
sample estimates:
difference in proportion 
               0.7142857 

In the table below, we see the results for the observed data for our illustrative example, using all four tests that we have explored.

Table 8: Comparison of Tests
Test Test statistic p value Conclusion
Pearson’s \chi^{2} test \chi^2=9.14 p=0.0221 Reject H_{0}
Likelihood-ratio test G^{2}=9.39 p=0.0092 Reject H_{0}
Fisher’s exact test n_{11} = 5 p=0.0606 Fail to reject H_{0}
Fisher-Boschloo test p_{\text{Fishcer}} = 0.0048 p=0.0322 Reject H_{0}

We note the largest p value for Fisher’s exact test alluding to the criticism mentioned in the introduction.

5 The Association Between Multi-Level Categorical Variables

We have concentrated on the association between binary variables. In this section, we begin our exploration of multi-level categorical variables. From multi-value variables, we create I \times J contingency tables and explore measures of association through confidence intervals for these measures, and hypothesis testing using Pearson’s \chi^{2} test and the likelihood ratio test.

5.1 Illustrative Data

The data that we will use in this section represents a study about education level and contraception use (from the married couples National Indonesia Contraceptive Survey). There are four exposure levels for the variable \texttt{Education}. These are High, Med-High, Med-Low, and Low, representing the four different levels of education. The three levels of the outcome variable, \texttt{Contraception}, are Long-term, Short-term, and No use, representing the use of contraceptives.

Code
dfr <- import_dataset("ed_contra.csv")

We set the levels such that the table of observed values follows the order of levels that we use as standard in this text.

Code
dfr$Education <- factor(
  dfr$Education,
  levels = c("High", "Med-High", "Med-Low", "Low")
)
dfr$Contraception <- factor(
  dfr$Contraception,
  levels = c("Long-term", "Short-term", "No use")
)

Below, we see the table of observed data.

Code
tb <- table(dfr$Education, dfr$Contraception)
tb
          
           Long-term Short-term No use
  High           207        195    175
  Med-High        80        155    175
  Med-Low         37        121    176
  Low              9         40    103

5.2 Reference Levels

For multi-level variables, we choose a level as our reference. Other levels are compared to the reference level.

It makes sense in the case of our illustrative example to use the low education level Low, and the no contraceptive use level, No use, as reference levels for the two variables. The choice for each variable must be made based on a research question.

From this choice, we can create six different contingency tables, shown in Table 9, Table 10, Table 11, Table 12, Table 13, and Table 14.

Table 9: First Table
Long-term No use
High 207 175
Low 9 103
Table 10: Second Table
Long-term No use
Med-High 80 175
Low 9 103
Table 11: Third Table
Long-term No use
Med-Low 37 176
Low 9 103
Table 12: Fourth Table
Short-term No use
High 195 175
Low 40 103
Table 13: Fifth Table
Short-term No use
Med-High 155 175
Low 40 103
Table 14: Sixth Table
Short-term No use
Med-Low 121 176
Low 40 103
Code
high_long <- matrix(c(207, 9, 175, 103), nrow = 2)
med_high_long <- matrix(c(80, 9, 175, 103), nrow = 2)
med_low_long <- matrix(c(37, 9, 176, 103), nrow = 2)
high_short <- matrix(c(195, 40, 175, 103), nrow = 2)
med_high_short <- matrix(c(155, 40, 175, 103), nrow = 2)
med_low_short <- matrix(c(121, 40, 176, 103), nrow = 2)

We use the epi.2by2 function to calculate the measures of association for each sub-table.

Code
epi.2by2(high_long)
             Outcome +    Outcome -      Total                 Inc risk *
Exposed +          207          175        382     54.19 (49.05 to 59.27)
Exposed -            9          103        112       8.04 (3.74 to 14.71)
Total              216          278        494     43.72 (39.30 to 48.23)

Point estimates and 95% CIs:
-------------------------------------------------------------------
Inc risk ratio                                 6.74 (3.58, 12.70)
Inc odds ratio                                 13.54 (6.65, 27.54)
Attrib risk in the exposed *                   46.15 (39.06, 53.25)
Attrib fraction in the exposed (%)            85.17 (72.07, 92.13)
Attrib risk in the population *                35.69 (29.02, 42.36)
Attrib fraction in the population (%)         81.62 (66.24, 90.00)
-------------------------------------------------------------------
Uncorrected chi2 test that OR = 1: chi2(1) = 74.973 Pr>chi2 = <0.001
Fisher exact test that OR = 1: Pr>chi2 = <0.001
 Wald confidence limits
 CI: confidence interval
 * Outcomes per 100 population units 
Code
epi.2by2(med_high_long)
             Outcome +    Outcome -      Total                 Inc risk *
Exposed +           80          175        255     31.37 (25.73 to 37.46)
Exposed -            9          103        112       8.04 (3.74 to 14.71)
Total               89          278        367     24.25 (19.95 to 28.97)

Point estimates and 95% CIs:
-------------------------------------------------------------------
Inc risk ratio                                 3.90 (2.03, 7.50)
Inc odds ratio                                 5.23 (2.52, 10.87)
Attrib risk in the exposed *                   23.34 (15.74, 30.94)
Attrib fraction in the exposed (%)            74.39 (50.82, 86.66)
Attrib risk in the population *                16.21 (9.54, 22.89)
Attrib fraction in the population (%)         66.86 (40.34, 81.60)
-------------------------------------------------------------------
Uncorrected chi2 test that OR = 1: chi2(1) = 23.071 Pr>chi2 = <0.001
Fisher exact test that OR = 1: Pr>chi2 = <0.001
 Wald confidence limits
 CI: confidence interval
 * Outcomes per 100 population units 
Code
epi.2by2(med_low_long)
             Outcome +    Outcome -      Total                 Inc risk *
Exposed +           37          176        213     17.37 (12.54 to 23.14)
Exposed -            9          103        112       8.04 (3.74 to 14.71)
Total               46          279        325     14.15 (10.55 to 18.42)

Point estimates and 95% CIs:
-------------------------------------------------------------------
Inc risk ratio                                 2.16 (1.08, 4.32)
Inc odds ratio                                 2.41 (1.12, 5.19)
Attrib risk in the exposed *                   9.34 (2.18, 16.49)
Attrib fraction in the exposed (%)            53.74 (7.62, 76.83)
Attrib risk in the population *                6.12 (-0.18, 12.42)
Attrib fraction in the population (%)         43.23 (0.87, 67.48)
-------------------------------------------------------------------
Uncorrected chi2 test that OR = 1: chi2(1) = 5.265 Pr>chi2 = 0.022
Fisher exact test that OR = 1: Pr>chi2 = 0.029
 Wald confidence limits
 CI: confidence interval
 * Outcomes per 100 population units 
Code
epi.2by2(high_short)
             Outcome +    Outcome -      Total                 Inc risk *
Exposed +          195          175        370     52.70 (47.48 to 57.89)
Exposed -           40          103        143     27.97 (20.79 to 36.09)
Total              235          278        513     45.81 (41.43 to 50.23)

Point estimates and 95% CIs:
-------------------------------------------------------------------
Inc risk ratio                                 1.88 (1.42, 2.49)
Inc odds ratio                                 2.87 (1.89, 4.36)
Attrib risk in the exposed *                   24.73 (15.79, 33.68)
Attrib fraction in the exposed (%)            46.92 (29.76, 59.89)
Attrib risk in the population *                17.84 (9.31, 26.36)
Attrib fraction in the population (%)         38.94 (22.88, 51.65)
-------------------------------------------------------------------
Uncorrected chi2 test that OR = 1: chi2(1) = 25.411 Pr>chi2 = <0.001
Fisher exact test that OR = 1: Pr>chi2 = <0.001
 Wald confidence limits
 CI: confidence interval
 * Outcomes per 100 population units 
Code
epi.2by2(med_high_short)
             Outcome +    Outcome -      Total                 Inc risk *
Exposed +          155          175        330     46.97 (41.48 to 52.51)
Exposed -           40          103        143     27.97 (20.79 to 36.09)
Total              195          278        473     41.23 (36.75 to 45.81)

Point estimates and 95% CIs:
-------------------------------------------------------------------
Inc risk ratio                                 1.68 (1.26, 2.24)
Inc odds ratio                                 2.28 (1.49, 3.49)
Attrib risk in the exposed *                   19.00 (9.88, 28.11)
Attrib fraction in the exposed (%)            40.45 (20.66, 55.30)
Attrib risk in the population *                13.25 (4.66, 21.84)
Attrib fraction in the population (%)         32.15 (14.70, 46.03)
-------------------------------------------------------------------
Uncorrected chi2 test that OR = 1: chi2(1) = 14.860 Pr>chi2 = <0.001
Fisher exact test that OR = 1: Pr>chi2 = <0.001
 Wald confidence limits
 CI: confidence interval
 * Outcomes per 100 population units 
Code
epi.2by2(med_low_short)
             Outcome +    Outcome -      Total                 Inc risk *
Exposed +          121          176        297     40.74 (35.10 to 46.57)
Exposed -           40          103        143     27.97 (20.79 to 36.09)
Total              161          279        440     36.59 (32.08 to 41.28)

Point estimates and 95% CIs:
-------------------------------------------------------------------
Inc risk ratio                                 1.46 (1.08, 1.96)
Inc odds ratio                                 1.77 (1.15, 2.73)
Attrib risk in the exposed *                   12.77 (3.53, 22.01)
Attrib fraction in the exposed (%)            31.34 (7.63, 48.96)
Attrib risk in the population *                8.62 (-0.01, 17.24)
Attrib fraction in the population (%)         23.55 (4.41, 38.86)
-------------------------------------------------------------------
Uncorrected chi2 test that OR = 1: chi2(1) = 6.783 Pr>chi2 = 0.009
Fisher exact test that OR = 1: Pr>chi2 = 0.011
 Wald confidence limits
 CI: confidence interval
 * Outcomes per 100 population units 

The more levels, the more sub-tables we have to create. The conclusions can be even more tedious. For example, in sub-table 1, for the relative risk, we would state that the chance of being a long-term contraceptive user vs. a non-user in the high-education group is 574% greater that the chance of being a long-term contraceptive user vs. a non-user in the low education group.

5.3 Tests Using I \times J Tables

If the assumptions for the expected cell counts are met, we can use the following two approximations for the X^{2} statistic (Pearson’s \chi^{2} test) and the G^{2} statistic (likelihood ration test (LRT)).

  • X^{2} \sim \chi^{2}_{(I-1)(J-1)}
  • G^{2} \sim \chi^{2}_{(I-1)(J-1)}

We can call the expected attribute for the chisq.test function to ensure that no more than 20% of the expected cell values are not less than 5.

Code
tb <- table(dfr$Education, dfr$Contraception) # Table of observed values
chisq.test(tb)$expected
          
           Long-term Short-term    No use
  High     130.44196  200.16768 246.39036
  Med-High  92.68839  142.23354 175.07807
  Med-Low   75.50713  115.86830 142.62458
  Low       34.36253   52.73048  64.90699

None of the expected values are less than 5.

5.4 Pearson \chi^{2} Test

For \alpha = 0.05 we state our hypothesis.

  • H_{0}: Education level and contraceptive use are not associated
  • H_{1}: Education level and contraceptive use are associated

We conduct the test using the chisq.test function.

Code
tst_chi <- chisq.test(tb)
tst_chi

    Pearson's Chi-squared test

data:  tb
X-squared = 140.46, df = 6, p-value < 2.2e-16

The test statistic, X^{2}, is very high, with a negligible p value. We reject the null hypothesis and state that there is an association between the variables.

Since we have rejected H_{0}, we present the individual sub-tables. We might also explore cell residuals.

5.5 Cell Residuals

Cell-by-cell comparisons between the observed and expected tables can help us understand departures from the null hypothesis.

In Equation 7 we see raw residuals and in Equation 8, standardized residuals.

O_{ij} - E_{ij} \tag{7}

\frac{O_{ij} - E_{ij}}{\sqrt{E_{ij} (1 - p_{i+}) (1 - p_{+j})}} \tag{8}

In the denominator of Equation 8, we have the standard error (standard deviation of a sampling distribution) of the numerator under the null hypothesis. Under H_{0}, each standardized residual approximately follows the standard normal distribution. Absolute values larger than 2 suggests evidence against H_{0}. This is usually stated as a lack of fit for H_{0} (referencing a null model - i.e. the data does not fit the null model).

The residuals (from Equation 7) are returned by calling the residuals attribute.

Code
tst_chi$residuals
          
             Long-term  Short-term      No use
  High      6.70320010 -0.36525742 -4.54807602
  Med-High -1.31793467  1.07045789 -0.00590037
  Med-Low  -4.43146328  0.47673747  2.79466408
  Low      -4.32663201 -1.75312942  4.72824006

The standardized residuals are returned calling the stdres attribute.

Code
tst_chi$stdres
          
              Long-term   Short-term       No use
  High      9.769638634 -0.579509094 -7.703804154
  Med-High -1.763508559  1.559258659 -0.009175804
  Med-Low  -5.728430053  0.670861142  4.198549467
  Low      -5.193365290 -2.290750441  6.595984604

We interpret the first cell by stating that the observed value is almost 10 standard deviations away from what is expected under the null hypothesis. This cell contributes much to the X^{2} statistic that we found.

If we consider the larger positive standardized residuals, we find the two-category level combinations that are much higher than what would be expected if the variables were not associated. These are listed below.

  • High education level - long-term contraception use (9.77)
  • Medium-low education level - long-term contraception use (4.20)
  • Low education level - no contraception use (6.60)

If we consider the larger negative standardized residuals, we find the two-category level combinations that are much lower than what would be expected if the variables were not associated. These are listed below.

  • High education level - no contraceptive use (-7.7)
  • Medium-low education level - long-term contraceptive use (-5.73)
  • Low education level - long-term contraception use (-5.19)
  • Low education level - short-term contraception use (-2.29)

6 Lab Problems

6.1 Question

You are part of a team conducting a research project investigating the long-term health benefits of financial support to members of an under-privileged community. Two variables for which measurements are taken, are the financial support group that each individual belongs to and a decrease in malnutrition. There are two financial aid groups, Cash and Control. The former are members of the community chosen to receive financial aid and the latter are controls from the community (not receiving financial aid). There are two malnutrition groups, Improved and No improvement. Your task is to answer the research question: Are financial aid group and malnutrition independent at the 5% significance level.

Answer the research question by (1) stating the test hypothesis, (2) generating a table of observed data, (3) determining if the \chi^{2} distribution can be used to approximate the sampling distribution of the test statistic, and by contrasting the results from two appropriate tests for association, using (4) a critical value approach and (5) a p value approach. Lastly (6), use the results from either test to write a comment summarizing your answer to the research question.

Data for the two variables are generated below.

Code
set.seed(12) 
group <- rep(c("Cash", "Control"), each = 20)
malnutrition <- sample(c("Improved", "Not improved"), size = 40, replace = TRUE)

dfr <- data.frame(
  Group = group,
  Malnutrition = malnutrition
)

Before starting any analysis, we consider summary statistics and data visualization. Part 1 already asks for a table of observed data. Below, visualize the data as a stacked bar chart.

Code
ggstatsplot::ggbarstats(
  dfr,
  y = Group,
  x = Malnutrition,
  bf.message = FALSE,
  results.subtitle = FALSE, 
  palette = "Blues",
  title = "Percentage of malnutrition improvement in the two groups"
)

The null hypothesis states that the \texttt{Group} and \texttt{Malnutrition} variables are independent. The alternative hypothesis is that they are dependent. It can also be written in terms of joint population proportions and marginal population proportions.

  • H_{0}: \; \pi_{ij} = \pi_{i+} \pi_{+j}
  • H_{1}: \; \pi_{ij} \ne \pi_{i+} \pi_{+j}

We can also state measures of association. In this case, we use the odds ratio.

  • H_{0}: \; \text{OR} = 1
  • H_{1}: \; \text{OR} \ne 1

A summary of the results are shown using a table of observed data.

Code
table(dfr$Group, dfr$Malnutrition)
         
          Improved Not improved
  Cash           7           13
  Control       10           10

To use the \chi^{2} distribution as sampling distribution of our test statistics, we need to confirm that all four expected values are 5 or more.

Code
chisq.test(table(dfr$Group, dfr$Malnutrition))$expected
         
          Improved Not improved
  Cash         8.5         11.5
  Control      8.5         11.5

The critical value for \chi^{2}_{1 , \, 0.05} is calculate below.

Code
qchisq(0.05, df = 1, lower.tail = FALSE)
[1] 3.841459

We conduct the Pearson’s \chi^{2} test below (without continuity correction).

Code
chisq.test(table(dfr$Group, dfr$Malnutrition), correct = FALSE)

    Pearson's Chi-squared test

data:  table(dfr$Group, dfr$Malnutrition)
X-squared = 0.92072, df = 1, p-value = 0.3373

We conduct the G test for independence below (without continuity correction).

Code
DescTools::GTest(table(dfr$Group, dfr$Malnutrition), correct = "none")

    Log likelihood ratio (G-test) test of independence without correction

data:  table(dfr$Group, dfr$Malnutrition)
G = 0.92462, X-squared df = 1, p-value = 0.3363

In both cases the test statistic is less than the critical statistic value and we fail to reject the null hypothesis. We visualize the results below. Since I=2 and J=2, we have (2-1)(2-1)=1 degree of freedom and both X^{2} \sim \chi^{2}_{1, \, 0.05} and G^{2} \sim \chi^{2}_{1, \, 0.05}.

Code
sjPlot::dist_chisq(
  chi2 = DescTools::GTest(table(dfr$Group, dfr$Malnutrition), correct = "none")$statistic,
  deg.f = 1
)

In the case of both tests, the p value is less than \alpha=0.05.

There was no association between financial aid group and an improvement in malnutrition at the 5% significance level.

Code
epitools::oddsratio.wald(
  table(dfr$Group, dfr$Malnutrition),
  rev = "both"
)
$data
         
          Not improved Improved Total
  Control           10       10    20
  Cash              13        7    20
  Total             23       17    40

$measure
         odds ratio with 95% C.I.
           estimate     lower    upper
  Control 1.0000000        NA       NA
  Cash    0.5384615 0.1512368 1.917132

$p.value
         two-sided
          midp.exact fisher.exact chi.square
  Control         NA           NA         NA
  Cash     0.3616972    0.5231071   0.337287

$correction
[1] FALSE

attr(,"method")
[1] "Unconditional MLE & normal approximation (Wald) CI"

The odds of an improvement in malnutrition in the cash support group is 0.538 times the odds of improvement in the control group, (95% CI 0.151 to 1.917).

The odds of an improvement in malnutrition is 41.7% lower in the cash group compared to the control group.

6.2 Question

You are a member of a research group conducting early research into a novel drug candidate for depression. A total of 10 healthy volunteers are randomly assigned to take either the novel drug or a placebo (5 in each group). The response measure the development of a dry mouth (at the levels of Yes or No), a concerning side-effect of the novel drug.

Data for the project are conducted below.

Code
set.seed(11) # Reproducible results
group <- rep(
  c("Active", "Control"),
  each = 5,
  replace = TRUE
)

response <- sample(
  c("Dry mouth", "No dry mouth"),
  size = 10,
  p = c(0.6, 0.4),
  replace = TRUE
)

dfr <- data.frame(
  Group = group,
  Response = response
)

dfr
     Group     Response
1   Active    Dry mouth
2   Active    Dry mouth
3   Active    Dry mouth
4   Active    Dry mouth
5   Active    Dry mouth
6  Control No dry mouth
7  Control    Dry mouth
8  Control    Dry mouth
9  Control No dry mouth
10 Control    Dry mouth

You are tasked with the following: (1) create a table of observed and a table of expected values, (2) use four different tests for association (independence) between the binary explanatory and the response variables, and (3) comment on the results and use of each.

We start by visualizing the data.

Code
ggstatsplot::ggbarstats(
  dfr,
  y = Group,
  x = Response,
  bf.message = FALSE,
  results.subtitle = FALSE, 
  palette = "Blues",
  title = "Visualize the results"
)

A numerical summary of the data is provided using a table of observed values and a table of expected values.

Code
tb <- table(
  dfr$Group,
  dfr$Response
)

tb
         
          Dry mouth No dry mouth
  Active          5            0
  Control         3            2

Our research questions considers an association between drug group and side effects.The test hypothesis are stated below.

  • H_{0}: There is no association between test group and side-effect
  • H_{1}: There is an association between test group and side-effect

We choose a 5% level of significance.

The assumption for the use of the \chi^{2} distribution as sampling distribution of our test statistic is tested below.

Code
chisq.test(tb, correct = F)$expected
Warning in chisq.test(tb, correct = F): Chi-squared approximation may be
incorrect
         
          Dry mouth No dry mouth
  Active          4            1
  Control         4            1

The expected values are all less than 5 and we cannot use the \chi^{2} distribution as sampling distribution for Pearson’s \chi^{2} test or the G test for independence.

As per the lab problem question, though, we will conduct these tests nonetheless.

We have (I-1)(J-1)=(201)(2-1)=1 degree of freedom.The critical \chi^{2}_{1 , \, 0.05} is calculated below.

Code
qchisq(0.05, 1, lower.tail = FALSE)
[1] 3.841459

We conduct Pearson’s \chi^{2} test, the G test, Fisher’s exact test and Fisher-Boschloo’s test below.

Code
chisq.test(tb, correct = F)
Warning in chisq.test(tb, correct = F): Chi-squared approximation may be
incorrect

    Pearson's Chi-squared test

data:  tb
X-squared = 2.5, df = 1, p-value = 0.1138
Code
DescTools::GTest(tb, correct = "none")

    Log likelihood ratio (G-test) test of independence without correction

data:  tb
G = 3.2779, X-squared df = 1, p-value = 0.07022
Code
fisher.test(tb)

    Fisher's Exact Test for Count Data

data:  tb
p-value = 0.4444
alternative hypothesis: true odds ratio is not equal to 1
95 percent confidence interval:
 0.1953744       Inf
sample estimates:
odds ratio 
       Inf 
Code
Exact::exact.test(
  data = tb,
  method = "boschloo",
  to.plot = FALSE
)

    Boschloo's Exact Test

data:  5 out of 5 vs. 3 out of 5
test statistic = 0.44444, first sample size = 5, second sample size =
5, p-value = 0.1874
alternative hypothesis: true difference in proportion is not equal to 0
sample estimates:
difference in proportion 
                     0.4 

Since we did not meet the assumptions for the use of the \chi^{2} distribution, we can only use the results of Fisher’s exact test and Fisher-Boschloo’s test. In both cases, we have that there is not enough evidence in the data to reject the null hypothesis. There is no association between the variables.

6.3 Question

You are leading an investigation to determine response to a new health intervention in a community. The participants in the community are in one of three groups: Younger than 30 (the reference group), Mid age, that is between 30 and 60 years of age, and Older than 60 years. In a survey a question on their agreement on how they perceived the intervention, the can answer I welcome it and I do not welcome it. You consider if the two variables are associated (or independent). Data for the research question is simulated below.

Code
set.seed(12) # For reproducible results

group <- rep(
  c("Younger", "Mid age", "Older"),
  each = 30
)

answer <- sample(
  c("I welcome it", "I do not welcome it"),
  size = 90,
  replace = TRUE
)

dfr <- data.frame(
  Group = group,
  Answer = answer
)

You manage the analysis of the data and answer the research question by considering the following and commenting on each answer to target the research question based on your analysis.

  1. Generate a table of observed values .

  2. Determine an appropriate test or tests to use and conduct the test or tests.

  3. Calculate the odds ratios of the subgroups and the Wald confidence intervals of the odds ratios (even if the tests for independence were not significant).

  4. Determine the standardized residuals.

Before commencing the analyses, we visualize the data.

Code
ggstatsplot::ggbarstats(
  dfr,
  y = Group,
  x = Answer,
  bf.message = FALSE,
  results.subtitle = FALSE, 
  palette = "Blues",
  title = "Percentage of acceptance of the intervention for the three groups"
)

The table of observed values follows as numerical summary of the data.

Code
# Set the levels for the table of observed values
dfr$Group <- factor(dfr$Group, levels = c("Younger", "Mid age", "Older"))
dfr$Answer <- factor(dfr$Answer, levels = c("I welcome it", "I do not welcome it"))

tb <- table(
  dfr$Group,
  dfr$Answer
)
tb
         
          I welcome it I do not welcome it
  Younger           11                  19
  Mid age           15                  15
  Older             15                  15

We assume independence between observations. The table of expected values is calculated below.

Code
chisq.test(tb, correct = FALSE)$expected
         
          I welcome it I do not welcome it
  Younger     13.66667            16.33333
  Mid age     13.66667            16.33333
  Older       13.66667            16.33333

We note that all values are at least 5 and we can use either Pearson’s \chi^{2} test or the G test for independence (likelihood ratio test). In both cases we will use a 5% significance level. The value for the degrees of freedom is (I-1)(J-1)=(3-1)(2-1)=2 and both X^{2} \sim \chi^{2}_{2 , \, 0.05} and G^{2} \sim \chi^{2}_{2 , \, 0.05}.

The null hypothesis and the alternative hypothesis are described below.

  • H_{0}: There is no association between age group and acceptance of the intervention
  • H_{1}: There is association between age group and acceptance of the intervention

The critical value for the test statistic is calculated using the qchisq function.

Code
qchisq(0.05, 2, lower.tail = FALSE)
[1] 5.991465

By omitting the lower.tail argument with a values set to FALSE, we must use a value of 1 - \alpha to determine the critical value.

Code
qchisq(0.95, 2)
[1] 5.991465

The dist_chisq function from the sjPlot library visualizes the distribution (for 2 degrees of freedom) and the area of significance (using the critical value).

Code
sjPlot::dist_chisq(
  chi2 = qchisq(0.05, 2, lower.tail = FALSE),
  deg.f = 2
)

The chisq.test function performs Pearson’s \chi^{2} test.

Code
pearson <- chisq.test(
  tb,
  correct = FALSE
)

pearson

    Pearson's Chi-squared test

data:  tb
X-squared = 1.4335, df = 2, p-value = 0.4883

The GTest function from the DescTools library conduct the LRT.

Code
DescTools::GTest(tb, correct = "none")

    Log likelihood ratio (G-test) test of independence without correction

data:  tb
G = 1.4473, X-squared df = 2, p-value = 0.485

In both cases X^{2} < \chi^{2}_{\text{critical value for df}=2, \, \alpha=0.05} and G^{2} < \chi^{2}_{\text{critical value for df}=2, \, \alpha=0.05} and we fail to reject the null hypothesis. This is confirmed by p > \alpha in both cases.

We set the older group and the I do not welcome it as the baseline levels for the two variable and create sub-tables, before using the epi.2by2 function from the epiR library.

Code
tb
         
          I welcome it I do not welcome it
  Younger           11                  19
  Mid age           15                  15
  Older             15                  15
Code
sub_tb_1 <- matrix(c(11, 15, 19, 15), nrow = 2)
sub_tb_1
     [,1] [,2]
[1,]   11   19
[2,]   15   15
Code
sub_tb_2 <- matrix(c(15, 15, 15, 15), nrow = 2)
sub_tb_2
     [,1] [,2]
[1,]   15   15
[2,]   15   15
Code
epiR::epi.2by2(sub_tb_1)
             Outcome +    Outcome -      Total                 Inc risk *
Exposed +           11           19         30     36.67 (19.93 to 56.14)
Exposed -           15           15         30     50.00 (31.30 to 68.70)
Total               26           34         60     43.33 (30.59 to 56.76)

Point estimates and 95% CIs:
-------------------------------------------------------------------
Inc risk ratio                                 0.73 (0.41, 1.32)
Inc odds ratio                                 0.58 (0.21, 1.62)
Attrib risk in the exposed *                   -13.33 (-38.18, 11.52)
Attrib fraction in the exposed (%)            -36.36 (-146.23, 24.48)
Attrib risk in the population *                -6.67 (-28.51, 15.18)
Attrib fraction in the population (%)         -15.38 (-48.61, 10.41)
-------------------------------------------------------------------
Uncorrected chi2 test that OR = 1: chi2(1) = 1.086 Pr>chi2 = 0.297
Fisher exact test that OR = 1: Pr>chi2 = 0.435
 Wald confidence limits
 CI: confidence interval
 * Outcomes per 100 population units 
Code
epiR::epi.2by2(sub_tb_2)
             Outcome +    Outcome -      Total                 Inc risk *
Exposed +           15           15         30     50.00 (31.30 to 68.70)
Exposed -           15           15         30     50.00 (31.30 to 68.70)
Total               30           30         60     50.00 (36.81 to 63.19)

Point estimates and 95% CIs:
-------------------------------------------------------------------
Inc risk ratio                                 1.00 (0.60, 1.66)
Inc odds ratio                                 1.00 (0.36, 2.75)
Attrib risk in the exposed *                   0.00 (-25.30, 25.30)
Attrib fraction in the exposed (%)            0.00 (-65.87, 39.71)
Attrib risk in the population *                0.00 (-21.91, 21.91)
Attrib fraction in the population (%)         0.00 (-28.79, 22.36)
-------------------------------------------------------------------
Uncorrected chi2 test that OR = 1: chi2(1) = 0.000 Pr>chi2 = 1.000
Fisher exact test that OR = 1: Pr>chi2 = 1.000
 Wald confidence limits
 CI: confidence interval
 * Outcomes per 100 population units 

In both cases, we find that the odds ratio under the null hypothesis, 1, is in the confidence interval and we fail to reject the null hypothesis.

The stdres attribute returns the standardized residuals.

Code
pearson$stdres
         
          I welcome it I do not welcome it
  Younger   -1.1973091           1.1973091
  Mid age    0.5986545          -0.5986545
  Older      0.5986545          -0.5986545

None of the standardized residuals are less than -2 or more than 2.