Chi Square

Author

Thomas Fleetwood

Chi-Square Hypothesis Test

Introduction

In class, you used the chi-square test to determine if a coin was fair or unfair. For that assessment, you did all the calculations by hand. But now you will let the computer do the math for you using the R programming language. You will then interpret the results to show your understanding of the chi-square test. You can download this Quarto document and load it into your RStudio or upload it to Posit Cloud to work in directly. Download the file here: Chi-Square Problem Set

How to run a chi-square test in R

To perform a chi-square test in R, you need two pieces of information.

  • the experimental results (either in raw table form or already counted)

  • the theoretical/expected outcome

Once you have this information, you can run the function chisq.test to determine the \(X^2\) and p-values of the test. I have completed an example below.

# results of flipping a coin 100 times = 72 heads; 38 tails
# the expected outcome is 50/50 or 0.5:0.5
# In the chisq.test function, enter the actual 
# resuts as 'x' and the expected proportions as 'p' 
chisq.test(x = c(72, 38), p = c(0.5, 0.5))

    Chi-squared test for given probabilities

data:  c(72, 38)
X-squared = 10.509, df = 1, p-value = 0.001188

In the above example, the results show

  • \(X^2\) = 10.509

  • degrees of freedom = 1

  • p-value = 0.001

As a result, we see that the probability of the null hypothesis that the coin is fair (50/50) is very low. Therefore, we accept the alternative hypothesis that the coin is unfair.

Words of advice

Remember that you can always read the help file for the chisq.test by typing ?chisq.test in the console and hitting enter.

If you have an expected outcome that isn’t 1:1, let’s say your expected outcome is 4:1, then you have a couple of options to put in the p = c() part of the function.

  • You can convert the expected outcome to decimal proportions (that must add to 1). So in the case of 4:1, it would be p = c(0.8, 0.2). {which is 4/5 and 1/5)

  • Or, you could keep p = c(4, 1) but add the argument rescale.p = TRUE

Here are both examples. You will see they both return the same result. If you try p = c(4, 1) with the rescale.p argument set to FALSE, R will give you an error that the probabilities must sum to 1. The rescale.p argument does that for you automatically when set to TRUE.

chisq.test(x = c(78, 32), p = c(0.8, 0.2))

    Chi-squared test for given probabilities

data:  c(78, 32)
X-squared = 5.6818, df = 1, p-value = 0.01714
chisq.test(x = c(78, 32), p = c(4, 1), rescale.p = TRUE)

    Chi-squared test for given probabilities

data:  c(78, 32)
X-squared = 5.6818, df = 1, p-value = 0.01714

Finally, if you have a data set rather than the totals, you can use the table command to automatically count your results. For example, if you had a dataset named coin_data that reported each head or tail, you wouldn’t have to count them yourself.

  [1] "heads" "tails" "heads" "heads" "heads" "tails" "heads" "heads" "heads"
 [10] "heads" "heads" "tails" "heads" "tails" "heads" "heads" "tails" "tails"
 [19] "heads" "heads" "heads" "tails" "heads" "heads" "heads" "heads" "heads"
 [28] "heads" "heads" "heads" "heads" "heads" "heads" "tails" "heads" "heads"
 [37] "heads" "heads" "tails" "tails" "heads" "heads" "tails" "heads" "heads"
 [46] "heads" "heads" "heads" "heads" "heads" "tails" "tails" "heads" "heads"
 [55] "heads" "heads" "heads" "heads" "tails" "heads" "heads" "tails" "heads"
 [64] "tails" "tails" "tails" "heads" "heads" "heads" "tails" "heads" "heads"
 [73] "heads" "heads" "tails" "tails" "tails" "heads" "heads" "tails" "tails"
 [82] "heads" "heads" "heads" "tails" "heads" "heads" "tails" "heads" "tails"
 [91] "heads" "heads" "heads" "heads" "heads" "tails" "heads" "tails" "heads"
[100] "heads" "heads" "heads" "heads" "tails" "heads" "heads" "tails" "tails"
[109] "heads" "heads"

Rather than count all those heads and tails, you could just use table like this…

table(coin_data)
coin_data
heads tails 
   78    32 

You can use that table function directly in the chisq.test function…

chisq.test(x = table(coin_data), p = c(4, 1), rescale.p = TRUE)

    Chi-squared test for given probabilities

data:  table(coin_data)
X-squared = 5.6818, df = 1, p-value = 0.01714

And look at that, we got the same results as before when we knew the numbers directly. But this time R counted the heads and tails for us. Nice :-)

Your turn

You will complete the following problem set and determine the proper conclusion for each scenario. The problems get progressively more complex and you may need to consult older students to understand some of the topics. (e.g. genetics for problem 3) When you write your conclusions, use the template I showed you in class as a starting point.

Problem 1

A zookeeper hypothesizes that changing the intensity of the light in the primate exhibits will reduce the amount of aggression between the baboons. In exhibit A, with a lower light intensity, he observes 36 incidences of aggression over a one month period. In exhibit B, with normal lights, he observes 42 incidences of aggression. Should he support or reject his hypothesis?

# Do your work in this box

What is your final conclusion and why?

Problem 2

At a particular high school, students can choose to enter one of three doors. Custodians noticed that door #3 was always getting broken and suggested that more students use that door because it has a hands-free opener. Science minded students collected data on the number of students entering each door to see if the custodians were right. The following is the data they collected.

  [1] "door 3" "door 2" "door 2" "door 2" "door 1" "door 2" "door 3" "door 3"
  [9] "door 2" "door 2" "door 1" "door 2" "door 3" "door 1" "door 3" "door 1"
 [17] "door 3" "door 3" "door 3" "door 3" "door 3" "door 3" "door 3" "door 3"
 [25] "door 3" "door 1" "door 2" "door 1" "door 1" "door 3" "door 2" "door 3"
 [33] "door 3" "door 2" "door 3" "door 1" "door 3" "door 3" "door 1" "door 1"
 [41] "door 2" "door 2" "door 3" "door 2" "door 3" "door 2" "door 2" "door 2"
 [49] "door 3" "door 1" "door 3" "door 3" "door 1" "door 2" "door 3" "door 2"
 [57] "door 2" "door 1" "door 3" "door 1" "door 1" "door 3" "door 3" "door 3"
 [65] "door 3" "door 1" "door 3" "door 2" "door 3" "door 2" "door 3" "door 1"
 [73] "door 3" "door 1" "door 3" "door 2" "door 2" "door 3" "door 3" "door 2"
 [81] "door 1" "door 2" "door 3" "door 2" "door 3" "door 1" "door 2" "door 3"
 [89] "door 3" "door 1" "door 3" "door 1" "door 3" "door 3" "door 3" "door 2"
 [97] "door 3" "door 3" "door 3" "door 1" "door 3" "door 2" "door 3" "door 3"
[105] "door 3" "door 3" "door 3" "door 3" "door 2" "door 3" "door 1" "door 3"
[113] "door 1" "door 3" "door 3" "door 1" "door 1" "door 3" "door 3" "door 1"
[121] "door 1" "door 1" "door 1" "door 1" "door 2" "door 3" "door 3" "door 2"
[129] "door 2" "door 3" "door 1" "door 2" "door 3" "door 2" "door 1" "door 2"
[137] "door 1" "door 2" "door 2" "door 1" "door 2" "door 1" "door 3" "door 2"
[145] "door 3" "door 2" "door 1" "door 2" "door 1" "door 2" "door 2" "door 3"
[153] "door 3" "door 1" "door 3" "door 1" "door 2" "door 1" "door 3" "door 2"
[161] "door 2" "door 2" "door 2" "door 3" "door 3" "door 2" "door 2" "door 3"
[169] "door 1" "door 3" "door 1" "door 2" "door 1" "door 2" "door 3" "door 1"
[177] "door 2" "door 2" "door 3" "door 3" "door 2" "door 3" "door 2" "door 3"
[185] "door 1" "door 3" "door 1" "door 2" "door 3" "door 1" "door 2" "door 1"
[193] "door 3" "door 1" "door 1" "door 1" "door 1" "door 1" "door 1" "door 2"
[201] "door 2" "door 2" "door 3" "door 1" "door 2" "door 1" "door 3" "door 2"
[209] "door 3" "door 1"

Were the custodians right? The data above is saved in a vector titled door_count. Remember the words of advice above regarding the table function.

# Do your work in this box

What is your final conclusion and why?

Problem 3

A scientist hypothesizes that a congenital birth defect in kittens is caused be a recessive gene in that breed of cat. After performing multiple monohybrid crosses and surveying several litters, he found that 44 out of 128 kittens had the defect. Is his hypothesis correct? (hint: First you need to know what a monohybrid cross is. Then you need to figure out the expected outcome of a typical monohybrid cross (think punnett square). This will tell you what your expectation should be if it is caused by a recessive gene.)

# Do your work in this box

What is your conclusion and why?

Problem 4

Suppose you take a random sample of 30 students who are using a new math text and a second sample of 30 students who are using a more traditional text. You compare student achievement on the state test given to all students at the end of the course. Based on state test performance, would you recommend the new math book?

Passed State Test Failed State Test
New Textbook 26 4
Old Textbook 22 8

Hmm, this doesn’t look like our other data on coin flips or student doors. This data is in a contingency table (this one happens to be a 2x2 table, it could be 3x2 or any size). When you are comparing 2 categories like this (textbook vs state test score), we call it a chi-square test of independence. Are the two categories dependent on each other or independent from one another? In this case, you are wondering if the score on the state test is dependent upon which textbook the student used. It is really the same thing as the chi-square goodness of fit test you were doing above, but the ‘expected’ values are a bit harder to calculate. Luckily, the computer will do it for us.

In a chi-square test of independence, the Null hypothesis is that the 2 categories are independent. This fact will tell you how to interpret the p-value.

# The data from the table above is in a data frame called textbook_data
# You can put that in the chisq.test function all by itself
# (without needing to use the table() function)
# and the computer will calculate the expected values automatically. Also nice :-)

# Do your work in this box below this comment

After running the chi-square, what conclusion do you make regarding whether or not the test score was dependent on which textbook the student used? Why?