How to use the CHISQ.TEST function
What is the CHISQ.TEST function?
The CHISQ.TEST function calculates the test for independence, the value returned from the chi-squared statistical distribution and the correct degrees of freedom. Use this function to check if hypothesized results are valid.
The CHITEST function is outdated and you will find it in the compatibility category, the function has been replaced with the CHISQ.TEST function.
Table of Contents
1. Introduction
What is a test for independence?
A test for independence is a statistical hypothesis test that checks if two categorical variables are related or independent based on observed data, commonly done using Pearson's chi-squared test or G-test to compare observed and expected frequency counts.
What is a chi-squared distribution?
The chi-squared distribution is a theoretical probability distribution modeling the sum of squared standard normal random variables used in inferential statistics for estimation, confidence intervals, and hypothesis testing.
What is the probability of the chi-squared distribution?
The probability of the chi-squared distribution determines the likelihood that the sum of squared standard normal variables will take on a value less than or equal to a given number, depending on its degrees of freedom parameter.
What is an hypothesis?
In statistics, a hypothesis is an assumption about some aspect of a population parameter or probability model that can be tested using observations and data to determine if there is sufficient evidence in the sample to support the assumed hypothesis.
What is a cumulative chi-squared distribution?
The cumulative chi-squared distribution function gives the probability that the sum of squared standard normals will result in a value less than or equal to a specified number x, giving the accumulated area under the probability density curve.
What is a probability density function of a chi-squared distribution?
A chi-squared probability density function is a function that defines the relative likelihood of different outcomes for the sum of squared standard normals based on its degrees of freedom parameter, integrating to a total area of 1 over the domain.
What is inferential statistics for estimation?
Inferential statistics for estimation involve using a random sample to estimate characteristics and parameters about a larger population using statistical techniques like confidence intervals and point estimation to quantify uncertainty about the estimates.
What is confidence intervals?
A confidence interval provides a range of plausible values for an unknown population parameter centered around a sample estimate, describing the uncertainty around the estimate at a specified level of confidence.
What is the sum of squared standard normal variables?
The sum of squared standard normal variables refers to summing multiple independent normally distributed random variables each with a mean of 0 and variance of 1, which results in a chi-squared distribution that can be used for statistical modeling and analysis.
What are the degrees of freedom?
The degrees of freedom in a chi-squared distribution refers to the number of standard normal random variables being squared and summed, which affects the shape of the distribution and occurs in statistical tests as the sample size minus the number of estimated parameters.
2. CHISQ.TEST function Syntax
CHISQ.TEST(actual_range,expected_range)
3. CHISQ.TEST function Arguments
actual_range | Required. A range of data. |
expected_range | Required. A range of data. |
4. CHISQ.TEST function Example 1
A researcher wants to test if there is a significant difference between the observed frequencies of different blood types in a population and the expected frequencies based on theoretical probabilities. The data consists of the observed and expected frequencies for each blood type.
Observed frequencies of blood types:
Blood Type A: 120
Blood Type B: 95
Blood Type AB: 30
Blood Type O: 155
To calculate the expected frequencies based on theoretical probabilities we need to know the probabilities of each blood type in the general population.
The theoretical probabilities for the ABO blood group system are as follows:
Blood Type A: 0.25 (or 25%)
Blood Type B: 0.25 (or 25%)
Blood Type AB: 0.09 (or 9%)
Blood Type O: 0.41 (or 41%)
Determine if the observed frequencies differ significantly from the expected frequencies?
The image above shows the observed frequencies in cell range B19:C22 and the expected frequencies in cell range B27:D30. There are four categories of blood types (A, B, AB, and O), the degrees of freedom would be 3 (4 categories - 1).
Formula in cell C23:
Formula in cell D27:
Copy cell D27 to cells below as far as needed.
Formula in cell C32:
The CHISQ.TEST function returns 0.0934952080113268
Formula in cell C33:
Formula in cell C34:
The χ2 statistic for the observed frequencies is approx. 6.41 with 3 degrees of freedom. This is lower than the decision rule calculated in cell C34. The observed frequencies does not differ significantly from the expected frequencies.
5. CHISQ.TEST function Example 2
A quality control manager at a manufacturing plant wants to test if the frequencies of defective products across different production lines are significantly different from the expected frequencies based on historical data. The data includes the observed and expected frequencies of defective products for each production line.
Observed frequencies of defective products:
Production Line 1: 18
Production Line 2: 25
Production Line 3: 12
Production Line 4: 30
Production Line 5: 19
Expected frequencies of defective products based on historical data:
Production Line 1: 22
Production Line 2: 24
Production Line 3: 20
Production Line 4: 28
Production Line 5: 10
Determine if the observed frequencies of defective products differ significantly from the expected frequencies?
The image above shows the observed frequencies in cell range C19:C23 and the expected frequencies in cell range C28:C32. There are five categories of production lines (1, 2, 3, 4, and 5), the degrees of freedom would be 4 (5 categories - 1).
Formula in cell C34:
The CHISQ.TEST function returns 0.0158438623483097
Formula in cell C35:
Formula in cell C36:
The χ2 statistic for the observed frequencies is approx. 12.21 with 4 degrees of freedom. This is higher than the decision rule calculated in cell C36. The observed frequencies does differ significantly from the expected frequencies.
6. CHISQ.TEST function not working
Independence is indicated by a low number of χ2.
The CHISQ.TEST function returns
- #N/A error value if the number of data points in the arguments doesn't match.
7. How is the CHISQ.TEST function calculated
CHISQ.TEST function equation:
Aij = actual frequency in the i-th row, j-th column
Eij = expected frequency in the i-th row, j-th column
r = number of rows
c = number of columns
Functions in 'Statistical' category
The CHISQ.TEST function function is one of 73 functions in the 'Statistical' category.
How to comment
How to add a formula to your comment
<code>Insert your formula here.</code>
Convert less than and larger than signs
Use html character entities instead of less than and larger than signs.
< becomes < and > becomes >
How to add VBA code to your comment
[vb 1="vbnet" language=","]
Put your VBA code here.
[/vb]
How to add a picture to your comment:
Upload picture to postimage.org or imgur
Paste image link to your comment.
Contact Oscar
You can contact me through this contact form