How to use the VAR.P function
What is the VAR.P function?
The VAR.P function returns the variance based on a population. The function ignores logical and text values.
Table of contents
1. Introduction
What is variance in statistics?
Variance is a measure in statistics that shows how much a set of numbers are spread out from their average value. Variance is an important measure in statistics and science.
One limitation of the variance is that its units are different from the units of the original random variable. The standard deviation retains the same units as the random variable making it a more useful measure of spread or dispersion.
How is the variance calculated?
It depends on which function you use, the VAR.P function or the VAR.S function. The VAR.P function is calculated by taking the average of squared deviations from the mean.
VAR.P function = Σ(x - x̄)2/n
The VAR.S function calculates the variance based on a sample of the population.
How is the variance and standard deviation related?
The standard deviation is the square root of the variance. The following formula shows how the standard deviation is calculated.
STDEV.P function = √(Σ(x - x̄)2/n)
What is the difference between the VAR.P function and the VAR.S function?
VAR.S function = Σ(x - x̄)2/(n-1)
VAR.P function = Σ(x - x̄)2/n
x is each value
x̄ is the mean of all values
n is the total number of observations
2. VAR.P Function Syntax
VAR.P(number1,[number2],...)
3. VAR.P Function Arguments
number1 | Required. A cell reference to the population. |
number2 | Optional. Up to 254 additional arguments. |
4. VAR.P Function Example
A study is conducted to analyze the weight of a specific fish between 12 and 15 months old. If the weights of every individual in that population are recorded what is the variance of the weights for the entire population? Assume the weights follow a normal distribution.
The data points are in cell range B16:B25, here they are:
Weight |
163 |
157 |
170 |
167 |
178 |
173 |
198 |
163 |
208 |
161 |
The argument in this example is:
- number1 = B16:B25
There are 10 data points in this example.
Formula in cell E15:
Cell E15 returns 251.36 which represents the variance. The standard deviation is the square root of the variance. √251.36 equals 15.85 which is the same value that the STDEV.P function returns.
The image above shows a chart containing a blue line that represents the normal distribution based on a mean of 174 and a standard deviation of 15.85. The chart also shows the different standard deviations 1σ, 2σ, 3σ, -1a, -2σ, and -3σ which represents:
- 68% of the data falls between μ ± 1σ
- 95% of the data falls between μ ± 2σ
- 99.7% of the data falls between μ ± 3σ
4.1 Explaining the math formula for calculating the variance
This example demonstrates how to calculate the variance (VAR.P) of a population using these numbers: 10, 30, 25, 50, and 35.
The equation for VAR.P is:
x ̅ is the sample mean AVERAGE(number1,number2,…)
n is the sample size.
Step 1 - Calculating the average
To calculate an average you need to add up all the values and then divide by the number of values.
10 + 30 + 25 + 50 + 35 equals 150.
The number of values is five. 150 divided by 5 equals 30
Step 2 - Subtract average value from the set of numbers
10-30= -20
30-30= 0
25-30= -5
50-30= 20
35-30 = 5
Step 3 - Square the difference
(-20)^2 = 400
(0)^2 = 0
(-5)^2 = 25
(20)^2 = 400
(5)^2 = 25
Step 4 - Calculate a total
400 + 0 + 25 + 400 + 25 = 850
Step 4 - Divide total by the number of values
850 / 5 = 170
5. Why is the VAR.P function returning an error?
If the VAR.P function returns an error, make sure that the source data has no error values, as the function cannot handle them.
The image above shows an error in cell B5. The VAR.P function returns an error because the source data in B3:B7 contains an error.
The IFERROR function lets you ignore error values.
How to find errors in a worksheet
6. Sort rows by variance based on a population
This example demonstrates a formula in cell B8 that calculates the variance based on a population per row. It sorts rows from cell range B3:N6 by the variance of a population from large to small.
Cell ranges P3:P6 contains the variances from the source data and P7:P11 contains the variances based on the sorted rows.
This kind of calculation was very hard to perform in earlier Excel version. Excel 365 has a bunch of new functions that are powerful and easy to understand.
Some of these new functions return an array of values, however, you simply enter the formulas as regular formulas. They spill values automatically to cells below and to the right as far as needed. A #SPILL error tells you that at least one of the destination cells are not empty.
Explaining formula
Step 1 - Calculate the variance of a population
VAR.P(a)
Step 2 - Build the LAMBDA function
The LAMBDA function build custom functions without VBA, macros or javascript.
Function syntax: LAMBDA([parameter1, parameter2, …,] calculation)
LAMBDA(a,VAR.P(a))
Step 3 - Calculate the variance of a sample by row
The BYROW function puts values from an array into a LAMBDA function row-wise.
Function syntax: BYROW(array, lambda(array, calculation))
BYROW(C3:N6,LAMBDA(a,VAR.P(a)))
returns
{47221.2222222222;81130.9722222222;87499.7222222222;41286.1388888889}
Step 4 - Sort rows based on the variance of a sample
The SORTBY function sorts a cell range or array based on values in a corresponding range or array.
Function syntax: SORTBY(array, by_array1, [sort_order1], [by_array2, sort_order2],…)
SORTBY(B3:N6,BYROW(C3:N6,LAMBDA(a,VAR.P(a))),-1)
becomes
SORTBY(B3:N6,{47221.2222222222;81130.9722222222;87499.7222222222;41286.1388888889},-1)
and returns
7. When to use the VAR.P function and when to use the VAR.S function?
The VAR.P function and the VAR.S function are both used to calculate the variance, however, they differ in how they calculate the variance. The VAR.P function assumes that the dataset is the entire population, while the VAR.S function assumes that the dataset is a sample of the population.
The difference between calculating the population and sample variance is that population variance divides by the number of observations in the population, while the sample variance divides by the number of observations minus one. This makes the sample variance larger than the population variance, because it tries to account for the uncertainty of estimating the population variance from a sample.
You should use the VAR.P function when you have data for an entire population, and use the VAR.S function when you have data for a sample of the population. It is sometimes not practical to calculate the variance for millions of observations, a "Sample size calculator" is often useful as it also meets a given the confidence level and a margin of error.
Functions in 'Statistical' category
The VAR.P function function is one of 73 functions in the 'Statistical' category.
How to comment
How to add a formula to your comment
<code>Insert your formula here.</code>
Convert less than and larger than signs
Use html character entities instead of less than and larger than signs.
< becomes < and > becomes >
How to add VBA code to your comment
[vb 1="vbnet" language=","]
Put your VBA code here.
[/vb]
How to add a picture to your comment:
Upload picture to postimage.org or imgur
Paste image link to your comment.
Contact Oscar
You can contact me through this contact form