How to use the VAR.P function
What is the VAR.P function?
The VAR.P function returns the variance based on a population. The function ignores logical and text values.
Table of contents
1. Introduction
What is variance in statistics?
Variance is a measure in statistics that shows how much a set of numbers are spread out from their average value. Variance is an important measure in statistics and science.
One limitation of the variance is that its units are different from the units of the original random variable. The standard deviation retains the same units as the random variable making it a more useful measure of spread or dispersion.
How is the variance calculated?
It depends on which function you use, the VAR.P function or the VAR.S function. The VAR.P function is calculated by taking the average of squared deviations from the mean.
VAR.P function = Σ(x - x̄)2/n
The VAR.S function calculates the variance based on a sample of the population.
How is the variance and standard deviation related?
The standard deviation is the square root of the variance. The following formula shows how the standard deviation is calculated.
STDEV.P function = √(Σ(x - x̄)2/n)
What is the difference between the VAR.P function and the VAR.S function?
VAR.S function = Σ(x - x̄)2/(n-1)
VAR.P function = Σ(x - x̄)2/n
x is each value
x̄ is the mean of all values
n is the total number of observations
2. Syntax
VAR.P(number1,[number2],...)
3. Arguments
number1 | Required. A cell reference to the population. |
number2 | Optional. Up to 254 additional arguments. |
4. Example
A study is conducted to analyze the weight of a specific fish between 12 and 15 months old. If the weights of every individual in that population are recorded what is the variance of the weights for the entire population? Assume the weights follow a normal distribution.
The data points are in cell range B16:B25, here they are:
Weight |
163 |
157 |
170 |
167 |
178 |
173 |
198 |
163 |
208 |
161 |
The argument in this example is:
- number1 = B16:B25
There are 10 data points in this example.
Formula in cell E15:
Cell E15 returns 251.36 which represents the variance. The standard deviation is the square root of the variance. √251.36 equals 15.85 which is the same value that the STDEV.P function returns.
The image above shows a chart containing a blue line that represents the normal distribution based on a mean of 174 and a standard deviation of 15.85. The chart also shows the different standard deviations 1σ, 2σ, 3σ, -1a, -2σ, and -3σ which represents:
- 68% of the data falls between μ ± 1σ
- 95% of the data falls between μ ± 2σ
- 99.7% of the data falls between μ ± 3σ
4.1 Explaining the math formula for calculating the variance
This example demonstrates how to calculate the variance (VAR.P) of a population using these numbers: 10, 30, 25, 50, and 35.
The equation for VAR.P is:
x ̅ is the sample mean AVERAGE(number1,number2,…)
n is the sample size.
Step 1 - Calculating the average
To calculate an average you need to add up all the values and then divide by the number of values.
10 + 30 + 25 + 50 + 35 equals 150.
The number of values is five. 150 divided by 5 equals 30
Step 2 - Subtract average value from the set of numbers
10-30= -20
30-30= 0
25-30= -5
50-30= 20
35-30 = 5
Step 3 - Square the difference
(-20)^2 = 400
(0)^2 = 0
(-5)^2 = 25
(20)^2 = 400
(5)^2 = 25
Step 4 - Calculate a total
400 + 0 + 25 + 400 + 25 = 850
Step 4 - Divide total by the number of values
850 / 5 = 170
5. Why is the function returning an error?
If the VAR.P function returns an error, make sure that the source data has no error values, as the function cannot handle them.
The image above shows an error in cell B5. The VAR.P function returns an error because the source data in B3:B7 contains an error.
The IFERROR function lets you ignore error values. How to find errors in a worksheet
The VAR.P function returns
- #NAME? error if you misspell the function name.
- propagates errors, meaning that if the input contains an error (e.g., #VALUE!, #REF!), the function will return the same error.
5.1 Troubleshooting the error value
When you encounter an error value in a cell a warning symbol appears, displayed in the image above. Press with mouse on it to see a pop-up menu that lets you get more information about the error.
- The first line describes the error if you press with left mouse button on it.
- The second line opens a pane that explains the error in greater detail.
- The third line takes you to the "Evaluate Formula" tool, a dialog box appears allowing you to examine the formula in greater detail.
- This line lets you ignore the error value meaning the warning icon disappears, however, the error is still in the cell.
- The fifth line lets you edit the formula in the Formula bar.
- The sixth line opens the Excel settings so you can adjust the Error Checking Options.
Here are a few of the most common Excel errors you may encounter.
#NULL error - This error occurs most often if you by mistake use a space character in a formula where it shouldn't be. Excel interprets a space character as an intersection operator. If the ranges don't intersect an #NULL error is returned. The #NULL! error occurs when a formula attempts to calculate the intersection of two ranges that do not actually intersect. This can happen when the wrong range operator is used in the formula, or when the intersection operator (represented by a space character) is used between two ranges that do not overlap. To fix this error double check that the ranges referenced in the formula that use the intersection operator actually have cells in common.
#SPILL error - The #SPILL! error occurs only in version Excel 365 and is caused by a dynamic array being to large, meaning there are cells below and/or to the right that are not empty. This prevents the dynamic array formula expanding into new empty cells.
#DIV/0 error - This error happens if you try to divide a number by 0 (zero) or a value that equates to zero which is not possible mathematically.
#VALUE error - The #VALUE error occurs when a formula has a value that is of the wrong data type. Such as text where a number is expected or when dates are evaluated as text.
#REF error - The #REF error happens when a cell reference is invalid. This can happen if a cell is deleted that is referenced by a formula.
#NAME error - The #NAME error happens if you misspelled a function or a named range.
#NUM error - The #NUM error shows up when you try to use invalid numeric values in formulas, like square root of a negative number.
#N/A error - The #N/A error happens when a value is not available for a formula or found in a given cell range, for example in the VLOOKUP or MATCH functions.
#GETTING_DATA error - The #GETTING_DATA error shows while external sources are loading, this can indicate a delay in fetching the data or that the external source is unavailable right now.
5.2 The formula returns an unexpected value
To understand why a formula returns an unexpected value we need to examine the calculations steps in detail. Luckily, Excel has a tool that is really handy in these situations. Here is how to troubleshoot a formula:
- Select the cell containing the formula you want to examine in detail.
- Go to tab “Formulas” on the ribbon.
- Press with left mouse button on "Evaluate Formula" button. A dialog box appears.
The formula appears in a white field inside the dialog box. Underlined expressions are calculations being processed in the next step. The italicized expression is the most recent result. The buttons at the bottom of the dialog box allows you to evaluate the formula in smaller calculations which you control. - Press with left mouse button on the "Evaluate" button located at the bottom of the dialog box to process the underlined expression.
- Repeat pressing the "Evaluate" button until you have seen all calculations step by step. This allows you to examine the formula in greater detail and hopefully find the culprit.
- Press "Close" button to dismiss the dialog box.
There is also another way to debug formulas using the function key F9. F9 is especially useful if you have a feeling that a specific part of the formula is the issue, this makes it faster than the "Evaluate Formula" tool since you don't need to go through all calculations to find the issue..
- Enter Edit mode: Double-press with left mouse button on the cell or press F2 to enter Edit mode for the formula.
- Select part of the formula: Highlight the specific part of the formula you want to evaluate. You can select and evaluate any part of the formula that could work as a standalone formula.
- Press F9: This will calculate and display the result of just that selected portion.
- Evaluate step-by-step: You can select and evaluate different parts of the formula to see intermediate results.
- Check for errors: This allows you to pinpoint which part of a complex formula may be causing an error.
The image above shows cell reference B3:B7 converted to hard-coded value using the F9 key. The VAR.P function requires non-error values which is not the case in this example. We have found what is wrong with the formula.
Tips!
- View actual values: Selecting a cell reference and pressing F9 will show the actual values in those cells.
- Exit safely: Press Esc to exit Edit mode without changing the formula. Don't press Enter, as that would replace the formula part with the calculated value.
- Full recalculation: Pressing F9 outside of Edit mode will recalculate all formulas in the workbook.
Remember to be careful not to accidentally overwrite parts of your formula when using F9. Always exit with Esc rather than Enter to preserve the original formula. However, if you make a mistake overwriting the formula it is not the end of the world. You can “undo” the action by pressing keyboard shortcut keys CTRL + z or pressing the “Undo” button
5.3 Other errors
Floating-point arithmetic may give inaccurate results in Excel - Article
Floating-point errors are usually very small, often beyond the 15th decimal place, and in most cases don't affect calculations significantly.
6. Sort rows by variance based on a population
This example demonstrates a formula in cell B8 that calculates the variance based on a population per row. It sorts rows from cell range B3:N6 by the variance of a population from large to small.
Cell ranges P3:P6 contains the variances from the source data and P7:P11 contains the variances based on the sorted rows.
This kind of calculation was very hard to perform in earlier Excel version. Excel 365 has a bunch of new functions that are powerful and easy to understand.
Some of these new functions return an array of values, however, you simply enter the formulas as regular formulas. They spill values automatically to cells below and to the right as far as needed. A #SPILL error tells you that at least one of the destination cells are not empty.
Explaining formula
Step 1 - Calculate the variance of a population
VAR.P(a)
Step 2 - Build the LAMBDA function
The LAMBDA function build custom functions without VBA, macros or javascript.
Function syntax: LAMBDA([parameter1, parameter2, …,] calculation)
LAMBDA(a,VAR.P(a))
Step 3 - Calculate the variance of a sample by row
The BYROW function puts values from an array into a LAMBDA function row-wise.
Function syntax: BYROW(array, lambda(array, calculation))
BYROW(C3:N6,LAMBDA(a,VAR.P(a)))
returns
{47221.2222222222;81130.9722222222;87499.7222222222;41286.1388888889}
Step 4 - Sort rows based on the variance of a sample
The SORTBY function sorts a cell range or array based on values in a corresponding range or array.
Function syntax: SORTBY(array, by_array1, [sort_order1], [by_array2, sort_order2],…)
SORTBY(B3:N6,BYROW(C3:N6,LAMBDA(a,VAR.P(a))),-1)
becomes
SORTBY(B3:N6,{47221.2222222222;81130.9722222222;87499.7222222222;41286.1388888889},-1)
and returns
7. When to use the VAR.P function and when to use the VAR.S function?
The VAR.P function and the VAR.S function are both used to calculate the variance, however, they differ in how they calculate the variance. The VAR.P function assumes that the dataset is the entire population, while the VAR.S function assumes that the dataset is a sample of the population.
The difference between calculating the population and sample variance is that population variance divides by the number of observations in the population, while the sample variance divides by the number of observations minus one. This makes the sample variance larger than the population variance, because it tries to account for the uncertainty of estimating the population variance from a sample.
You should use the VAR.P function when you have data for an entire population, and use the VAR.S function when you have data for a sample of the population. It is sometimes not practical to calculate the variance for millions of observations, a "Sample size calculator" is often useful as it also meets a given the confidence level and a margin of error.
Functions in 'Statistical' category
The VAR.P function function is one of 73 functions in the 'Statistical' category.
How to comment
How to add a formula to your comment
<code>Insert your formula here.</code>
Convert less than and larger than signs
Use html character entities instead of less than and larger than signs.
< becomes < and > becomes >
How to add VBA code to your comment
[vb 1="vbnet" language=","]
Put your VBA code here.
[/vb]
How to add a picture to your comment:
Upload picture to postimage.org or imgur
Paste image link to your comment.
Contact Oscar
You can contact me through this contact form