We consider a random variablexand a data setS ={x1, x2, , xn}of size n which contains possible values ofx. The data set can represent either the population being studied or a sample drawn from the population. The meanis the statistic used most often to characterize the center of the data in S. We now consider the following commonly used measures of variability of the data around the mean, namely the In addition, we also explore three other measures of variability that are not linked to the mean, namely the Of these statistics, the variance and standard deviation are most commonly employed.
WhereSrepresents a sample the The reason the expression for the population variance involves division by n while that of the sample variance involves division by n 1 isexplained in Property 3 ofEstimators, where division by n1is required to obtained an unbiased estimator of the population variance.
This is calculated as follows. First, the mean =(2+5-1+3+4+5+0+2)/8 = 2.5, and so the squared deviationSS = (22.5)2+ (52.5)2+(-12.5)2+(32.5)2+(42.5)2+(52.5)2 +(02.5)2+(22.5)2 = 34. Thus the variance = SS/n = 34/8 = 4.25 If instead S represents a sample, then the mean is still 2.5, but the variance = SS/(n1) = 34/7 = 4.86. These can be calculated in Excel by the formulas VARP(B3;B10) and VAR(B3:B10), as shown in Figure 2.
STDEV. The population standard deviation is calculated in Excel using the functionSTDEVP.In Excel 2010/2013 the alternative forms of these functions areSTDEV.SandSTDEV.P.
IfSis a sample, thenthe sample standard deviation = square root of the sample variance = = 2.20 These are the results of the formulas STDEVP(B3:B10) and STDEV(B3:B10), as shown in Figure 2.
The formula =VARCOL(J4:L11) produces the first result (in range J15:L15), while the formula =STDEVCOL(J4:L11) produces the second result (in range J16:L16). Remember that after entering either of these formulas you must press
Thus if µx= µy the combined population variance would be
Thus if x̄ = ȳ, the combined sample variance would be
The data in the two samples is given in the range B3:C7 of Figure 4. From these, the mean, variance and standard deviation are calculated for each of the two samples (ranges B12:B15 and C12:C15). Using Property 4, we can calculate the mean and variance of the combined sample (D13 and D14). If we simply combine the two samples we obtain the data in the range F3:F10, from which we can calculate the mean, variance and standard deviation in the normal way (range D12:D18). As we can see the results are the same.
We have three samples whose total sample size is 58 (cell B7), calculated via =SUM(B4:B6). The sum of the elements in each sample can be calculated from the mean as shown in range F4:F6. E.g. the sum of all the data elements in sample 1 is 276 (cell F4), calculated via the formula =B4*C4. Thus the sum of all the elements in all three samples is 786 (cell F7), calculated via the formula =SUM(F4:F6). The mean of the combined sample is therefore 13.5517 (cell C7), calculated via the formula =F7/B7. The calculation of the combined variance is similar. The key is to first find the sum of the squares of all the elements in each sample. These are given in range I4:I6. E.g. the sum of the squares of all the elements in sample 1 is 5512 (cell I4), calculated by =G4+H4 (using Property 1), where G4 contains the formula =B4*C4^2 and H4 contains =(B4-1)*D4. Thus the sum of squares of all the elements in the combined sample is 19,832 (cell I7), calculated by =SUM(I4:I6). Finally, the variance for the combined sample is 161.059 (cell D7), calculated by =(I7-B7*C7^2)/(B7-1), based on Property 1. The standard deviation is therefore 12.6909.
DEVSQ.
AVEDEV.
Median {|xi | : xi in S} where = median of the data elements in S.
=MEDIAN(ABS(R-MEDIAN(R))) Even though the value is presented in a single cell it is essential that you press Alternatively, you can use the function
You can achieve the same result using the formula =MAD(E3:E10) as shown in Figure 2.
=MAX(R) MIN(R) Alternatively, you can use the function
=QUARTILE(R, 3) QUARTILE(R, 1) Starting with Excel 2010 there is another version of the quartile function called QUARTILE.EXC. An alternative version of IQR is therefore =QUARTILE.EXC(R, 3) QUARTILE.EXC(R, 1) SeeRanking Functions in Excel for further information about the QUARTILE and QUARTILE.EXC functions. Alternatively, you can calculate the inter-quartile range via the function
V = s/x̄ Since s and x̄have the same units of measurement, Vhas no units of measurement. This statistic only makes sense for ratio scale data. The higher the value of V the more Clearly, the coefficient of variation is only defined when the mean is not zero.
=STDEV.S(R)/AVERAGE(R) The population version of Vis σ/μ which can becalculated in Excel by the formula =STDEV.P(R)/AVERAGE(R)
V = s/x̄ = 2.203892/2.5 = 88.16%
Since VA = .09/.12 = .75 and VB = .05/.08 = .625, stock B is considered to be the better investment since its relative risk (equal to its coefficient of variation) is lower. ## Video |