# Under what circumstances is it necessary to use the coefficient of variation to compare relative variability between two or more distributions?

DataplotVol 2Vol 1

## COEFFICIENT OF VARIATION

Name:
COEFFICIENT OF VARIATION (LET)
Type:
Let Subcommand
Purpose:
Compute the coefficient of variation of a variable.
Description:
The sample coefficient of variation (CV) is defined as the ratio of the standard deviation to the mean:
$$\mbox{cv} = \frac{s}{\bar{x}}$$

where s is the sample standard deviation and $$\bar{x}$$ is the sample mean.

That is, it shows the variability, as defined by the standard deviation, relative to the mean.

The coefficient of variation should typically only be used for data measured on a ratio scale. That is, the data should be continuous and have a meaningful zero. Measurement data in the physical sciences and engineering are often on a ratio scale. As an example, temperatures measured on a Kelvin scale are on a ratio scale while temperaturs measured on a Celcius or Farenheit scale are interval scales rather than ratio scales. Given a set of temperature measurements, the coefficient of variation on the Celcius scale will be different than the coefficient of variation on the Farenheit scale.

The coefficient of variation is sometimes preferred to the standard deviation because the value of the coefficient of variation is independent of the unit of measurement scale (as long as it is a ratio scale). When comparing variability between data sets with different measurement scales or very different mean values, the coefficient of variation can be a useful alternative or complement to the standard deviation.

However, the coefficient of variation should not be used for data that are not on a ratio scale. Also, if the mean value is near zero, the coefficient of variation is sensitive to small changes in the mean. Also, the coefficient of variation cannot be used to compute confidence intervals for the mean.

Syntax 1:
LET <par> = COEFFICIENT OF VARIATION <y>
<SUBSET/EXCEPT/FOR qualification>
where <y> is a response variable;
<par> is a parameter where the coefficient of variation value is saved;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.
Syntax 2:
LET <par> = UNBIASED COEFFICIENT OF VARIATION <y>
<SUBSET/EXCEPT/FOR qualification> where <y> is a response variable;
<par> is a parameter where the unbiased coefficient of variation value is saved;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

For normally distributed data, an unbiased estimate of the coefficient of variation is

$$\mbox{cv*} = (1 + \frac{1}{4n}) \mbox{cv}$$

where n is the sample size and cv is $$s/\bar{x}$$.

Syntax 3:
LET <par> = LOGNORMAL COEFFICIENT OF VARIATION <y>
<SUBSET/EXCEPT/FOR qualification>
where <y> is a response variable;
<par> is a parameter where the lognormal coefficient of variation value is saved;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

For lognormally distributed data, a more accurate estimate for the coefficient of variation (based on the population mean and standard deviation of the lognormal distribution) is

$$\mbox{cv}_{\mbox{ln}} = \sqrt{\exp(s_{\mbox{ln}}^2) - 1}$$

where $$s_{\mbox{ln}}^2$$ is the variance of the log of the data.

Examples:
LET CV = COEFFICIENT OF VARIATION Y1
LET CV = COEFFICIENT OF VARIATION Y1 SUBSET TAG > 2
LET CV = UNBIASED COEFFICIENT OF VARIATION Y1
LET CV = LOGNORMAL COEFFICIENT OF VARIATION Y1
Note:
Versions prior to 1994/11 treated this command as a synonym for RELATIVE STANDARD DEVIATION. The relative standard deviation is:
$$\mbox{relsd} = 100 \frac{s}{|\bar{x}|}$$

That is, the relative standard deviation is the absolute value of the coefficient of variation expressed in percentage units.

Note:
Dataplot statistics can be used in a number of commands. For details, enter
HELP STATISTICS
Default:
None
Synonyms:
COEFFICIENT VARIATION
Related Commands:
COEFFICIENT OF VARIATION CONFIDENCE LIMIT=Compute confidence limits for the coefficient of variation.COEFFICIENT OF DISPERSION=Compute the coefficient of dispersion of a variable.QUARTILE COEFFICIENT OF DISPERSION=Compute the quartile coefficient of dispersion of a variable.RELATIVE STANDARD DEVIATION=Compute the relative standard deviation of a variable.MEAN=Compute the mean of a variable.STANDARD DEVIATION== Compute the standard deviation of a variable.
Applications:
Data Analysis
Implementation Date:
1994/11 (earlier versions use a different definition)
2017/01 Added the UNBIASED COEFFICIENT OF VARIATION
2017/01 Added the LOGNORMAL COEFFICIENT OF VARIATION
Program 1:
LET Y1 = NORMAL RANDOM NUMBERS FOR I = 1 1 100 LET CV = COEFFICIENT OF VARIATION Y1
Program 2:
. Step 1: Create the data . skip 25 read gear.dat y x skip 0 set write decimals 6 . . Step 2: Define plot control . title case asis title offset 2 label case asis . y1label Coefficient of Variation x1label Group title Coefficient of Variation for GEAR.DAT let ngroup = unique x xlimits 1 ngroup major x1tic mark number ngroup minor x1tic mark number 0 tic mark offset units data x1tic mark offset 0.5 0.5 y1tic mark label decimals 3 . character X line blank . set statistic plot reference line average . coefficient of variation plot y x