What Is P-Value?

In statistics, the p-value is the probability of obtaining results at least as extreme as the observed results of a statisticalhypothesis test, assuming that the null hypothesis is correct. The p-value is used as an alternative to rejection points to provide the smallest level of significance at which thenull hypothesis would be rejected. A smaller p-value means that there is stronger evidence in favor of the alternative hypothesis.

Key Takeaways

  • A p-value is a measure of the probability that an observed difference could have occurred just by random chance.
  • The lower the p-value, the greater the statistical significance of the observed difference.
  • P-value can be used as an alternative to or in addition to preselected confidence levels for hypothesis testing.

Click Play to Learn How P-Value Is Calculated

How Is P-Value Calculated?

P-values are usually foundusing p-value tables or spreadsheets/statistical software. These calculations are based on the assumed or known probability distribution of the specific statistic being tested. P-values are calculated from the deviation between the observed value and a chosen reference value, given the probability distribution of the statistic, with a greater difference between the two values corresponding to a lower p-value.

Mathematically, the p-value is calculated using integral calculus from the area under the probability distribution curve for all values of statistics that are at least as far from the reference value as the observed value is, relative to the total area under the probability distribution curve. In a nutshell, the greater the difference between two observed values, the less likely it is that the difference is due to simple random chance, and this is reflected by a lower p-value.

P-Value Approach to Hypothesis Testing

The p-value approach to hypothesis testing uses the calculated probability to determine whether there is evidence to reject the null hypothesis. The null hypothesis, also known as the conjecture, is the initial claim about a population (or data generating process). The alternative hypothesis states whether the population parameter differs from the value of the population parameter stated in the conjecture.

In practice, the significance level is stated in advance to determine how small the p-value must be in order to reject the null hypothesis. Because different researchers use different levels of significance when examining a question, a reader may sometimes have difficulty comparing results from two different tests. P-values provide a solution to this problem.

For example, suppose a study comparing returns from two particularassets was undertaken by different researchers who used the same data but different significance levels. The researchers might come to opposite conclusions regarding whether the assets differ. If one researcher used a confidence level of 90% and the other required a confidence level of 95% to reject the null hypothesis and the p-value of the observed difference between the two returns was 0.08 (corresponding to a confidence level of 92%), then the first researcher would find that the two assets have a difference that is statistically significant, while the second would find no statistically significant difference between the returns.

To avoid this problem, the researchers could report the p-value of the hypothesis test and allow readers to interpret thestatistical significancethemselves. This is called a p-value approach to hypothesis testing. Independent observers could note the p-value and decide for themselves whether that represents a statistically significant difference or not.

Example of P-Value

Aninvestorclaims that their investment portfolios performance is equivalent to that of the Standard & Poors (S&P)500 Index. To determine this, the investor conducts atwo-tailed test. The null hypothesis states that the portfolios returns are equivalent to the S&P 500s returns over a specified period, while the alternative hypothesis states that the portfolios returns and the S&P 500s returns are not equivalentif the investor conducted a one-tailed test, the alternative hypothesis would state that the portfolios returns are either less than or greater than the S&P 500s returns.

The p-value hypothesis test does not necessarily make use of a preselected confidence level at which the investor should reset the null hypothesis that the returns are equivalent. Instead, it provides a measure of how much evidence there is to reject the null hypothesis. The smaller the p-value, the greater the evidence against the null hypothesis. Thus, if the investor finds that the p-value is 0.001, there is strong evidence against the null hypothesis, and the investor can confidently conclude the portfolios returns and the S&P 500s returns are not equivalent.

Although this does not provide an exact threshold as to when the investor should accept or reject the null hypothesis, it does have another very practical advantage. P-value hypothesis testing offers a direct way to compare the relative confidence that the investor can have when choosing among multiple different types of investments or portfolios relative to a benchmark such as the S&P 500.

For example, for two portfolios, A and B, whose performance differs from the S&P 500 with p-values of 0.10 and 0.01, respectively, the investor can be much more confident that portfolio B, with a lower p-value, will actually show consistently different results.