P-Value Calculator

P-Value from Z-Score

P-Value

0.0500

Significance at α = 0.05

Statistically Significant

The p-value (0.0500) is ≤ 0.05. Reject the null hypothesis.

Link copied!

How P-Values Work

A p-value is the probability of observing a test statistic as extreme as (or more extreme than) the one calculated from your sample data, assuming the null hypothesis is true. It is the cornerstone of null hypothesis significance testing (NHST), the most widely used framework for statistical inference in science, medicine, social science, and business analytics.

The concept was formalized by statistician Ronald Fisher in the 1920s. According to the American Statistical Association (ASA), a p-value does not measure the probability that the studied hypothesis is true, nor does it measure the probability that the data were produced by random chance alone. Rather, it quantifies the compatibility of the observed data with a specified statistical model (the null hypothesis).

P-values are used across virtually every quantitative discipline. Medical researchers use them to evaluate drug efficacy in clinical trials. Social scientists use them to assess the significance of survey results. Business analysts use them in A/B testing to determine whether website changes affect conversion rates. Our Standard Deviation Calculator and Chi-Square Calculator are companion tools for the underlying statistical computations.

How P-Values Are Calculated

P-values are calculated from the cumulative distribution function (CDF) of the relevant probability distribution, depending on the test being used:

Worked example (z-score): You test whether a sample mean differs from the population mean and get z = 2.33. Two-tailed p-value = 2 × P(Z > 2.33) = 2 × 0.0099 = 0.0198. Since 0.0198 < 0.05, the result is statistically significant at the 5% level.

Key Terms You Should Know

Null hypothesis (H0): The default assumption that there is no effect or no difference. The p-value measures evidence against this hypothesis.

Alternative hypothesis (H1 or Ha): The hypothesis that there is a real effect or difference. It is what you hope to support with evidence.

Significance level (α): The predetermined threshold for rejecting the null hypothesis. Commonly 0.05 (5%), 0.01 (1%), or 0.001 (0.1%). If the p-value is less than α, the result is declared statistically significant.

Type I error (false positive): Rejecting the null hypothesis when it is actually true. The probability of this equals α. At α = 0.05, you accept a 5% chance of a false positive.

Type II error (false negative): Failing to reject the null hypothesis when it is actually false. The probability is denoted β. Statistical power = 1 − β.

Effect size: A measure of the magnitude of the difference or relationship, independent of sample size. Common measures include Cohen's d, Pearson's r, and odds ratios. The ASA recommends always reporting effect sizes alongside p-values.

Common Significance Thresholds by Field

Different scientific disciplines use different significance thresholds. A 2019 Nature article signed by over 800 statisticians argued that the arbitrary p < 0.05 threshold leads to both false confidence and false despair. The table below shows conventions across fields, based on published standards from each discipline.

Field Typical α P-Value Threshold Notes
Social Sciences 0.05 p < 0.05 Standard since Fisher; replication crisis debate ongoing
Medical Research 0.05 or 0.01 p < 0.05 FDA typically requires p < 0.05 in Phase III trials
Particle Physics 0.0000003 5σ (p < 3 × 10−7) Used for discovery claims (e.g., Higgs boson)
Genomics (GWAS) 5 × 10−8 p < 5 × 10−8 Bonferroni correction for ~1 million SNP tests
A/B Testing (Tech) 0.05–0.10 p < 0.05 or 0.10 Often combined with minimum detectable effect (MDE)
Proposed New Standard 0.005 p < 0.005 2017 Nature Human Behaviour proposal by 72 statisticians

Practical Examples

Example 1 — Medical trial: A clinical trial tests a new drug against a placebo. The t-test produces t = 2.89 with 48 degrees of freedom. Two-tailed p-value = 0.0058. Since p < 0.05, the drug shows a statistically significant effect. However, the researchers also report Cohen's d = 0.42 (medium effect size), providing context that the magnitude of the effect is clinically meaningful.

Example 2 — A/B test: A website runs an experiment comparing two homepage designs. After 10,000 visitors per group, the conversion rate is 3.2% (control) vs 3.8% (treatment). The z-test produces z = 2.14, giving a two-tailed p = 0.032. The result is statistically significant, and the 0.6 percentage point lift translates to meaningful revenue impact. Use our Percentage Increase Calculator to quantify the improvement.

Example 3 — Chi-square test: A survey asks 200 people their preferred social media platform across 4 age groups. The chi-square test produces χ² = 15.4 with 9 degrees of freedom. The p-value is 0.080. Since p > 0.05, we fail to reject the null hypothesis; the data do not provide sufficient evidence that platform preference differs significantly by age group.

Tips for Interpreting P-Values Correctly

The Replication Crisis and P-Values

The "replication crisis" in science has brought p-values under intense scrutiny. A landmark 2015 study by the Open Science Collaboration attempted to replicate 100 published psychology experiments and found that only 36% produced statistically significant results on replication, compared to 97% in the original publications. This has led to calls for pre-registration of studies, transparent reporting, and greater emphasis on effect sizes and confidence intervals alongside p-values.

In 2016, the American Statistical Association issued its first formal statement on p-values, emphasizing six principles including that p-values do not measure the importance of a result and that scientific conclusions should not be based solely on whether a p-value passes a specific threshold. This statement has been cited over 7,000 times and has influenced reporting standards across multiple journals.

Frequently Asked Questions

Related Calculators

Chi-Square Calculator Standard Deviation Calculator Average Calculator Sample Size Calculator Probability Calculator