Result
Enter parameters and click Calculate
Compute p-values for Z, T, Chi-square, or F test statistics.
No recently used tools
Loading categories...
Free online p-value calculator for Z-test, T-test, Chi-square, and F-test. One-tailed and two-tailed tests with significance classification, interactive distribution curve, and Python scipy export.
Compute p-values for Z, T, Chi-square, or F test statistics.
A p-value is the probability of observing a test statistic at least as extreme as the one computed from sample data, assuming the null hypothesis (Hโ) is true. It is a fundamental concept in hypothesis testing and helps researchers decide whether to reject Hโ.
The p-value quantifies how likely observed data (or more extreme) would occur if Hโ were true. Ranges from 0 to 1.
A smaller p-value provides stronger evidence against the null hypothesis. It does NOT measure the probability that Hโ is true.
Compare p-value to significance level ฮฑ. If p โค ฮฑ, reject Hโ. Common thresholds: 0.05, 0.01, 0.10.
The significance of a p-value depends on the chosen ฮฑ level. Below are commonly used thresholds and their interpretations.
Highly Significant โ Very strong evidence against the null hypothesis. Often denoted with *** in research papers.
Significant โ Sufficient evidence to reject Hโ at the standard 5% level. The most common threshold in science.
Marginal โ Weak evidence against Hโ. Sometimes called โtrending toward significance.โ Use with caution.
Not Significant โ Insufficient evidence to reject Hโ. Does not prove Hโ is true, only that data is consistent with it.
| Distribution | Parameters | Common Use |
|---|---|---|
| Z (Normal) | None (standard) | Large samples, known ฯ, proportions |
| t (Studentโs) | df = n โ 1 | Small samples, unknown ฯ |
| ฯยฒ (Chi-square) | df = k โ 1 | Categorical data, goodness-of-fit |
| F (Fisher) | dfโ, dfโ | ANOVA, variance comparison |
Tests for an effect in a specific direction (e.g., mean is greater than or less than a value). All ฮฑ is concentrated in one tail, giving more power for that direction.
Tests for any difference in either direction. The ฮฑ is split between both tails (ฮฑ/2 each). More conservative but appropriate when the direction is not pre-specified.
Important: The choice between one-tailed and two-tailed tests must be made before looking at the data. Choosing after seeing results inflates the false positive rate.
Incorrect. The p-value is the probability of the data (or more extreme) given Hโ is true. It is P(data | Hโ), not P(Hโ | data). Bayesian methods are needed for the latter.
Incorrect. A p-value of 0.03 does not mean there is a 97% chance the alternative is true. The p-value tells you about the data under Hโ, not the probability of any hypothesis.
Incorrect. Failure to reject Hโ does not prove Hโ. It may simply mean the sample size was too small to detect the effect. โAbsence of evidence is not evidence of absence.โ
Correct: โAssuming Hโ is true, the probability of observing a test statistic as extreme as or more extreme than the one observed is p.โ Always interpret in context.