metrics

p95 and p99 latency

p95 and p99 latency show the slow tail of response times: 95% or 99% of requests were at or below that value.

Also known as: percentile latency, tail latency

Definition

p95 latency means 95 percent of requests completed at or below that duration. p99 latency does the same for 99 percent. These percentiles show the slow tail better than a mean.

In Maxoperf

Run results should be reviewed by label and percentile. A checkout p99 spike can matter more than a small change in whole-test average latency.

Common pitfalls

Reading percentiles without enough samples.
Comparing p99 from a tiny smoke test with p99 from a heavy release-candidate run.

FAQ

Why not use average latency?

Averages hide the tail. A small group of very slow requests can harm users while the average still looks acceptable.

Should release gates use p95 or p99?

Use the percentile that matches the service promise. p95 is common for broad user experience; p99 is useful for stricter paths and latency-sensitive APIs.