DataReportive dashboard interface showing database report generation

How to Calculate Percentiles in PostgreSQL

Percentiles are crucial in data analysis to understand the distribution of values in a dataset. PostgreSQL provides several built-in ways to calculate percentiles efficiently.

What is a Percentile?

A percentile indicates the value below which a given percentage of observations fall. For example, the 90th percentile means 90% of the data is below this value.

Using percentile_cont and percentile_disc

PostgreSQL offers two primary functions:

  • percentile_cont: Returns a continuous percentile, interpolating between values if necessary.
  • percentile_disc: Returns a discrete percentile, selecting the closest actual value.

Example: Continuous Percentile

SELECT percentile_cont(0.9) WITHIN GROUP (ORDER BY salary) AS p90 FROM employees;

This returns the 90th percentile salary, interpolating between salaries if needed.

Example: Discrete Percentile

SELECT percentile_disc(0.9) WITHIN GROUP (ORDER BY salary) AS p90 FROM employees;

This returns the first salary value that reaches the 90th percentile.

Calculating Multiple Percentiles at Once

SELECT percentile_cont(ARRAY[0.25, 0.5, 0.75]) WITHIN GROUP (ORDER BY salary) AS quartiles FROM employees;

This returns the 25th, 50th (median), and 75th percentiles in a single query.

Using Percentiles in Groups

SELECT department, 
       percentile_cont(0.9) WITHIN GROUP (ORDER BY salary) AS p90
FROM employees
GROUP BY department;

This calculates the 90th percentile salary per department.

Things to Keep in Mind

  • Percentile values range from 0 to 1 (not 0 to 100).
  • For performance, ensure you have indexes on the ORDER BY columns when working with large tables.
  • If you need a simpler estimate, you can approximate percentiles using NTILE or window functions.

Conclusion

PostgreSQL’s percentile functions are powerful tools for analyzing data distributions. With percentile_cont and percentile_disc, you can perform advanced statistical queries with ease. Incorporate these into your analytics workflows to gain deeper insights from your data.

Automate PostgreSQL reporting with DataReportive

Turn your PostgreSQL queries into scheduled reports delivered to your team or customers.