How to Create Histogram in PostgreSQL

Histograms help you visualize frequency distribution in data analysis and reporting, and get insights quickly. Here’s how to create histogram in PostgreSQL. We will look at a couple of different ways to get frequency distribution in PostgreSQL.

Table of Contents

How to Create Histogram in PostgreSQL

Here are the steps to create histogram in PostgreSQL.

Let’s say you have the following table sales(order_date, sale)

postgres=# create table sales(order_date date,sale int);

postgres=# insert into sales values('2020-04-01',210),
           ('2020-04-02',125),('2020-04-03',150),('2020-04-04',230),
           ('2020-04-05',200),('2020-04-10',220),('2020-04-06',250),
           ('2020-04-07',215),('2020-04-08',300),('2020-04-09',250);

postgres=# select * from sales;
 order_date | sale
------------+------
 2020-04-01 |  210
 2020-04-02 |  125
 2020-04-03 |  150
 2020-04-04 |  230
 2020-04-05 |  200
 2020-04-10 |  220
 2020-04-06 |  250
 2020-04-07 |  215
 2020-04-08 |  300
 2020-04-09 |  250

Bonus Read : Create User in PostgreSQL

Create Histogram in PostgreSQL

There are multiple PostgreSQL Histogram queries to create histogram in PostgreSQL. We will look at each of them.

Simplest way to create Histogram in PostgreSQL

If you want PostgreSQL Histogram bins to have equal intervals (10-20, 20-30, 30-40, etc.) then here’s the SQL query to create frequency distribution in PostgreSQL.

postgres=# select floor(sale/10)*10 as bin_floor, count(*)
     from sales
     group by 1
     order by 1;
+-----------+----------+
| bin_floor | count(*) |
+-----------+----------+
|       120 |        1 |
|       150 |        1 |
|       200 |        1 |
|       210 |        2 |
|       220 |        1 |
|       230 |        1 |
|       250 |        2 |
|       300 |        1 |
+-----------+----------+

In the above query, we calculate the bin number by dividing each sale value with histogram bin size, that is, 10. If you want to change the bin size to 25, change floor(sale/10)*10 to floor(sale/25)*25

Bonus Read : PostgreSQL DROP VIEW

postgres=# select floor(sale/25)*25 as bin_floor, count(*)
     from sales
     group by 1
     order by 1;
+-----------+----------+
| bin_floor | count(*) |
+-----------+----------+
|       125 |        1 |
|       150 |        1 |
|       200 |        4 |
|       225 |        1 |
|       250 |        2 |
|       300 |        1 |
+-----------+----------+

The advantage of the above method is that it automatically changes the bins based on addition/deletion/modification of data. It also creates new bins for new data and/or removes bins in case their count becomes 0. So it allows you to dynamically create histogram in PostgreSQL.

Bonus Read : PostgreSQL Create Index

Most Flexible Way to Create Histrogram in PostgreSQL

If you already know the histogram bins to be created or if your histogram bucket sizes are different, then you can create frequency distribution using CASE statement. Here’s the Histogram query for PostgreSQL

postgresql=# select '100-150' as TotalRange,count(sale) as Count from sales
        where sale between 100 and 150
     union (
     select '150-200' as TotalRange,count(sale) as Count from sales
        where sale between 150 and 200)
     union (
     select '200-225' as TotalRange,count(sale) as Count from sales
        where sale between 200 and 225)
     union (
     select '225-250' as TotalRange,count(sale) as Count from sales
        where sale between 225 and 250);
+------------+-------+
| TotalRange | Count |
+------------+-------+
| 100-150    |     2 |
| 150-200    |     2 |
| 200-225    |     4 |
| 225-250    |     3 |
+------------+-------+

In this query, we assign bucket for each sale value using case statement. You can change the histogram buckets to any range you want, they don’t need to have uniform intervals.

This query has 2 advantages:

You can change the bucket name to anything you want. For example, instead of displaying “100-150” you can show it as “between 100 and 150”
You can have non uniform bucket size. For example, you can have a bucket as 100-150 with a bucket size of 50, and another bucket as 150-250 with a bucket size of 100.

Depending on your requirement, you can choose any of the above methods to create histogram in PostgreSQL.

Ubiq makes it easy to visualize data in minutes, and monitor in real-time dashboards. Try it Today!

How to Calculate Retention Rate in SQL?

How To Calculate Month over Month Growth in PostgreSQL

How to Create Read Only User in PostgreSQL

Sreeram Sreenivasan

Sreeram Sreenivasan is the Founder of Ubiq. He has helped many Fortune 500 companies in the areas of BI & software development.

How to Create Histogram in PostgreSQL

Create Histogram in PostgreSQL

Simplest way to create Histogram in PostgreSQL

Most Flexible Way to Create Histrogram in PostgreSQL

Related posts:

How to Calculate Retention Rate in SQL?

How To Calculate Month over Month Growth in PostgreSQL

How to Create Read Only User in PostgreSQL

Share this: