Histograms help you visualize frequency distribution in data analysis and reporting, and get insights quickly. Here’s how to create histogram in PostgreSQL. We will look at a couple of different ways to get frequency distribution in PostgreSQL.
How to Create Histogram in PostgreSQL
Here are the steps to create histogram in PostgreSQL.
Let’s say you have the following table sales(order_date, sale)
postgres=# create table sales(order_date date,sale int); postgres=# insert into sales values('2020-04-01',210), ('2020-04-02',125),('2020-04-03',150),('2020-04-04',230), ('2020-04-05',200),('2020-04-10',220),('2020-04-06',250), ('2020-04-07',215),('2020-04-08',300),('2020-04-09',250); postgres=# select * from sales; order_date | sale ------------+------ 2020-04-01 | 210 2020-04-02 | 125 2020-04-03 | 150 2020-04-04 | 230 2020-04-05 | 200 2020-04-10 | 220 2020-04-06 | 250 2020-04-07 | 215 2020-04-08 | 300 2020-04-09 | 250
Bonus Read : Create User in PostgreSQL
Create Histogram in PostgreSQL
There are multiple PostgreSQL Histogram queries to create histogram in PostgreSQL. We will look at each of them.
Simplest way to create Histogram in PostgreSQL
If you want PostgreSQL Histogram bins to have equal intervals (10-20, 20-30, 30-40, etc.) then here’s the SQL query to create frequency distribution in PostgreSQL.
postgres=# select floor(sale/10)*10 as bin_floor, count(*) from sales group by 1 order by 1; +-----------+----------+ | bin_floor | count(*) | +-----------+----------+ | 120 | 1 | | 150 | 1 | | 200 | 1 | | 210 | 2 | | 220 | 1 | | 230 | 1 | | 250 | 2 | | 300 | 1 | +-----------+----------+
In the above query, we calculate the bin number by dividing each sale value with histogram bin size, that is, 10. If you want to change the bin size to 25, change floor(sale/10)*10 to floor(sale/25)*25
Bonus Read : PostgreSQL DROP VIEW
postgres=# select floor(sale/25)*25 as bin_floor, count(*) from sales group by 1 order by 1; +-----------+----------+ | bin_floor | count(*) | +-----------+----------+ | 125 | 1 | | 150 | 1 | | 200 | 4 | | 225 | 1 | | 250 | 2 | | 300 | 1 | +-----------+----------+
The advantage of the above method is that it automatically changes the bins based on addition/deletion/modification of data. It also creates new bins for new data and/or removes bins in case their count becomes 0. So it allows you to dynamically create histogram in PostgreSQL.
Bonus Read : PostgreSQL Create Index
Most Flexible Way to Create Histrogram in PostgreSQL
If you already know the histogram bins to be created or if your histogram bucket sizes are different, then you can create frequency distribution using CASE statement. Here’s the Histogram query for PostgreSQL
postgresql=# select '100-150' as TotalRange,count(sale) as Count from sales where sale between 100 and 150 union ( select '150-200' as TotalRange,count(sale) as Count from sales where sale between 150 and 200) union ( select '200-225' as TotalRange,count(sale) as Count from sales where sale between 200 and 225) union ( select '225-250' as TotalRange,count(sale) as Count from sales where sale between 225 and 250); +------------+-------+ | TotalRange | Count | +------------+-------+ | 100-150 | 2 | | 150-200 | 2 | | 200-225 | 4 | | 225-250 | 3 | +------------+-------+
In this query, we assign bucket for each sale value using case statement. You can change the histogram buckets to any range you want, they don’t need to have uniform intervals.
This query has 2 advantages:
- You can change the bucket name to anything you want. For example, instead of displaying “100-150” you can show it as “between 100 and 150”
- You can have non uniform bucket size. For example, you can have a bucket as 100-150 with a bucket size of 50, and another bucket as 150-250 with a bucket size of 100.
Depending on your requirement, you can choose any of the above methods to create histogram in PostgreSQL.
Ubiq makes it easy to visualize data in minutes, and monitor in real-time dashboards. Try it Today!
Sreeram Sreenivasan is the Founder of Ubiq. He has helped many Fortune 500 companies in the areas of BI & software development.