Pivot tables provide a great way to summarize and analyze data during data processing. It is very useful way to work with large datasets. Earlier it was available only in Microsoft Excel but its utility makes it required in every data processing software. Sometimes Python developers need to create pivot tables using pandas dataframe. In this article, we will learn how to create pivot tables in Python pandas.
How to Create Pivot Tables in Python Pandas
Here are the steps to create pivot tables in Python pandas. You can do this using pivot_table() function available for all dataframes.
1. Create Dataframe
Here are the commands to create a dataframe in Python Pandas.
import pandas as pd
data = {'Category': ['A', 'A', 'B', 'B', 'A', 'B'],
'Product': ['X', 'Y', 'X', 'Y', 'X', 'X'],
'Amount': [10, 12, 15, 18, 20, 14]}
df = pd.DataFrame(data)
print(df)
## output
Category Product Amount
0 A X 10
1 A Y 12
2 B X 15
3 B Y 18
4 A X 20
5 B X 14
2. Create Pivot Table
Next, we create pivot table using pivot_table() function. It takes three main arguments – index, values and aggfunc. Index means the columns to use as new row index. Values means the columns to aggregate. And aggfunc means the aggregation function such as sum, count, etc. Here is its complete syntax.
DataFrame.pivot_table( values=None, index=None, columns=None, aggfunc=’mean’, fill_value=None, margins=False, dropna=True)
Here is an example to pivot table by aggregating Amount column for each Product column value, using sum function.
pivot = df.pivot_table(index=['Product'],values=['Amount'],aggfunc='sum')
print(pivot)
## output
Product Amount
X 59
Y 30
Pivot Table Using Multiple Indexes
You can also aggregate data by both category and product, by simply mentioning those column names in indexes argument as shown.
pivot = df.pivot_table(index=['Category','Product'],
values=['Amount'],
aggfunc='sum')
print(pivot)
## output
Category Product
A X 30
Y 12
B X 29
Y 18
Pivot Table using multiple aggregation
You can also use multiple aggregation functions to aggregate your data, by mentioning the function names one after the other in a comma-separated manner, in aggfunc argument.
pivot = df.pivot_table(index=['Category','Product'],
values=['Amount'],
aggfunc={'median', 'mean', 'min'})
print(pivot)
## output
mean median min
Category Product
A X 15.0 15.0 10
Y 12.0 12.0 12
B X 14.5 14.5 14
Y 18.0 18.0 18
Pivot Table on filtered data
You can also filter the dataframe before creating a pivot table out of it. Here is an example where we filter rows where amount>10 and then use the result to create pivot table.
pivot = df[df['Amount'] > 10].pivot_table(index=['Category','Product'],
values=['Amount'],
aggfunc={'median', 'mean', 'min'})
print(pivot)
## output
mean median min
Category Product
A X 20.0 20.0 20
Y 12.0 12.0 12
B X 14.5 14.5 14
Y 18.0 18.0 18
Conclusion
In this article, we have learnt how to create Pivot table in Pandas. We have also learnt several use cases for creating pivot tables. You can customize it as per your requirement. Pivot tables are super useful for statistical analysis, data processing and reporting. They allow you to easily summarize data, understand trends, identify patterns and outliers. Being able to create pivot tables in Python pandas is a very helpful as it strengthens data processing capabilities of your Python script.
Also read:
How to Connect Pandas to Database
Top Python Learning Libraries in Python
Machine Learning LifeCycle

Sreeram Sreenivasan is the Founder of Ubiq. He has helped many Fortune 500 companies in the areas of BI & software development.