How to Change Order of Dataframe Columns in Python

Python dataframes provide a great way to store data in a tabular manner. They support tons of functions to easily analyze and manipulate data, making it easy for developers to work with them. Many times, Python developers need to change the order of columns in Python dataframe. There are several simple ways to do this. In this article, we will learn how to change order of dataframe columns in Python.

How to Change Order of Dataframe Columns in Python

Here are the different ways to change the order of columns in Python dataframe. Let us say you have the following dataframe.

import pandas as pd

data = {'id': [1, 2, 3, 4],
'name': ['Jim','Jane','John', 'Tim'],
'marks': [45, 85, 69, 75]}
df = pd.DataFrame(data)
print(df)

## output

id name marks
0 1 Jim 45
1 2 Jane 85
2 3 John 69
3 4 Tim 75

1. Using Column List

This is one of the simplest ways to alter column order for dataframes. You can pass a list of columns with new order to the dataframe variable as shown below.

df = df[list_with_new_column_order]

Here is an example to interchange the columns name and marks so that the new order of dataframe columns is id, marks, name. We pass a list containing new column order.

df=df[['id','marks','name']]
print(df)

## output

id marks name
0 1 45 Jim
1 2 85 Jane
2 3 69 John
3 4 75 Tim

If you have too many columns in your dataframe, or you do not remember the names of all columns, then it can be tedious to list names of all the columns. In such cases, if you want to change the position of just one or a few columns, you can use a list comprehension in addition to the column names. Here is an example to make marks column as the first column and leave the order of rest of the columns unchanged.

df=df[['marks'] + [ col for col in df.columns if col != 'marks' ]]
print(df)

## output

marks id name
0 45 1 Jim
1 85 2 Jane
2 69 3 John
3 75 4 Tim

2. Using loc

Every dataframe supports loc method that allows you to re-order the columns as per your requirement. It allows you to re-order columns by directly specifying their names. It is generally used to extract select rows and columns from a given dataframe. But it can also be used to re-order columns. Here is its syntax.

dataframe.loc[start_row:end_row,start_column:end_column]
OR
dataframe.loc[start_row:end_row,list_of_column_names]

Here is an example to re-arrange dataframe columns using loc method.

df=df.loc[:,['id','marks','name']]
print(df)

## output

id marks name
0 1 45 Jim
1 2 85 Jane
2 3 69 John
3 4 75 Tim

Although this method appears similar to the previous one, it has an important difference. When you use loc() function, it returns a view or slice of the original dataframe. It does not create a copy of the original dataframe. In other words, the dataframe with new column order will reference the same memory space as the original dataframe. In the previous solution, it will create a copy of the original dataframe. Since we re-assign it back to the original dataframe, it will overwrite the original dataframe. Otherwise, the result would have been a copy of the original.

3. Using iloc

Every dataframe also supports iloc method that allows you to select data based on column index instead of column names. This is great if you want to change the columns using their positions instead of their names. Here is its syntax.

dataframe.iloc[start_row:end_row,start_column_index:end_column_index]
OR
dataframe.iloc[start_row:end_row,list_of_column_indexes]

Here is an example to re-arrange columns using their indexes.

df=df.iloc[:,[0,2,1]]
print(df)

## output

id marks name
0 1 45 Jim
1 2 85 Jane
2 3 69 John
3 4 75 Tim

4. Using insert

In this approach, we extract the required column using pop() function and insert it back into dataframe, at the desired position, using insert() function. Here is the syntax of insert() function.

dataframe.insert(location, column_name, column_value, allow_duplicates = False) 

In the above command, we first specify the location of the column to be inserted. This is followed by column name, followed by column values. The last argument allow_duplicates is optional.

col = df.pop("marks")
df.insert(0, col.name, col)
print(df)

## output

marks id name
0 45 1 Jim
1 85 2 Jane
2 69 3 John
3 75 4 Tim

In the above code, pop() function extracts the column marks, both column name as well as values, and stores it in col variable. Now the dataframe contains only 2 columns. We use insert function to insert the marks column back into dataframe df at 1st column position. This is followed by the value of marks column.

This method is useful if you want to change position of just one column in a dataframe with many columns, without listing names or indexes of all columns.

5. Using reindex

You can also call reindex method on a dataframe to re-order its columns. You just need to pass columns argument, that contains a list of column names in the new order. Here is an example to re-order dataframe columns as id, marks, name.

df=df.reindex(columns=['id','marks','name'])
print(df)

## output

id marks name
0 1 45 Jim
1 2 85 Jane
2 3 69 John
3 4 75 Tim

You can also call Python list functions on the list of column names. Here is an example to re-order dataframe columns such that their column names are in alphabetical order.

df=df.reindex(columns=sorted(['id','marks','name']))
print(df)

You can also use list comprehensions. Here is an example to set name column as first column, followed by the rest of columns.

df=df.reindex(columns=(['name'] + list([a for a in df.columns if a != 'name'])))
print(df)

## output

name id marks
0 Jim 1 45
1 Jane 2 85
2 John 3 69
3 Tim 4 75

Please note, you can also use copy=True argument if you want the result to be stored as a separate copy. Else you can set it to False or omit it.

6. Using reverse

Sometimes, you may want to reverse the order of columns. This is very tedious if you need to manually list the reverse order of columns, and if there are many columns. In such cases, you can call reverse() function on the list of columns. It will automatically return a list of reverse order of columns.

cols = list(df.columns)
cols.reverse()
df=df[cols]
print(df)

## output

marks name id
0 45 Jim 1
1 85 Jane 2
2 69 John 3
3 75 Tim 4

In the above code, we first get the list of column names using df.columns and list() function. Then we call reverse function on this to reverse the order of columns. Then we pass the list with new order of columns to df[…]. This will reverse the order of dataframe columns.

Conclusion

In this article, we have learnt several ways to change order of columns in Pandas dataframe in Python. The most basic way is to provide a list of column names in new order. This method will create a separate copy for the result. You can also use loc or iloc function for this purpose. In this case, it will create a view or slice of original dataframe. These are useful for re-ordering multiple columns of your dataframe. If you want to move just one column, then you can use insert function. If you want to reverse order of all columns in dataframe, then you can use reverse function. Having said that, you can use any of these methods as per your requirement.

Also read:

How to Select Multiple Columns in Pandas DataFrame
How to Randomly Select Item from Python List
How to Detect Invalid Date in JavaScript

Leave a Reply

Your email address will not be published. Required fields are marked *