Sometimes you may need to remove duplicate records or delete duplicate rows in MySQL. There are mainly 3 ways to delete duplicate records in MySQL. Here’s how to remove duplicate records in MySQL.
MySQL Remove Duplicate Records
Here are the steps to remove duplicate records in MySQL. Let’s say you have the following table dup_orders(id, amount) with duplicate records.
mysql> create table dup_orders(id int, amount int); mysql> insert into dup_orders(id,amount) values(1, 100),(1,250),(2,350),(2,350); mysql> select * from dup_orders; +------+--------+ | id | amount | +------+--------+ | 1 | 100 | | 1 | 250 | | 2 | 350 | | 2 | 350 | +------+--------+
Bonus Read : How to Get Duplicate Records in MySQL
How to Delete Duplicate Rows in MySQL
Let us look at each of the 3 ways to delete duplicate records in MySQL.
1. Remove duplicate records using intermediate table
This method involves 3 steps. First, select rows without duplicates, from dup_orders table and insert them into another table. Here’s the syntax for it.
CREATE TABLE [copy_of_source] SELECT DISTINCT [columns] FROM [source_table];
Here’s our query to select distinct rows from dup_orders into another table.
mysql> CREATE TABLE dup_orders_copy SELECT DISTINCT id,amount FROM dup_orders; mysql> select * from dup_orders_copy; +------+--------+ | id | amount | +------+--------+ | 1 | 100 | | 1 | 250 | | 2 | 350 | +------+--------+
The above query removes rows that complete duplicate, that is duplicate values for all columns.
If you want to remove rows with duplicate values for just one or more but not all columns, you can use a GROUP BY clause in the above query. For example, if you only want to rows with duplicate ID column values in your table, use the following query.
mysql> CREATE TABLE dup_orders_copy SELECT id,amount FROM dup_orders group by id; mysql> select * from dup_orders_copy; +------+--------+ | id | amount | +------+--------+ | 1 | 100 | | 2 | 350 | +------+--------+
Once you have created the intermediate table, drop the original table.
mysql> drop table dup_orders;
Finally, rename intermediate table to original table.
mysql> alter table dup_orders_copy rename to dup_orders; mysql> select * from dup_orders; +------+--------+ | id | amount | +------+--------+ | 1 | 100 | | 2 | 350 | +------+--------+
Bonus Read : MySQL Add Unique Constaint
2. Remove duplicate rows using INNER JOIN
You can also remove duplicate rows using combination of DELETE and INNER JOIN statements. However, in this case, your table needs to have at least one unique column (e.g primary key). Let’s say you have the following dup_orders table with duplicate amount values but unique id values.
mysql> insert into dup_orders(id,amount) values(1, 100),(2,250),(3,350),(4,350); mysql> select * from dup_orders; +------+--------+ | id | amount | +------+--------+ | 1 | 100 | | 2 | 250 | | 3 | 350 | | 4 | 350 | +------+--------+
You can join the above table with itself using an INNER JOIN and delete duplicate rows using the following query.
mysql> DELETE t1 FROM dup_orders t1 INNER JOIN dup_orders t2 WHERE t1.id < t2.id AND t1.amount = t2.amount; +------+--------+ | id | amount | +------+--------+ | 1 | 100 | | 2 | 250 | | 4 | 350 | +------+--------+
The above query will retain the row highest id for each duplicate row. If you want to retain the row with lowest id, use the following query.
mysql> DELETE t1 FROM dup_orders t1 INNER JOIN dup_orders t2 WHERE t1.id >> t2.id AND t1.amount = t2.amount;
Bonus Read : MySQL Rename Column
3. Remove Duplicate Records using ROW_NUMBER()
You can also remove duplicate records from table using ROW_NUMBER() function, available since 8.0.2. Here’s the SQL query to get row numbers of duplicate rows
SELECT *. ROW_NUMBER () Over (PARTITION BY [column] ORDER BY [column]) as [row_number_name];
Here’s the query to get row number for dup_orders table
SELECT *. ROW_NUMBER () Over (PARTITION BY id ORDER BY id) as row_number from dup_orders;
We will use this as a sub query in our SQL query to delete duplicate rows, as shown below
DELETE FROM dup_orders WHERE id IN ( SELECT id FROM ( SELECT id, ROW_NUMBER () Over (PARTITION BY id ORDER BY id) as row_number from dup_orders ) t WHERE row_number > 1 )
Hopefully, now you can easily remove duplicate records in MySQL.
Ubiq makes it easy to visualize data in minutes, and monitor in real-time dashboards. Try it Today!
Sreeram Sreenivasan is the Founder of Ubiq. He has helped many Fortune 500 companies in the areas of BI & software development.