MySQL Remove Duplicate Records

Sometimes you may need to remove duplicate records or delete duplicate rows in MySQL. There are mainly 3 ways to delete duplicate records in MySQL. Here’s how to remove duplicate records in MySQL.

 

MySQL Remove Duplicate Records

Here are the steps to remove duplicate records in MySQL. Let’s say you have the following table dup_orders(id, amount) with duplicate records.

mysql> create table dup_orders(id int, amount int);

mysql> insert into dup_orders(id,amount) values(1, 100),(1,250),(2,350),(2,350);

mysql> select * from dup_orders;
+------+--------+
| id   | amount |
+------+--------+
|    1 |    100 |
|    1 |    250 |
|    2 |    350 |
|    2 |    350 |
+------+--------+

 

Bonus Read : How to Get Duplicate Records in MySQL

 

How to Delete Duplicate Rows in MySQL

Let us look at each of the 3 ways to delete duplicate records in MySQL.

 

1. Remove duplicate records using intermediate table

This method involves 3 steps. First, select rows without duplicates, from dup_orders table and insert them into another table. Here’s the syntax for it.

CREATE TABLE [copy_of_source] SELECT DISTINCT [columns] FROM [source_table];

Here’s our query to select distinct rows from dup_orders into another table.

mysql> CREATE TABLE dup_orders_copy SELECT DISTINCT id,amount FROM dup_orders;

mysql> select * from dup_orders_copy;
+------+--------+
| id   | amount |
+------+--------+
|    1 |    100 |
|    1 |    250 |
|    2 |    350 |
+------+--------+

The above query removes rows that complete duplicate, that is duplicate values for all columns.

 

If you want to remove rows with duplicate values for just one or more but not all columns, you can use a GROUP BY clause in the above query. For example, if you only want to rows with duplicate ID column values in your table, use the following query.

mysql> CREATE TABLE dup_orders_copy SELECT id,amount FROM dup_orders group by id;

mysql> select * from dup_orders_copy;
+------+--------+
| id   | amount |
+------+--------+
|    1 |    100 |
|    2 |    350 |
+------+--------+

Once you have created the intermediate table, drop the original table.

mysql> drop table dup_orders;

 

Finally, rename intermediate table to original table.

mysql> alter table dup_orders_copy rename to dup_orders;

mysql> select * from dup_orders;
+------+--------+
| id   | amount |
+------+--------+
|    1 |    100 |
|    2 |    350 |
+------+--------+

 

Bonus Read : MySQL Add Unique Constaint

 

2. Remove duplicate rows using INNER JOIN

You can also remove duplicate rows using combination of DELETE and INNER JOIN statements. However, in this case, your table needs to have at least one unique column (e.g primary key). Let’s say you have the following dup_orders table with duplicate amount values but unique id values.

mysql> insert into dup_orders(id,amount) values(1, 100),(2,250),(3,350),(4,350);

mysql> select * from dup_orders;
+------+--------+
| id   | amount |
+------+--------+
|    1 |    100 |
|    2 |    250 |
|    3 |    350 |
|    4 |    350 |
+------+--------+

You can join the above table with itself using an INNER JOIN and delete duplicate rows using the following query.

mysql> DELETE t1 FROM dup_orders t1
       INNER JOIN dup_orders t2
       WHERE
           t1.id < t2.id 
           AND t1.amount = t2.amount;
+------+--------+
| id   | amount |
+------+--------+
|    1 |    100 |
|    2 |    250 |
|    4 |    350 |
+------+--------+

The above query will retain the row highest id for each duplicate row. If you want to retain the row with lowest id, use the following query.

mysql> DELETE t1 FROM dup_orders t1
       INNER JOIN dup_orders t2
       WHERE
           t1.id >> t2.id 
           AND t1.amount = t2.amount;

 

Bonus Read : MySQL Rename Column

 

3. Remove Duplicate Records using ROW_NUMBER()

You can also remove duplicate records from table using ROW_NUMBER() function, available since 8.0.2. Here’s the SQL query to get row numbers of duplicate rows

SELECT *. ROW_NUMBER () Over (PARTITION BY [column] 
ORDER BY [column]) as [row_number_name];

Here’s the query to get row number for dup_orders table

SELECT *. ROW_NUMBER () Over (PARTITION BY id ORDER BY id) 
as row_number from dup_orders;

 

We will use this as a sub query in our SQL query to delete duplicate rows, as shown below

DELETE FROM dup_orders
WHERE id IN (
    SELECT id
    FROM (
        SELECT 
            id, 
            ROW_NUMBER () Over (PARTITION BY id ORDER BY id) as row_number 
           from dup_orders
    ) t
    WHERE row_number > 1
)

Hopefully, now you can easily remove duplicate records in MySQL.

 

Ubiq makes it easy to visualize data in minutes, and monitor in real-time dashboards. Try it Today!