Learn how to find, filter, and remove duplicates in Excel using quick and creative methods. If you are working with large data tables, it’s helpful to find unique values or remove duplicate values. Our goal is to show you how to use Excel’s built-in data cleansing tools to find unique values in single or multiple columns.
Here we go; let us see how it works.
Table of contents:
- How to remove duplicates in Excel for a single column
- Find and remove duplicates in Excel for multiple columns
- Filtering for unique values and removing duplicate values
- Find and remove duplicate values using Conditional Formatting
- Remove duplicate data using Power Query
- Working with Pivot Tables
- Find and remove duplicates using Formulas
- Build a macro to remove duplicate values
- Use DataXL free add-in
What is a duplicate value? Things to consider…
A comparison of duplicate values depends on what appears in the cell. So it’s not about the original value stored in the cell. Check the example below!
Let’s say that our range contains the same date value in multiples cells with various formatting.
If we apply the ‘Remove duplicates’ command, we will get only unique values without duplicates. Why? We are talking about unique values, but the cell format is different!
So, we recommend you format the data before you start using the function.
Tip: The remove duplicates tool delete your duplicates permanently.
How to remove duplicates in Excel for a single column
In the first example, we’ll show you how to check for duplicates from a single column (list). As first, select the range which contains duplicated values.
1. Locate the ribbon, and click on the Data Tab.
2. A new window will appear by clicking on ‘Remove Duplicates.’
3. Finally, click the OK button. A small message box will appear. It contains useful information about the numbers of removed duplicates and unique values. Click ‘OK’ to close the window.
Important: Now, decide your data contains a header or not. If your range has a header, make sure the status of the ‘My list has headers’ box is checked. If you don’t do that, you’ll get a false result.
Find and remove duplicates in Excel for multiple columns
If you want to remove duplicates in more than one column, use the following method. At first glance, the ’Remove Duplicates’ is a swiss knife in Excel. But we kindly ask you, use it carefully. We’ll show you why. If you are using the tool on a range that contains multiple columns, the results can be different.
In the examples below, there are three sample outputs for the ‘Rep’, ‘Regions’, and ‘OrderDate’ columns. As we mentioned before, click on the ribbon and select the ‘Remove Duplicates’ command.
The first image displays all the duplicates based only on the Sales rep. After removing duplicates, 9 unique values remain.
The second image shows all the duplicates based on the Rep and Region columns. After clicking OK, based on two columns, 4 duplicate values will be removed, and 10 unique values remain.
The third image shows all the duplicates based on all columns in the table. In this case, 12 unique values remain.
Don’t forget: Before we start, let’s take a quick comparison of removed items to check how the different filter works. You can check for duplicates in many different ways, and it depends on your task.
Filtering for unique values and removing duplicate values
In this part of the tutorial, we will explain what are the differences between filtering (hiding) and remove duplicate values. At first glance, there are two similar tasks: their objective is to create a list of unique values.
It’s a smart decision to apply filters before removing duplicates. Try to switch the advanced filter on or use simple conditional formatting to avoid unexpected results.
To filter unique values, follow these steps below.
1. Select the range or a table. Tip: It’s good enough to select a single cell in a range.
2. Click the Data Tab and select the ‘Advanced Filter‘ under the ‘Sort & Filter’ group.
3. Now, the ‘Advanced Filter’ box appears. You have two choices to apply the filter.
In the first example, we want to temporarily hide the duplicate values and keep the list in place. To do that, select ‘Filter the list, in-place’, check the ‘Unique records only’ box. Click OK to close the window.
Check the result! If you are using an advanced filter for unique values, duplicates are only hidden temporarily. Take a closer look at the picture below! This is a filtered list that contains all records.
In the second example, we’ll show you how to copy the unique records to another location and keep the original list in place. To do that, select the ‘Copy to another location’ checkbox, select the unique records only, and click OK. Select the target location to copy the result to another worksheet.
Find and remove duplicate values using Conditional Formatting
Conditional formatting is a versatile tool; we love it! Let’s see how to highlight duplicate values in a single column:
As first, we have to talk about an important step. It’s important to select all cells in a range! It is a limitation of formatting rules.
Go to the Home tab and locate the Style group.
From the drop-down list, click the small arrow for Conditional Formatting, and then click Highlight Cells Rules. Finally, select Duplicate Values.
The method is perfect if we want to remove duplicates in a single column.
How to find unique or duplicate values using multiple columns?
Because the conditional formatting is not able the work with records across rows, we will apply a small trick. Create a helper column and use the CONCAT function to create a single string without spaces.
We’ll use this combined column to check for duplicates in more than one column.
The first three steps are the same as the single-column example:
Click Conditional Formatting > Highlight Cells Rules > Duplicate values.
If you want to highlight the duplicate values for three columns, select the ‘Duplicate’ option from the drop-down list. Apply a built-in or a custom formatting style. The result is below:
Tip: If you need to take a quick overview of duplicates, use the Quick Analysis Tool.
Select the range which contains duplicates. A small icon will appear at the end of the range. Choose the ‘Formatting’ tab and select ‘Duplicates’ from the list. You can use the ‘Ctrl’ + ‘Q’ shortcut too.
Find And Remove Duplicate Data using Power Query
Select the range that you want to add to the Power Query Editor. Select the Data Tab, in the ‘Get and Transform’ section, choose the ‘From Table / Range’ option.
Now the Power Query window appears. Select the first column, in this case, ‘Rep’. Right-click and apply the ‘Remove Duplicates’ command.
Working with multiple columns
The Power Query based method works on single or multiple ranges.
Example 1: If you need to find duplicate rows based on the entire table, hold the ‘Control’ key and select the column by clicking headers.
Example 2: Quick tip to keep distinct values based on the entire range: Check the table icon on the header of the first column. Right-click, then click remove duplicates. It’s easy!
In this example, we’ll show you how to remove duplicates using Pivot Tables.
- Select data.
- Click the Home Tab and Insert a Pivot Table.
- Make sure to drag all three fields into the Rows section.
Go to the Design Tab and transform the Pivot table using these steps below.
Click on the Pivot Table area, then
- Select ‘Subtotals’, and click ‘Do Not Show Subtotals‘
- Switch off showing ‘Grand Totals’ using the drop-down list
- Under the ‘Report Layout’ section click ‘Show data in Tabular Form’ and ‘Repeat All Item Labels’
Good to know that the Pivot table lists unique values only. If you create a proper report layout, the Pivot table removes duplicate rows.
Find and remove duplicates using Formulas
As we stated at the beginning of the article, A comparison of duplicate values depends on what appears in the cell. So, it’s useful to use formulas and functions to remove duplicates properly.
Let’s see the example below!
We will create a helper column and join the data, which are in three columns. Add a name to the new column, in this case, ‘Joined records.’
Combine the records using the Excel CONCAT function:
=CONCAT(Rep, Region, OrderDate)
With the help of the COUNTIFS function, we’ll take a quick overview of the numbers of duplicate values. Create a new column, ‘Count Duplicates.’
Enter the formula, then evaluate:
Copy the formula down until cell E15. The COUNTIFS function will show the duplicates quickly.
Take a closer look at the output:
- We are talking about distinct values if the result = 1
- If it’s greater than 1, the value appears in the list more than once
I’ve just highlighted the duplicate values using orange fill.
To remove duplicates apply a filter using the Ctrl + Shift + L keyboard shortcut and select the duplicate rows using the drop-down list.
Build a macro to remove duplicate values
Okay, if VBA is your friend, we’ll show some small snippets!
- If the range does not have a header, then use Header:=‘xlNo’
- If the range has header, then use Header:=‘xlYes’
Example 1: Our range is A1:C15, the range has a header and we want to remove duplicates from the first column. The code is the following:
Sub Example1() Range("A1:C15").RemoveDuplicates Columns:=1, Header:=xlYes End Sub
Example 2: The range is the same as above, and we want to remove duplicates from the first and the third columns:
Sub Example2() Range("A1:C13").RemoveDuplicates Columns:=Array(1, 3), Header:=xlYes End Sub
Use DataXL free add-in
DataXL is our answer to data cleansing challenges! Coming soon!