Extract a list of duplicates from three columns combined
This webpage demonstrates formulas that merge three non-adjacent cell ranges.
What's on this webpage
1. Extract a list of duplicates from two or more columns combined - Excel 365
This example demonstrates how to list duplicates from three cell ranges combined, the first cell range is B3:B21, the second cell range is D3:D8 and the third is F3:F9.
Note, the cell ranges are not adjacent and are in different sizes. The result is an array shown in cell H3 and cells below as far as needed. This is called spilling in Excel 365.
Excel 365 dynamic array formula in cell H3:
You are not limited to three cell ranges, you can use up to 254 cell ranges.
Explaining the formula in cell H3
Step 1 - Combine cell ranges
The TOCOL function rearranges values in 2D cell ranges to a single column.
Function syntax: TOCOL(array, [ignore], [scan_by_col])
The TOCOL function lets you combine multiple cell ranges by adding parentheses and commas as delimiting characters.
TOCOL((B3:B21,D3:D8,F3:F9))
returns
{"Federer, Roger "; "Djokovic, Novak "; "Murray, Andy "; "Davydenko, Nikolay "; "Roddick, Andy "; "Del Potro, Juan Martin "; "Federer, Roger "; "Davydenko, Nikolay "; "Verdasco, Fernando "; "Gonzalez, Fernando "; "Wawrinka, Stanislas "; "Gonzalez, Fernando "; "Blake, James "; "Nalbandian, David "; "Robredo, Tommy "; "Wawrinka, Stanislas "; "Cilic, Marin "; "Stepanek, Radek "; "Almagro, Nicolas "; "Djokovic, Novak "; "May Smith"; "Davydenko, Nikolay "; "Julia Ford"; "Del Potro, Juan Martin "; "Julia Ford"; "Wawrinka, Stanislas "; "Murray, Andy "; "Davydenko, Nikolay "; "Ruth, Arvid"; "Julia Ford"; "Ruth, Arvid"; "Penelope Swanson"}.
Step 2 - Match values to the same values
The MATCH function returns the relative position of an item in an array that matches a specified value in a specific order.
Function syntax: MATCH(lookup_value, lookup_array, [match_type])
MATCH(TOCOL((B3:B21,D3:D8,F3:F9)),TOCOL((B3:B21,D3:D8,F3:F9)),0)
becomes
MATCH({"Federer, Roger "; "Djokovic, Novak "; ... ; "Penelope Swanson"}, {"Federer, Roger "; "Djokovic, Novak "; ... ; "Penelope Swanson"}, 0)
and returns
{1; 2; 3; 4; 5; 6; 1; 4; 9; 10; 11; 10; 13; 14; 15; 11; 17; 18; 19; 2; 21; 4; 23; 6; 23; 11; 3; 4; 29; 23; 29; 32}.
Step 3 - Count rows in the array
The ROWS function calculate the number of rows in a cell range.
Function syntax: ROWS(array)
ROWS(TOCOL((B3:B21,D3:D8,F3:F9)))
becomes
ROWS({"Federer, Roger "; "Djokovic, Novak "; ... ; "Penelope Swanson"})
and returns
32. There are 32 rows (values) in the array.
Step 4 - Create a sequence from 1 to n
The SEQUENCE function creates a list of sequential numbers.
Function syntax: SEQUENCE(rows, [columns], [start], [step])
SEQUENCE(ROWS(TOCOL((B3:B21,D3:D8,F3:F9))))
becomes
SEQUENCE(32)
and returns
{1; 2; 3; 4; 5; 6; 7; 8; 9; 10; 11; 12; 13; 14; 15; 16; 17; 18; 19; 20; 21; 22; 23; 24; 25; 26; 27; 28; 29; 30; 31; 32}.
Step 5 - Check if the numbers are not equal to the sequence
The less than and larger than sign combined lets you perform "not equal to" in an Excel formula. The result is a boolean value TRUE or FALSE, however, in this case, the result is an array containing boolean values.
MATCH(TOCOL((B3:B21,D3:D8,F3:F9)),TOCOL((B3:B21,D3:D8,F3:F9)),0)<>SEQUENCE(ROWS(TOCOL((B3:B21,D3:D8,F3:F9))))
becomes
{1; 2; 3; 4; 5; 6; 1; 4; 9; 10; 11; 10; 13; 14; 15; 11; 17; 18; 19; 2; 21; 4; 23; 6; 23; 11; 3; 4; 29; 23; 29; 32}<>{1; 2; 3; 4; 5; 6; 7; 8; 9; 10; 11; 12; 13; 14; 15; 16; 17; 18; 19; 20; 21; 22; 23; 24; 25; 26; 27; 28; 29; 30; 31; 32}
and returns
{FALSE; FALSE; FALSE; FALSE; FALSE; FALSE; TRUE; TRUE; FALSE; FALSE; FALSE; TRUE; FALSE; FALSE; FALSE; TRUE; FALSE; FALSE; FALSE; TRUE; FALSE; TRUE; FALSE; TRUE; TRUE; TRUE; TRUE; TRUE; FALSE; TRUE; TRUE; FALSE}.
This array lets you identify duplicate values, their positions correspond to the position in cell ranges B3:B21, D3:D8, and F3:F9.
Step 6 - Filter values based on the boolean array
The FILTER function extracts values/rows based on a condition or criteria.
Function syntax: FILTER(array, include, [if_empty])
FILTER(TOCOL((B3:B21,D3:D8,F3:F9)),MATCH(TOCOL((B3:B21,D3:D8,F3:F9)),TOCOL((B3:B21,D3:D8,F3:F9)),0)<>SEQUENCE(ROWS(TOCOL((B3:B21,D3:D8,F3:F9)))))
becomes
FILTER({"Federer, Roger "; "Djokovic, Novak "; ... ; "Penelope Swanson"},{FALSE; FALSE; FALSE; ... ; FALSE})
and returns
{"Federer, Roger "; "Davydenko, Nikolay "; "Gonzalez, Fernando "; "Wawrinka, Stanislas "; "Djokovic, Novak "; "Davydenko, Nikolay "; "Del Potro, Juan Martin "; "Julia Ford"; "Wawrinka, Stanislas "; "Murray, Andy "; "Davydenko, Nikolay "; "Julia Ford"; "Ruth, Arvid"}.
Step 7 - Extract unique distinct values
The UNIQUE function returns a unique or unique distinct list.
Function syntax: UNIQUE(array,[by_col],[exactly_once])
UNIQUE(FILTER(TOCOL((B3:B21,D3:D8,F3:F9)),MATCH(TOCOL((B3:B21,D3:D8,F3:F9)),TOCOL((B3:B21,D3:D8,F3:F9)),0)<>SEQUENCE(ROWS(TOCOL((B3:B21,D3:D8,F3:F9))))))
becomes
UNIQUE({"Federer, Roger "; "Davydenko, Nikolay "; "Gonzalez, Fernando "; "... ; "Ruth, Arvid"})
and returns
{"Federer, Roger "; "Davydenko, Nikolay "; "Gonzalez, Fernando "; "Wawrinka, Stanislas "; "Djokovic, Novak "; "Del Potro, Juan Martin "; "Julia Ford"; "Murray, Andy "; "Ruth, Arvid"}
Step 8 - Shorten the formula
The LET function lets you name intermediate calculation results which can shorten formulas considerably and improve performance.
Function syntax: LET(name1, name_value1, calculation_or_name2, [name_value2, calculation_or_name3...])
UNIQUE(FILTER(TOCOL((B3:B21,D3:D8,F3:F9)),MATCH(TOCOL((B3:B21,D3:D8,F3:F9)),TOCOL((B3:B21,D3:D8,F3:F9)),0)<>SEQUENCE(ROWS(TOCOL((B3:B21,D3:D8,F3:F9))))))
x - TOCOL((B3:B21,D3:D8,F3:F9))
LET(x,TOCOL((B3:B21,D3:D8,F3:F9)),UNIQUE(FILTER(x,MATCH(x,x,0)<>SEQUENCE(ROWS(x)))))
2. Extract a list of duplicates from three columns combined
The following regular formula extracts duplicate values from column B (List1), D (List2) and F (List3) combined, the result is displayed in cell H3 and cells below.
Formula in cell H3:
Explaining formula in cell F3
This formula consists of three similar parts, one returns values from List1 and the second returns values from List2 and the third returns duplicates from List3.
IFERROR(IFERROR(formula1, formula2), formula3)
Step 1 - Prevent duplicate values in output
The COUNTIF function counts values based on a condition, in this case, I am counting values in cells above. This makes sure that duplicates are not returned.
COUNTIF($H$2:H2,$B$3:$B$21)=0
becomes
COUNTIF("Duplicates",$B$3:$B$21)=0
becomes
COUNTIF("Duplicates",{"Federer, Roger ";"Djokovic, Novak ";"Murray, Andy ";"Davydenko, Nikolay ";"Roddick, Andy ";"Del Potro, Juan Martin ";"Federer, Roger ";"Davydenko, Nikolay ";"Verdasco, Fernando ";"Gonzalez, Fernando ";"Wawrinka, Stanislas ";"Gonzalez, Fernando ";"Blake, James ";"Nalbandian, David ";"Robredo, Tommy ";"Wawrinka, Stanislas ";"Cilic, Marin ";"Stepanek, Radek ";"Almagro, Nicolas "})=0
{0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0}=0
and returns
{TRUE;TRUE;TRUE; TRUE;TRUE;TRUE;TRUE; TRUE;TRUE;TRUE;TRUE; TRUE;TRUE;TRUE;TRUE; TRUE;TRUE;TRUE;TRUE}
Step 2 - Count values in List1
We want to know where the duplicates are in List1.
COUNTIF($B$3:$B$21,$B$3:$B$21)>1
becomes
COUNTIF({"Federer, Roger ";"Djokovic, Novak ";"Murray, Andy ";"Davydenko, Nikolay ";"Roddick, Andy ";"Del Potro, Juan Martin ";"Federer, Roger ";"Davydenko, Nikolay ";"Verdasco, Fernando ";"Gonzalez, Fernando ";"Wawrinka, Stanislas ";"Gonzalez, Fernando ";"Blake, James ";"Nalbandian, David ";"Robredo, Tommy ";"Wawrinka, Stanislas ";"Cilic, Marin ";"Stepanek, Radek ";"Almagro, Nicolas "},{"Federer, Roger ";"Djokovic, Novak ";"Murray, Andy ";"Davydenko, Nikolay ";"Roddick, Andy ";"Del Potro, Juan Martin ";"Federer, Roger ";"Davydenko, Nikolay ";"Verdasco, Fernando ";"Gonzalez, Fernando ";"Wawrinka, Stanislas ";"Gonzalez, Fernando ";"Blake, James ";"Nalbandian, David ";"Robredo, Tommy ";"Wawrinka, Stanislas ";"Cilic, Marin ";"Stepanek, Radek ";"Almagro, Nicolas "})>1
becomes
{2;1;1;2;1;1;2;2;1;2;2;2;1;1;1;2;1;1;1}>1
{TRUE;FALSE; FALSE;TRUE; FALSE;FALSE; TRUE;TRUE; FALSE;TRUE; TRUE;TRUE; FALSE;FALSE; FALSE;TRUE; FALSE;FALSE; FALSE}
Step 3 - Multiply arrays
(COUNTIF($H$2:H2,$B$3:$B$21)=0)*(COUNTIF($B$3:$B$21,$B$3:$B$21)>1)
becomes
{TRUE;TRUE;TRUE; TRUE;TRUE;TRUE;TRUE; TRUE;TRUE;TRUE;TRUE; TRUE;TRUE;TRUE;TRUE; TRUE;TRUE;TRUE;TRUE} * {TRUE;FALSE; FALSE;TRUE; FALSE;FALSE; TRUE;TRUE; FALSE;TRUE; TRUE;TRUE; FALSE;FALSE; FALSE;TRUE; FALSE;FALSE; FALSE}
and returns
{1;0;0;1;0;0;1;1;0;1;1;1;0;0;0;1;0;0;0}
Step 4 - Divide 1 with array
The LOOKUP function ignores error and if we divide 1 with 0 an error occurs. 1/0 = #DIV/0!
1/((COUNTIF($H$2:H2,$B$3:$B$21)=0)*(COUNTIF($B$3:$B$21,$B$3:$B$21)>1))
becomes
1/{1;0;0;1;0;0;1;1;0;1;1;1;0;0;0;1;0;0;0}
and returns
{1;#DIV/0!;#DIV/0!;1;#DIV/0!;#DIV/0!;1;1;#DIV/0!;1;1;1;#DIV/0!;#DIV/0!;#DIV/0!;1;#DIV/0!;#DIV/0!;#DIV/0!}
Step 5 - Return value based on array
LOOKUP(2,1/((COUNTIF($H$2:H2,$B$3:$B$21)=0)*(COUNTIF($B$3:$B$21,$B$3:$B$21)>1)),$B$3:$B$21)
becomes
LOOKUP(2, 1/{1;0;0;1;0;0;1;1;0;1;1;1;0;0;0;1;0;0;0}, $B$3:$B$21)
becomes
LOOKUP(2, 1/{1;0;0;1;0;0;1;1;0;1;1;1;0;0;0;1;0;0;0}, {"Federer, Roger ";"Djokovic, Novak ";"Murray, Andy ";"Davydenko, Nikolay ";"Roddick, Andy ";"Del Potro, Juan Martin ";"Federer, Roger ";"Davydenko, Nikolay ";"Verdasco, Fernando ";"Gonzalez, Fernando ";"Wawrinka, Stanislas ";"Gonzalez, Fernando ";"Blake, James ";"Nalbandian, David ";"Robredo, Tommy ";"Wawrinka, Stanislas ";"Cilic, Marin ";"Stepanek, Radek ";"Almagro, Nicolas "})
and returns Wawrinka, Stanislas in cell H3.
Step 4 - Return values from List2
When values run out from List1 formula1 returns errors, the IFERROR function then moves to formula2.
IFERROR(formula1, formula2)
formula2 is just like formula1 except that it returns values from List2 and duplicates found between List1 and List2.
Another IFERROR function is used to handle errors from List2, the formula then returns values from List.
IFERROR(IFERROR(formula1, formula2), formula3)
Get Excel *.xlsx file
how-to-extract-a-list-of-duplicates-from-three-columns-in-excelv2.xlsx
Duplicate values category
The array formula in cell C2 extracts duplicate values from column A. Only one duplicate of each value is displayed […]
This article explains how to extract values that exist in three different columns, they must occur in each of the […]
This article demonstrates formulas that extract duplicate values from a column, however, column D contains values that you don't want […]
Excel categories
8 Responses to “Extract a list of duplicates from three columns combined”
Leave a Reply
How to comment
How to add a formula to your comment
<code>Insert your formula here.</code>
Convert less than and larger than signs
Use html character entities instead of less than and larger than signs.
< becomes < and > becomes >
How to add VBA code to your comment
[vb 1="vbnet" language=","]
Put your VBA code here.
[/vb]
How to add a picture to your comment:
Upload picture to postimage.org or imgur
Paste image link to your comment.
Contact Oscar
You can contact me through this contact form
This is close to what I need. I have three lists of email addresses. If an email address appears in all three (not two out of three) lists then place it in the duplicate column. Also I need all three named ranges to be dynamic. Can you help?
Peter Voss,
See this post: Filter values that exists in all three lists
[...] in Compare, Excel, Search/Lookup on Sep.27, 2012. Email This article to a Friend Peter Voss asks:This is close to what I need. I have three lists of email addresses. If an email address appears in [...]
Hi there,
I am using Excel 2011 on a Mac and keep getting errrors...
Vanessa,
I don´t have a mac, perhaps it is possible to disable error checking rules:
https://www.addictivetips.com/microsoft-office/show-error-on-formula-referring-to-an-empty-cell-in-excel-2010/
I said before that you're a genius
But I want the previous version (office 2003)
I said before that you're a genius
But I want the previous formula (office 2003)
Great work! I've been searching for a couple of hours, and your formula, while intimidating and I don't really understand the logic, works.
I wanted to match just by 1 row in 3 columns B,C,D, so I changed it to the below formula. Just need to figure out how to handle div\0 errors when there is NO duplicate and it would be perfect for me.
=IFERROR(IFERROR(LOOKUP(2, 1/((COUNTIF($E$2:E2, B3)=0)*(COUNTIF(B3, B3)>1)),B3), LOOKUP(2, 1/((COUNTIF($E$2:E2, C3)=0)*((COUNTIF(C3,C3)>1)+(COUNTIF(B3,C3)>0))),C3)),LOOKUP(2, 1/((COUNTIF($E$2:E2,D3)=0)*((COUNTIF(D3, D3)>1)+(COUNTIF(B3, D3)>0)+(COUNTIF(C3, D3)>0))), D3))