Extract unique distinct values if the value contains the given string
This article demonstrates formulas that list unique distinct values if they contain a specified substring.
Table of contents
1. Extract unique distinct values if the value contains a given string - Excel 365
The following formula lists unique distinct values from cell range B3:B21 if they contain the substring specified in cell D3. Unique distinct values are all values except duplicates, they are merged into one distinct value.
The image above shows names in cell range B3:B21, the sub string is specified in cell D3 and the result is displayed in cells F3:F13. The result is an dynamic array that spills to adjacent cells below automatically.
Excel 365 formula in cell F3:
The result is an array containing names from source range B3:B21 that contains "r".
Okay, let's go through the formula step by step, starting with the innermost function:
- SEARCH(D3, B3:B21): The SEARCH function searches for the value in cell D3 within each cell in the range B3:B21. It returns the starting position of the first occurrence of the value in D3 within the corresponding cell in B3:B21. If the value in D3 is not found in a cell, the SEARCH function returns 0 (zero).
- ISNUMBER(SEARCH(D3, B3:B21)): The ISNUMBER function checks if the result of the SEARCH function is a number. If the SEARCH function returns a number (i.e., the value in D3 was found in the corresponding cell in B3:B21), the ISNUMBER function will return TRUE. If the SEARCH function returns 0 (i.e., the value in D3 was not found in the corresponding cell in B3:B21), the ISNUMBER function will return FALSE.
- FILTER(B3:B21, ISNUMBER(SEARCH(D3, B3:B21))): The FILTER function takes two arguments: the range to filter (B3:B21) and the condition to apply (ISNUMBER(SEARCH(D3, B3:B21))). The FILTER function creates a new array that contains only the cells from B3:B21 where the corresponding condition (the result of ISNUMBER(SEARCH(D3, B3:B21))) is TRUE. In other words, the FILTER function creates a new array that contains only the cells from B3:B21 where the value in D3 was found.
- UNIQUE(FILTER(B3:B21, ISNUMBER(SEARCH(D3, B3:B21)))): The UNIQUE function takes the array returned by the FILTER function and removes any duplicate values, leaving only the unique values. The final result of this formula is an array containing the unique values from the range B3:B21 where the value in D3 was found.
In summary, the formula first searches for the value in D3 within each cell in the range B3:B21, then filters the range to include only the cells where the value in D3 was found, and finally, it removes any duplicate values from the filtered array, leaving only the unique values.
Explaining formula
Step 1 - Search for a substring in the array
The SEARCH function returns the number of the character at which a specific character or text string is found reading left to right (not case-sensitive)
Function syntax: SEARCH(find_text,within_text, [start_num])
SEARCH(D3, B3:B21)
returns {5; #VALUE!; ... ; 6}.
Step 2 - Check if the value is a number
The ISNUMBER function checks if a value is a number, returns TRUE or FALSE.
Function syntax: ISNUMBER(value)
ISNUMBER(SEARCH(S20,R20:R22))
returns {TRUE; FALSE; T... ; TRUE}.
Step 3 - Filter values if the value in the array is a number
The FILTER function extracts values/rows based on a condition or criteria.
Function syntax: FILTER(array, include, [if_empty])
FILTER(B3:B21,ISNUMBER(SEARCH(S20,R20:R22)))
returns {"Federer, Roger "; "Murray, Andy "; ... ; "Almagro, Nicolas "}.
Step 3 - List unique distinct values
The UNIQUE function returns a unique or unique distinct list.
Function syntax: UNIQUE(array,[by_col],[exactly_once])
UNIQUE(FILTER(B3:B21,ISNUMBER(SEARCH(D3,B3:B21))))
returns {"Federer, Roger "; "Murray, Andy "; ... ; "Almagro, Nicolas "}
2. Extract unique distinct values if the value contains a string in earlier Excel versions
The image above demonstrates a formula in cell F3 that extracts unique distinct values from column B if they contain the value in cell D3. This fomrula worksin all Excel versions, however, you need to enter the formula as an array formula and copy cell F3 to cells below in order to ignore duplicate values.
Formula in cell F3:
Important! The use of absolute cell references in this formula is crucial because it ensures that the formula will work correctly when copied to cells below. If the references were relative the formula would not be able to correctly identify the appropriate ranges and cell references, leading to incorrect results.
This formula is different from the Excel 365 formula above in that it calculates a new value in each cell, whereas the Excel 365 calculates an array and spills the values to cells below. This spilling behavior is new to Excel 365 and doesn't work in earlier Excel versions.
Let's break down this formula step by step:
- COUNTIF($F$2:F2, $B$3:$B$21)=0: The COUNTIF function counts the number of times the value in each cell of the range $B$3:$B$21 appears in the range $F$2:F2. Note that $F$2:F2 id dependent on where you enter the formula and it expands when the formula is copied to cells below. The result of this function is an array of 1s and 0s, where 1 indicates that the value in the corresponding cell of $B$3:$B$21 is found in $F$2:F2, and 0 indicates that the value is not found. The =0 part of the formula checks if the result of COUNTIF is 0, which means the value in the corresponding cell of $B$3:$B$21 is not found in $F$2:F2.
- SEARCH($D$3, $B$3:$B$21): The SEARCH function searches for the value in cell $D$3 within each cell in the range $B$3:$B$21. It returns the starting position of the first occurrence of the value in $D$3 within the corresponding cell in $B$3:$B$21. If the value in $D$3 is not found in a cell, the SEARCH function returns an error value #VALUE!.
- (COUNTIF($F$2:F2, $B$3:$B$21)=0)*SEARCH($D$3, $B$3:$B$21): This part of the formula multiplies the array of 1s and 0s (from the COUNTIF function) with the array of search results (from the SEARCH function). The result is an array where the values are either 0 (if the value in the corresponding cell of $B$3:$B$21 is found in $F$2:F2) or the search result (if the value in the corresponding cell of $B$3:$B$21 is not found in $F$2:F2).
- 1/((COUNTIF($F$2:F2, $B$3:$B$21)=0)*SEARCH($D$3, $B$3:$B$21)): This part of the formula divides 1 by the array created in the previous step. The result is an array where the values are either 1/0 or error (if the value in the corresponding cell of $B$3:$B$21 is found in $F$2:F2) or 1/the search result (if the value in the corresponding cell of $B$3:$B$21 is not found in $F$2:F2). Dividing 1/0 returns #DIV/0 error which the LOOKUP function ignores in the next step below.
- LOOKUP(2, 1/((COUNTIF($F$2:F2, $B$3:$B$21)=0)*SEARCH($D$3, $B$3:$B$21)), $B$3:$B$21): The LOOKUP function takes three arguments: the value to search for (in this case, 2), the array to search in (the array created in the previous step), and the array to return the corresponding values from (the range $B$3:$B$21). The LOOKUP function finds the largest value in the array that is less than or equal to 2, and then returns the corresponding value from the range $B$3:$B$21. The LOOKUP function ignore errors automatically.
Explaining formula in cell F3
These steps shows the results in greater detail.
Step 1 - Prevent duplicates
The COUNTIF function counts cells in cell range based on a condition or criteria. If the value is equal to 0 then it has not been displayed yet.
COUNTIF($F$2:F2, $B$3:$B$21)=0
returns {TRUE; TRUE; ... ; TRUE}
Step 2 - Check if values contain string
The SEARCH function returns a number that represents the position of the search string if found. The function returns an error if not found which is alright in this case.
SEARCH($D$3,$B$3:$B$21)
returns {5; #VALUE!; ... ; 6}.
Step 3 - Multiply arrays
Both values must be TRUE in order to be TRUE meaning if the value has not been displayed yet AND the value contains the string then return TRUE or the equivalent numerical number. TRUE is all numbers except 0 (zero), FALSE is 0 (zero).
returns {5; #VALUE!; ... ; 6}.
Step 4 - Divide 1 with array
The result will return !DIV/0 error if 1 is divided with 0 (0), which the LOOKUP function ignores. It will also ignore #VALUE! errors.
1/((COUNTIF($F$2:F2, $B$3:$B$21)=0)*SEARCH($D$3, $B$3:$B$21))
returns {0.2; #VALUE!;... ; 0.166666666666667}.
Step 5 - Return value
LOOKUP(2,1/((COUNTIF($F$2:F2,$B$3:$B$21)=0)*SEARCH($D$3,$B$3:$B$21)),$B$3:$B$21)
returns "Almagro, Nicolas " in cell F3.
Get Excel *.xlsx file
Filter unique distinct values containing string.xlsx
Unique distinct values category
First, let me explain the difference between unique values and unique distinct values, it is important you know the difference […]
This article demonstrates Excel formulas that allows you to list unique distinct values from a single column and sort them […]
Question: I have two ranges or lists (List1 and List2) from where I would like to extract a unique distinct […]
Excel categories
6 Responses to “Extract unique distinct values if the value contains the given string”
Leave a Reply
How to comment
How to add a formula to your comment
<code>Insert your formula here.</code>
Convert less than and larger than signs
Use html character entities instead of less than and larger than signs.
< becomes < and > becomes >
How to add VBA code to your comment
[vb 1="vbnet" language=","]
Put your VBA code here.
[/vb]
How to add a picture to your comment:
Upload picture to postimage.org or imgur
Paste image link to your comment.
Contact Oscar
You can contact me through this contact form
I have a data in same way only difference is that they are in number format i.e. (330-1541) in this way so is there any way i can use the filter for such type of data. Please help me.
Ashok,
did you create an array formula?
[...] For Extracting Records From Data Set (12 Examples) - YouTube Bill Jelen - YouTube Or here... Filter unique distinct values using “contain” condition of a column in excel | Get Digit... I hope this helps. [...]
Do you have examples for using advanced filter for "Not Contains"?
I have a data of this sort...
Probe ID Call ID
USBE1 130226200131-1
USBE1 130226200131-2
USBE1 130226200131-3
USBE1 130226200131-4
USBE1 130226200521-1
USBE1R1 130227143154-1
USBE1R1 130227143154-10
USBE1R1 130227143154-11
USBE1R1 130227143154-12
USBE1R1 130227143154-13
USBE1R1 130227143154-14
USBE1R1 130227143154-15
..The "" condition works on the first column, but not on the second column..I am treating both as text columns. But the advanced filter for "Not" condition on these are not working..any help?
dashil103,
Sorry I don´t understand. What is the desired output?
I would like to do something vaguely similar to this. I would like to filter a list with tho columns. In the first column is an ID ("List 1", "List 2", "List 3", and so on), in the second one is a value. Now i would like to list all the values where the first row contains "List 1". Could you help me with this?