ThreeWave Listing Distinct Items From A Range

This page describes code that you can use to get a list of the distinct elements in an range.


Most every application written for Excel makes use of list in one way or another, and Excel provides a number of functions (VLOOKUP, HLOOKUP, and the D data functions. However, there is no formula function that allows you to get a list of the distinct elements. That is, a list with all duplicates removed. Excel's Advanced Filter can do this, but it has two significant shortcomings: First, the list of distinct values must be placed on the same worksheet as the original data. Second, the list created by Advanced Filter is static. If you add new elements to the list, the filter list of distinct values is no updated. You have to apply the filter manually.

The code on this page describes a VBA procedure called GetDistinct that returns an array containing the distinct values from a range. The function declaration is:

Public Function GetDistinct(RR As Range) As Variant()

where RR is the range from which the distinct values are to be extracted. There result in an array of Variants. This will return the values the values oriented as a single row spanning several columns. To get the result oriented as a column, one column spanning several rows, use the TRANPOSE function to transpose the array: =TRANSPOSE(GetDistict(A1:A10)).

This is an array formula, so you must first select the range cell cells into which the distinct values are to be written, enter the formula, and press CTRL SHIFT ENTER rather than just ENTER when you first enter the formula and whenever you edit it later. If you do this properly, Excel will display the formula in the formula bar enclosed in curly braces { }.

The code can optionally resize the array to the size of the range from which it was called. Normally, if you enter an array function into more cells than the size of the returned array, Excel files the empty cell with #N/A errors. As written, the code will automatically resize itself so that it fills the entire range with data, so no #N/A errors will appear. The values of the array that lie outside the distinct elements are filled with empty strings so they will appear blank. See the nodes in the comments of the code marked with '<<<<< OPTIONAL for details about this part of the code.


The GetDistinct Code

download You can download a workbook with all the example code and examples on this page.

The code for the GetDistinct procedure is shown below:

Option Explicit
Option Compare Text

Public Function GetDistinct(RR As Range) As Variant()
' GetDistinct
' By Chip Pearson,,, 1-August-2010
' This creates an array containing all the distinct items
' from the range RR. The result array is row oriented (one row spanning
' several columns) so use TRANSPOSE if you want the values column
' oriented (one column spanning several rows). The array returned
' by this function is suitable for use within other array formulas.
' Array enter (CTRL SHIFT ENTER) the formula into as many rows
' or columns as there will be distinct elements. You can use
' Redim Arr(1 to RR.Cells.Count) to ensure you have enough room,
' and the array will be Redim Preserved to the actual use size
' before it is returned.
' You should take into consideration the size of the array. Using a
' range RR = A1:A5000 containing 100 distinct items, the code took
' approximately 3 seconds. Multiple uses of GetDistinct with very
' large ranges may have unacceptable calculation times.
' This procedures has two advantages of the Excel Advanced Filter. First,
' Filter requires that the list of distinct values be created on the
' same worksheet as the original data. GetDistinct has no such limitation.
' Second, Filter is not automatically updated. You must manually
' cause Filter to refresh the list of distinct items. GetDistinct
' will update its results in real time.
    Dim Arr() As Variant
    Dim R As Range
    Dim N As Long
    Dim Ndx As Long
    Dim ArraySize As Long
    ' If the range RR is completely empty, get out.
    If Application.WorksheetFunction.CountA(RR) = 0 Then
        ReDim Arr(1 To Application.Caller.Cells.Count)
        For N = 1 To Application.Caller.Cells.Count
            Arr(N) = vbNullString
        Next N
        GetDistinct = Arr
        Exit Function
    End If
    ' Get the number of distinct values in range RR. Resize the array
    ' to that many elements. This evaluates the following formula where
    ' $B$2:$B$11 are replaced by RR.Address
    '    =SUM(1/IF($B$2:$B$11="",1,(COUNTIF($B$2:$B$11,$B$2:$B$11))))-COUNTBLANK($B$2:$B$11)
    ArraySize = Application.Evaluate("=SUM(1/IF(" & RR.Address & "="""",1,(COUNTIF(" & _
        RR.Address & "," & RR.Address & "))))-COUNTBLANK(" & RR.Address & ")")
    ReDim Arr(1 To ArraySize)
    ' loop only in the region of RR that intersects with the UsedRange
    ' of the sheet. This can cut processing time dramatically.
    For Each R In Application.Intersect(RR.Worksheet.UsedRange, RR).Cells
        ' ignore empty cells
        If R.Text <> vbNullString Then
            On Error Resume Next
            N = 0
            ' see if R.Value already exists in the Arr. If so,
            ' ignore it. If not present, add it.
            N = Application.WorksheetFunction.Match(R.Value, Arr, 0)
            If Err.Number <> 0 Then
                Ndx = Ndx + 1
                Arr(Ndx) = R.Value
            End If
        End If
    Next R
    '<<<<< OPTIONAL
    ' The following code is optional. It resizes the array to the
    ' number of cells from with it was called, so that if there
    ' are fewer elements than Application.Caller.Cells.Count
    ' (a condition that is always true unless all cells of
    ' RR have the same value. This prevents the arrray of cells
    ' on the worksheet being padded out with values = 0. If
    ' you don't want this resizing, get rid of the entire
    ' If/Then/Else/EndIf block of code and replace it with
    '       ReDim Preserve Arr(1 To Ndx)
    If Application.Caller.Cells.Count > Ndx Then
        ReDim Preserve Arr(1 To Application.Caller.Cells.Count)
        For N = Ndx + 1 To Application.Caller.Cells.Count
            Arr(N) = vbNullString
        Next N
        ReDim Preserve Arr(1 To Ndx)
    End If
    GetDistinct = Arr
End Function
LastUpdate This page last updated: 9-August-2010.