[IncludeBorders/top.htm]

    Recursion And The FileSystemObject

  This page introduces the concept of recursive procedures and uses the Scripting FileSystemObject as an example of when recursion can be a powerful programming technique.

Recursion is a programming technique that can simplify complex tasks or accomplish tasks that cannot be accomplished by any other means. Simply put, a procedure is said to be "recursive" if it calls itself, almost always passing a parameter to itself.  It may seem odd that a procedure can call itself, but there is nothing in the language to prevent this.  Fundamentally, there is nothing different between a procedure calling another procedure and a procedure calling itself.

Care must be taken with the logic of a recursive procedure. There must be some condition that, when satisfied, stops the recursion. If you don't control the recursion properly, you'll end up in an infinite loop. In its most simple form, a recursive procedure is something like the following:

    Sub Proc()
        Proc
    End Sub

The flaw here is obvious. Proc call itself, which calls itself, which calls itself, forever. There is no escape mechanism. Ultimately this procedure will exhaust the resources available to VBA at run time, and VBA will terminate the procedure with an Out Of Stack Space error. In a quick example, VBA terminated the recursion after approximately 6800 loops.  This illustrates in the simplest terms what happens without proper escape code. The procedure will continue to call itself before the VBA runtime kicks in and terminates the process. Typically, a recursive procedure is initiated by another procedure that provides a starting value for the recursive procedure.  The recursive procedure then modifies that parameter and calls itself. When some escape condition is satisfied, the recursive procedure stop calling itself, winds back through the procedure "stack" and ultimately returns the initializing procedure.

The "stack" is a data structure that in our context is used to keep track of what procedure is calling what procedure, and where execution should resume when a procedure reaches its conclusion. A "stack" is a last-in, first-out structure, and takes is name from a stack of dinner plates. You always add plates to the top of the stack, and you always remove plates from the top of the stack. You never insert or remove items from the middle or the bottom of the stack.  When used to keep track of procedures, the "top" of the stack is always the currently executing procedure. When this procedure terminates, it is removed or "popped" from the stack, and then the top of the stack is the procedure which called that procedure. As procedures are called, they are "pushed" on to the top of the stack, and when a procedure terminates, it is "popped" from the top of the stack and the procedure that called that procedure is now at the top of the stack. In the recursive procedure shown above, the "Proc" procedure is "pushed" on to the top of the stack but is never removed or "popped". The procedure stack in VBA has a limited capacity. As Proc calls itself, the stack grows because there is no escape from the loop. At approximately 6800 stack entries, the stack is filled to capacity and VBA terminates everything with an Out Of Stack Space run time error.

A properly constructed recursive procedure is shown below.

    Dim Total As Long
    
    Sub Start()
        Total = 0
        SumRecurse 10
        Debug.Print "DONE: " & CStr(Total)
    End Sub
    
    Sub SumRecurse(N As Long)
        Total = Total + N
        If N > 0 Then
            ''''''''''''''''''''''''
            ' Here, SumRecurse
            ' calls itself, passing
            ' N-1 as the parameter.
            ''''''''''''''''''''''''
            SumRecurse N - 1
        Else
            ''''''''''''''''''''''''''''
            ' Here, N =0, so SumRecurse
            ' doesn't call itself.
            ''''''''''''''''''''''''''''
        End If
    End Sub

This code simply sums the numbers from 5 to  0, starting from the Start procedure with a value of 5 and decrementing that value by 1 each time SumRecurse  calls itself.   In this code, there is an escape mechanism to prevent the loop from running forever. The Start procedure begins the recursive process by calling SumRecurse passing it a value of 5. The SumRecurse  procedure calls itself, passing itself the value N-1. Each time  calls itself, it passes N-1, so the value N, as passed in each iteration of the recursive loop, is decremented, or reduced by 1. When N reaches 0, the recursion ends and the process stops calling itself and displays the  result.

In reality, you would never use the code above -- it could be done more simply with a For loop or even a single line of code. However, it does introduce the concept of recursion.

Recursion can be used when you don't know at design time how far a routine must go. As an example, we will use the FileSystemObject to list all subdirectories of a given directory. You don't know at run time how many subfolders there may be under a given folder, nor do you know how deeply the subfolders are nested. A folder may contain a subfolder that in turn contains subfolders which contain subfolders and so on. The code must accommodate whatever   folder/subfolder hierarchy exists an run time, which may well not be the same as that which exists when the code is written. Using recursion, we can write code that can handle any number of subfolders nested to any level.

The first step is to establish a reference to the Scripting runtime library. In VBA, go to the Tools menu, choose References, and scroll to and check the entry for "Microsoft Scripting Runtime". Next, declare a module-level variable named FSO of type Scripting.FileSystemObject.

    Dim FSO As Scripting.FileSystemObject

Then, create the starting procedure for the code. This procedure sets the initial folder, the folder whose subfolder we want to list.

    Sub StartListing()
    
    ''''''''''''''''''''''''''''''''''
    ' This is the name of the
    ' folder whose subfolders
    ' we want to list.
    ''''''''''''''''''''''''''''''''''
    Dim TopFolderName As String
    ''''''''''''''''''''''''''''''''''
    ' This is a Scripting.Folder
    ' object representing the
    ' folder named in TopFolderName.
    ''''''''''''''''''''''''''''''''''
    Dim TopFolderObj As Scripting.Folder
    ''''''''''''''''''''''''''''''''''
    ' This is the cell at which the
    ' listing will begin.
    ''''''''''''''''''''''''''''''''''
    Dim DestinationRange As Range
    
    TopFolderName = "C:\Test" '<<< CHANGE TO YOUR FOLDER
    Set DestinationRange = Worksheets(1).Range("A1")
    
    ''''''''''''''''''''''''''''''''''
    ' If FSO isn't initialized,
    ' create a new instance of FSO.
    ''''''''''''''''''''''''''''''''''
    If FSO Is Nothing Then
        Set FSO = New Scripting.FileSystemObject
    End If
    ''''''''''''''''''''''''''''''''''
    ' Set our TopFolderObj variable to
    ' a Folder object the represents
    ' the disk folder whose subfolders
    ' we want to list.
    ''''''''''''''''''''''''''''''''''
    Set TopFolderObj = FSO.GetFolder(TopFolderName)
    
    ListSubFolders OfFolder:=TopFolderObj, _
        DestinationRange:=DestinationRange
    
    End Sub

So far, this procedure is quite simple. It creates a new instance of the FileSystemObject if necessary, creates a starting cell for the listing, and creates a folder object representing the disk folder whose subfolders we want to list.  The last step of this procedure is to call the procedure named ListSubFolders.  We haven't yet written this procedure. The ListSubFolders procedure will do all the work of listing the subfolders. Since we don't know at design-time what our starting folder will be, nor do we know what the subfolder hierarchy of that folder is, we have to write the  procedure to handle any number of subfolders nested to any depth. This makes it an ideal candidate for a recursive solution. The procedure can call itself as many times as need to list all the subfolders.

We begin with the declaration of the ListSubFolders procedure. This is shown below:

    Sub ListSubFolders(OfFolder As Scripting.Folder, _
        DestinationRange As Range)

This declares ListSubFolders  to accept a Scripting Folder object and a destination range object. The OfFolder object is an object that represents the disk folder whose subfolders we are going to list. The DestinationRange object references the cell on the worksheet where the name of OfFolder is going to be written. The complete  procedure is shown below:

    Sub ListSubFolders(OfFolder As Scripting.Folder, _
        DestinationRange As Range)
    '''''''''''''''''''''''''''''''''''''''''''''''''''''
    ' ListSubFolders
    ' This lists all subfolders of the OfFolder object.
    ' It places the full name of OfFolder in the range
    ' referenced by DestinationRange.
    '''''''''''''''''''''''''''''''''''''''''''''''''''''
    
    ''''''''''''''''''''''''''''''''''
    ' This represents one subfolder
    ' within the OfFolder folder.
    ' We use this to loop through
    ' all the subfolders with a For
    ' Each loop.
    ''''''''''''''''''''''''''''''''''
    Dim SubFolder As Scripting.Folder
    
    '''''''''''''''''''''''''''''''''''''''
    ' Put the name of OfFolder in to
    ' DestinationRange.
    '''''''''''''''''''''''''''''''''''''''
    DestinationRange.Value = OfFolder.Path
    '''''''''''''''''''''''''''''''''''''''
    ' Point DestinationRange one row down
    ' and one column to the right. This
    ' will give a nice, indented view of
    ' the folder hierarchy.
    '''''''''''''''''''''''''''''''''''''''
    Set DestinationRange = DestinationRange.Offset(1, 1)
    '''''''''''''''''''''''''''''''''''''''
    ' Here's the recursive part. For each
    ' subfolder of OfFolder, the procedure
    ' calls itself passing in SubFolder and
    ' DestinationRange. The first time it
    ' calls itselef, it passes the first
    ' subfolder. If THAT folder has any
    ' subfolders, those subfolders will be
    ' picked up in the loop below, and the
    ' procedure will call itself for each
    ' subfolder. This process continues for
    ' as deep as subfolder are nested.
    ''''''''''''''''''''''''''''''''''''''
    For Each SubFolder In OfFolder.SubFolders
        ListSubFolders OfFolder:=SubFolder, _
            DestinationRange:=DestinationRange
    Next SubFolder
    '''''''''''''''''''''''''''''''''''''''''''
    ' Once we've listed the subfolder in
    ' OfFolder, move the DestinationRange
    ' down one row and back one column to
    ' the left to reflect that we are done
    ' with this depth of the folder/subfolder
    ' hierarchy.
    '''''''''''''''''''''''''''''''''''''''''''
    Set DestinationRange = DestinationRange(1, 0)
    
    End Sub
    

This procedure first declares a variable named SubFolder of type Scripting.Folder. This will represent each subfolder of the OfFolder folder object that is passed in to the procedure. Later, we will use this variable in a For Each loop to get each subfolder of OfFolder.  Next, we place the full name of OfFolder in the cell referenced by DestinationRange. Following this, we change DestinationRange to refer one row down and one column to the right of its original location. This will accomplish two things: (1) it will ensure that the next time a value is written to the cell referenced by DestinationRange, it will be one row below the previous entry, and (2) it creates a nicely indented representation of the subfolder hierarchy. Each subfolder is indented one column to the right and one row down from its parent folder.

The next step is the recursive piece of the code.

    For Each SubFolder In OfFolder.SubFolders
        ListSubFolders OfFolder:=SubFolder, _
            DestinationRange:=DestinationRange
    Next SubFolder

This uses a For Each loop to obtain a reference to each subfolder within OfFolder. For each subfolder found within OfFolder, the procedure calls itself to process that folder (which in this example means writing the name of the subfolder to DestinationRange) and then loops through that folder's subfolders, calling itself yet again for each subfolder.  If you follow the logic path carefully, you will see that the procedure ListSubFolders is called for every subfolder of the original TopFolderName that we started with in the StartListing procedure, no matter how deeply nested those subfolders may be.

Finally, we change the reference of DestinationRange down one row and back one column to the left, indicating we're done with that level of the subfolder hierarchy. If another subfolder is found, DestinationRange will be moved to the right to place the subfolder name in the proper cell.

Ultimately, we will end up with a listing in the worksheet that looks something like the following:

The code illustrated above is a very simple version of the Build Directory Tree add-in described here. That add-in uses essentially the same recursive structure as shown on this page.

Hopefully, this page will have given you a basic understanding of recursive programming. Recursion can be a powerful tool and should be in the repertoire of any serious developer.

 

 
     
   
     
[IncludeBorders/bottom.htm]