How do I prevent duplicates being created from an import of directory listing?

Sampoline

Member
Local time
Tomorrow, 06:06
Joined
Oct 19, 2020
Messages
161
I found this very helpful bit of code to 'import' the filename and filepath of files into an access table:

Code:
Option Compare Database
Option Explicit

'list files to tables
'http://allenbrowne.com/ser-59alt.html

Dim gCount As Long ' added by Crystal

Sub runListFiles()
    'Usage example.
    Dim strPath As String _
    , strFileSpec As String _
    , booIncludeSubfolders As Boolean
    
    strPath = "E:\"
    strFileSpec = "*.*"
    booIncludeSubfolders = True
    
    ListFilesToTable strPath, strFileSpec, booIncludeSubfolders
End Sub

'crystal modified parameter specification for strFileSpec by adding default value
Public Function ListFilesToTable(strPath As String _
    , Optional strFileSpec As String = "*.*" _
    , Optional bIncludeSubfolders As Boolean _
    )
On Error GoTo Err_Handler
    'Purpose:   List the files in the path.
    'Arguments: strPath = the path to search.
    '           strFileSpec = "*.*" unless you specify differently.
    '           bIncludeSubfolders: If True, returns results from subdirectories of strPath as well.
    'Method:    FilDir() adds items to a collection, calling itself recursively for subfolders.
    
    Dim colDirList As New Collection
    Dim varitem As Variant
    Dim rst As DAO.Recordset
    
   Dim mStartTime As Date _
      , mSeconds As Long _
      , mMin As Long _
      , mMsg As String
      
   mStartTime = Now()
   '--------
    
    Call FillDirToTable(colDirList, strPath, strFileSpec, bIncludeSubfolders)
      
   mSeconds = DateDiff("s", mStartTime, Now())
  
   mMin = mSeconds \ 60
   If mMin > 0 Then
      mMsg = mMin & " min "
      mSeconds = mSeconds - (mMin * 60)
   Else
      mMsg = ""
   End If
  
   mMsg = mMsg & mSeconds & " seconds"
  
   MsgBox "Done adding " & format(gCount, "#,##0") & " files from " & strPath _
      & IIf(Len(Trim(strFileSpec)) > 0, " for file specification --> " & strFileSpec, "") _
      & vbCrLf & vbCrLf & mMsg, , "Done"
 
Exit_Handler:
   SysCmd acSysCmdClearStatus
   '--------
    
    Exit Function

Err_Handler:
    MsgBox "Error " & Err.Number & ": " & Err.Description, , "ERROR"
    
    'remove next line after debugged -- added by Crystal
    Stop: Resume 'added by Crystal
    
    Resume Exit_Handler
End Function

Private Function FillDirToTable(colDirList As Collection _
    , ByVal strFolder As String _
    , strFileSpec As String _
    , bIncludeSubfolders As Boolean)
  
    'Build up a list of files, and then add add to this list, any additional folders
    On Error GoTo Err_Handler
    
    Dim strTemp As String
    Dim colFolders As New Collection
    Dim vFolderName As Variant
    Dim strSQL As String

    'Add the files to the folder.
    strFolder = TrailingSlash(strFolder)
    strTemp = Dir(strFolder & strFileSpec)
    Do While strTemp <> vbNullString
         gCount = gCount + 1
         SysCmd acSysCmdSetStatus, gCount
         strSQL = "INSERT INTO Files " _
          & " (FName, FPath) " _
          & " SELECT """ & strTemp & """" _
          & ", """ & strFolder & """;"
         CurrentDb.Execute strSQL
        colDirList.Add strFolder & strTemp
        strTemp = Dir
    Loop

    If bIncludeSubfolders Then
        'Build collection of additional subfolders.
        strTemp = Dir(strFolder, vbDirectory)
        Do While strTemp <> vbNullString
            If (strTemp <> ".") And (strTemp <> "..") Then
                If (GetAttr(strFolder & strTemp) And vbDirectory) <> 0& Then
                    colFolders.Add strTemp
                End If
            End If
            strTemp = Dir
        Loop
        'Call function recursively for each subfolder.
        For Each vFolderName In colFolders
            Call FillDirToTable(colDirList, strFolder & TrailingSlash(vFolderName), strFileSpec, True)
        Next vFolderName
    End If

Exit_Handler:
    
    Exit Function

Err_Handler:
    strSQL = "INSERT INTO Files " _
    & " (FName, FPath) " _
    & " SELECT ""  ~~~ ERROR ~~~""" _
    & ", """ & strFolder & """;"
    CurrentDb.Execute strSQL
    
    Resume Exit_Handler
End Function

Public Function TrailingSlash(varIn As Variant) As String
    If Len(varIn) > 0& Then
        If Right(varIn, 1&) = "\" Then
            TrailingSlash = varIn
        Else
            TrailingSlash = varIn & "\"
        End If
    End If
End Function

Source: https://www.everythingaccess.com/tutorials.asp?ID=List-files-to-a-table

The only thing I wanted to change in it, was a way to prevent duplicate records being created. What I did was add the code to a function and then call it using a button.

The button does import the directory list into the table, but every time I click the button it will import the list entirely again. I only want to be able to import new items to the table.

So say for example there are 100 images in a folder. When I click the button, it should import those 100 images' filename and filepath into the table I specified. Which the code does successfully. But what I don't want (which is happening at the moment) is each time I click the button, duplicate records of those 100 images metadata are created. I only want the code to add new records. So if another 15 images get updated tomorrow, I should be able to append those to the table but not the records pre-existing in the table already.

Thanks and wishing everyone a belated happy new year.
 
One thought that came to mind was to create a Dictionary of the existing files and then loop through the folder and add those files that do not exist in the dictionary. It's almost 3:30 am so I just mocked up a test which seems to work. I have to get some sleep so i'll try to revisit it in the morning and see about adding recursive code for subfolders.
 

Attachments

Create a unique index on the table. Then trap the "duplicate" error when you try to insert the record.
 
One thought that came to mind was to create a Dictionary of the existing files and then loop through the folder and add those files that do not exist in the dictionary. It's almost 3:30 am so I just mocked up a test which seems to work. I have to get some sleep so i'll try to revisit it in the morning and see about adding recursive code for subfolders.
Thanks moke.

I've tried out your database. Is there a way to search subfolders so that I can just choose one main folder that has subfolders with files/images in them all in one go?

Because I have to import a directory list of this one main folder location as opposed to select each individual folder which will be more tedious.

Thankyou.
 
Create a unique index on the table. Then trap the "duplicate" error when you try to insert the record.
Hi Pat,

Do you mean I have to create a separate index or is the default primary key indexed I have at the moment fine for this?

I'm not very clear about trapping the duplicate error. What exactly does that involve?
 
you could perhaps use the dcount function - e.g.

Code:
Do While strTemp <> vbNullString
         if dcount("*", "Files","FName='" & strTemp & "' AND FPath='" & strFolder & "'") =0 Then 'data not in Files table, so add it in
                gCount = gCount + 1
                SysCmd acSysCmdSetStatus, gCount
                strSQL = "INSERT INTO Files " _
                 & " (FName, FPath) " _
                 & " SELECT """ & strTemp & """" _
                 & ", """ & strFolder & """;"
                CurrentDb.Execute strSQL
               colDirList.Add strFolder & strTemp
           end if  
           strTemp = Dir
    Loop
 
you could perhaps use the dcount function - e.g.

Code:
Do While strTemp <> vbNullString
         if dcount("*", "Files","FName='" & strTemp & "' AND FPath='" & strFolder & "'") =0 Then 'data not in Files table, so add it in
                gCount = gCount + 1
                SysCmd acSysCmdSetStatus, gCount
                strSQL = "INSERT INTO Files " _
                 & " (FName, FPath) " _
                 & " SELECT """ & strTemp & """" _
                 & ", """ & strFolder & """;"
                CurrentDb.Execute strSQL
               colDirList.Add strFolder & strTemp
           end if
           strTemp = Dir
    Loop
Hi CJ,

When I did this, the import didn't work. The table will have no entries except the error handler message in it.

1611015108759.png

the error
 
you could perhaps use the dcount function - e.g.
I thought about that but being sleep deprived I was thinking that the dcount might be slower than using a dictionary if making a 100 calls or so, but who knows.

I still haven't slept but put together this example which seems to work ok. I haven't tested extensively.
I use a dictionary for the existing table records.
I use a recursive procedure to get a collection of folders and subfolders and then run a procedure to add
the records to the table if they don't exist in the dictionary.

Edit: Just an afterthought.
I added a line of code to include the Top Folder in the collection of folders. I did this in case there are files in the top folder you wanted included in the file list. If you dont want or have files in the top folder this line can be commented out.
Code:
colF.Add StrDir, StrDir  'add the top folder to collection
 

Attachments

Last edited:
Hi moke, this works exactly how I wanted it to. Thanks a lot.

Also if I'm not being too picky here, is there a way to avoid picking up "thumbs.db" files from this process. Maybe something I can add in the code level as opposed to query?
 
change this line
Code:
If Dict.Exists(myFile.Path) = False  Then
to
Code:
 If Dict.Exists(myFile.Path) = False And InStr(myFile.Path, "thumbs.db") = 0 Then
 
change this line
Code:
If Dict.Exists(myFile.Path) = False  Then
to
Code:
 If Dict.Exists(myFile.Path) = False And InStr(myFile.Path, "thumbs.db") = 0 Then
Thanks so much moke, appreciate your help!
 
Your welcome.
Good luck with your project.
Hi Moke thanks for the good wishes,

Sorry to bring this thread up again. Just wondering if there was a way to control the recursive procedure to only find a specific file format in a folder. For example only ".tif" or only ".jpg" files, as opposed to simply an entire list of everything in the folder.

Thanks.
 
Yes there is.

I added to the code using the FileSystemObject.GetExtension function which returns the file extension with out the Dot.
Add the file types in a string like "jpg, TIFF, png, gif" It's an optional argument so if it's omitted it returns all files.

the call would look like
Code:
sFolderList StrF, True, "jpg,png"

see the attached updated file
 

Attachments

Yes there is.

I added to the code using the FileSystemObject.GetExtension function which returns the file extension with out the Dot.
Add the file types in a string like "jpg, TIFF, png, gif" It's an optional argument so if it's omitted it returns all files.

the call would look like
Code:
sFolderList StrF, True, "jpg,png"

see the attached updated file
Awesome. This is exactly it!

Thanks moke <3
 
Hi moke, sorry to keep bringing this thread up again. I just have been playing with your code since writing last and was very keen on the idea of retrieving file metadata from the directory listing.

So we are already getting:

rs!FPath = myfile.Path
rs!FName = myfile.Name

But can we also get stuff like:

rs![FOwner] = myfile.Owner
rs![FSize] = myfile.Size
rs![FDateCreated] = myfile.DateCreated
rs![FDateLastModified] = myfile.DateLastModified
rs![FDateLastAccessed] = myfile.DateLastAccessed
rs![FType] = myfile.Type
rs![FAttributes] = myfile.Attributes

Tried to incorporate them in with 'Sub ListFiles', but it didn't do anything. Would I have to see use myObject.GetFile instead of myObject.GetFolder?
 
try

Code:
 rs![FDateCreated] =   myObject.GetFile(myFile.Path).DateCreated

Bear in mind that once the data is in the table it wont update last modified or last accessed.
 
try

Code:
 rs![FDateCreated] =   myObject.GetFile(myFile.Path).DateCreated

Bear in mind that once the data is in the table it wont update last modified or last accessed.
Hi Moke,

Thanks for that. Nearly all works. Except doesn't seem to return back the File Owner information. Need that one and the image horizontal resolution and compression and I will be golden.

I think it's because those other information is in the details tab of a file, as opposed to the others which were in the general metadata tab?
 
Last edited:

Users who are viewing this thread

Back
Top Bottom