merge pdf files till 5mb size

megatronixs

Registered User.
Local time
Today, 02:26
Joined
Aug 17, 2012
Messages
719
Hi all,

I have some code to merge pdf files that are in a folder and including the subfolder. I use Acrobat Pro and works pretty fine. What now turns out, there is a limitation on the size of the pdf files. They will be uploaded to an intranet site and the max size is 5mb.
I would need to list all the pf files first in a temp table with the file size.

How can I loop trough it and when it reaches several pdf file size of 5mb, to merge them and then continue with the rest of the file again till 5mb size. etc
I would need to adjust the data so I will get the file size in the temp table and then the loop to get 5mb size.

this is the current code I use, it is kind of messy :(

Code:
Option Compare Database
 
Public Sub btn_process_Click()
'---button to process the cases one by one from the folder and subfolder
'---will get all the PDF files from that case, list them in a datasheet,
'---and then merge to one master PDF file
    Dim elcamino            As String
    Dim racf_uploader       As String
    Dim userEmail           As String
    Dim userRACF            As String
    Dim racf_user_uploading As String
    Dim time_date_uploaded  As String
    Dim uploaded            As String
    Dim dateAdded           As String
    Dim strSQL              As String
    Dim rs                  As DAO.Recordset
    Dim myArray()           As String
    Dim strPDFs             As String
    'the below is added from the below macro
    Dim rstTableName        As DAO.Recordset   'Your table
    Dim intArraySize        As Integer         'The size of your array
    Dim iCounter            As Integer             'Index of the array
    
    Const cPath As String = "C:\TempPDF\\Tools\data\"
 
    
        
    strSQL = "SELECT * FROM tbl_main_data"   'define the SQL result that you want to loop
    Set rs = CurrentDb.OpenRecordset(strSQL)
    
    If Not rs.BOF And Not rs.EOF Then
        rs.MoveFirst
        While (Not rs.EOF)
            'Debug.Print rs.Fields("full_path") 'define the field you want to return data
            elcamino = rs.Fields("full_path")
       
Call ListFiles(elcamino, "*.pdf", True)
         
         
'---------------------------------------------------------------------------------------------------
  'Open your table
Set rstTableName = CurrentDb.OpenRecordset("tbl_temp_pdf_file_names")
If Not rstTableName.EOF Then
    rstTableName.MoveFirst   'Ensure we begin on the first row
    'The size of the array should be equal to the number of rows in the table
    intArraySize = rstTableName.RecordCount - 1
    iCounter = 0
    ReDim myArray(intArraySize) 'Need to size the array
    Do Until rstTableName.EOF
        myArray(iCounter) = rstTableName.Fields("temp_pdf_files")
        'Debug.Print "Item: "; iCounter & " " & myArray(iCounter)
        iCounter = iCounter + 1
        rstTableName.MoveNext
    Loop
End If
'---------------------------------------------------------------------------------------------------
 
   If Len(Dir(elcamino & "\" & " MasterPDF", vbDirectory)) = 0 Then
   MkDir elcamino & "\" & "MasterPDF"
End If
    updfConcatenate myArray, elcamino & "\MasterPDF\" & " MasterPDF.pdf"
        
DoCmd.SetWarnings False
DoCmd.RunSQL "DELETE * FROM tbl_temp_pdf_file_names"
DoCmd.SetWarnings True
'---------------------------------------------------------------------------------------------------
'---move the uploaded case path to the audit trail--------------------------------------------------
'---get the fields from the table
elcamino = rs.Fields("full_path")
userEmail = rs.Fields("user_email")
userRACF = rs.Fields("user_racf")
dateAdded = rs.Fields("date_time_added")
login_name = getUserName
CurrentDb.Execute "INSERT INTO tbl_audit_trail(full_path, email_user, user_racf, date_time_added, racf_user_uploading, time_date_uploaded, uploaded) " & _
"VALUES ('" & elcamino & "', '" & userEmail & "', '" & userRACF & "', '" & dateAdded & "', '" & login_name & "', '" & Now() & "', '" & "yes" & "' );"
rs.Delete

elcamino = ""
 
'---the macro goes to the next folder path
rs.MoveNext
                        
Wend
        
        
End If
rs.Close
Set rs = Nothing
'this is end of the main loop
MsgBox "files uploaded!"
End Sub

Public Sub loadIntoArray()
Dim rstTableName As DAO.Recordset   'Your table
Dim myArray() As String             'Your dynamic array
Dim intArraySize As Integer         'The size of your array
Dim iCounter As Integer             'Index of the array
'Open your table
Set rstTableName = CurrentDb.OpenRecordset("tbl_temp_pdf_file_names")
If Not rstTableName.EOF Then
    rstTableName.MoveFirst   'Ensure we begin on the first row
    'The size of the array should be equal to the number of rows in the table
    intArraySize = rstTableName.RecordCount - 1
    iCounter = 0
    ReDim myArray(intArraySize) 'Need to size the array
    Do Until rstTableName.EOF
        myArray(iCounter) = rstTableName.Fields("temp_pdf_files")
        Debug.Print "Item: "; iCounter & " " & myArray(iCounter)
        iCounter = iCounter + 1
        rstTableName.MoveNext
    Loop
End If
'If IsObject(rstTableName) Then Set rstTableName = Nothing
End Sub
 
Private Sub btn_upload_Click()
' button to upload the ready cases to the tool for later processing
Dim login_name As String
login_name = getUserName
CurrentDb.Execute "INSERT INTO tbl_main_data(full_path, user_email, user_racf, date_time_added) VALUES ('" & Me.txt_full_path & "', '" & Me.txt_user_email & "', '" & login_name & "', '" & Now() & "');"
Me.txt_full_path = ""
Me.txt_user_email = ""
End Sub
 
'Private mlngBkmkCounter     As Long
Public Sub updfConcatenate(pvarFromPaths As Variant, _
                           pstrToPath As String)
    Dim origPdfDoc      As Acrobat.CAcroPDDoc
    Dim newPdfDoc       As Acrobat.CAcroPDDoc
    Dim lngNewPageCount As Long
    Dim lngInsertPage   As Long
    Dim i               As Integer
    Dim fso             As New FileSystemObject
    Dim fileName        As String
    Dim adoRS           As ADODB.Recordset
    Dim daoRS           As DAO.Recordset
    
    Set origPdfDoc = CreateObject("AcroExch.PDDoc")
    Set newPdfDoc = CreateObject("AcroExch.PDDoc")
    mlngBkmkCounter = 0
    'set the first file in the array as the "new"'
    If newPdfDoc.Open(pvarFromPaths(LBound(pvarFromPaths))) = True Then
        fileName = fso.GetFileName(pvarFromPaths(LBound(pvarFromPaths)))
        updfInsertBookmark fileName, lngInsertPage, , newPdfDoc
        mlngBkmkCounter = 1
        For i = LBound(pvarFromPaths) + 1 To UBound(pvarFromPaths)
            'Application.StatusBar = "Merging " & pvarFromPaths(i) & "..."
            If origPdfDoc.Open(pvarFromPaths(i)) = True Then
                lngInsertPage = newPdfDoc.GetNumPages
                newPdfDoc.InsertPages lngInsertPage - 1, origPdfDoc, 0, origPdfDoc.GetNumPages, False
                fileName = fso.GetFileName(pvarFromPaths(i))
                updfInsertBookmark fileName, lngInsertPage, , newPdfDoc
                origPdfDoc.Close
                mlngBkmkCounter = mlngBkmkCounter + 1
            End If
        Next i
        newPdfDoc.Save PDSaveFull, pstrToPath
    End If
ExitHere:
    Set origPdfDoc = Nothing
    Set newPdfDoc = Nothing
    'Application.StatusBar = False
    Exit Sub
End Sub
 
 
'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
'   functions
'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

Public Function getUserName() As String
 getUserName = Environ("USERNAME")
End Function
 
Public Function ListFiles(strPath As String, Optional strFileSpec As String, _
    Optional bIncludeSubfolders As Boolean, Optional lst As ListBox)
On Error GoTo Err_Handler
    'Purpose:   List the files in the path.
    'Arguments: strPath = the path to search.
    '           strFileSpec = "*.*" unless you specify differently.
    '           bIncludeSubfolders: If True, returns results from subdirectories of strPath as well.
    '           lst: if you pass in a list box, items are added to it. If not, files are listed to immediate window.
    '               The list box must have its Row Source Type property set to Value List.
    'Method:    FilDir() adds items to a collection, calling itself recursively for subfolders.
    Dim colDirList As New Collection
    Dim varItem As Variant
    Dim daspath As String

daspath = elcamino
Set db = CurrentDb()
    
    Call FillDir(colDirList, strPath, strFileSpec, bIncludeSubfolders)
    
    'Add the files to a list box if one was passed in. Otherwise list to the Immediate Window.
    If lst Is Nothing Then
        For Each varItem In colDirList
        
            sSQL = "INSERT INTO [tbl_temp_pdf_file_names] (temp_pdf_files) VALUES(""" & varItem & """)"
            db.Execute sSQL, dbFailOnError
        
            'Debug.Print varItem
        Next
    Else
        For Each varItem In colDirList
        lst.AddItem varItem
        Next
    End If
Exit_Handler:
    Exit Function
Err_Handler:
    MsgBox "Error " & Err.Number & ": " & Err.Description
    Resume Exit_Handler
End Function
Private Function FillDir(colDirList As Collection, ByVal strFolder As String, strFileSpec As String, _
    bIncludeSubfolders As Boolean)
    'Build up a list of files, and then add add to this list, any additional folders
    Dim strTemp As String
    Dim colFolders As New Collection
    Dim vFolderName As Variant
    'Add the files to the folder.
    strFolder = TrailingSlash(strFolder)
    strTemp = Dir(strFolder & strFileSpec)
    Do While strTemp <> vbNullString
        colDirList.Add strFolder & strTemp
        strTemp = Dir
    Loop
    If bIncludeSubfolders Then
        'Build collection of additional subfolders.
        strTemp = Dir(strFolder, vbDirectory)
        Do While strTemp <> vbNullString
            If (strTemp <> ".") And (strTemp <> "..") Then
                If (GetAttr(strFolder & strTemp) And vbDirectory) <> 0& Then
                    colFolders.Add strTemp
                End If
            End If
            strTemp = Dir
        Loop
        'Call function recursively for each subfolder.
        For Each vFolderName In colFolders
            Call FillDir(colDirList, strFolder & TrailingSlash(vFolderName), strFileSpec, True)
        Next vFolderName
    End If
End Function
Public Function TrailingSlash(varIn As Variant) As String
    If Len(varIn) > 0& Then
        If Right(varIn, 1&) = "\" Then
            TrailingSlash = varIn
        Else
            TrailingSlash = varIn & "\"
        End If
    End If
End Function
Public Sub updfInsertBookmark(pstrCaption As String, _
    plngPage As Long, _
    Optional pstrPath As String, _
    Optional pMyPDDoc As Acrobat.CAcroPDDoc, _
    Optional plngIndex As Long = -1, _
    Optional plngParentIndex As Long = -1)
    Dim MyPDDoc         As Acrobat.CAcroPDDoc
    Dim jso             As Object
    Dim BMR             As Object
    Dim arrParents      As Variant
    Dim bkmChildsParent As Object
    Dim bleContinue     As Boolean
    Dim bleSave         As Boolean
    Dim lngIndex        As Long
    If pMyPDDoc Is Nothing Then
        Set MyPDDoc = CreateObject("AcroExch.PDDoc")
        bleContinue = MyPDDoc.Open(pstrPath)
        bleSave = True
    Else
        Set MyPDDoc = pMyPDDoc
        bleContinue = True
    End If
    If plngIndex > -1 Then
        lngIndex = plngIndex
    Else
        lngIndex = mlngBkmkCounter
    End If
    If bleContinue = True Then
        Set jso = MyPDDoc.GetJSObject
        Set BMR = jso.BookMarkRoot
        If plngParentIndex > -1 Then
            arrParents = jso.BookMarkRoot.Children
            Set bkmChildsParent = arrParents(plngParentIndex)
            bkmChildsParent.createchild pstrCaption, "this.pageNum= " & plngPage, lngIndex
        Else
            BMR.createchild pstrCaption, "this.pageNum= " & plngPage, lngIndex
        End If
        MyPDDoc.SetPageMode 3 '3 — display using bookmarks'
        If bleSave = True Then
            MyPDDoc.Save PDSaveIncremental, pstrPath
            MyPDDoc.Close
         End If
    End If
ExitHere:
    Set jso = Nothing
    Set BMR = Nothing
    Set arrParents = Nothing
    Set bkmChildsParent = Nothing
    Set MyPDDoc = Nothing
End Sub
Private Function MergePDFs(arrFiles() As String, strSaveAs As String) As Boolean
'---------------------------------------------------------------------------------------------------
'---FUNCTION: MergePDFs-----------------------------------------------------------------------------
'---DEVELOPER: Ryan Wells (wellsr.com)--------------------------------------------------------------
'---DATE: 09/2017-----------------------------------------------------------------------------------
'---DESCRIPTION: This function uses Adobe Acrobat (won't work with just the Reader!) to-------------
'---             combine PDFs into one PDF and save the new PDF with its own file name.-------------
'---INPUT: The function requires two arguments.-----------------------------------------------------
'---       1) arrFiles is an array of strings containing the full path to each PDF you want to------
'---          combine in the order you want them combined.------------------------------------------
'---       2) strSaveAs is a string containing the full path you want to save the new PDF as.-------
'---REQUIREMENTS: 1) Must add a reference to "Adobe Acrobat X.0 Type Library" or "Acrobat"----------
'---                 under Tools > References. This has been tested with Acrobat 6.0 and 10.0.------
'---CAUTION: This function won't work unless you have the full Adobe Acrobat. In other words,-------
'            Adobe Reader will not work.------------------------------------------------------------
'---------------------------------------------------------------------------------------------------
 
Dim objCAcroPDDocDestination As Acrobat.CAcroPDDoc
Dim objCAcroPDDocSource As Acrobat.CAcroPDDoc
Dim i As Integer
Dim iFailed As Integer
 
On Error GoTo NoAcrobat:
'Initialize the Acrobat objects
Set objCAcroPDDocDestination = CreateObject("AcroExch.PDDoc")
Set objCAcroPDDocSource = CreateObject("AcroExch.PDDoc")
 
'Open Destination, all other documents will be added to this and saved with
'a new filename
objCAcroPDDocDestination.Open (arrFiles(LBound(arrFiles))) 'open the first file
 
'Open each subsequent PDF that you want to add to the original
  'Open the source document that will be added to the destination
    For i = LBound(arrFiles) + 1 To UBound(arrFiles)
        objCAcroPDDocSource.Open (arrFiles(i))
        If objCAcroPDDocDestination.InsertPages(objCAcroPDDocDestination.GetNumPages - 1, objCAcroPDDocSource, 0, objCAcroPDDocSource.GetNumPages, 0) Then
          MergePDFs = True
        Else
          'failed to merge one of the PDFs
          iFailed = iFailed + 1
        End If
        objCAcroPDDocSource.Close
    Next i
objCAcroPDDocDestination.Save 1, strSaveAs 'Save it as a new name
objCAcroPDDocDestination.Close
Set objCAcroPDDocSource = Nothing
Set objCAcroPDDocDestination = Nothing
 
NoAcrobat:
If iFailed <> 0 Then
    MergePDFs = False
End If
On Error GoTo 0
End Function

Greetings.
 
Is your question about how to get the file size or how to set up the loop to reset when it is near or at 5mb?

If the former, here is one example from Stack Overflow to get the file size:
Code:
Function GetDirOrFileSize(strFolder As String, Optional strFile As Variant) As Long
'Call Sequence: GetDirOrFileSize("drive\path"[,"filename.ext"])
'https://stackoverflow.com/questions/15883237/vba-excel-function-for-returning-file-size-in-byte
   Dim lngFSize As Long, lngDSize As Long
   Dim oFO As Object
   Dim oFD As Object
   Dim OFS As Object

   lngFSize = 0
   Set OFS = CreateObject("Scripting.FileSystemObject")

   If strFolder = "" Then strFolder = ActiveWorkbook.path
   If Right(strFolder, 1) <> "\" Then strFolder = strFolder & "\"
   'Thanks to Jean-Francois Corbett, you can use also OFS.BuildPath(strFolder, strFile)

   If OFS.FolderExists(strFolder) Then
     If Not IsMissing(strFile) Then

       If OFS.FileExists(strFolder & strFile) Then
         Set oFO = OFS.GetFile(strFolder & strFile)
         GetDirOrFileSize = oFO.Size
       End If

       Else
        Set oFD = OFS.GetFolder(strFolder)
        GetDirOrFileSize = oFD.Size
       End If

   End If

End Function   '*** GetDirOrFileSize ***
 
Hi sxshech,

It is more about the loop reset when reaching the 5mb size.
I will use the code to get the file size before I can start looping till 5mb file size.

Greetings.
 
Well the code you supplied has nothing to do with that.?

I think you would do something like

Initialise filesize varaiable (pdfSize)
Get first file from table

Get filesize using your function (fSize)
Test if pdfSize plus fSize exceed 5Mb
If it does
write output pdf file
reset pdfsize
add current file to pdf output
add fsize to pdfsize
else
add file to output
add filesize to pdfsize
end if
get next file

Loop until end of table and write what remains as output.

HTH
 

Users who are viewing this thread

Back
Top Bottom