VBA and corrupt PDFs (1 Viewer)

CedarTree

Registered User.
Local time
Today, 10:57
Joined
Mar 2, 2018
Messages
404
Hello - I have a process that manipulates PDFs in Access VBA using the Adobe reference, etc. The problem is that when a PDF file is corrupt, it doesn't "crash" normally... rather Adobe will prompt me regarding the bad file and then only when I manually close the file / Adobe, Access will THEN take me to the VBA error where I could do some error trapping. But I want to remove the manual aspect to the process... any suggestions please? THANKS!!!
 

Uncle Gizmo

Nifty Access Guy
Staff member
Local time
Today, 15:57
Joined
Jul 9, 2003
Messages
16,282
I don't know! But I thought, this looks like a question to ask chat GPT! I asked and this was the answer:- ( mind you, it is not guaranteed to be correct)

ChatGPT Said:-
Automating a third-party software like Adobe from within Access VBA presents challenges due to the inability to directly trap and handle errors generated by the external application. When dealing with file manipulation and potential issues such as corruption, the best approach is often to build in error-checking steps prior to or after critical operations.

Since you cannot directly catch Adobe's errors within VBA, you can introduce a pre-processing step in your VBA code to validate the PDF file before you open it with Adobe. You may use a third-party tool or library capable of checking the validity of a PDF file, such as Apache PDFBox or PyPDF2.

Unfortunately, these tools are not directly available to VBA. You might need to set up a small Python, Java, or .NET Core application to use these libraries. Then from VBA, you could call the application, passing the PDF file as an argument, and handle the output of that application to check if the PDF is corrupt or not.

Here's an example with Python and PyPDF2:

Python ==============

Code:
import sys
from PyPDF2 import PdfFileReader

def is_corrupt(pdf_path):
    try:
        reader = PdfFileReader(open(pdf_path, 'rb'))
        return False
    except:
        return True

if __name__ == "__main__":
    print(is_corrupt(sys.argv[1]))
Python ==============

You could run the Python script from VBA like this:

VBA ====================
Code:
Function IsPdfCorrupt(pdfPath As String) As Boolean
    Dim scriptPath As String
    scriptPath = "C:\path\to\your\python_script.py"  'Replace with the path of your script
    
    Dim command As String
    command = "python " & scriptPath & " " & pdfPath
    
    Dim wsh As Object
    Set wsh = VBA.CreateObject("WScript.Shell")
    Dim exec As Object
    Set exec = wsh.Exec(command)
    
    'Wait until the command has finished executing
    While exec.Status = 0
        DoEvents
    Wend
 
    'The StdOut property contains the command's output
    IsPdfCorrupt = CBool(exec.StdOut.ReadAll)
End Function

This function will return True if the PDF file at pdfPath is corrupt, and False otherwise. You could call it before manipulating the PDF file with Adobe, and handle the error if the function returns True.

The solution above involves working with Python, which might be complex if you're only familiar with VBA. Another alternative is to use a tool that is available in a language you're familiar with.
 

CedarTree

Registered User.
Local time
Today, 10:57
Joined
Mar 2, 2018
Messages
404
Thanks. Any non-native things (like I’m guessing Python might be) would be hard to get approval for @ my company.
 
Last edited:

CedarTree

Registered User.
Local time
Today, 10:57
Joined
Mar 2, 2018
Messages
404
Any other suggestions very welcome please! Thanks!
 

ebs17

Well-known member
Local time
Today, 16:57
Joined
Feb 7, 2020
Messages
1,946
when a PDF file is corrupt
Where does that come from, who makes something like this?
Is that you with your manipulation?
 

CedarTree

Registered User.
Local time
Today, 10:57
Joined
Mar 2, 2018
Messages
404
Where does that come from, who makes something like this?
Is that you with your manipulation?
I am first converting regular scanned pdfs to searchable PDFs (using 3rd party software) and then searching for certain words. During the conversion process every once in a while, it will create a corrupted PDF. Out of my control, but I do want to be able to test for it before searching for the words.
 

Uncle Gizmo

Nifty Access Guy
Staff member
Local time
Today, 15:57
Joined
Jul 9, 2003
Messages
16,282
Unfortunately, these tools are not directly available to VBA. You might need to set up a small Python, Java, or .NET Core application to use these libraries. Then from VBA, you could call the a

Find out if your company allows to use of .Net.
 

CedarTree

Registered User.
Local time
Today, 10:57
Joined
Mar 2, 2018
Messages
404
It seems I can download and run .net framework on a server that IT gave us (and has Access/VBA installed)... now what? Thanks!
 

CedarTree

Registered User.
Local time
Today, 10:57
Joined
Mar 2, 2018
Messages
404
I ended up doing much simpler coding, based on the END of the file (that may have %PDF in the header), but a corrupt PDF would much less likely have %PDF in the final line...

Code:
Function fnPDFCorrupt(pFileNameFull As String) As Boolean
  
    Dim sFileFooter As String
  
    sFileFooter = fnFileFooter(pFileNameFull)
    If Left(sFileFooter, 4) <> "%PDF" Then
        fnPDFCorrupt = True
    End If
  
End Function
Function fnFileFooter(pFileNameFull As String) As String

    Dim iFileNum As Long, vLine
  
    iFileNum = FreeFile()
    Open pFileNameFull For Input As #iFileNum
  
    Do While Not EOF(iFileNum) 'do a loop to get to last line
        Line Input #iFileNum, vLine
    Loop
  
    Close iFileNum
  
    fnFileFooter = CStr(vLine)
  
End Function
 

Users who are viewing this thread

Top Bottom