Solved File Read error (2 Viewers)

prasadgov

Member
Local time
Today, 11:58
Joined
Oct 12, 2021
Messages
62
If there are any ASCII characters in the range 0 to 31, they might interfere with the way the "string" is interpreted. Do you have a layout description? If not, how will you analyse the file contents?
I tried manually pasting the data into my table and use this sql which formats the data correctly

SELECT Mid(message,1,2) AS [Message Type], Mid(message,5,6) AS [Timestamp], Mid(message,95,1) AS [TYPE-OF-08-RESPONSE], Mid(message,97,8) AS [DO-OUT-DLV-PART-NUM], Mid(message,107,9) AS [DO-OUT-CUSIP], Mid(message,144,8) AS [DO-OUT-RCV-PART-NUM], Mid([message],154,4) AS [DO-OUT-ORIG-INPUT-SRCE], Mid(message,169,13) AS [DO-OUT-MONEY-VALUE], Mid(message,182,1) AS [DO-OUT-DELV-RECV-IND], Mid(message,242,6) AS [DO-OUT-SETTLE-DATE], Mid(message,248,3) AS [DO-OUT-REASON-CODE], Mid(message,261,6) AS [DO-OUT-DATE-STAMP], Mid(message,267,6) AS [DO-OUT-TIME-STAMP], Mid(message,273,20) AS [DO-OUT-CUSIP-DESC], Mid(message,520,9) AS [DO-OUT-SHARE-QTY-NEW], Mid(message,749,16) AS IMS_TID, Format((Date()),"mm/dd/yyyy") AS inserteddate
FROM tbltcc;

Not sure how it reads and skips the nulls when I paste manually.
I just wanted to automate this.
 

gemma-the-husky

Super Moderator
Staff member
Local time
Today, 16:58
Joined
Sep 12, 2006
Messages
15,805
So do you have a fixed width file layout to tell you where the data is you want to manage?

What you could do is, in code. (Sorry, I can't select the lines below on my phone). Anyway, if there is just a single line, you will end up with one "message".

Open the file
While not end of file
Read the next line into a string variable (that will give you your message)
Wend
Close the file
 

prasadgov

Member
Local time
Today, 11:58
Joined
Oct 12, 2021
Messages
62
So do you have a fixed width file layout to tell you where the data is you want to manage?

What you could do is, in code. (Sorry, I can't select the lines below on my phone). Anyway, if there is just a single line, you will end up with one "message".

Open the file
While not end of file
Read the next line into a string variable (that will give you your message)
Wend
Close the file
I was able to follow your suggestion and was able to read the complete contents of one text file into the table

Code:
Dim strFilename As String: strFilename = "C:\PG\BC2\TCC\20240626060103.txt"
Dim strTextLine As String
Dim iFile As Integer: iFile = FreeFile

Open strFilename For Input As #iFile
Do Until EOF(1)
Line Input #1, strTextLine
CurrentDb.Execute "insert into tbldtcc (message) select '" & strTextLine & "';"
Loop
Close #iFile

End Sub

Now, how to modify the code to read all the files in the folder inside the loop?

Thanks everyone for your kind help and patience!
 

Gasman

Enthusiastic Amateur
Local time
Today, 16:58
Joined
Sep 21, 2011
Messages
14,743
You do not need the Select surely?, just use the strTextLine?
Use Dir() to get each file in the folder in turn.
 

The_Doc_Man

Immoderate Moderator
Staff member
Local time
Today, 10:58
Joined
Feb 28, 2001
Messages
27,677
I ran the two sample files from post #18 through a string parser test module that I had for another project. It reports no control characters of any kind, and that parser specifically WILL catch null characters. It even helped me analyze a UTF-8 file.

The analyzer does, however, report two long sequences of space characters near the end of the lines. One is about 80 bytes, the other about 190 bytes. Exact number not really important in this context.
 

arnelgp

..forever waiting... waiting for jellybean!
Local time
Today, 23:58
Joined
May 7, 2009
Messages
19,293
can you convert Chr$(0) to a Space char?
Code:
Private Sub Command0_Click()
Dim FSO As Object, objFolder As Object, objFile As Object
Dim content As String
Dim i As Integer
Dim db As DAO.Database
Dim rs As DAO.Recordset
Set db = CurrentDb
Set rs = db.OpenRecordset("tbltcc", dbOpenDynaset)
Set FSO = CreateObject("Scripting.FileSystemObject")
Set objFolder = FSO.GetFolder("C:\Prasad\BC2\TC")
i = 1
For Each objFile In objFolder.Files
    If Right(objFile.Name, 3) = "txt" Then
        content = Trim$(Replace$(RawContent(objFolder & "\" & objFile.Name), Chr$(0), " "))
        'Debug.Print content
        With rs
            .AddNew
            ![message] = content
            .Update
        End With
    End If
    i = i + 1
Next objFile

Set objFile = Nothing
Set objFolder = Nothing
Set FSO = Nothing
rs.Close
Set rs = Nothing
Set db = Nothing
End Sub


' https://bytes.com/topic/access/864406-read-file-byte-byte
Public Function RawContent(ByVal strFileName As String) As String
Dim intFileNumber As Integer
Dim lngFileSize As Long
Dim strBuffer As String
Dim lngCharNumber As Long
Dim strCharacter As String * 1
'Dim strFileName As String

'strFileName = "C:\Windows\System.ini"

'Get the next available File Number
intFileNumber = FreeFile

DoCmd.Hourglass True

Open strFileName For Binary Access Read Shared As #intFileNumber

lngFileSize = LOF(intFileNumber)    'How large is the File in Bytes?
strBuffer = Space$(lngFileSize)     'Set Buffer Size to File Length

Get #intFileNumber, , strBuffer     'Grab a Chunk of Data from the File
Close #intFileNumber

RawContent = strBuffer
DoCmd.Hourglass False
End Function
 

The_Doc_Man

Immoderate Moderator
Staff member
Local time
Today, 10:58
Joined
Feb 28, 2001
Messages
27,677
can you convert Chr$(0) to a Space char?

CHR$(0) is the ASCII "NUL" character, which occupies one byte. I don't know why you couldn't convert it - but in the sample files offered to us earlier, there were no NUL characters. My scan would have detected them. Lots of spaces. No NULs.
 

arnelgp

..forever waiting... waiting for jellybean!
Local time
Today, 23:58
Joined
May 7, 2009
Messages
19,293
both text files have Nulls on them (view them on Notepad++ (View->Show Control Chars)
nulls.jpg
 

The_Doc_Man

Immoderate Moderator
Staff member
Local time
Today, 10:58
Joined
Feb 28, 2001
Messages
27,677
Interesting, because the method I used WOULD detect NUL and other non-printing characters - but didn't. Which makes me wonder if the method used for opening a file somehow pre-filters those NUL characters. I can modify my test and try again, but it's getting late in my time zone.
 

gemma-the-husky

Super Moderator
Staff member
Local time
Today, 16:58
Joined
Sep 12, 2006
Messages
15,805
Interesting, because the method I used WOULD detect NUL and other non-printing characters - but didn't. Which makes me wonder if the method used for opening a file somehow pre-filters those NUL characters. I can modify my test and try again, but it's getting late in my time zone.
That implies that maybe "line input" might read a line to a CR/EOF character and pre process it to a genuine string. As it returns a string it sounds plausible.
 

gemma-the-husky

Super Moderator
Staff member
Local time
Today, 16:58
Joined
Sep 12, 2006
Messages
15,805
I was able to follow your suggestion and was able to read the complete contents of one text file into the table

Code:
Dim strFilename As String: strFilename = "C:\PG\BC2\TCC\20240626060103.txt"
Dim strTextLine As String
Dim iFile As Integer: iFile = FreeFile

Open strFilename For Input As #iFile
Do Until EOF(1)
Line Input #1, strTextLine
CurrentDb.Execute "insert into tbldtcc (message) select '" & strTextLine & "';"
Loop
Close #iFile

End Sub

Now, how to modify the code to read all the files in the folder inside the loop?

Thanks everyone for your kind help and patience!

So generally speaking, put the processing function into a separate sub/function, with an argument of the single file name.

Then have another sub to iterate the files in the folder. You can do this with a dir() loop, but unfortunately because dir() is not recursive you can't get it to check any subfolders that it finds. If you only want to use a single folder you can use dir.

You might need a folder picker to find the folder you want to use, rather than hard coding it, so that's another independent function. You could save the last folder you used in a table, or variable somewhere so the next time you do this process, the folder picker starts at the last folder you used, rather than being really irritating and starting a C:/ each time.

Breaking an application into smaller units like this is really easy and useful.

You end up with 2 or 3 functions this way. One to select a folder, one to check all the files in the folder. The third to process a file in a particular way. Each of these functions can often be used/reused independently of the others. It also makes it easier to debug your programs, and even to design them. You can build the whole structure in logical blocks without having the code.

After you process a file, you might want to rename the processed file, or move it to a processed folder, or write the file details to a log file somewhere. It's easy to see the correct logical place to put each step. .

In this instance you can process a file folder, and make sure it finds the right number of files, before you even bother with the processing code for a single file. Alternatively you can develop the processing code for a test file, knowing it will work correctly with every file you give it.

I tend to try to stick to intrinsic VBA commands mostly. I only really use fso if I've picked up a function somewhere that already uses it, for instance.
 
Last edited:

gemma-the-husky

Super Moderator
Staff member
Local time
Today, 16:58
Joined
Sep 12, 2006
Messages
15,805
Interesting, because the method I used WOULD detect NUL and other non-printing characters - but didn't. Which makes me wonder if the method used for opening a file somehow pre-filters those NUL characters. I can modify my test and try again, but it's getting late in my time zone.
.just thinking about this again reminds me a different issue, which I won't repeat here.

But at some point doesn't VBA convert bytes (eg in a text file) into 2 byte Unicode characters? That might change chr(0) into a unicode character that doesn't cause an error. Or am I missing something about the nature of a VBA string.

In the issue I had before, I wanted to treat an array of bytes as a string, and it just couldn't get it to work. After I stringified the array of bytes, I believe it was no longer the same byte sequence. (This was a while ago)
 

The_Doc_Man

Immoderate Moderator
Staff member
Local time
Today, 10:58
Joined
Feb 28, 2001
Messages
27,677
I re-ran my test and confirmed my suspicions.

By experiment I found that using a sequence to Open filename For Input As #1 followed by Line Input #1 converts the input to an ASCII string because my scanner didn't even hiccup once after reading the file using that sequence. And it is capable of detecting non-printing characters. I verified it was still working correctly because in test #2, when I used the FileSystemObject to perform an OpenTextFile followed by .ReadAll then I got a (VERY) different result. Technically, it appears that the FSO OpenTextFile call creates a UNIX-like text stream which does not guarantee any type of conversion at all. So what I got was quite raw.

The file, despite being of type .TXT, is NOT an ASCII text file. When I checked with Notepad, it is verified as a UTF-8 file, which means it contains extended characters. It also contains a LOT of literal ASCII NULs in the string. UTF-8 is NOT the same as UNICODE.

The original problem of encountering read errors is simple... Access doesn't like UTF-8 files unless you explain to it how to handle them first. BUT if you use the Open / Line Input sequence, it WILL convert that for you, losing most of the special characters along the way.
 

gemma-the-husky

Super Moderator
Staff member
Local time
Today, 16:58
Joined
Sep 12, 2006
Messages
15,805
I re-ran my test and confirmed my suspicions.

By experiment I found that using a sequence to Open filename For Input As #1 followed by Line Input #1 converts the input to an ASCII string because my scanner didn't even hiccup once after reading the file using that sequence. And it is capable of detecting non-printing characters. I verified it was still working correctly because in test #2, when I used the FileSystemObject to perform an OpenTextFile followed by .ReadAll then I got a (VERY) different result. Technically, it appears that the FSO OpenTextFile call creates a UNIX-like text stream which does not guarantee any type of conversion at all. So what I got was quite raw.

The file, despite being of type .TXT, is NOT an ASCII text file. When I checked with Notepad, it is verified as a UTF-8 file, which means it contains extended characters. It also contains a LOT of literal ASCII NULs in the string. UTF-8 is NOT the same as UNICODE.

The original problem of encountering read errors is simple... Access doesn't like UTF-8 files unless you explain to it how to handle them first. BUT if you use the Open / Line Input sequence, it WILL convert that for you, losing most of the special characters along the way.
That's interesting, as it sounds like using fso to manage my byte stream might well have solved the problem I had with the unicode characters.

And I appreciate your observations that unicode and utf-8 are different things. :D
 

Users who are viewing this thread

Top Bottom