Solved File Read error

prasadgov · Jul 1, 2024

gemma-the-husky said:
If there are any ASCII characters in the range 0 to 31, they might interfere with the way the "string" is interpreted. Do you have a layout description? If not, how will you analyse the file contents?

I tried manually pasting the data into my table and use this sql which formats the data correctly

SELECT Mid(message,1,2) AS [Message Type], Mid(message,5,6) AS [Timestamp], Mid(message,95,1) AS [TYPE-OF-08-RESPONSE], Mid(message,97,8) AS [DO-OUT-DLV-PART-NUM], Mid(message,107,9) AS [DO-OUT-CUSIP], Mid(message,144,8) AS [DO-OUT-RCV-PART-NUM], Mid([message],154,4) AS [DO-OUT-ORIG-INPUT-SRCE], Mid(message,169,13) AS [DO-OUT-MONEY-VALUE], Mid(message,182,1) AS [DO-OUT-DELV-RECV-IND], Mid(message,242,6) AS [DO-OUT-SETTLE-DATE], Mid(message,248,3) AS [DO-OUT-REASON-CODE], Mid(message,261,6) AS [DO-OUT-DATE-STAMP], Mid(message,267,6) AS [DO-OUT-TIME-STAMP], Mid(message,273,20) AS [DO-OUT-CUSIP-DESC], Mid(message,520,9) AS [DO-OUT-SHARE-QTY-NEW], Mid(message,749,16) AS IMS_TID, Format((Date()),"mm/dd/yyyy") AS inserteddate
FROM tbltcc;

Not sure how it reads and skips the nulls when I paste manually.
I just wanted to automate this.

gemma-the-husky · Jul 1, 2024

So do you have a fixed width file layout to tell you where the data is you want to manage?

What you could do is, in code. (Sorry, I can't select the lines below on my phone). Anyway, if there is just a single line, you will end up with one "message".

Open the file
While not end of file
Read the next line into a string variable (that will give you your message)
Wend
Close the file

prasadgov · Jul 1, 2024

gemma-the-husky said:
So do you have a fixed width file layout to tell you where the data is you want to manage?

What you could do is, in code. (Sorry, I can't select the lines below on my phone). Anyway, if there is just a single line, you will end up with one "message".

Open the file
While not end of file
Read the next line into a string variable (that will give you your message)
Wend
Close the file

I was able to follow your suggestion and was able to read the complete contents of one text file into the table

Code:

Dim strFilename As String: strFilename = "C:\PG\BC2\TCC\20240626060103.txt"
Dim strTextLine As String
Dim iFile As Integer: iFile = FreeFile

Open strFilename For Input As #iFile
Do Until EOF(1)
Line Input #1, strTextLine
CurrentDb.Execute "insert into tbldtcc (message) select '" & strTextLine & "';"
Loop
Close #iFile

End Sub

Now, how to modify the code to read all the files in the folder inside the loop?

Thanks everyone for your kind help and patience!

Gasman · Jul 1, 2024

You do not need the Select surely?, just use the strTextLine?
Use Dir() to get each file in the folder in turn.

The_Doc_Man · Jul 1, 2024

I ran the two sample files from post #18 through a string parser test module that I had for another project. It reports no control characters of any kind, and that parser specifically WILL catch null characters. It even helped me analyze a UTF-8 file.

The analyzer does, however, report two long sequences of space characters near the end of the lines. One is about 80 bytes, the other about 190 bytes. Exact number not really important in this context.

arnelgp · Jul 2, 2024

can you convert Chr$(0) to a Space char?

Code:

Private Sub Command0_Click()
Dim FSO As Object, objFolder As Object, objFile As Object
Dim content As String
Dim i As Integer
Dim db As DAO.Database
Dim rs As DAO.Recordset
Set db = CurrentDb
Set rs = db.OpenRecordset("tbltcc", dbOpenDynaset)
Set FSO = CreateObject("Scripting.FileSystemObject")
Set objFolder = FSO.GetFolder("C:\Prasad\BC2\TC")
i = 1
For Each objFile In objFolder.Files
    If Right(objFile.Name, 3) = "txt" Then
        content = Trim$(Replace$(RawContent(objFolder & "\" & objFile.Name), Chr$(0), " "))
        'Debug.Print content
        With rs
            .AddNew
            ![message] = content
            .Update
        End With
    End If
    i = i + 1
Next objFile

Set objFile = Nothing
Set objFolder = Nothing
Set FSO = Nothing
rs.Close
Set rs = Nothing
Set db = Nothing
End Sub


' https://bytes.com/topic/access/864406-read-file-byte-byte
Public Function RawContent(ByVal strFileName As String) As String
Dim intFileNumber As Integer
Dim lngFileSize As Long
Dim strBuffer As String
Dim lngCharNumber As Long
Dim strCharacter As String * 1
'Dim strFileName As String

'strFileName = "C:\Windows\System.ini"

'Get the next available File Number
intFileNumber = FreeFile

DoCmd.Hourglass True

Open strFileName For Binary Access Read Shared As #intFileNumber

lngFileSize = LOF(intFileNumber)    'How large is the File in Bytes?
strBuffer = Space$(lngFileSize)     'Set Buffer Size to File Length

Get #intFileNumber, , strBuffer     'Grab a Chunk of Data from the File
Close #intFileNumber

RawContent = strBuffer
DoCmd.Hourglass False
End Function

The_Doc_Man · Jul 2, 2024

arnelgp said:
can you convert Chr$(0) to a Space char?

CHR$(0) is the ASCII "NUL" character, which occupies one byte. I don't know why you couldn't convert it - but in the sample files offered to us earlier, there were no NUL characters. My scan would have detected them. Lots of spaces. No NULs.

arnelgp · Jul 2, 2024

both text files have Nulls on them (view them on Notepad++ (View->Show Control Chars)

The_Doc_Man · Jul 2, 2024

Interesting, because the method I used WOULD detect NUL and other non-printing characters - but didn't. Which makes me wonder if the method used for opening a file somehow pre-filters those NUL characters. I can modify my test and try again, but it's getting late in my time zone.

gemma-the-husky · Jul 2, 2024

The_Doc_Man said:
Interesting, because the method I used WOULD detect NUL and other non-printing characters - but didn't. Which makes me wonder if the method used for opening a file somehow pre-filters those NUL characters. I can modify my test and try again, but it's getting late in my time zone.

That implies that maybe "line input" might read a line to a CR/EOF character and pre process it to a genuine string. As it returns a string it sounds plausible.

gemma-the-husky · Jul 2, 2024

prasadgov said:
I was able to follow your suggestion and was able to read the complete contents of one text file into the table

Code:

Dim strFilename As String: strFilename = "C:\PG\BC2\TCC\20240626060103.txt" Dim strTextLine As String Dim iFile As Integer: iFile = FreeFile Open strFilename For Input As #iFile Do Until EOF(1) Line Input #1, strTextLine CurrentDb.Execute "insert into tbldtcc (message) select '" & strTextLine & "';" Loop Close #iFile End Sub

Now, how to modify the code to read all the files in the folder inside the loop?

Thanks everyone for your kind help and patience!

So generally speaking, put the processing function into a separate sub/function, with an argument of the single file name.

Then have another sub to iterate the files in the folder. You can do this with a dir() loop, but unfortunately because dir() is not recursive you can't get it to check any subfolders that it finds. If you only want to use a single folder you can use dir.

You might need a folder picker to find the folder you want to use, rather than hard coding it, so that's another independent function. You could save the last folder you used in a table, or variable somewhere so the next time you do this process, the folder picker starts at the last folder you used, rather than being really irritating and starting a C:/ each time.

Breaking an application into smaller units like this is really easy and useful.

You end up with 2 or 3 functions this way. One to select a folder, one to check all the files in the folder. The third to process a file in a particular way. Each of these functions can often be used/reused independently of the others. It also makes it easier to debug your programs, and even to design them. You can build the whole structure in logical blocks without having the code.

After you process a file, you might want to rename the processed file, or move it to a processed folder, or write the file details to a log file somewhere. It's easy to see the correct logical place to put each step. .

In this instance you can process a file folder, and make sure it finds the right number of files, before you even bother with the processing code for a single file. Alternatively you can develop the processing code for a test file, knowing it will work correctly with every file you give it.

I tend to try to stick to intrinsic VBA commands mostly. I only really use fso if I've picked up a function somewhere that already uses it, for instance.

gemma-the-husky · Jul 2, 2024

The_Doc_Man said:
Interesting, because the method I used WOULD detect NUL and other non-printing characters - but didn't. Which makes me wonder if the method used for opening a file somehow pre-filters those NUL characters. I can modify my test and try again, but it's getting late in my time zone.

.just thinking about this again reminds me a different issue, which I won't repeat here.

But at some point doesn't VBA convert bytes (eg in a text file) into 2 byte Unicode characters? That might change chr(0) into a unicode character that doesn't cause an error. Or am I missing something about the nature of a VBA string.

In the issue I had before, I wanted to treat an array of bytes as a string, and it just couldn't get it to work. After I stringified the array of bytes, I believe it was no longer the same byte sequence. (This was a while ago)

The_Doc_Man · Jul 2, 2024

I re-ran my test and confirmed my suspicions.

By experiment I found that using a sequence to Open filename For Input As #1 followed by Line Input #1 converts the input to an ASCII string because my scanner didn't even hiccup once after reading the file using that sequence. And it is capable of detecting non-printing characters. I verified it was still working correctly because in test #2, when I used the FileSystemObject to perform an OpenTextFile followed by .ReadAll then I got a (VERY) different result. Technically, it appears that the FSO OpenTextFile call creates a UNIX-like text stream which does not guarantee any type of conversion at all. So what I got was quite raw.

The file, despite being of type .TXT, is NOT an ASCII text file. When I checked with Notepad, it is verified as a UTF-8 file, which means it contains extended characters. It also contains a LOT of literal ASCII NULs in the string. UTF-8 is NOT the same as UNICODE.

The original problem of encountering read errors is simple... Access doesn't like UTF-8 files unless you explain to it how to handle them first. BUT if you use the Open / Line Input sequence, it WILL convert that for you, losing most of the special characters along the way.

gemma-the-husky · Jul 2, 2024

The_Doc_Man said:
I re-ran my test and confirmed my suspicions.

By experiment I found that using a sequence to Open filename For Input As #1 followed by Line Input #1 converts the input to an ASCII string because my scanner didn't even hiccup once after reading the file using that sequence. And it is capable of detecting non-printing characters. I verified it was still working correctly because in test #2, when I used the FileSystemObject to perform an OpenTextFile followed by .ReadAll then I got a (VERY) different result. Technically, it appears that the FSO OpenTextFile call creates a UNIX-like text stream which does not guarantee any type of conversion at all. So what I got was quite raw.

The file, despite being of type .TXT, is NOT an ASCII text file. When I checked with Notepad, it is verified as a UTF-8 file, which means it contains extended characters. It also contains a LOT of literal ASCII NULs in the string. UTF-8 is NOT the same as UNICODE.

The original problem of encountering read errors is simple... Access doesn't like UTF-8 files unless you explain to it how to handle them first. BUT if you use the Open / Line Input sequence, it WILL convert that for you, losing most of the special characters along the way.

That's interesting, as it sounds like using fso to manage my byte stream might well have solved the problem I had with the unicode characters.

And I appreciate your observations that unicode and utf-8 are different things.

Solved File Read error

prasadgov

Member

gemma-the-husky

Super Moderator

prasadgov

Member

Gasman

Enthusiastic Amateur

The_Doc_Man

Immoderate Moderator

arnelgp

..forever waiting... waiting for jellybean!

The_Doc_Man

Immoderate Moderator

arnelgp

..forever waiting... waiting for jellybean!

The_Doc_Man

Immoderate Moderator

gemma-the-husky

Super Moderator

gemma-the-husky

Super Moderator

gemma-the-husky

Super Moderator

The_Doc_Man

Immoderate Moderator

gemma-the-husky

Super Moderator

Similar threads

Users who are viewing this thread