"Run-time error '-2147024882 (8007000e)': Out of memory." while building ADO recordset

Mike_10

New member
Local time
Today, 17:03
Joined
Jul 29, 2020
Messages
18
Hello, I have done hours of combing the web trying to solve this problem myself but still can't find anything to help. I am not experienced in any type of coding and am trying to teach myself some of the Access VBA coding basics.

I have a large text file (590,352 KB) that consist of about 1.1 million records, all in 1 field. Each record contains 534 characters. I am trying to break this into an organized recordset of 41 fields, each of varying string lengths. For testing purposes I created 4 smaller sample files (15 KB, 263 KB, 3,657 KB, and 49,044 KB) of the original, thus I can check for code functionality and performance. I have successfully accomplished building the "organized' recordset with all sample files (and the rest of the code runs great), but am receiving the out of memory error when I ramp all the way up to the full burden of the original file.

Below is the relevant portion of the code. The debugger highlights ".AddNew" inside the With MasterRS statement. How should I build this recordset?

Code:
Public PRecordCount As Long

    Dim strSQL As String
    Dim db As DAO.Database
    Dim ImportedDataRS As DAO.Recordset
    Dim MasterRS As ADODB.Recordset
    Dim MasterField(50) As String
    Dim FieldCount As Integer

    strSQL = "SELECT * FROM [" & NewFileName & "]"
    Set db = OpenDatabase(NewPathName, False, True, "Text; HDR=No")
    Set ImportedDataRS = db.OpenRecordset(strSQL)

    MasterField(1) = "ACCOUNT"
    MasterField(2) = "YEAR"
    MasterField(3) = "JURISDICTION"
    MasterField(4) = "TAX UNIT ACCT"
    MasterField(5) = "LEVY"
    MasterField(6) = "HOMESTEAD"
    MasterField(7) = "OVER 65"
    MasterField(8) = "VETERAN"
    MasterField(9) = "DISABLED"
    MasterField(10) = "AG"
    MasterField(11) = "DATE PAID"
    MasterField(12) = "DUE DATE"
    MasterField(13) = "OMIT"
    MasterField(14) = "LEVY BALANCE"
    MasterField(15) = "SUIT"
    MasterField(16) = "CAUSE NO"
    MasterField(17) = "BANK CODE"
    MasterField(18) = "BANKRUPT NO"
    MasterField(19) = "ATTORNEY"
    MasterField(20) = "COURT COST"
    MasterField(21) = "ABSTRACT FEE"
    MasterField(22) = "DEFERRAL"
    MasterField(23) = "BILL SUPP"
    MasterField(24) = "SPLIT PMT"
    MasterField(25) = "CATEGORY"
    MasterField(26) = "OWNER 1"
    MasterField(27) = "OWNER 2"
    MasterField(28) = "MAILING ADDRESS 1"
    MasterField(29) = "MAILING ADDRESS 2"
    MasterField(30) = "MAILING CITY"
    MasterField(31) = "MAILING STATE"
    MasterField(32) = "MAILING ZIP"
    MasterField(33) = "ROLL CODE"
    MasterField(34) = "PARCEL NO"
    MasterField(35) = "PARCEL NAME"
    MasterField(36) = "PAYMENT AGREEMENT"
    MasterField(37) = "AMT DUE AS OF EOM"
    MasterField(38) = "AMT DUE +30 DAYS"
    MasterField(39) = "AMT DUE +60 DAYS"
    MasterField(40) = "AMT DUE +90 DAYS"
    MasterField(41) = "AMOUNT INDICATOR"
    FieldCount = 41

    'Set up an in-memory recordset that will assign fields to all the imported data
    Set MasterRS = New ADODB.Recordset
    For i = 1 To FieldCount
        MasterRS.Fields.Append MasterField(i), adVarChar, 40, adFldMayBeNull
    Next i

    'Set up the in-memory recordset with assigned field names to separate out and organize all single field raw data
    i = 1
    ImportedDataRS.MoveFirst
    MasterRS.Open
    Do While Not ImportedDataRS.EOF
        With MasterRS
            .AddNew
            .Fields("ACCOUNT") = Mid(ImportedDataRS.Fields(0), 1, 34)
            .Fields("YEAR") = Mid(ImportedDataRS.Fields(0), 35, 4)
            .Fields("JURISDICTION") = Mid(ImportedDataRS.Fields(0), 39, 4)
            .Fields("TAX UNIT ACCT") = Mid(ImportedDataRS.Fields(0), 43, 34)
            .Fields("LEVY") = Mid(ImportedDataRS.Fields(0), 77, 11)
            .Fields("HOMESTEAD") = Mid(ImportedDataRS.Fields(0), 88, 1)
            .Fields("OVER 65") = Mid(ImportedDataRS.Fields(0), 89, 1)
            .Fields("VETERAN") = Mid(ImportedDataRS.Fields(0), 90, 1)
            .Fields("DISABLED") = Mid(ImportedDataRS.Fields(0), 91, 1)
            .Fields("AG") = Mid(ImportedDataRS.Fields(0), 92, 1)
            .Fields("DATE PAID") = Mid(ImportedDataRS.Fields(0), 93, 8)
            .Fields("DUE DATE") = Mid(ImportedDataRS.Fields(0), 101, 8)
            .Fields("OMIT") = Mid(ImportedDataRS.Fields(0), 109, 2)
            .Fields("LEVY BALANCE") = Mid(ImportedDataRS.Fields(0), 111, 11)
            .Fields("SUIT") = Mid(ImportedDataRS.Fields(0), 122, 1)
            .Fields("CAUSE NO") = Mid(ImportedDataRS.Fields(0), 123, 40)
            .Fields("BANK CODE") = Mid(ImportedDataRS.Fields(0), 163, 1)
            .Fields("BANKRUPT NO") = Mid(ImportedDataRS.Fields(0), 164, 40)
            .Fields("ATTORNEY") = Mid(ImportedDataRS.Fields(0), 204, 1)
            .Fields("COURT COST") = Mid(ImportedDataRS.Fields(0), 205, 7)
            .Fields("ABSTRACT FEE") = Mid(ImportedDataRS.Fields(0), 212, 7)
            .Fields("DEFERRAL") = Mid(ImportedDataRS.Fields(0), 219, 1)
            .Fields("BILL SUPP") = Mid(ImportedDataRS.Fields(0), 220, 1)
            .Fields("SPLIT PMT") = Mid(ImportedDataRS.Fields(0), 221, 1)
            .Fields("CATEGORY") = Mid(ImportedDataRS.Fields(0), 222, 4)
            .Fields("OWNER 1") = Mid(ImportedDataRS.Fields(0), 226, 40)
            .Fields("OWNER 2") = Mid(ImportedDataRS.Fields(0), 266, 40)
            .Fields("MAILING ADDRESS 1") = Mid(ImportedDataRS.Fields(0), 306, 40)
            .Fields("MAILING ADDRESS 2") = Mid(ImportedDataRS.Fields(0), 346, 40)
            .Fields("MAILING CITY") = Mid(ImportedDataRS.Fields(0), 386, 40)
            .Fields("MAILING STATE") = Mid(ImportedDataRS.Fields(0), 426, 2)
            .Fields("MAILING ZIP") = Mid(ImportedDataRS.Fields(0), 428, 12)
            .Fields("ROLL CODE") = Mid(ImportedDataRS.Fields(0), 440, 1)
            .Fields("PARCEL NO") = Mid(ImportedDataRS.Fields(0), 441, 8)
            .Fields("PARCEL NAME") = Mid(ImportedDataRS.Fields(0), 449, 40)
            .Fields("PAYMENT AGREEMENT") = Mid(ImportedDataRS.Fields(0), 489, 1)
            .Fields("AMT DUE AS OF EOM") = Mid(ImportedDataRS.Fields(0), 490, 11)
            .Fields("AMT DUE +30 DAYS") = Mid(ImportedDataRS.Fields(0), 501, 11)
            .Fields("AMT DUE +60 DAYS") = Mid(ImportedDataRS.Fields(0), 512, 11)
            .Fields("AMT DUE +90 DAYS") = Mid(ImportedDataRS.Fields(0), 523, 11)
            .Fields("AMOUNT INDICATOR") = Mid(ImportedDataRS.Fields(0), 534, 1)
            .Update
        End With
        ImportedDataRS.MoveNext
        i = i + 1
    Loop
    PRecordCount = i - 1

    ImportedDataRS.Close
    MasterRS.Close
    Set ImportedDataRS = Nothing
    Set MasterRS = Nothing
    Set db = Nothing

EDIT by Isladogs - Added code tags
 
Last edited:
Hello, I have done hours of combing the web trying to solve this problem myself but still can't find anything to help. I am not experienced in any type of coding and am trying to teach myself some of the Access VBA coding basics.

I have a large text file (590,352 KB) that consist of about 1.1 million records, all in 1 field. Each record contains 534 characters. I am trying to break this into an organized recordset of 41 fields, each of varying string lengths. For testing purposes I created 4 smaller sample files (15 KB, 263 KB, 3,657 KB, and 49,044 KB) of the original, thus I can check for code functionality and performance. I have successfully accomplished building the "organized' recordset with all sample files (and the rest of the code runs great), but am receiving the out of memory error when I ramp all the way up to the full burden of the original file.

Below is the relevant portion of the code. The debugger highlights ".AddNew" inside the With MasterRS statement. How should I build this recordset?

Public PRecordCount As Long

Dim strSQL As String
Dim db As DAO.Database
Dim ImportedDataRS As DAO.Recordset
Dim MasterRS As ADODB.Recordset
Dim MasterField(50) As String
Dim FieldCount As Integer

strSQL = "SELECT * FROM [" & NewFileName & "]"
Set db = OpenDatabase(NewPathName, False, True, "Text; HDR=No")
Set ImportedDataRS = db1.OpenRecordset(strSQL)

MasterField(1) = "ACCOUNT"
MasterField(2) = "YEAR"
MasterField(3) = "JURISDICTION"
MasterField(4) = "TAX UNIT ACCT"
MasterField(5) = "LEVY"
MasterField(6) = "HOMESTEAD"
MasterField(7) = "OVER 65"
MasterField(8) = "VETERAN"
MasterField(9) = "DISABLED"
MasterField(10) = "AG"
MasterField(11) = "DATE PAID"
MasterField(12) = "DUE DATE"
MasterField(13) = "OMIT"
MasterField(14) = "LEVY BALANCE"
MasterField(15) = "SUIT"
MasterField(16) = "CAUSE NO"
MasterField(17) = "BANK CODE"
MasterField(18) = "BANKRUPT NO"
MasterField(19) = "ATTORNEY"
MasterField(20) = "COURT COST"
MasterField(21) = "ABSTRACT FEE"
MasterField(22) = "DEFERRAL"
MasterField(23) = "BILL SUPP"
MasterField(24) = "SPLIT PMT"
MasterField(25) = "CATEGORY"
MasterField(26) = "OWNER 1"
MasterField(27) = "OWNER 2"
MasterField(28) = "MAILING ADDRESS 1"
MasterField(29) = "MAILING ADDRESS 2"
MasterField(30) = "MAILING CITY"
MasterField(31) = "MAILING STATE"
MasterField(32) = "MAILING ZIP"
MasterField(33) = "ROLL CODE"
MasterField(34) = "PARCEL NO"
MasterField(35) = "PARCEL NAME"
MasterField(36) = "PAYMENT AGREEMENT"
MasterField(37) = "AMT DUE AS OF EOM"
MasterField(38) = "AMT DUE +30 DAYS"
MasterField(39) = "AMT DUE +60 DAYS"
MasterField(40) = "AMT DUE +90 DAYS"
MasterField(41) = "AMOUNT INDICATOR"
FieldCount = 41

'Set up an in-memory recordset that will assign fields to all the imported data
Set MasterRS = New ADODB.Recordset
For i = 1 To FieldCount
MasterRS.Fields.Append MasterField(i), adVarChar, 40, adFldMayBeNull
Next i

'Set up the in-memory recordset with assigned field names to separate out and organize all single field raw data
i = 1
ImportedDataRS.MoveFirst
MasterRS.Open
Do While Not ImportedDataRS.EOF
With MasterRS
.AddNew
.Fields("ACCOUNT") = Mid(ImportedDataRS.Fields(0), 1, 34)
.Fields("YEAR") = Mid(ImportedDataRS.Fields(0), 35, 4)
.Fields("JURISDICTION") = Mid(ImportedDataRS.Fields(0), 39, 4)
.Fields("TAX UNIT ACCT") = Mid(ImportedDataRS.Fields(0), 43, 34)
.Fields("LEVY") = Mid(ImportedDataRS.Fields(0), 77, 11)
.Fields("HOMESTEAD") = Mid(ImportedDataRS.Fields(0), 88, 1)
.Fields("OVER 65") = Mid(ImportedDataRS.Fields(0), 89, 1)
.Fields("VETERAN") = Mid(ImportedDataRS.Fields(0), 90, 1)
.Fields("DISABLED") = Mid(ImportedDataRS.Fields(0), 91, 1)
.Fields("AG") = Mid(ImportedDataRS.Fields(0), 92, 1)
.Fields("DATE PAID") = Mid(ImportedDataRS.Fields(0), 93, 8)
.Fields("DUE DATE") = Mid(ImportedDataRS.Fields(0), 101, 8)
.Fields("OMIT") = Mid(ImportedDataRS.Fields(0), 109, 2)
.Fields("LEVY BALANCE") = Mid(ImportedDataRS.Fields(0), 111, 11)
.Fields("SUIT") = Mid(ImportedDataRS.Fields(0), 122, 1)
.Fields("CAUSE NO") = Mid(ImportedDataRS.Fields(0), 123, 40)
.Fields("BANK CODE") = Mid(ImportedDataRS.Fields(0), 163, 1)
.Fields("BANKRUPT NO") = Mid(ImportedDataRS.Fields(0), 164, 40)
.Fields("ATTORNEY") = Mid(ImportedDataRS.Fields(0), 204, 1)
.Fields("COURT COST") = Mid(ImportedDataRS.Fields(0), 205, 7)
.Fields("ABSTRACT FEE") = Mid(ImportedDataRS.Fields(0), 212, 7)
.Fields("DEFERRAL") = Mid(ImportedDataRS.Fields(0), 219, 1)
.Fields("BILL SUPP") = Mid(ImportedDataRS.Fields(0), 220, 1)
.Fields("SPLIT PMT") = Mid(ImportedDataRS.Fields(0), 221, 1)
.Fields("CATEGORY") = Mid(ImportedDataRS.Fields(0), 222, 4)
.Fields("OWNER 1") = Mid(ImportedDataRS.Fields(0), 226, 40)
.Fields("OWNER 2") = Mid(ImportedDataRS.Fields(0), 266, 40)
.Fields("MAILING ADDRESS 1") = Mid(ImportedDataRS.Fields(0), 306, 40)
.Fields("MAILING ADDRESS 2") = Mid(ImportedDataRS.Fields(0), 346, 40)
.Fields("MAILING CITY") = Mid(ImportedDataRS.Fields(0), 386, 40)
.Fields("MAILING STATE") = Mid(ImportedDataRS.Fields(0), 426, 2)
.Fields("MAILING ZIP") = Mid(ImportedDataRS.Fields(0), 428, 12)
.Fields("ROLL CODE") = Mid(ImportedDataRS.Fields(0), 440, 1)
.Fields("PARCEL NO") = Mid(ImportedDataRS.Fields(0), 441, 8)
.Fields("PARCEL NAME") = Mid(ImportedDataRS.Fields(0), 449, 40)
.Fields("PAYMENT AGREEMENT") = Mid(ImportedDataRS.Fields(0), 489, 1)
.Fields("AMT DUE AS OF EOM") = Mid(ImportedDataRS.Fields(0), 490, 11)
.Fields("AMT DUE +30 DAYS") = Mid(ImportedDataRS.Fields(0), 501, 11)
.Fields("AMT DUE +60 DAYS") = Mid(ImportedDataRS.Fields(0), 512, 11)
.Fields("AMT DUE +90 DAYS") = Mid(ImportedDataRS.Fields(0), 523, 11)
.Fields("AMOUNT INDICATOR") = Mid(ImportedDataRS.Fields(0), 534, 1)
.Update
End With
ImportedDataRS.MoveNext
i = i + 1
Loop
PRecordCount = i - 1

ImportedDataRS.Close
MasterRS.Close
Set ImportedDataRS = Nothing
Set MasterRS = Nothing
Set db = Nothing
wow; i guess your not an ordinary Accdb Programmer
>41 fields is really a big table(s)
if you have created fixed table(s), try to split your database
then your data would be in a link connection
this would help to separate the size of your (forms, query, reports) then separate size of (source table)
now... you have 2 accdb files the (forms, query, reports) and (source table)
>then your field text error maybe exceeded the limits
(could be possible to add one field of longtext for that field extension)
>you could also use MSSQL and link the table(s) to your Accdb but it would be a big work
this method also result to error of connection once some computer/file/server settings would be change
>try this: "Error -2147024882 (8007000e) runtime error and a &H80004005 (-2147467259) System Error during installation.
This error is reported as a Windows out of memory error,
however the actual cause of the error is most likely that the Windows Event Log is full.
Open the Windows Event Log and archive some of the events."
 
Last edited:
Hi. Welcome to AWF!

If you're getting that error because your machine doesn't have enough memory, then maybe either add more or use your disk to store the data.
 
I would like you to set up an experiment for me (and you) if you can. You didn't say which version of Windows you are on, but if it is at least 7 (or 10 would be better), set up your app to "do its thing" but DON'T trigger it quite yet. "Normalize" the window so you have room for something else on the screen and start the Task Manager. Find the Processes tab and focus on MEMORY. Click that column so that it sorts "most memory" on top. You might have to click twice because doing so flips the sort order for that column. Remember, biggest memory usage on top of the list even if when you start, the MSACCESS.EXE process isn't at the top.

Switch to the Access window. Trigger your run and switch back immediately to the Task Manager. Watch what happens to the numbers. If I am correct, the image that is running MSACCESS.EXE will balloon in memory size until it approaches 2 GB at which point you will see that error message.

I will avoid getting into the details right now, but I can tell you that there are TWO errors "Out of Memory" - 800x0008 and 800x000E. The 08 error would point to the swap/page file (a.k.a. the "Virtual" file). Error 0E points to a more severe and difficult problem. The test I am suggesting should not be hard to set up. When you do the test, tell me if it did what I expected. IF I am right, it would have taken at least several seconds to reach that point, maybe more than seconds.

There are other tests that involve Resource Monitor but they can be a bit more "dodgy" to track. But if you perform that test and it behaves about like I suggest, I think I know your problem and it will be a bear to fix.

Or maybe not, if you are willing to think outside of the box.

As a side note, tell us about the machine and Office configuration. How much RAM, what version of Windows, what version of Access, what bitness of Office, ... things like that.
 
one piece of advice is that it would probably be much simpler to make both record sets dao rather than (for some reason) having one dao & one ado.

That's my starting comment ... Will look more at it tomorrow. But to start with, using both types of recordsets doubles the amount of things you have to study up on (to perfection) to run this code, which makes no sense..
 
Sorry all, not sure why my code indentions collapsed, that made it harder to read, I promise I put them in.

One more detail that might be of interest, when the error occurs and I check the debugger, "i" is at approximately 190,000, just to give you guys an idea of how far into the loop it's going. So to me that means about 15 - 20% through.

vhung - Thank you for this suggestion. I actually don't have to split this database because I will be the only one running it and it will stay on my machine only. It will remain a simple VBA and I won't ever need tables, forms, reports or any other functionality because all I'm trying to do is take the source text file and convert it into a usable, organized, Excel file of specific imports and calculated outputs that others can use.

theDBguy - Many thanks for the warm welcome. I'll admit, I've read lots of your responses on this site along my quest to find answers to the various problems I've run into trying to learn this stuff over the past several months. It's almost like seeing a friendly face! This was the 1 problem that finally made me give in and admit I could not figure it out myself so I had to finally join and post a question, I hate admitting defeat, but I'm a little relieved now. To your point about adding memory I would think my machine has enough,16GB RAM. I'm not real computer knowledgeable but from what I understand that's a decent amount. Agreed?

The_Doc_Man - Also have read many of your responses on this site before, thanks for the contributions, I'm sure I've learned things from you already. I'm going to run this test as you've instructed as soon as I get a little more time, hopefully tomorrow, and I'll let you know what happens. So a little more about my hardware and OS:
- Device: HP ZBook 17 (For a laptop it's big and clucky and I've seen newer, slimmer ZBooks. The warranty expired 6-4-17 so it's several years old. It's a company issued machine.)
- Processor: Intel(R) Core(TM) i5-4330M CPU @ 2.80GHz, 2 Cores, 4 Logical Processors
- Installed RAM: 16.0 GB (15.8 GB usable)
- System type: 64-bit OS, x64-based processor
- OS Name: Microsoft Windows 10 Enterprise
- OS Version: 10.0.18362 Build 18362
- Access Version: 16.0.4993.1001 (32-bit)
You really made me work to find all that info LOL, who would have ever thought Office had a "bitness" and as I just learned that it could be different from the Windows version. Anyways, hope that helps.

Isaac - I did actually consider this suggestion you are making but don't quite understand enough yet about the differences in the two. I thought I was doing right by utilizing the ADO with such a large data set. But I thought it's probably not going to make a difference to swap my DAO over. But I can certainly try.
 
You should have enough physical memory at 16 GB on a 64-bit Windows 10. System RAM almost certainly isn't your problem. You have 32-bit Access and what appears to be a two-file split case, so you will NEVER need more than 4 GB total EVER.

I'll give you the preview. I think your problem is partly "garbage" collection and partly per-process addressing limitations. If your test provides the profile I predicted, slamming into its memory limit of 2 GB within a short time but NOT instantly, then here is what I think is going on.

The first part I mentioned is "garbage collection" because you are repeatedly making implied sub-strings via the MID function. These strings have to be allocated from the private process heap, which can expand up to certain limits. But the second part is the ADODB recordsets, which make records in memory. You are in essence straining the hell out of memory. Each one of those MID calls makes a string fragment of that length, which gets allocated but which cannot be dynamically deallocated. That is because the ADODB recordset is ALSO allocating memory dynamically. These allocations are intertwined and the result is that you cannot reclaim big blocks because you can't deallocate the ADODB strings. But you have a MID extraction for every field in the record, and then an ADODB record assembled from what you just imported. Because you can't garbage collect, you are storing twice the size of your database file, not to mention that at least a part of that same file is present with the Import recordset. And it is the nature of Access that if you have the original 534 bytes in the same database, it is ALSO occupying virtual address space because that is how Access maps records. That process is consuming your address space until you reach the size limit and try to allocate one more chunk.

There are run-time errors such as "STACK OVERFLOWS HEAP" and similar messages, but that didn't happen here. You report that the .AddNew is what failed, and I think that it was because there was not another 534 bytes to allocate for that new record.

You MIGHT, however, try a totally different approach although this would be incredibly tedious to set up. Since the code you wrote says that the file has a fixed-column format, don't read it as a recordset. Read it as a text file using an IMPORT SPECIFICATION that allows you to select which columns are part of each field. For a constant format text file, that works swell and MIGHT have less of a problem with the dynamic string creation because it is not clear to me that TEXT IMPORT uses the same string creation paradigm as that piecemeal extraction does. It may be a bit more wholesale. You want to avoid that bunch of explicit MID functions, I think.

If that doesn't work, you may have a very serious problem because, though Access can hold that much data, it is too memory-intensive to process it the way you are doing it.
 
@Mike_10
Please use code tags when posting lengthy code. I've now added them to post #1.
I'll leave you to investigate Doc's suggestions
 
append it to temporary table, rather than using in-memory ado.recordset.
then read from temporary table.
 
rather than opening a recordset, try using readln function?

something like
Code:
Sub CopyFromTextFile()
Dim fso As New FileSystemObject
Dim ts As TextStream
Dim CurrentLine As String

    Set ts = fso.OpenTextFile("C:\path\info.txt", ForReading)
    Do Until ts.AtEndOfStream
        CurrentLine = ts.ReadLine
        'append fields and save
   Loop

ts.Close

End Sub

You could run this code without the append to check it doesn't memory error
 
As Doc says, you are using very memory intensive techniques. Have you considered simply parsing the data permanently into tables? It will make the data a lot more accessible.

I use ADODB recordsets for forms that are displaying parsed data but not on so much data. The overwhelming load of the parsing makes the choice of what you hold the results in academic anyway.

Moreover the code is specific to that data so must be edited when applied to another structure.

Firstly generalise the code and make it portable by putting your parsing information into a table that holds the field name and the parameters being fed to Mid(). Put the entire destination field definition in a table, including the data type. Records will have two keys. One that identifies the type of document being parsed and another for the field number. You can loop through his information to define the ADO fields, or those of a destination table.

However there is a much more powerful solution. Your parsing technique is a type of what is known as RBAR (pronounced rebar). Row By Agonising Row. Anything that sequentially processes line by line falls into that category and is inevitably very slow. Another technique leverages the power of the database engine and is unbelievably fast.

Import the text files to a table without parsing them so that there is one record per line. Two fields, one is a LineId and the other is the whole line of text.

With the parsing information in a table it can be cartesian joined into a query that selects the text records making both the text and the parsing information available to a single Mid() expression outputting parsed text as a record for each combination of field specification and input text record. Essentially the results are super normalised in three fields as LineId, FieldId, ParsedData.

These results are then pivoted on the FieldId back into a table format with multiple fields.

I use this technique on SQL Server to return recordsets from text stored in the database. In one application, my query parses a dozen or so fields from 132 column text at over 8000 lines per second while simultaneously discarding unwanted rows, creating calculated columns as well as joining into historical data used to classify and enhance the results.

My data has a whole text file stored as a single record in a binary column. I use SQL Server's STRING_SPLIT() function to split out the individual lines on the line feed character and return a table variable with the lines numbered ready to be joined to the field specification records.

SQL Server is so much better equipped than Access for this kind of work.
 
I defer to Galaxiom on the SQL Server comments as I have not used that before. (I'm from an ORACLE shop that added SQL Server later.) However, he and I agree that what you are doing with ADODB is eating your memory alive.

G, if you look at the extraction code described in post #1 and in particular look carefully at the starting position and size of field N, then the starting position of field N+1, there is no apparent delimiter so STRING_SPLIT() might not work here. Otherwise I might have suggested the SPLIT() function from the VBA environment.

@Mike_10, ArnelGP's suggestion also has considerable merit. If you DON'T use an ADODB recordset but rather just make a temporary table in memory, you are still chewing up the implied virtual memory that is a Access file but you are not so rapidly chewing up working set allotment due to the ADODB recordset. And in fact if you combine his temp-table suggesting with my ImportText suggestion, it might fly fairly quickly because that code is compiled. Your "pick it apart" code is pseudo-compiled and thus is EMULATED, not truly executed. This adds a layer of delay. No promises, but I think you would have a better shot at it with TempTable + ImportText.
 
rather than opening a recordset, try using readln function?

something like
Code:
Sub CopyFromTextFile()
Dim fso As New FileSystemObject
Dim ts As TextStream
Dim CurrentLine As String

    Set ts = fso.OpenTextFile("C:\path\info.txt", ForReading)
    Do Until ts.AtEndOfStream
        CurrentLine = ts.ReadLine
        'append fields and save
   Loop

ts.Close

End Sub

You could run this code without the append to check it doesn't memory error
CJ, this assumes the text file has line delimiters. If it is a fixed-length record text stream, you would want to use random-access IO File Read.
FWIW, I think the problem is with the ADO recordset not releasing allocated memory. I would try using a DAO recordset, and if necessary do a dbengine.Idle dbForceOSFlush after every 10,000 (or so) record reads.

Best,
Jiri

Best
Jiri
 
Jiri, I agree that you could use some type of File I/O. I agree that the ADO recordset is not releasing memory. But the "Force O/S Flush" will not release dynamic allocated memory. It will force the flush of disk buffers that are waiting their turn for I/O WriteBehind operations. Also, normally you only use dbForceOSFlush in the context of Begin/Commit transactions.
 
Jiri, I agree that you could use some type of File I/O. I agree that the ADO recordset is not releasing memory. But the "Force O/S Flush" will not release dynamic allocated memory. It will force the flush of disk buffers that are waiting their turn for I/O WriteBehind operations. Also, normally you only use dbForceOSFlush in the context of Begin/Commit transactions.
Doc,
I am aware that dbForceOSFlush is used normally in transactions. Now whether it has effect on dynamic buffering is somethinbg I don't know but it wouldn't hurt to try. At any rate you could easily flush the allocated ADO space by closing and reopening the ADODB connection in the loop after x-number of iterations.

Best,
Jiri
 
True, you could close the recordset, but don't forget to save that recordset as a file before closing it. Otherwise, it becomes "discarded memory" and you lose it, or at least that is how I read it.

And my point about dbForceOSFlush is that it targets disk-buffer memory which is file-handle related. But until you act to save the recordset, there is no disk file. Further, closing and then opening a recordset doesn't release anything from the program HEAP area. Only exiting the process (quitting the app) does that.
 
... closing and then opening a recordset doesn't release anything from the program HEAP area. Only exiting the process (quitting the app) does that.

Closing a recordset will release whatever system resources it holds. Only the object itself will remain.

Jiri
 
this assumes the text file has line delimiters
seems a reasonable assumption since the OP's code uses movenext through the textfile recordset after each 'add' - so my reading of the file is there are 1.1m rows of 534 characters
 
We are not yet communicating, Jiri. Releasing a recordset absolutely does release system resources. Any system structures involved get linked back into one or more of the system's "Lookaside" lists. They get reused easily enough. But allocated string memory is not a system structure, it is part of a private structure. It cannot be so easily released because it requires a type of garbage collection that Windows does not do, CANNOT do, in the user's private context. Unfortunately, that is where a lot of the string resources are used and where they stay until the process RunDown reclaims ALL user memory and dumps it into the Free Page list.

My experiment that I suggested for BigJiim would have shown a massive growth in the process working set. Growth in a system structure would not be shown that way because system structures are NOT part of a process working set. Which is why I'm betting but not yet swearing that I understand what is going on. If his private working set goes BANG ZOOM then it isn't system resources that he lacks.
 
I've been following this thread and just chiming in to say if I were you @Mike_10 I would try either what Galaxiom or CJ_London suggested. Theoretically importing it to a table and then working with it might be fastest (although I'm not really sure, especially if you're importing it to an Access table and it is very large data). Or as CJ suggested ... reading textstream operates very fast, IMO, obviously depending on the overall data set, but there is no memory issues I don't think - certainly not like you are running into.

In all my dealings with incoming files (from vendors, government, different branches of an organization), I think I've seen a text file that was actually designed to have no concept of separate lines whatsoever maybe once or twice. Probably in healthcare, as they have some extremely nonsensical data layouts that require custom scripts to parse. Delimited vs. fixed length doesn't matter.

If you have trouble importing the text file into a table with no errors in one "go", then reading lines of text using FSO might be a great option.
 

Users who are viewing this thread

Back
Top Bottom