Rogue Column in a csv file (1 Viewer)

gemma-the-husky · Jan 20, 2018

:banghead:

Hi.

I have a somewhat similar issue to this thread.

https://www.access-programmers.co.uk/forums/showthread.php?p=1547252#post1547252

Anyway This is My Problem

I am trying to import a csv file without a file spec, to a new table. All appears fine. I delete the existing table and the import creates a new table and imports the data.

However, the next process I need to do fails, because of a type mismatch. It turns out that one of the columns which needs to be text, is actually coming in as a number. Previously the column has come in as text, and there has been no problem. This time, I think too many of the rows near the top are numeric, and access has decided the csv column was numeric. Therefore the import produces type checking errors and some rows have blank values for the converted non-numeric text.

So I tried this fix:
1. delete all the rows to leave an empty table - done
2. change the data type of the offending table to text - done

now i am trying to reimport the csv to the amended table, this time without creating a new table.

Using A2013 the reimport succeeds, but although the data type in the table is now Text (Short Text) the data conversion still fails, and the text values are not imported.

I normally use A2003, but in A2003, the reimport appears to hang, with task manager showing access running at 30-50%. I will try to resolve why it is hanging as a separate issue.

Anyway, even though the process completes in A2013, the data is still rejected with "type conversion error", and the text values are still not set.

I am reluctant to use a file specification for various reasons, but I will try that next to see if that fixes the issue.

[update.
I just tried a file spec.
A2013 imported correctly, with all the values in the suspect column.
A2003 hung again - so I can only presume A2003 has some sort of internal issue resolving what it thinks is a numeric value, with an application requirement for a text value.

The reason I am reluctant to use a file spec is that the column order of the csv just changed, although the column names are all the same. I think the import will come in with the wrong order, if I use a file spec.
]

Finally, I decided to try to link to the table, and then copy the linked items to the table. Exactly the same problem. The linked data shows #num errors for the text values in the numeric column.

So has anyone else had this type of problem, and how did you fix it!

RuralGuy · Jan 20, 2018

Dave, do you get the same results if you import to Excel? Just curious.

isladogs · Jan 20, 2018

Hi Dave
Yes I've had the same issue and for the same reasons
Imports that worked fine for months or even years failing due to Access interpreting data 'helpfully'
My solution is to use a two stage import with a buffer table
See post 9 of that other thread
If you want further info, let me know.

gemma-the-husky · Jan 20, 2018

Thanks for the tips. I just couldn't get it working in A2003 whatever I did, but the process is working correctly in A2013.

It's not my csv, so I don't know if I can get the sender to put text qualifiers around the suspect column. That would fix it.

JHB · Jan 21, 2018

gemma-the-husky said:
..
It's not my csv, so I don't know if I can get the sender to put text qualifiers around the suspect column. That would fix it.

Can't you put them in yourself, by code?

Galaxiom · Jan 21, 2018

Edit the file by adding a text row at the top or second line as appropriate depending on if headers are present.

It could be easily done in code.

gemma-the-husky · Jan 21, 2018

I'll try that idea about modifying the file. I didn't really want to change the file itself, but it would work.

gemma-the-husky · Mar 12, 2018

Back again.

I actually have 4 tables to import. I fixed the import for the first table using the idea I originally outlined. Only one table was exhibiting this funny behaviour, but now the other tables are doing the same thing. This was a cumbersome fix and I don't really want to have to do the same thing 3 more times.

So I decided to do what Galaxiom suggested. I preprocessed the csv file to add a new second row with a definitive text value in the rogue column. It had no effect. THe column is still being imported as numeric.

I was trying to avoid creating file specs, (in case the csv columns change) but I have done that now, and using an import spec there is no problem. So I can only assume there is something in the microcode for docmd.transfertext that causes an issue when processing imporrts with and without an import spec.

isladogs · Mar 12, 2018

Hi Dave

Did you try importing to a 'buffer' table first then specifying datatypes when importing to the final tables as I suggested in this post
https://www.access-programmers.co.uk/forums/showthread.php?p=1547252#post1547252

It always works for me

gemma-the-husky · Mar 12, 2018

@ridders

I could do it that way, but I have 4 different tables, and that means 4 different conversions. The reason I would prefer not to have an import spec at all (and using a buffer table is tantamount to the same thing) is in case the sender changes the csv file - either by adding more columns, or changing the order. Anyway, it's working properly with import specs now.

isladogs · Mar 12, 2018

Glad you got it working.

However, if you were to use a buffer table, it makes no difference what order the columns are in. If a column is added / removed, the recordset code is easily modified

The_Doc_Man · Mar 12, 2018

Dave, it might be a royal pain in the toches but you COULD do the import one other way. Granted, absolutely ugly. But...

Open the .CSV as an Excel App object. Read the raw text of the cells because in that COM context, you are ALWAYS reading text. Identify each column as to its purpose. (I'm assuming you can do so.) Then open a recordset to the targeted table. One row of the spreadsheet at the time, copy the data from the spreadsheet column to the recordset field. If necessary, you can test the datatype of the field and use a nested loop plus a Select Case to plotz each element. Here is some pseudocode

Code:

For i = 1 to spreadsheet.rows.count
    recordset.addnew
    For j = 1 to spreadsheet.columns.count
        select case recordset.fields(j-1).type
            case acBoolean ...
            case acInteger
                recordset.fields(j-1) = CInt( spreadhsheet.cells(i, j ) )
            case acLong    .... use CLng
            case acText    .....just copy the text
               etc. etc.
            end select
        next j
    recordset.update
    next i

I know this would be tedious, but you would only have to write it once and could pass in things like the name of the table to use. The trick is, of course, to assure that the table layout matches the spreadsheet input layout.

Cronk · Mar 13, 2018

FWIW

I've also encountered problems importing text based files. Mostly it has been invalid data formats so I always import into a table with all text columns and test before importing into the destination tables.

With rogue delimiters being included in the data, I check each record of the text file and count the number of delimiters. In some situations, it's possible to develop an algorithm to find the rogue are remove/replace it with vba. If not, I abort the import and message the operator to fix the data on line X, also displaying the problem string.

My worst example of this, was a job I was called on by a very large government department which had sent Excel survey forms to all work areas, and then wanted me to import the several hundred workbooks into Access. Trouble was the spreadsheets had no cell/range locking whatsoever so text was in numeric columns eg 100 to 200 and text in date columns eg between march and June. Columns had been inserted etc, etc. So a day's work led to some months (not continuous for me) as we went through several iterations for me to programmatically find issues for other staff to cleanse the data.

gemma-the-husky · Mar 13, 2018

@ridders

I haven't checked your code. I presume if I use a buffer table, then I bring every column in as text. I then need both some code convert the columns I brought in to the format I really want, and an append query to add it to the correctly typed table. It's all just more trouble that it's worth.

@the doc man.
same point really. I don't want to have to do all that jumping through hoops.

The issue seems to be that
1. using the import wizard directly correctly identifies a mixed numeric/alpha column as TEXT
2. using docmd.transfertext doesn't process a particular column correctly. Maybe there's something about this particular dataset, because it still doesn't do it, even when I add some text to the second row in the table to try to coerce it.

***. I have 4 tables, and this is only an occasional operation. Initially all the tables worked correctly. A couple of months ago, one of the tables started misbehaving with this problem. This time, all the tables misbehaved.

Anyway, it's working correctly using an import spec now, so I won't spend any more time on it.

isladogs · Mar 13, 2018

gemma-the-husky said:
@ridders

I haven't checked your code. I presume if I use a buffer table, then I bring every column in as text. I then need both some code convert the columns I brought in to the format I really want, and an append query to add it to the correctly typed table. It's all just more trouble that it's worth.

...

I have 4 tables, and this is only an occasional operation. Initially all the tables worked correctly. A couple of months ago, one of the tables started misbehaving with this problem. This time, all the tables misbehaved.

Anyway, it's working correctly using an import spec now, so I won't spend any more time on it.

Fair enough

In my case, I had the same issue for a nightly import of 30 CSV files which occasionally threw up datatype errors in different files & different fields.
This caused the whole import process to be erratic .

As the daily data update was vital for the system to function correctly, I had to make it reliable.
It was easy enough to setup and once completed, the import routine became 100% reliable (except for any client network interruptions) so that no further code changes have been needed in over 10 years.

Rogue Column in a csv file (1 Viewer)

gemma-the-husky

Super Moderator

RuralGuy

AWF VIP

isladogs

MVP / VIP

gemma-the-husky

Super Moderator

JHB

Have been here a while

Galaxiom

Super Moderator

gemma-the-husky

Super Moderator

gemma-the-husky

Super Moderator

isladogs

MVP / VIP

gemma-the-husky

Super Moderator

isladogs

MVP / VIP

The_Doc_Man

Immoderate Moderator

Cronk

Registered User.

gemma-the-husky

Super Moderator

isladogs

MVP / VIP

Similar threads

Users who are viewing this thread