Combing tables without common columns (1 Viewer)

pa1227

New member
Local time
Today, 09:50
Joined
Jul 26, 2017
Messages
8
Hi Guys,

I'm new to MS Access. I need help with combing two tables (with no common columns)

I want to combine two table based on matching keywords.
Ex: "Agricultural Business and Management." with "Agricultural Workers" (because "Agricultural" is common word in both) or "Agricultural Business and Management." with "Other Management Occupations" (because "Management" is common word in both)

I have tired the LIKE function but i feel it works more like a CONTAINS function so it doesn't return any value as both the strings are different with only 1 or 2 common words. I don't want to specify the actual keywords (there are about 2000 records), I want to return records just where 1/2 words match, even if it is partially. Please help!

Thanks in advance.
 

plog

Banishment Pending
Local time
Today, 08:50
Joined
May 11, 2011
Messages
11,643
Re: MS Access Combing tables without common columns

First, this is going to take a while to run. 2000 records equates to 4 million comparisons. Add to that the inefficient matching criteria you have and this is going to churn.

Second, define 'combine'. You have 2 unique tables now, how many tables and what fields do you want to end up with exactly?

Third, this is going to require you write a custom VBA function to determine "matches". Actually, there might be some sort of fuzzy match function on this site, so you might search. Essentially that function is going to take 2 strings (one from each table) its going to carve those strings up and then compare them according to your criteria to determine if they are "matches". If so, the function returns True, if not False. That is how you would determine if 2 strings "match".

From there your answer to my second point determines how you build the query to construct this combination. Perhaps if you provided the 2 tables along with some sample data that could help us give you more specific advice. Show us your 2 tables (include table and field names) and enough data to cover all cases. Then show us what you hope to end up with after running this yet to be written process.
 

pa1227

New member
Local time
Today, 09:50
Joined
Jul 26, 2017
Messages
8
Re: MS Access Combing tables without common columns

@Plog
Hi
Thanks for the reply.

By combine I mean to say I want to JOIN them. I want to join the two tables into a single one. I can't do this by a simply INNER JOIN as I do not have primary/foreign keys in either of the tables.

Currently Table 1 has "Study Code" and "Study Title" and Table 2 has "Occupation Code" and "Occupation Title". They are in random order. I want a single table with all four columns but the records are matching (Study Title matches the Occupation Title based on common words). I do not want it to return a True/False value.
 

plog

Banishment Pending
Local time
Today, 08:50
Joined
May 11, 2011
Messages
11,643
Re: MS Access Combing tables without common columns

You cannot JOIN them via an SQL JOIN. You can achieve the same results via the function I described that returns True/False to designate if two records match. The signature line of that function would like like so:

Function IsMatch(in_String1, in_String2) As Boolean

From that we can tell it takes 2 strings and returns a True/False to designate if those 2 strings "match" according to your criteria--whatever it may be. Then, the SQL to generate the data you want would be this:

Code:
SELECT [Study Code], [Study Title], [Occupation Code], [Occupation Title]
FROM Table1, Table2
WHERE IsMatch([Study Title], [Occupation Title])
 

isladogs

MVP / VIP
Local time
Today, 14:50
Joined
Jan 14, 2017
Messages
18,211
Re: MS Access Combing tables without common columns

By combine I mean to say I want to JOIN them. I want to join the two tables into a single one. I can't do this by a simply INNER JOIN as I do not have primary/foreign keys in either of the tables.

Currently Table 1 has "Study Code" and "Study Title" and Table 2 has "Occupation Code" and "Occupation Title". They are in random order. I want a single table with all four columns but the records are matching (Study Title matches the Occupation Title based on common words). I do not want it to return a True/False value.

You've not actually given a reason WHY you want to do this.
It will be potentially difficult to setup and time consuming to implement.
Why is it so important to you to do this?
 

The_Doc_Man

Immoderate Moderator
Staff member
Local time
Today, 08:50
Joined
Feb 28, 2001
Messages
27,158
By combine I mean to say I want to JOIN them. I want to join the two tables into a single one. I can't do this by a simply INNER JOIN as I do not have primary/foreign keys in either of the tables.

Because their match criteria are fuzzy, you have the potential problem of matching single records in one table with multiple records in the other table. You have the further potential problem that if there are spelling differences or you pick a "threshold formula" that says yea or nay to a match, you could pick a bad threshold and end up missing matches that you would actually want to have.

Please don't take this as anything other than trying to give you good advice on how to proceed. I don't think you are ready to do this. You have not analyzed the problem well enough to see how to manipulate the tables you have to make more reliable matches. Why do I say that? Because you are asking US to help you with criteria about which you are still not feeling "warm and fuzzy." But YOU are the subject matter expert. If you don't feel good about this problem and if you can't give us more information about the whys and wherefores, how do you expect us to do any better? We are looking at your problem through the wrong end of the telescope.
 

jdraw

Super Moderator
Staff member
Local time
Today, 09:50
Joined
Jan 23, 2006
Messages
15,378
pa1227,

You have to provide some answers to the comments and questions you have been asked if you want more assistance/advice.
Can you tell us what each of your tables is about -that is the subject matter/purpose? To the readers, they are lists of words/word combinations????
 

Galaxiom

Super Moderator
Staff member
Local time
Today, 23:50
Joined
Jan 20, 2009
Messages
12,851
I want to return records just where 1/2 words match, even if it is partially.

Begin by converting this vague description into a solid specification. One or two words match is good but "partially" match? One word is contained in the other perhaps? Or sharing common substrings of a certain length? Or Damerau-Levenshtein Distance below a specified threshold?

You could write a parsing function that steps through the words in the fields applying whatever criteria you choose, then joining on that function. However the Cartesian (AKA Cross) join involved would involve applying this substantial function millions of times. It would be painfully slow.

Instead you need to separate out the all the individual words in the fields from every record and collect them into a table as single records for each word. This allows a simple Like join and deduplicates the effort by reducing to one instance of each word in the join comparison.

The "one or two" exact match or "contained in" matches would work pretty well because the engine can apply its set processing. A D-L or other fuzzy match function would be slower but probably manageable because the data may still be processed as a set.

If you are quite well versed in databases you might be interested in understanding the different ways of extracting substrings from a field. This article applies to SQL Server but many of the principles are general to databases. However it will likely "do your head in" rather than help unless you already have considerable database experience.
 

Galaxiom

Super Moderator
Staff member
Local time
Today, 23:50
Joined
Jan 20, 2009
Messages
12,851
Of course the preferred solution would be to start by storing all those words as individual records in a table related to the main table records. It would be a whole lot easier to work with.
 

jdraw

Super Moderator
Staff member
Local time
Today, 09:50
Joined
Jan 23, 2006
Messages
15,378
pa1227,

Galaxiom has given you great advice -- start with a solid description so readers (and you) can understand your needs/issue.

Here is a link related to fuzzy matching.

Good luck/
 

pa1227

New member
Local time
Today, 09:50
Joined
Jul 26, 2017
Messages
8
Hi All,
Thank you for this overwhelming response. I wasn't expecting it to be frank. And also thank you moderator or administrator whoever moved my thread to this appropriate header.
 

pa1227

New member
Local time
Today, 09:50
Joined
Jul 26, 2017
Messages
8
Re: MS Access Combing tables without common columns

You cannot JOIN them via an SQL JOIN. You can achieve the same results via the function I described that returns True/False to designate if two records match. The signature line of that function would like like so:

Function IsMatch(in_String1, in_String2) As Boolean

From that we can tell it takes 2 strings and returns a True/False to designate if those 2 strings "match" according to your criteria--whatever it may be. Then, the SQL to generate the data you want would be this:

Code:
SELECT [Study Code], [Study Title], [Occupation Code], [Occupation Title]
FROM Table1, Table2
WHERE IsMatch([Study Title], [Occupation Title])

@Plog: Thanks a ton for the help. Will try it and let you know how it works.
 

pa1227

New member
Local time
Today, 09:50
Joined
Jul 26, 2017
Messages
8
Re: MS Access Combing tables without common columns

You've not actually given a reason WHY you want to do this.

@ridders: I work at an educational startup company where we are trying to form a crosswalk between the various occupations and various instructional programs. It’s an effort to consolidate the interests to applicable areas and also to find out the gaps the education system.
 

pa1227

New member
Local time
Today, 09:50
Joined
Jul 26, 2017
Messages
8
Begin by converting this vague description into a solid specification. One or two words match is good but "partially" match? One word is contained in the other perhaps? Or sharing common substrings of a certain length?
@ Galaxiom: So this is the case the two records share a common substrings (one or two words) the length cannot be fixed. One word is not contained in the other as in it is two different combination of words which might have one or two common words.

Instead you need to separate out the all the individual words in the fields from every record and collect them into a table as single records for each word. This allows a simple Like join and deduplicates the effort by reducing to one instance of each word in the join comparison.
@ Galaxiom: Sorry I'm not experienced in Database. Can you please dumb it down for me. What do I need to do after a split into separate individual words.
 

pa1227

New member
Local time
Today, 09:50
Joined
Jul 26, 2017
Messages
8
Please don't take this as anything other than trying to give you good advice on how to proceed. I don't think you are ready to do this. You have not analyzed the problem well enough to see how to manipulate the tables you have to make more reliable matches. Why do I say that? Because you are asking US to help you with criteria about which you are still not feeling "warm and fuzzy." But YOU are the subject matter expert. If you don't feel good about this problem and if you can't give us more information about the whys and wherefores, how do you expect us to do any better? We are looking at your problem through the wrong end of the telescope.

@The_Doc_Man: Thank you for your advice. But I'm still a beginner in this field and want to learn more. Thus I'm seeking the help from you guys.
 

isladogs

MVP / VIP
Local time
Today, 14:50
Joined
Jan 14, 2017
Messages
18,211
Re: MS Access Combing tables without common columns

@ridders: I work at an educational startup company where we are trying to form a crosswalk between the various occupations and various instructional programs. It’s an effort to consolidate the interests to applicable areas and also to find out the gaps the education system.

This is probably not going to be what you want to hear but ....

I create software for schools and used to be a teacher.

At one time I wanted to import data about our students held by a national statistical organisation. Although this was allowed the problem was that there were no common fields. The imported data had unique IDs but these had nothing in common with ours. The names were stored in a different way and weren't always consistent with what we had given them. In short it was a mess.

Because like you I didn't want to be defeated, I added additional id fields to the imported data and displayed it side by side with our own data, sorting both tables in different ways so I could populate the additional id fields manually. It took several hours because there were around 2000 students and each had around 30 imported records. Some of it was easier to manipulate in Excel than in Access.
BUT I did it in the end and I'm glad to say the data was worth the time & effort involved in that case.

Now why am I telling you all of this?
Because IF its important enough to do, I actually think it will in the end be quicker to match the data manually as I did.

Good luck ....and make backups repeatedly as you go....

BTW we managed to get the national agency to change their ways so that subsequent imports could be easily matched...
I expect you'll want to ensure this is a one off process as well.
 

The_Doc_Man

Immoderate Moderator
Staff member
Local time
Today, 08:50
Joined
Feb 28, 2001
Messages
27,158
Thus I'm seeking the help from you guys.

Absolutely YES. But our answers are perhaps not being seen as the type of help you were hoping to get. Therein lies the issue that you must understand. Fuzzy logic and partial matches work only when there is a reasonable rationale for declaring a match. Saying that one or two words in a grouping of words are a match might be good, might be bad, might be worth bupkis.

The problem is in assigning values or weights to these words, or in assigning accepted equivalences. From your earlier posts,

I want to combine two table based on matching keywords.
Ex: "Agricultural Business and Management." with "Agricultural Workers" (because "Agricultural" is common word in both) or "Agricultural Business and Management." with "Other Management Occupations" (because "Management" is common word in both)

Currently Table 1 has "Study Code" and "Study Title" and Table 2 has "Occupation Code" and "Occupation Title".

OK, are each of those individual columns? Or are they all values in the same columns? When you say you want to match, WHAT ARE YOU MATCHING? Then you tell us you want four columns as though you plopped your records side-by-side in parallel.

Our problem seems to be in communication, so let me make a suggestion.

Sample or synthesize two tables similar to the kind you want to join together. Include four or five records in each. THEN show us what you hoped to get out of it. Because like the poet, I see it through a glass, darkly.
 

jdraw

Super Moderator
Staff member
Local time
Today, 09:50
Joined
Jan 23, 2006
Messages
15,378
As others have said/suggested/requested
-show us a list of a few "records" from each "table"
Then, show us an example of the "matching" you would expect from the data you have shown us.

It would be helpful if you could tell us a little more of your ultimate use of this "matching". A detailed example would help.

Good luck.
 

pa1227

New member
Local time
Today, 09:50
Joined
Jul 26, 2017
Messages
8
Hi
I have attached a file with sample data (i have included only few records)
Please find the details below:

Sheet 1 (Study) --> Table 1
Sheet 2 (Occupation) --> Table 2
Sheet 3 (Results) --> The end result I'm expecting. I have colored the matching words in red i.e. the records from the 2 tables have been matched together because of 1 or 2 common (marked in red).

I have done it manually. I have not included the Study Code & Occupation Code columns in the respective tables as it does not matter.

Sorry for the delayed reply.
 

Attachments

  • Sample.xlsx
    16.6 KB · Views: 52

The_Doc_Man

Immoderate Moderator
Staff member
Local time
Today, 08:50
Joined
Feb 28, 2001
Messages
27,158
OK. Having looked at your example, the thing I see is that you need to take those multi-word items apart in order to get any performance. The words you want to match would force you to do some kind of "LIKE" match with "*word*" - which would have totally lousy performance. But if you split the candidate fields into individual then you can do exact word matches.
 

Users who are viewing this thread

Top Bottom