Finding what's not there. (1 Viewer)

John Sh

Member
Local time
Tomorrow, 03:36
Joined
Feb 8, 2021
Messages
410
I have two lists, A and B. List A is a field in a table and contains some items that do not appear in list B.
List B has been compiled using approved data sources and is the "Standard" by which list A will ultimately be corrected.
I need a select query that returns only those items in list A that are NOT in list B. I.e. those items that require correction.
Why do I need such a thing?
The wanted items are either typos or otherwise incorrectly entered data that need to be corrected in list A.


Before we get into normalisation etc, this is an inherited database where EVERY field was text and there were no checks and balances against what was being entered.
This resulted in a horrendous mess that has been largely repaired and is now fit for purpose. There is, however, a small number of errors that are proving difficult to find, hence the two lists.
Normalisation and setting up relationships is in the grand plan but is still some ways off. Getting the existing data corrected has to be my first priority as the database is in daily use and "normalising" faulty data would only exacerbate the problems I already face.

I have tried using various combinations of "is null" and "is not null" on list B. These return nothing or the full list B.
Putting the criteria "is null [listB]!{data] against list A results in an vague error message about operators without operands.

Any suggestions greatly appreciated
 

theDBguy

I’m here to help
Staff member
Local time
Today, 10:36
Joined
Oct 29, 2018
Messages
21,474
Hi. Have you tried using the "find unmatched query" wizard?
 

plog

Banishment Pending
Local time
Today, 12:36
Joined
May 11, 2011
Messages
11,646
I am going to help you, but first, although you tried to preempt it, I have to tell you that you need to normalize. And I say that because I want you to do less work, not more. Normalizing is the easy path to resolve this.

After I give you instructions you will have clean data for a day, then tomorrow users will start putting in more bad input because your structure/process allows it. So, whatever prompted you want to fix the data today will occur again and you go through this process again. Then before you know it, its a new regular task you have to perform that need not exist at all. After this I am done preaching--Normalization is the path of least resistance. Do it.

Let's rename your tables APPROVED and INUSE and the field is TheField. Here's how you find and update those in INUSE not in APPROVED:

Code:
SELECT INUSE.TheField
FROM INUSE LEFT JOIN APPROVED ON INUSE.TheField = APPROVED.TheField
WHERE APPROVED.TheField IS NULL

That identifies all bad records in use that are not on the approved list. From there, you should turn it into a Make Table query and name the new table FIXES. Once built, add a new text field to FIXES called [GoodValue]. Then you for every record in FIXES you manually input the correct value in the GoodValue field for every value in TheField field.

Once FIXES is completely populated you can use it in an UPDATE query to change every bad value in INUSE to its corrected version. Two notes:

1. Make a backup of everything before you run an ACTION query (Make Table or UPDATE).

2. Normalize. Or the above process will be part of your life.
 

John Sh

Member
Local time
Tomorrow, 03:36
Joined
Feb 8, 2021
Messages
410
I am going to help you, but first, although you tried to preempt it, I have to tell you that you need to normalize. And I say that because I want you to do less work, not more. Normalizing is the easy path to resolve this.
Hi Plog.
Thank you for your input. I will see how that works out.
Part of the repair process has been to install combo boxes wherever possible and draw the contents from known good data sources. This means that past mistakes cannot be repeated as the possible entries must come from the lists provided.
Obviously no system is foolproof but then we do not have fools working for us so there is a reasonable chance of things being kept kosher.
A large part of the problem is that I am working with plant taxonomy and that science is full of anomalies. Trying to get the "right" data is an ongoing battle. Add to that my background as an electronics technician as another hindrance to making sense of the myriad of information available.
John
 

John Sh

Member
Local time
Tomorrow, 03:36
Joined
Feb 8, 2021
Messages
410
Let's rename your tables APPROVED and INUSE and the field is TheField. Here's how you find and update those in INUSE not in APPROVED:
Not being familiar with SQL, I created my original query in the designer. Here, both my original query and yours look identical. It's only when I view the SQL that I can see the differences.

Your SQL
Code:
SELECT main.genus
FROM main LEFT JOIN genus ON main.genus = genus.genus
GROUP BY main.genus, genus.genus
HAVING (((genus.genus) Is Null))
ORDER BY main.genus;

My original query.
Code:
SELECT Main.Genus, Genus.Genus
FROM Genus INNER JOIN Main ON Genus.Genus = Main.Genus
GROUP BY Main.Genus, Genus.Genus
HAVING (((Genus.Genus) Is Null))
ORDER BY Main.Genus;

Looks like I'll have to put some time into learning SQL.
Thanks again.
 

plog

Banishment Pending
Local time
Today, 12:36
Joined
May 11, 2011
Messages
11,646
The one you posted is logically equivalent to mine, it should return the same results as my original SQL. It's just that the Access Query Builder does some things that lead to less than best practice SQL when you use it.

W3C has a good tutorial for SQL basics:

 

John Sh

Member
Local time
Tomorrow, 03:36
Joined
Feb 8, 2021
Messages
410
The one you posted is logically equivalent to mine, it should return the same results as my original SQL. It's just that the Access Query Builder does some things that lead to less than best practice SQL when you use it.

W3C has a good tutorial for SQL basics:

Thanks. It's bookmarked.
 

Users who are viewing this thread

Top Bottom