Solved Exclude Duplicate Records from a Query

DISTINCT/DISTINCTROW won't eliminate records where there are duplicates.

GROUP BY/HAVING COUNT(*) =1 will
 
The initial post says duplicate records and this is what DISTINCT ROW does. A record is a row: if they are duplicates only one will be seen.
But that's not what the OP meant. He was trying to select entrants from races that matched certain criteria which he didn't describe. Each race should have yielded one entrant, but a few (genuinely) yielded 2, and he was trying to ignore the races that had 2 results. He wasn't saying the data was duplicated, just that he didn't want to include the races with 2 outcomes.
 
DISTINCT/DISTINCTROW won't eliminate records where there are duplicates.

GROUP BY/HAVING COUNT(*) =1 will
If that's what he meant then you are right - however, what he said was duplicate records and that is exactly what DISTINCT ROW does. Mind you as I said in my first post "Have I missed something ' and I obviously had.
 
Summing up, the problem is there is only supposed to be 1 horse per race with trait X. I have a set of 233 races but there are six races where there are 2 horses with X. They are no good to my analysis. I only want to analyse the 233 - 12 = 221 races. To do that I need to exclude these instances where there are 2 horses with X. That's it in a big nutshell.
From Post #12
 
I'm with DickyP on this one. SELECT DISTINCT seems to be what is needed if all you are doing is looking for 1 record per horse or per race/number combo or whatever is the X factor.

However, there is another approach, "divide and conquer."

Add a yes/no field to the table, maybe call it "DQX", to help you with the selection and exclusion.
Before you do your analysis, run an UPDATE query: UPDATE mytable SET DQX = FALSE ;
Then when you do your "find duplicates" query, mark the ones that have duplicates as DQX = TRUE.
Then when you do your analysis, select your records but include in the WHERE clause "... AND DQX=FALSE ..." so that you won't consider either of the horses in the races where duplication occurred.
Thanks for the replies with lots of great ideas.
I certainly have enough to solve my dilemna now.
Thanks all

This will also help with another query I'm yet to do. This will have many more duplicates and possibly triplicates. These horses have Trait Y.
So I could make some smaller tables using DISTINCT or The_Doc_Man's suggestion. Having smaller specialised tables will simplify the whole thing.

I'm pleased to report that all has gone well and I've created a table where the duplicates are excluded and the new table is ready for analysis. I have certainly learnt a lot in this exercise and I'm grateful to all who contributed, thanks.
 
Last edited:
Did you at least try the simple suggestion in Post #18, before adding unnecessary fields to tables and creating other superfluous tables?

Really, @DickyP's and @The_Doc_Man's advice might not be the best for your specific situation.

This will also help with another query I'm yet to do. This will have many more duplicates and possibly triplicates. These horses have Trait Y.
So I could make some smaller tables using DISTINCT or The_Doc_Man's suggestion. Having smaller specialised tables will simplify the whole thing.

The solution I suggest you test will probably also handle triplicates as easily. Having smaller specialised tables will mean more moving parts that require maintenance and keeping in sync (ie more chance of getting it wrong).

You should only pursue this direction if the simpler methods do not do what you need them to do.
 
Last edited:
Back in post #12, the OP stated

Summing up, the problem is there is only supposed to be 1 horse per race with trait X. I have a set of 233 races but there are six races where there are 2 horses with X. They are no good to my analysis. I only want to analyse the 233 - 12 = 221 races. To do that I need to exclude these instances where there are 2 horses with X. That's it in a big nutshell.

Despite this, most of the responses appear to be dealing with eliminating the duplicate horses. That's not what the OP is asking for. He wants to exclude all races where there are duplicate records

In post #13, @gemma-the-husky wrote:
if you want to just ignore the "problem races", then first have a query to find the races with more than one Trait X. Then use an unmatched query to select the other races, and extract your Trait X Data from the unmatched query.

I agree with that approach
 
Posted on behalf of O/P from https://www.access-programmers.co.uk/forums/threads/how-to-find-an-administrator.331120/


Thanks for your reply,
Yes I ran your SQL and it produced a recordset with the duplicates in it. I made a clarification later in the thread which said that the "duplicates" were in the Race Number and what confused the discussion was the horses' names weren't duplicated, it was the Race Number that was duplicated.
The two horses appeared in the same race. So I moved on to the other posts which gave me the answers I could understand and apply.
Once I created the queries I could isolate the duplicates and create the table I needed.
Thanks for your contribution, I just went with what worked for me at the time.



The Message I Received
=======================
Your content can not be submitted. This is likely because your content is spam-like or contains inappropriate elements. Please change your content or try again later. If you still have problems, please contact an administrator.
 
Let's say the OP is looking to find horses with a form history of 32. They came third the first time they ran and second the second time, with the intention to bet them to win the third time.

In some races there may be more than one horse with a race record of 32, and we want to ignore those races.

That's what I imagine we are doing, not necessarily with such a simple Trait as form of 32.

It's not duplicates as such. All the race records are accurate. It's just how to determine which records to ignore.

As it happens, a horse's form history is quite tricky. is that stored, or built up from it's previous races. If the former, then you need to be sure to enter all the results in chronological order. If the latter, it's a tricky thing to do.

Is this anywhere near to what you are doing @davegoodo
 
Let's say the OP is looking to find horses with a form history of 32. They came third the first time they ran and second the second time, with the intention to bet them to win the third time.

In some races there may be more than one horse with a race record of 32, and we want to ignore those races.

That's what I imagine we are doing, not necessarily with such a simple Trait as form of 32.

It's not duplicates as such. All the race records are accurate. It's just how to determine which records to ignore.

As it happens, a horse's form history is quite tricky. is that stored, or built up from it's previous races. If the former, then you need to be sure to enter all the results in chronological order. If the latter, it's a tricky thing to do.

Is this anywhere near to what you are doing @davegoodo
That is a great summary of what I'm doing. Thanks Dave.

BTW I have a copy of "DAO Object Model" by Helen Feddema, I noticed a photo of your copy in a thread from last year. I'm coming back to Access after a long (8 year) absence and have started reading it again. That is why I'm very rusty with Access, I've been out of touch with it for all that time. But its good to know there are so many enthusiasts in this forum it is encouraging.
 
Last edited:
Since you want to exclude any and all records that occur more than once (and not just remove duplicates as many respondents have assumed) you first need a query that lists those 'more than once' records - let's call it 'qDup' - and I think you've already constructed that query?

Then make a new query on the data you do want to analyze and use a LEFT join of that query - call it 'qGood' - to the qDup query.

This LEFT join uses an IS NULL critieria:

Select qGood.* FROM qGood LEFT JOIN qDup on qGood.[field/s] = qDup.[fields/s] WHERE qDup.[any field will do] IS NULL

ie. get every record from the left side (qGood) where it does not exist in the other side (qDup)

GL
 
( BTW - I struggle to understand why some feel it necessary to escalate a poster's problem to a discussion about database design or the lack thereof. If the question was 'how do I remove duplicates?', then yes, let's all wade in and let rip on the fundamentals, on EF Codd, on CJ Date, on PK, FK and functional dependence, on fan traps and chasm traps, and non/clustered indexes. But if the poster has a simple problem easily solved, maybe we should just hand over some sugar, then move on. )
 
( BTW - I struggle to understand why some feel it necessary to escalate a poster's problem to a discussion about database design or the lack thereof. If the question was 'how do I remove duplicates?', then yes, let's all wade in and let rip on the fundamentals, on EF Codd, on CJ Date, on PK, FK and functional dependence, on fan traps and chasm traps, and non/clustered indexes. But if the poster has a simple problem easily solved, maybe we should just hand over some sugar, then move on. )
I appreciate your comments, the main thing is that I've got the problem sorted now, so I'm happy to leave it at that. Thanks for your support though.
 
Last edited:
( BTW - I struggle to understand why some feel it necessary to escalate a poster's problem to a discussion about database design or the lack thereof. If the question was 'how do I remove duplicates?', then yes, let's all wade in and let rip on the fundamentals, on EF Codd, on CJ Date, on PK, FK and functional dependence, on fan traps and chasm traps, and non/clustered indexes. But if the poster has a simple problem easily solved, maybe we should just hand over some sugar, then move on. )
Sometimes, giving a diabetic sugar would do more harm than good.

Sometimes, offering a work-around to avoid dealing with an underlying table design flaw can do more harm than good.
 
Sometimes, giving a diabetic sugar would do more harm than good.

Sometimes, offering a work-around to avoid dealing with an underlying table design flaw can do more harm than good.
Thanks for the allegory, I have made changes to the primary key to eliminate duplicates.
 
Sometimes, giving a diabetic sugar would do more harm than good.

Sometimes, offering a work-around to avoid dealing with an underlying table design flaw can do more harm than good.
Could not agree more, and I've often struggled to hold back on unsolicited commentary about what might be underlying cause of OP problem/s. Nonetheless, I think most forum users me included are usually overwhelmed by some annoyance that just needs to be solved 'now'. Our offering advice on the fundamental problem is probably a distraction.
 
Could not agree more, and I've often struggled to hold back on unsolicited commentary about what might be underlying cause of OP problem/s. Nonetheless, I think most forum users me included are usually overwhelmed by some annoyance that just needs to be solved 'now'. Our offering advice on the fundamental problem is probably a distraction.
Thanks again.
 

Users who are viewing this thread

Back
Top Bottom