cheekybuddha
AWF VIP
- Local time
- Today, 10:51
- Joined
- Jul 21, 2014
- Messages
- 2,784
DISTINCT/DISTINCTROW won't eliminate records where there are duplicates.
GROUP BY/HAVING COUNT(*) =1 will
GROUP BY/HAVING COUNT(*) =1 will
But that's not what the OP meant. He was trying to select entrants from races that matched certain criteria which he didn't describe. Each race should have yielded one entrant, but a few (genuinely) yielded 2, and he was trying to ignore the races that had 2 results. He wasn't saying the data was duplicated, just that he didn't want to include the races with 2 outcomes.The initial post says duplicate records and this is what DISTINCT ROW does. A record is a row: if they are duplicates only one will be seen.
If that's what he meant then you are right - however, what he said was duplicate records and that is exactly what DISTINCT ROW does. Mind you as I said in my first post "Have I missed something ' and I obviously had.DISTINCT/DISTINCTROW won't eliminate records where there are duplicates.
GROUP BY/HAVING COUNT(*) =1 will
From Post #12Summing up, the problem is there is only supposed to be 1 horse per race with trait X. I have a set of 233 races but there are six races where there are 2 horses with X. They are no good to my analysis. I only want to analyse the 233 - 12 = 221 races. To do that I need to exclude these instances where there are 2 horses with X. That's it in a big nutshell.
Thanks for the replies with lots of great ideas.I'm with DickyP on this one. SELECT DISTINCT seems to be what is needed if all you are doing is looking for 1 record per horse or per race/number combo or whatever is the X factor.
However, there is another approach, "divide and conquer."
Add a yes/no field to the table, maybe call it "DQX", to help you with the selection and exclusion.
Before you do your analysis, run an UPDATE query: UPDATE mytable SET DQX = FALSE ;
Then when you do your "find duplicates" query, mark the ones that have duplicates as DQX = TRUE.
Then when you do your analysis, select your records but include in the WHERE clause "... AND DQX=FALSE ..." so that you won't consider either of the horses in the races where duplication occurred.
This will also help with another query I'm yet to do. This will have many more duplicates and possibly triplicates. These horses have Trait Y.
So I could make some smaller tables using DISTINCT or The_Doc_Man's suggestion. Having smaller specialised tables will simplify the whole thing.
Summing up, the problem is there is only supposed to be 1 horse per race with trait X. I have a set of 233 races but there are six races where there are 2 horses with X. They are no good to my analysis. I only want to analyse the 233 - 12 = 221 races. To do that I need to exclude these instances where there are 2 horses with X. That's it in a big nutshell.
if you want to just ignore the "problem races", then first have a query to find the races with more than one Trait X. Then use an unmatched query to select the other races, and extract your Trait X Data from the unmatched query.
That is a great summary of what I'm doing. Thanks Dave.Let's say the OP is looking to find horses with a form history of 32. They came third the first time they ran and second the second time, with the intention to bet them to win the third time.
In some races there may be more than one horse with a race record of 32, and we want to ignore those races.
That's what I imagine we are doing, not necessarily with such a simple Trait as form of 32.
It's not duplicates as such. All the race records are accurate. It's just how to determine which records to ignore.
As it happens, a horse's form history is quite tricky. is that stored, or built up from it's previous races. If the former, then you need to be sure to enter all the results in chronological order. If the latter, it's a tricky thing to do.
Is this anywhere near to what you are doing @davegoodo
I appreciate your comments, the main thing is that I've got the problem sorted now, so I'm happy to leave it at that. Thanks for your support though.( BTW - I struggle to understand why some feel it necessary to escalate a poster's problem to a discussion about database design or the lack thereof. If the question was 'how do I remove duplicates?', then yes, let's all wade in and let rip on the fundamentals, on EF Codd, on CJ Date, on PK, FK and functional dependence, on fan traps and chasm traps, and non/clustered indexes. But if the poster has a simple problem easily solved, maybe we should just hand over some sugar, then move on. )
Sometimes, giving a diabetic sugar would do more harm than good.( BTW - I struggle to understand why some feel it necessary to escalate a poster's problem to a discussion about database design or the lack thereof. If the question was 'how do I remove duplicates?', then yes, let's all wade in and let rip on the fundamentals, on EF Codd, on CJ Date, on PK, FK and functional dependence, on fan traps and chasm traps, and non/clustered indexes. But if the poster has a simple problem easily solved, maybe we should just hand over some sugar, then move on. )
Thanks for the allegory, I have made changes to the primary key to eliminate duplicates.Sometimes, giving a diabetic sugar would do more harm than good.
Sometimes, offering a work-around to avoid dealing with an underlying table design flaw can do more harm than good.
Could not agree more, and I've often struggled to hold back on unsolicited commentary about what might be underlying cause of OP problem/s. Nonetheless, I think most forum users me included are usually overwhelmed by some annoyance that just needs to be solved 'now'. Our offering advice on the fundamental problem is probably a distraction.Sometimes, giving a diabetic sugar would do more harm than good.
Sometimes, offering a work-around to avoid dealing with an underlying table design flaw can do more harm than good.
Thanks again.Could not agree more, and I've often struggled to hold back on unsolicited commentary about what might be underlying cause of OP problem/s. Nonetheless, I think most forum users me included are usually overwhelmed by some annoyance that just needs to be solved 'now'. Our offering advice on the fundamental problem is probably a distraction.