How can I delete Duplicate Records from a table (1 Viewer)

sbaud2003 · May 10, 2024

Hello Sir/Madam

Please help me to deleting duplicate records records from a Table
I tried but it deletes all the recordes with the key, i am trying.

theDBguy · May 10, 2024

You can create a duplicate table structure and set a unique index on the duplicate field(s). Then, insert the data from the original table.

The_Doc_Man · May 10, 2024

The suggestion from theDBguy should work OK. The problem was that your key, whatever it was, was not unique. For future reference, the exact problem you just had is the reason why you need unique keys in tables. When you need selectivity, having only keys that can contain duplicates tends to limit your options.

ebs17 · May 10, 2024

If you only want to delete parts of them, you need an attribute that allows you to distinguish between the records and can therefore be used to determine which records you want to delete or which records you want to keep. A typical field would be an ID with a continuous number (autoincrement).

SQL:

DELETE
FROM
   TableX AS T
WHERE
   EXISTS
      (
         SELECT
            NULL
         FROM
            TableX AS X
         WHERE
            X.DuplicateKey = T.DuplicateKey
               AND
            X.ID < T.ID
      )

Here the highest ID per duplicate key should remain.

Pat Hartman · May 10, 2024

@theDBguy 's solution is the simplest. Recreate the table with the proper keys defined. Then run an append query. Only the first row with the "key" will be added. The others will be ignored.

ebs17 · Tuesday at 10:16 AM

Pat Hartman said:
@theDBguy 's solution is the simplest.

But you should also be able to do a delete query on a table, it's not a big challenge.
If you always need a new table to display results, that's not beneficial.

Pat Hartman · Tuesday at 6:45 PM

ebs17 said:
But you should also be able to do a delete query on a table, it's not a big challenge.

OK, you have a query that produces 5 duplicates. So you create a query that selects the duplicate with the lowest autonumber PK. Then your delete query deletes all records that match on the x number of fields used to define a duplicate that have PKs greater than the lowest value. Great. Not so easy if you don't already have a unique PK you can use to isolate ONE row.

Then, once all the dups are gone, you add the multi-field index that will prevent future duplicates. OR, you define the new table WITH the multi-field index and run a simple append of all the rows. Access will discard the dups for you.

ebs17 · Tuesday at 8:27 PM

Pat Hartman said:
OK, you have a query that produces 5 duplicates.

That's not OK. A query that produces unwanted duplicates should be reworded. This is about a table that contains duplicates, for whatever reason.

Pat Hartman said:
Not so easy if you don't already have a unique PK you can use to isolate ONE row.

The ID with autonumber was just one possible example to differentiate records. Maybe there are already existing usable fields such as a timestamp field, you just have to look at the entire table to see what there is when it is fully displayed.

If there is no suitable criterion for differentiation and therefore no primary key, you can simply create such a autonumber field in the table using DDL.

Of course, after cleaning up unwanted duplicates will be avoided by using a unique index. Please don't say SQL is difficult. In the end, someone still believes that. SQL is the native language in a database. If you consciously refuse to do this, you are moving in the wrong environment.

Pat Hartman · Tuesday at 10:36 PM

ebs17 said:
Please don't say SQL is difficult

SQL is good for a great many things but it is essentially set oriented. You have to identify a set. The "set" is the "5" duplicates. I think you missed the point of the question. The OP has duplicate data in a table that he needs to clean up. The duplicates are not caused by a join, which could be a logic problem or a simple misunderstanding of how 1-m relationships work. This is an actual, physical problem. He wants to get rid of 4 of the 5. How does he identify which 4 of the 5 "identical" records to delete? A TimeStamp won't help. That isn't data that is easily accessible or even understandable. To use it, you would need to know how each value relates to all the other values in the set of duplicates. All we know about a TimeStamp is that it is unique, at least within the table. An autonumber that is sequentially assigned is the most viable discriminator but the table may not have one or it may be randomly assigned which won't help. You suggested adding an autonumber. OK, but was that only because you have to disagree with my suggestion?

If we believe the problem statement, these are actual, physical duplicate rows. How did they get there? How do we prevent that in the future? How do we get rid of the existing dups?

The third question is where the OP is starting. First we get rid of the dups. My solution solves two of the three questions at once. You can tackle them all separately if you choose. You can modify the input table, you can create beautifully formatted, colorful, complex subselects 4 levels deep, or, you can use my suggestion which in one step fixes the existing data and prevents future duplicates from being added and uses a simple append query with no criteria. It takes advantage of how Access SQL append queries work. I guess my suggestion is just too simple.

As to "how did they get there", the OP will need to look at how the data is added to the table. If it is appended from an external source, my proposed solution keeps on working because the append query will discard the "duplicates" as long as the indexes were correctly defined. If the records are added one at a time using a form, proper validation will give the user an understandable message and let the user fix the data before it is saved.

ebs17 · Wednesday at 4:32 AM

Pat Hartman said:
I think you missed the point of the question.

This is what I call an exclusive opinion. I can only shake my head at what some people think.

sbaud2003 said:
help me to deleting duplicate records records from a Table

For those who can read: My suggestion corresponds to the wording of the question.
Or should I understand the counter-discussion to mean that it is a huge effort for those involved to write a delete query?
As @The_Doc_Man likes to write: You can have more tools than just a hammer.
None of us know the exact situation that exists. It makes little sense to interpret something appropriate yourself.

Pat Hartman · Wednesday at 2:58 PM

sbaud2003 said:
Please help me to deleting duplicate records records from a Table
I tried but it deletes all the recordes with the key, i am trying.

Repeated for your convenience.

ebs17 said:
None of us know the exact situation that exists.

We do know the exact situation. It is clearer than most problem statements we see.

How can I delete Duplicate Records from a table (1 Viewer)

sbaud2003

Member

theDBguy

I’m here to help

The_Doc_Man

Immoderate Moderator

ebs17

Well-known member

Pat Hartman

Super Moderator

ebs17

Well-known member

Pat Hartman

Super Moderator

ebs17

Well-known member

Pat Hartman

Super Moderator

ebs17

Well-known member

Pat Hartman

Super Moderator

Similar threads

Users who are viewing this thread