Multiple Primary Keys or Composite Unique Index (1 Viewer)

thomasknebel · Jun 1, 2016

Hi,

I am looking for advise on when to a Multiple Primary Key and when a Composite Unique Index shall be used.

I created a table with a many to many relationship which I could use Multiple Primary Key or a Composite Unique Index. This table is filled with additional data fields too and not only the many to may relationships.

Thank you,

Thomas

GinaWhipp · Jun 1, 2016

Hmm, well I would use an Autonumber as the Primary Key that the database uses and then have a secondary field set up for the Composite Key then you get the best of both worlds without the headache of a Composite Key for a Primary Key.

informer · Jun 24, 2016

hi,

About PK, I always make a différence between
PKT primary key technical for making a link with others tables
PKF primary key
functional by declaring a unique index composed of fields which single out a record.
For exemple considering a table client
pkt :idclient = autonumber
pkf (unique index with value mandatory for all fields) : First + last name... Enough not of course...
+ birthdate... Not enough
+ adress... Now it's enough to single out a client in your db

Apply this method for tables with multiple foreign PKs

Galaxiom · Jun 24, 2016

informer said:
About PK, I always make a différence between
PKT primary key technical for making a link with others tables
PKF primary key
functional by declaring a unique index composed of fields which single out a record.

Your terminology is inappropriate. By definition, there can be only be one Primary Key. Any others are just keys.

For exemple considering a table client
pkt :idclient = autonumber
pkf (unique index with value mandatory for all fields) : First + last name... Enough not of course...
+ birthdate... Not enough
+ adress... Now it's enough to single out a client in your db

An index like that would add overheads to the table with no tangible benefits. One tiny difference in the address formatting and what would essentially be a duplicate is allowed in.

Far more useful to offer the user potential close matches to existing records when entering a new record. It would find those who have changed address which is one of the most common things encountered in a person table.

informer · Jun 24, 2016

a comment about the 5 NF and conceptual data model should be more constructive

gemma-the-husky · Jun 24, 2016

your terminology is confusing, I think.

In a table you need a field or number of fields that designate a unique item. This is a unique index, which may or may not be nominated as a primary key.

To relate this table to other tables, the PK needs to be represented in the linked table as a foreign key. FK.

So if the uniqueness of your records is represented by a "compound index" then all fields in this index must be represented in the linked table.

because manipulating compound keys can be awkward, it often suits us to use an ADDITIONAL artificial autonumber PK in the main table. You only then need a single field FK in the sub table. Keep the "real world" or "natural" index also, because you still need that to manage your data.

informer · Jun 25, 2016

hi gemma,

my target is to clarify Pk concept. The only meaningful PK is the unique index which singles out each record on a table.
And the autonumber PK is only an artefact of the true PK

gemma-the-husky · Jun 25, 2016

informer said:
hi gemma,

my target is to clarify Pk concept. The only meaningful PK is the unique index which single out each record on a table.
And the autonumber PK is only for an artefact of the true PK

I meant the OP

I think your note regarding the distinction between the PKT and the PKF is useful. We have a lot of discussions about PKs, and you get the feeling that some posters think you can replace the PKF with a PKT (aoutonumber), rather than that you still need the PKF (as a unique index, rather than PK) as well as the PKT

The_Doc_Man · Jun 27, 2016

Informer:

There is a distinction between synthetic keys and natural keys, but we have to remember that sometimes what is a perfectly true "natural key" is not a viable key for database functionality simply because of what it implies about the contents required for child tables. Forcing a 4-field compound on every little table tends to obfuscate or even outright negate the benefits of normalization. Not to mention what it does if there are any cases where some tables only need 3 of the 4 fields as their PK/FK, whereas others need all 4, and some might get away with only 1 or 2 of the participating fields.

The "functional" keys, though valid, have this nagging little problem in the real world - they are too often mutable. The "technical" key, on the other hand, might add 4 bytes to your records but it lets you avoid the problem of cascading updates to multiple compounded keys when you update what you call the functional keys. For small databases, this is purely a philosophical discussion. For much larger databases, forcing four full-sized key fields on each record might cause instant bloating.

informer · Jun 30, 2016

Hi The_Doc_Man

Thanks for these clarifications and as you mentioned it

The "technical" key, on the other hand, might add 4 bytes to your records but it lets you avoid the problem of cascading updates to multiple compounded keys when you update what you call the functional keys

And for this reason, I systematize the use of PKT instead of PKF.

But as ever, there is no universal rules and at the end, the choice between PKT and PKF is at discretion of the database modeler

Galaxiom · Jul 1, 2016

informer said:
And for this reason, I systematize the use of PKT instead of PKF.

Not very "functional" then is it?

informer · Jul 1, 2016

Galaxiom said:
Not very "functional" then is it?

hi Galaxiom
Could you be more explivite?

The_Doc_Man · Jul 1, 2016

there is no universal rules and at the end, the choice between PKT and PKF is at discretion of the database modeler

True, and I partially concur. However, I think YOU would also agree that to get this question right, you should do your analysis up front to consider the side-effects of natural vs. synthetic keys, usually on a case-by-case basis for small natural keys.

For instance, if you have a USA database with a table for state-relevant information, there is no reason to use a synthetic key of 4 bytes (autonumber) when the two-letter postal abbreviation for each state is (a) shorter (b) unique (c) unlikely to change. But on the other hand, when something involves very long (and even worse, variable length) text names, there comes a point where a synthetic key is easier to manage.

The reason I partly concur as opposed to fully concurring is that if you are doing it right, there actually IS one really hard-and-fast rule. In your database, the dog wags the tail; the tail does not wag the dog. You ALWAYS must design your database to reflect the real data associated with your problem of interest. The data (structure and flow) drives the design.

Decisions certainly CAN be made to accommodate practical matters such as shortening key lengths to make a child table's records shorter. These ARE a reality in a database where there is an upper limit to the size of a single data file - even if that limit is fairly large. Not to mention the limit on the size of a single record. But in the end analysis, the data should drive the solution. If your design doesn't follow that rule, you ARE doomed to less than stellar results.

informer · Jul 1, 2016

I totally agree with your last comments, The Doc Man. Tanks for your precisions

Galaxiom · Jul 1, 2016

informer said:
The_Doc_Man said:

The "technical" key, on the other hand, might add 4 bytes to your records but it lets you avoid the problem of cascading updates to multiple compounded keys when you update what you call the functional keys.

Click to expand...

And for this reason, I systematize the use of PKT instead of PKF.

Functional
adjective
1. of or having a special activity, purpose, or task.
2. designed to be practical and useful, rather than attractive.

Maybe the meaning got lost in the translation, like the meaning of "primary" when you have two keys you refer to as "primary". I don't think your terms of PKT and PKF are helpful. By definition there is one Primary Key in a table. Your PKF is just a key, or "candidate key" if you like.

Earlier you gave an example of a PKF as FirstName, Lastname, DateOfBirth and Address. While it represents one reliable way to select an individual record, one would not treat it in any way different from the rest of the data in the record and one certainly would not use it as a "primary" way of referring to a record.

Nor would there be any functional benefit indexing it in an effort to ensure that two records did not refer to the same person especially given the overhead of maintaining such an index. A tiny difference in the way the address is entered (eg "Main St" instead of "Main Street") and the duplicate slips through.

So the PKF isn't "primary" and it isn't particularly "functional", leaving it as what I said, just a "key", which is what it would be referred to by the vast majority of database developers.

informer · Jul 2, 2016

Hi Galaxiom

On Wikipedia here, we can read :

A surrogate key may also be called a synthetic key, an entity identifier, a system-generated key, a database sequence number, a factless key, a technical key, or an arbitrary unique identifier. Some of these terms describe the way of generating new surrogate values rather than the nature of the surrogate concept.

And still in Wikipedia here
A primary key uniquely specifies a tuple within a table. In order for an attribute to be a good primary key it must not repeat. While natural attributes (attributes used to describe the data being entered) are sometimes good primary keys, surrogate keys are often used instead. A surrogate key is an artificial attribute assigned to an object which uniquely identifies it (for instance, in a table of information about students at a school they might all be assigned a student ID in order to differentiate them). The surrogate key has no intrinsic (inherent) meaning, but rather is useful through its ability to uniquely identify a tuple.

Galaxiom · Jul 2, 2016

informer said:
On Wikipedia here, we can read :

A surrogate key may also be called a synthetic key, an entity identifier, a system-generated key, a database sequence number, a factless key, a technical key, or an arbitrary unique identifier. Some of these terms describe the way of generating new surrogate values rather than the nature of the surrogate concept.

And still in Wikipedia here
A primary key uniquely specifies a tuple within a table. In order for an attribute to be a good primary key it must not repeat. While natural attributes (attributes used to describe the data being entered) are sometimes good primary keys, surrogate keys are often used instead. A surrogate key is an artificial attribute assigned to an object which uniquely identifies it (for instance, in a table of information about students at a school they might all be assigned a student ID in order to differentiate them). The surrogate key has no intrinsic (inherent) meaning, but rather is useful through its ability to uniquely identify a tuple.

Yes. We see references to a surrogate key being called a technical key. I have no problem with that at all. But I don't see any mention of a natural key being referred to as "Primary Key Functional" nor any variant of those terms.

My main issue is with your terminology. You have introduced a nonstandard term that conflicts with accepted usage and, more importantly, does not accurately describe it.

There is no need for the term because it is already described clearly as a "candidate key" in standard terminology. A candidate key is one that could potentially be used as a Primary Key. However you want to twist the semantics, a single table does not have two types of Primary Keys.

What you suggested in your first post is that a table should have a unique index on a field, or set of fields that form a unique key plus a technical or surrogate key if that index is too complex.

What I am saying is that I disagree with the notion that the unique index is necessary in addition to a surrogate key.

In particular, in the example you gave of using FirstName, LastName, DoB and Address I suggest that such a index would not reliably ensure that one person was not recorded more than once due to the potential vagaries of the entry of the address. Moreover, such an index would be a considerable overhead that, in my opinion, would not be justified.

In that situation we are better to rely on a simple search and offering the operator a list of potential close matches to existing records, including those with a match on any of the fields that would comprise your index, ordered by their similarity to the proposed new record.

We would want to consider any record that matched on FirstName, LastName and DOB. People do change their address. They change their first name, they change their last name. Names can even have their spelling altered. As such these attributes don't make very good keys at all.

In my own database work I make extensive use of Damerau-Levenshtein distance to find similar records.

informer · Jul 2, 2016

Hi Galaxiom

Of course, the perfection is not in our world but we try to approach on it. And I saw so many db system with dirty data (duplicated data) due to absence of a unique index with the use of a technical primary key only that I disagree with idea to not use an unique index.

And the natural / functional key is built also according to the probability to have a same primary key for different items on a entity. For example customer entity

What is this probability if we base a customer entity PKF on :
first name ? Too strong !
First + last name? Still too strong !
First + last name + date of birth ? Less strong but still too much
First+ last+ Date + zip code Place of birth? Probability to have different customers with these same properties is sufficiently low to create unique index on these fields

ButtonMoon · Jul 2, 2016

The idea of identifying customers by name and date of birth seems to me far too unrealistic to be worth considering. What kind of business would want to ask their customers for their date of birth? True there are cases where the date of birth might be required for customers, healthcare applications being one example, but those healthcare applications are never likely to use name and date of birth as a key.

Natural keys (composite keys included of course) are incredibly important, even essential to successful database design. Unfortunately the internet is littered with poor examples like yours. That's precisely why the issues are so often misunderstood and misrepresented.

Galaxiom · Jul 3, 2016

informer said:
And I saw so many db system with dirty data (duplicated data) due to absence of a unique index with the use of a technical primary key only that I disagree with idea to not use an unique index.

Those systems clearly didn't check for potential duplication during entry.

Relying on your index alone to avoid duplicate won't do much better. Bill Smith, Will Smith and William Smith all with the same DoB and address will waltz straight in. His wife Patricia could be there too as Patricia, Patti and Trish. Then all of them again at the address they were living at five years ago.

You really aught to take the opinions of your data modelling software as suggestions rather than as gospel and start thinking about nature of real world data.

Multiple Primary Keys or Composite Unique Index (1 Viewer)

New member

AWF VIP

Registered User.

Super Moderator

Registered User.

Super Moderator

Registered User.

Super Moderator

Immoderate Moderator

Registered User.

Super Moderator

Registered User.

Immoderate Moderator

Registered User.

Super Moderator

Registered User.

Super Moderator

Registered User.

Registered User.

Super Moderator

Similar threads

Users who are viewing this thread