Relationships Headache

Hey

Why oh why give such wrong advice.
Every table should have an Autonumber PK
Who told you to avoid using Autonumbers?
mike60 - 40 years of data modeling experience has proven (to me) that autonumbered primary keys are not necessary where an existing natural key will do the job.

In nearly every instance of the excessive use of autonumbering where a natural key already exists and would do the job perfectly well, I see either users who don't understand their business, or developers who've failed to elicit from users their true business case, or both - liberal use of autonumber primary keys is not a substitute for a well-understood business need.

So, why my assertion about natural keys in place of autonumbering primary keys...?

1. Natural keys are self documenting which greatly simplifies the design of the front end and especially the creation of reports.
2. Natural keys are familiar to users and this simplifies communication amongst users and amongst users and developers when exploring the business need.
3. Natural keys remove the need to (always remember to) create a unique index to ensure relational integrity exists in stored data.

Here's my example model to support my argument:

If a particular 'race' is called, for example, "Iditarod Trail Sled Dog Race" then that name will serve perfectly well as the primary key - there is only one such race.

Race: RaceName (PK), <other non-key attributes>

Now we need a table to hold the occasions when this race is run - ie the "Race Event"; this 'event' table inherits the Race primary key to which I'd add the date when the race us run to make a composite primary key - each running of the race is now unique:

RaceEvent: RaceName (PK, FK), DateHeld (PK), <other non-key attributes>

We could then create a table that shows all entrants in an instance of the race:

Competition: RaceName (PK, FK), DateHeld (PK, FK), TeamID (PK, FK), <non key attributes, like PlaceAchieved, TimeStarted, TimeEnded etc.

...no autonumber primary key required yet.

However, there are occasions when autonumber keys are necessary -- when no natural key exists. For instance, consider a database storing riverwater quality -- sites along a transect where samples are taken may not exhibit any special characteristic unique to each site - they're just sites 1, 2, 3 etc. We don't need to know anything specific to each site, but we do want to be able to assign the correct set of results to each site so we don't end up with duplicates, or missing data etc. I'd probably use autonumber primary key for each site in this scenario to ensure referential integrity and functional dependence between all site-related ata. (I note - from experience - that a site's spatial coordinates - latitude/longitude for example - are not suitable as primary keys because they're not reliably discreet numbers and not immutable).

--

I'm very interested in any evidence you have to support your claim that "Every table should have an Autonumber PK".

Cheers
Greg
 
Last edited:
I'm guessing that you don't use Access FE's much. A single field natural key won't be a problem but a multi-field natural key will prevent you from selecting rows using a combo or listbox which is why I always recommend autonumbers for PKs with Access. The combo and listbox require a SINGLE unique identifier for selection purposes. They are not capable of creating a multi-field FK.
Pat - this is true; so don't use complex primary keys in lookup tables; I don't. But if it's unavoidable, then I'd code the solution in a modal popup form and post the PK values in to the receiving form (table). Alternatively, I could use a combo/list box for each key field but that's a messy and inelegant solution.
 
Last edited:
If a particular 'race' is called, for example, "Iditarod Trail Sled Dog Race" then that name will serve perfectly well as the primary key - there is only one such race.
Your primary key occupies 29*2+10 = 68 bytes. A long value requires 4 bytes.

A key is used for comparative operations such as linking, filtering, grouping, sorting. When you think of mass data processing, there is a very big difference in terms of effort and performance whether you have to match 68 bytes or 4 bytes with your peers.
 
Your primary key occupies 29*2+10 = 68 bytes. A long value requires 4 bytes.

A key is used for comparative operations such as linking, filtering, grouping, sorting. When you think of mass data processing, there is a very big difference in terms of effort and performance whether you have to match 68 bytes or 4 bytes with your peers.
Good point. So let's use a unique surrogate key initialism like "ITSDR" - that would (probably) still satisfy my criteria. It'd require a little extra leg work to present users with the full race name in selections and on forms and reports, but it's almost self-documenting as users become familiar, and far better in my view than something like "Race=17" for example. But putting aside the space issue, surrogate keys that require educating users are probably better replaced with the actual value like "Iditarod Trail Sled Dog Race" - no user education required, no mental gymnastics and no extra steps when building queries.
 
I don't know how you program. Users get a graphical interface, they don't look at tables. This way they are never confronted with surrogate keys and never have to think about it. Natural keys don't change? I live in the east of good old Germany, our history goes back further. A lot is changing.
 
I don't know how you program. Users get a graphical interface, they don't look at tables. This way they are never confronted with surrogate keys and never have to think about it. Natural keys don't change? I live in the east of good old Germany, our history goes back further. A lot is changing.
Good point - which is why (where it has an impact) I'd never use a mutable concept like Country Name as a primary key... not only do the names change/merge/diverge/subsume; so do the boundaries (definitely calls for a spatial database solution).
 
"Race=17"
A user would NEVER, EVER see a surrogate key unless you want to use them as things like InvoiceNumber which most people don't do. Users always use the natural key or other search fields to find the records they want, hence the need for proper indexing. Unless you have millions of rows in your tables, you don't need to worry about the additional space required for two indexes on some tables. The time consuming thing is index maintenance and natural keys are unlikely to change or they wouldn't even be candidates.
 

Users who are viewing this thread

Back
Top Bottom