Text as Primary Key vs Auto Number

jk42

Registered User.
Local time
Today, 06:06
Joined
Apr 12, 2013
Messages
78
Hi All,

I really just wanted some opinions. When I first was making my database I used all auto numbers for PK's. But then I read an article or two saying that you really don't need to do that if you don't have to (actually, the article said "don't use autonumber if you want anyone to understand your database). I have a few tables (most employee experience type tables) where I just use the word as the PK and there is no auto number in the table at all. Since I'm learning as I go, does anyone see a major problem with that as I continue to build my database?

Just curious.
Thanks!
 
The role of the PK is to provide a unique ID or reference for each record. Autonumber is the simplest for this as Access takes care of creating the number. If you choose to go down the route of another field type for the PK, it would probably be best to determine a format this will comply with (to ensure conformity and - I think - make indexing easier). Say it is a people table, you might decide the format is Surname & first letter of Forename. With this though, you run the risk of encountering a situation where the PK value you go to enter is already in use so you have to use something else and violate your PK format.

Whilst Autonumber is not Essential, it does make life alot easier. You can always add your own ID field as well as an autonumber for aesthetics.
 
In general, the Primary key is for use by the database system to ensure each record/relation is unique within a table. If you have some field/construct that does this and is not an autonumber, so be it.

As Isskint has described, make sure whatever you choose satisfies your application - it must be unique for every relation/record in your table.. forever.

The autonumber (or sequence in other database systems) is a convenience to have a meaningless unique number. Since it is unique, it generally satisfies the requirement of PK.

Does that mean you have to use autonumbers? No. But they are convenient; are unique; and satisfy the database management system.

There are several opinions (pro and con) re autonumbers. Just google - surrogate keys.
 
The only fundamental requirement of a key is that it is a set of attributes which together are irreducibly unique across all the permitted populations of a table (no duplicates or null values). Beyond that basic requirement there are other common sense criteria that ought to guide the selection and design of keys. Three usually desirable criteria are: Simplicity; Familiarity; Stability.

Key generator functions such as "autonumbers" are most often used for surrogate keys and the main reason for creating a surrogate key is to provide a compact and stable identifier where there is no alternative. Auto-generated numbers can't and shouldn't be a universal solution for keys however because databases also have to implement business rules accurately and maintain the unique identifiers used in the business domain. As so often, the wisest answer is "it depends".
 
Thanks to all of you. This does help me. I think I'll be ok. There are only a few tables where I don't use auto number, as it's just easier to use the experience names. Thanks again!
 
Working with Access, you will find it easier if you use single-column primary keys. It doesn't matter if they are autonumbers or user-defined. When you decide to go with a surrogate key (autonumber) even though you have a candidate key (unique field or set of fields) in the table data, don't forget to create a unique index on the candidate key to enforce the business rules. All joins in queries and foreign keys will reference the autonumber surrogate key but searches will be done on the candidate key. So, for folks in the US, SSN is a typical candidate key. It is a violation of privacy to use SSN as a primary key so you will always use a surrogate key. But you still need a unique index on SSN to ensure there are no duplicates.
 
The other consideration is the ease of input, I have 151 types of Medium using Oil (Paint); 52 Acrylic so I use Short Keys because it is quciker than scrolling down all the variations. OOPN is Oil on Panel is is 104 in the list starting with O. OOPN involves four keystrokes.

Simon
 
So, for folks in the US, SSN is a typical candidate key. It is a violation of privacy to use SSN as a primary key so you will always use a surrogate key. But you still need a unique index on SSN to ensure there are no duplicates.

Hi Pat,

Privacy is unrelated to the choice of keys. Privacy and security are concerned with how data is managed and used. If you expose data to unauthorised people or fail to secure it properly then that is a potential violation of privacy irrespective of whether it is designated a primary key or not.

It might sometimes be that the technology presents certain obstacles to securing key attributes vs non-key attributes (replication for example), but in all cases it is the methods you use to secure and manage the data that addresses privacy, not the choice of keys.
 
When you get to be 65 and sign up for Medicare, you will be happy to find out that your SSN is your Medicare ID followed by a letter that indicates whether you are collecting SS or not yet. You are required to carry your Medicare card to show to any providers of medical service. Tell me that isn't a violation of privacy given the uprising of identity theft. In the past, many companies also used SSN as employeeID but for the most part they have switched to surrogate keys instead. Having your SSN stored in a computer file somewhere is necessary. Having it printed on a card you are supposed to carry is a violation of privacy. But the feds get away with it.

When you are developing a new system, there is absolutely no reason to use SSN as the primary key. Use a surrogate key instead and keep the SSN hidden away encrypted the way you would a password if possible.
 
Pat,

I entirely agree on the importance of privacy and that SSN shouldn't be exposed and put at risk of disclosure. That has nothing to do with whether it's a key or not, much less whether it happens to be a primary key. Just because you enforce the uniqueness of an SSN that doesn't mean you should make it any less secure - it just means you care about data integrity. Wouldn't you agree?
 
Pat,

I entirely agree on the importance of privacy and that SSN shouldn't be exposed and put at risk of disclosure. That has nothing to do with whether it's a key or not, much less whether it happens to be a primary key. Just because you enforce the uniqueness of an SSN that doesn't mean you should make it any less secure - it just means you care about data integrity. Wouldn't you agree?

No. A PK that requires encryption to meet regulations is a rather ugly concept.
 
A PK that requires encryption to meet regulations is a rather ugly concept.

Not at all. In very many cases entire databases are encrypted. Perhaps you haven't worked with such systems but I can promise you there is nothing ugly about them. I seriously doubt that there are regulations that apply only to data when used as a key in a table - the data values themselves are what matter to regulators. And if you want to ensure that encrypted SSNs are unique (as Pat suggested) then you'll need a key enforcement mechanism that works for encrypted data.

Encryption in any case has very little to do with the general concept of security in a database context because the gatekeeper to data is usually the DBMSs own authentication mechanism. Typically the same mechanism is used whether data is encrypted or not.

I think Pat's point is really about disclosure - unnecessarily exposing the SSN outside the database. However, it's extremely common to have keys that aren't exposed to all business users. Different users may use different identifiers for employees depending on the context. For example the login name, payroll number, badge number or surrogate key might all be alternative keys for employees. The question of which key might be designated "primary" by a database designer is uninteresting to business users and to regulators. What matters is what keys are used by whom and in what real world context.
 
Don't forget also that SSN can be changed. It isn't easy and it practically takes an act of God but it does happen due to identity theft. That in and of itself should disqualify it. Also, using SSN ASSUMES (and we know what that means) that all entries will be citizens who have a SSN. We will also be coming to the end of the useful life of the SSN. I don't remember when the SSN was introduced but I'd guess we've used more than half of the available numbers by now and with the "official" addition of millions of illegal immigrants we're going to immediately use another 1-3% of the capacity depending on who you believe regarding how many illegals will be given amnesty. I am not positive but I think EIN's take up some segment of the numbers also.

We're going to have to agree to disagree regarding the use of SSN as a primary key.
 
Pat,

That key values change is a fact of reality. If you think you have a key that will never need to change then you are probably wrong. Stability is a sensible basis on which to select a key but immutability is irrelevant, often unachievable and frequently impossible to verify.

Should we not enforce keys if we expect their values to change? Of course we should still enforce them. The fact that key values may change is at least as good a reason to implement a key as not because if the key constraint is not enforced then there may be a greater risk that a changing value could unintentionally violate uniqueness.

My main disagreement though was with the comment you never explained - that using SSN as a primary key is somehow a "violation of privacy". It is not in any sense that I'm aware of. I think that's a far more important point.
 
The example I gave you of the Medicare card is an example of a privacy violation. We are all warned against carrying our SS cards on our person and yet once people become eligible for medicare, they are told to always carry their Medicare cards.
 
The example I gave you of the Medicare card is an example of a privacy violation.

It is indeed but it has nothing to do with whether the SSN is a primary key or not; it's simply an example of careless use of personal data.

Suppose the designers of the Medicare database had followed your advice and designated something else as primary key but they kept SSN as an alternate key and still printed it on the Medicare cards. That would amount to the same violation of privacy. Now suppose instead that they made SSN a primary key but they printed some alternate key value on the cards - the SSN only being used inside their systems for internal administrative purposes. The violation of privacy you described would no longer exist. The choice of primary key is irrelevant as far as privacy is concerned. What matters is how the keys (potentially any number of different keys) are used in the outside world.

Generally speaking, where several alternative keys are in use the choice of which one should be "primary" is essentially arbitrary (that was E.F.Codd's opinion, not just mine) and a matter of perspective. In practice a "primary" key is often chosen for reasons that have nothing to do with what is the "preferred" or most widely used identifier.
 
I believe in following a Standard. Yours may be different to mine but whatever it is it should be followed 100% of the time if possible. We can all follow the standard of not using Reserved Words or not using spaces (And other non Alpha Numeric characters) but it is possible to use either. My preference is to select the one that can be used all the time. In these cases only not using Reserved Words and not using spaces (And other non Alpha Numeric characters) are the only two that can be enforced.

The same applies to AutoNumber V Natural Values. AutoNumber can always be used where Natural Values may not. In the case of Natural Values you may have to introduce a Second or Third or Fourth Field to obtain a unique value.

There is one other rule I apply with Primary Keys and that is that it is never seen by the end user.

These are the rules I have set myself and I stick by them.
 
Now for a quick question about your SSN. Is this privacy thing that I see so often, set down by law or is this someone's interpretation of some privacy law.

Then when we are deciding who should have access to that number how is that done. (Namely who gets Access)
 
I am gathering the opinion that there are no privacy laws pertaining to a SSN. Neither is there a Law to do with the privacy of your DOB, Driver's License, and Medicare etc.

There is strong advice that one should limit who you give this information to, but no Law. This advice originated from identity thief.

I would think that there is a Law preventing you from sharing my information with a third party.

I must add that I am not American and my knowledge of US Laws stems from watching American TV shows and Movies. However we do have similar problems here in OZ so it follows that we both have similar Laws to overcome these problems.

If I am correct then a reassessment of your stored data could be warranted.
 
Last edited:

Users who are viewing this thread

Back
Top Bottom