Normalization Denormalization of tables. Keep it denormalized when it is possible.

'I've been creating applications based on relational databases since before most of you were born.' ;)
-----------
'The only place where I have ever used unnormalized tables is in Data Warehouse applications.'

Come on now Pat... Every db you ever put into production has been in 5th normal form... :rolleyes:
-----------
'It is possible to go overboard with how you define a repeating group,'

Hobby1 could be their favorite and Hobby2 could be 2nd favorite. :o

Just trying to help...
ken
 
I think normalisation is like many other things in life in that you have a basic structure or guidelines to work on but to take things to the extreme (in either direction) often results in an inferior outcome.

Mike
 
KenHigg said:
Hobby1 could be their favorite and Hobby2 could be 2nd favorite. :o
ken

what if there is no second favorite...? what if there is no first favorite...? results in storing nulls again...

Kev
 
I think 25 fields is damn stupid. Normalise, normalise, normalise...


Even when it comes to querying: Access would rather query a couple of normalised fields than a 25 field wide table.
 
Yeah, just goofing off... Good point though...

Really didn't mean to get off on this wild goose chase - :)

(I'll try to keep my feeble opinions on normalization to myself - I think I only made matters worse for monkey. :rolleyes: )

ken
 
Mike375 said:
I think normalisation is like many other things in life in that you have a basic structure or guidelines to work on but to take things to the extreme (in either direction) often results in an inferior outcome.

Mike
yeah right! you could end up with hundreds of tables and thousands of macros! :rolleyes:
 
Rich said:
yeah right! you could end up with hundreds of tables and thousands of macros! :rolleyes:

wouldn't have to worry about modules or code though... not enough time to work on these! :D :rolleyes: :p ;)

sorry Mike - couldn't resist :p
 
(I hope I can do this without using the 'n' word)

Repeat: With out nailing down the bussiness rules, you'll have a hard time getting this to work very well. So here are some rules to start with:

1. All customers will have a contract.
2. Contracts are broken down into 25 components.
3. All 25 components will always be complied with from day one of the contract.
4. The 25 components that will make up a contract will never change.
5. Hence, all contracts are exactly the same for every.
5. When a contract is terminated, for whatever reason, the contract (and its components) will be archived for future reference.

Can you look at this and either confirm, clarify, or reject each rule or add new rules?

ken
 
Mile-O-Phile said:
I think 25 fields is damn stupid. Normalise, normalise, normalise...


Even when it comes to querying: Access would rather query a couple of normalised fields than a 25 field wide table.

Mile,

Here is a question for you in terms of would you put this data on one row.

Title, salutation, first name, middle name, surname, business name, level, street, suburb, state, postode (and by 3 for home, business and postal address), business phone, home phone, mobile, fax, email, date of birth, smoker/non smoker, height, weight, gender, employed/self employed, net income, assets, liabilities, sick leave, workers compensation, married/single, occupation, qualifications, Medical speciality, visiting medical officer/staff specialist, graduation date, dangerous past times, medical loading/no loading, bank, source of lead, place of birth, spouses names and dates of birth, smoker/non smoker, spouse work/not work, spouse date of birth.

Then some fields which determine what sort of mail outs he/she gets, policy owner, accountant, solicitor.

Mike
 
I suppose I am just the simple sort

In my databases

Tables have just the right number of attributes. (Not too many, not too few)
Tables have a primary Key
Relationships are 1 to Many with the very occassional 1 to 1 but I worry about them
Referential Integrity is enforced
I do not have multi valued attributes
I do not store calculated values except in archived records
Full dependancies to primary key
No transient dependancies
I have just enough forms to do what I want to do
I have just enough reports to report what I want
Forms and Reports are based on queries and I have just enough of these as well
I try to remember to comment my code
I find that these are just about as many guidlines as I need to allow me to concentrate on the specification that some dumb ass has written asking me to do things that they cannot explain in a logical fashion

Len B
 
Mike375 said:
Mile,

Here is a question for you in terms of would you put this data on one row.

Title, salutation, first name, middle name, surname, business name, level, street, suburb, state, postode (and by 3 for home, business and postal address), business phone, home phone, mobile, fax, email, date of birth, smoker/non smoker, height, weight, gender, employed/self employed, net income, assets, liabilities, sick leave, workers compensation, married/single, occupation, qualifications, Medical speciality, visiting medical officer/staff specialist, graduation date, dangerous past times, medical loading/no loading, bank, source of lead, place of birth, spouses names and dates of birth, smoker/non smoker, spouse work/not work, spouse date of birth.

Then some fields which determine what sort of mail outs he/she gets, policy owner, accountant, solicitor.

Mike

let me answer for him.... no.

Kev
 
Mike375 -

Even though I know you will not hear this I will tell you anyway:

the example you posted above violates the principles of normalization, specifically the rule of third normal form (3NF) stating that all fields must directly relate to the primary key. If they do not directly relate then they should be seperated into seprate tables which link to the table in question...

Kev
 
Hum...

How would you set it up Kev?

ken
 
KenHigg said:
3. All 25 components will always be complied with from day one of the contract.

We track when they come in. Otherwise you're spot on. If you look at the sample DB I posted earlier in this thread, you'll see that we track dates of when all these components come in. So, different parts might arrive on different days.

Is it possible, since we're tracking dates per component, that this isn't a completely unnormalized table? I'm sure it's not, like, normalized to the 5th form or anything, but maybe this isn't as precarious a table as I thought...?...well, you guys are the pros, you'd know better than I.

Mile-O-Phile said:
I think 25 fields is damn stupid. Normalise, normalise, normalise...

Even when it comes to querying: Access would rather query a couple of normalised fields than a 25 field wide table.

No need to call my practices "damn stupid", friend. I'm here to discuss and learn. Since this thread was about denormalization, and I have a related issue, I threw my example out there for discussion. I try to follow normalization practices as much as possible, and I troll these forums all day long learning new things about DB design EVERY day. On the rare occasions I can contribute an answer to someone's question, I do. Most of the time, I find myself asking.

My point was, and if you look back over the thread you'll see I got mixed answers, what's better IN THIS CASE? One table with up to 90+ companies (90+ records) and 25 columns (25 fields) for their contractual obligations, or, since all components are eventually required, a join table with three fields (company, component, date) and 2250+ records...many of which will be null until the companies send in their components (which, again, they MUST do)? And, how would I show that join table on the form I provided in my example? I have users who are REALLY newbie (lots of temps), and must have everything laid out like that.
 
Kevin_S said:
Mike375 -

Even though I know you will not hear this I will tell you anyway:

the example you posted above violates the principles of normalization, specifically the rule of third normal form (3NF) stating that all fields must directly relate to the primary key. If they do not directly relate then they should be seperated into seprate tables which link to the table in question...

Kev

At what point do you cease to have the fields.

For example, if his address and names are included then logically one would think his various other identifiers would be included.

If I split that table then at different points I will have to rejoin it with a query and I need all of those fields for data that insered into a Word.doc via Word Bookmarks. Now in my experience are query that joins a few of these will run slower than the table itself.

The other problem is that I will need subforms to display information and that limits where I can place the fields. Also, with only "fields" to deal with an OnCurrent macro takes care of what is visible and invisible and also does MoveSize etc. based on conditions.

Having said that, I feel that the data on the row would be as "normalised" as having his name and address.

For example, would include the gender an date of birth. If the answer is "Yes" then for the insurance purposes you would be also include the other fields.

If you include date of birth then you would include height, weight, occupation etc because all these details "identify" the person as much as DoB indentifies him or Smoker/Non Smoker.

Mike
 
monkeytunes said:
(company, component, date) and 2250+ records...many of which will be null until the companies send in their components

I told you in the post above that you will never have null fields in this case because you DO NOT ADD the records in until the data is available... this is why its normalized - if you talking about appending 25 blank fields then your not thinking normalized - basically it would be the same as the flattened example...

Kev
 
Mike375 said:
Mile,

Here is a question for you in terms of would you put this data on one row.

Title, salutation, first name, middle name, surname, business name, level, street, suburb, state, postode (and by 3 for home, business and postal address), business phone, home phone, mobile, fax, email, date of birth, smoker/non smoker, height, weight, gender, employed/self employed, net income, assets, liabilities, sick leave, workers compensation, married/single, occupation, qualifications, Medical speciality, visiting medical officer/staff specialist, graduation date, dangerous past times, medical loading/no loading, bank, source of lead, place of birth, spouses names and dates of birth, smoker/non smoker, spouse work/not work, spouse date of birth.

Then some fields which determine what sort of mail outs he/she gets, policy owner, accountant, solicitor.

Mike



Short answer: No

Long answer not sure if we are talking insurance claim or a new insurance poilicy. This should give you some ideas :D

Unlike some people i don't mind lookup tables when you have as many catergories as i've seen in the NHS datasets and they keep changing their minds as to what the items are called it makes life a lot easier IMHO. :)


TBLCLIENT

ClientID(PK) Buggered if i can work out a suitable natural primary key from this lot that is fool proof. Best one could be DOB NAME PLACE OF BIRTH CURRENT HOME ADDRESS. I personally would include national insurance number - or the equivalent - in TBLCLIENT as this should be unique and it's a good security check as well.

Title (could be 1 to 1 look up with TBLTITLE)
Salutation(Could be 1 to 1 look up with TBLSALUTATION)
first name
middle name
surname

Code:
Seperate out the address. Only because if you are doing a policy for a
 husband and then the wife or number of people who work for the same 
business it makes it alot easier to manage. Choosing key fields is difficult. 
Building number and postcode are usual in the UK. Not sure if you use 
Building number - i.e Number 10(k), downing street, London, W1(k). And not 
sure what level is?


TBLBUSINESSADDRESSES
BUSSINESSADRDRESSID, business name, level, street, suburb, state, postode 

TBLHOMEADDRESS
HOMEADDRESSID, level, street, suburb, state, postode

TBLPOSTALADDRESS
POSTALADDRESSID, level, street, suburb, state, postode

BUSSINESSADRDRESSIDFK,
HOMEADDRESSIDFK,
POSTALADDRESSIDFK,


smoker(yesno tickbox)
height
weight
gender

Code:
employed/self employed (assuming that they must have a job then
 employed self employed can be a yes/no checkbox. If not i.e you want to 
know their employment status field should be employmentstatus which can 
be a 1 to 1 look up to a table of employmentstatus, this is a business rule)

EmploymentStatus

net income

assets (assuming this is just a cash figure if not then you need another 2 tables see qualifications),

liabilities(assuming this is just a cash figure if not then you need another 2 tables see qualifications),

sick leave (what how many days this year? ever?)

workers compensation (Do they want some :D Have they had some? Are they being insured for some?)

married/single (Often seen commonlaw, cohabiting etc could have a 1 to 1 look up so that it becomes more flexible)

occupation (Again 1 to 1 lookup)

qualifications
Code:
 If this is a list of their qualifications then you need 2 more tables

TBLQUALIFICATIONSLINK
CLIENTIDFK
QUALIFICATIONIDFK


TBLQUALIFICATIONS
QUALIFICATIONID (Unique identifing autonumber)
QUALIFICATIONDESC (PK)

Some people would not use the field QUALIFICATIONID as the QUALIFICATIONDESC (PK) is unqiue i just prefer it that way so tough. :p

graduation date


dangerous past times (as for qualifications)
medical loading (yes/no)

bankidfk ( from tblbank with all banks contact details)

source of lead (1 to 1 lookup)

place of birth (1 to 1 lookup)

spouses names and dates of birth (as for qualifications, seeing as you have used the plural- i.e. divorced remarried. Even if you have only the current/last spouse you need TBLSPOUSE. Spouse work(yes/no) is part of TBLSPOUSE)

couple of others are repeats.


HTH

TS
 
Last edited:
What exactly is a component? Is it a paragraph or a clause that the customer agrees to comply with but which may take some time to fulfill after the contract has been signed? Say, like your company requires they prove insurance by suppling you with some type paperwork?

If this is so, What happens if your company is ready to do business but they are not in compliance with a clause? Do you need something that will let you know when a component is in compliance? When the entire contract is in compliance?

Finally - Wouldn't you guess that down the line some where, your legal department would:

a. Want to keep track of old, out of date contracts, and
b. Change the contents, revise the components of new contracts.


???
ken
 
Mike375 said:
If I split that table then at different points I will have to rejoin it with a query and I need all of those fields for data...

Mike - here is where your logic is flawed. You would not create this relationship in a query but in the ER Diagram you would create a relationship between the tables - again you do not make databases... you are creating macro activiated spreadsheets. The number of field would be mute as a proper normalized design would streamline the data entry where 90% of the data you posted could/would be captured in a suform/continuous form setup.

Ken -

Off the top of my head without know the exact use and structure of the data I would go with 4 tables:

tblCustomer
tblCustomerAttributeProfile
tblFamily
tblBenefits

tblCustomer:
CustID PK

tblBenefit:
BenefitID PK

tblFamily
FamilyMemberID PK
CustID FK (1 - many w/tblCustomer)

tblCustomerAttributetProfile:
CustProfileID PK
CustID FK (1-Many w/tblCustomer)
BenefitID FK (1-Many w/tblBenefit)

This again, is with no prior knowledge of the data and about 5 minutes of thought so its a little ruff but for this example will work. Here tblCustomer would contain data related to the customer him/her self - Name & contact info. tblCustomerProfile contains all profile infomation that is 1-1 with the customer: (age, height, weight, sex, etc...) this is split from the customer table because you may or may not have all the data for each of these fields... as the data becomes available it is added to the database without having to store blank/null data in the first place. This table also contains the benefit details for the holder (i.e. fire, accident, life, etc..) since these things are all dynamic you dont have a table with these pieces of information running across the top (a la spreadsheet) because not every one each of these policies - you only store data on the policies each individual has. Finally, tblFamily contains data on the different members in the policy holders family that relate to the customer...

Again, this is quick and dirty and could probably be streamlined some but I think you get the jist of it...

HTH,
Kev
 

Users who are viewing this thread

Back
Top Bottom