Composite index

Romio_1968

Member
Local time
Today, 21:08
Joined
Jan 11, 2023
Messages
126
I have few fields that ar not unique, but i need the ”sum” of them to be unique.
For example, a book title is not unique, the publisher is not unique, the author is not unique, but a book of a given author, edited by a certian publisher in a certain yera must be unique
I was thinking to a composite index, but it seems that such a composite accepts only unique indexed fields.
Any ideea how can i manage this?
 
You are incorrect regarding the idea that a composite index requires unique indexed fields. It is possible for the composite index to be unique even if you have no indexes on the component fields. You DO have a size limit of 255 characters on a given index, so long titles, long publisher names, and long author names might be a bit messy.

What might be MORE efficient, though, is that author and publisher would seem to be fields that would repeat a lot if you have a lot of books, so you might have a symbol or code number to represent authors and to represent publishers in individual tables where you would have a unique code for each author and a unique code for each publisher. (There is no reason that each such table would NOT use the same numbers over again, as might happen with an autonumber ID for each entry.) Then in your book-title table you would have the actual book title, author code, publisher code, and year - a four field compound - and declare a compound index with one text field and 3 numeric fields.

How often do book titles repeat? That would govern whether it made sense to have a book-title table analogous to the author and publisher tables. But if not that many book-title are duplicates, perhaps it isn't worth that level of effort to split out a names table. In either case, it still might make sense for the book name table to have a non-unique index on the name. If you do the split I just mentioned, though, you would not need an index on each field in the book name table because the two "translation" tables would have a usable index on those values when needed.

When you need to produce a REPORT showing all of those fields, a multi-table JOIN would probably be the starting point. To ENTER data, a form with a couple of combo-boxs tied to author and publisher would make data entry fairly simple.

This is how I would approach the problem. However, from your brief description, I admit to "shooting from the hip" so if there is a wrinkle here and there, you would have to explain a bit more in order to smooth out those wrinkles.
 
Set up a composite index in table design - Assign a name to the index and add the fields that compose that index on separate lines. Set that index to be unique. Then any combinations of those items must be unique. The Doc's suggestions about structuring the data make sense too.
 
You are incorrect regarding the idea that a composite index requires unique indexed fields. It is possible for the composite index to be unique even if you have no indexes on the component fields. You DO have a size limit of 255 characters on a given index, so long titles, long publisher names, and long author names might be a bit messy.

What might be MORE efficient, though, is that author and publisher would seem to be fields that would repeat a lot if you have a lot of books, so you might have a symbol or code number to represent authors and to represent publishers in individual tables where you would have a unique code for each author and a unique code for each publisher. (There is no reason that each such table would NOT use the same numbers over again, as might happen with an autonumber ID for each entry.) Then in your book-title table you would have the actual book title, author code, publisher code, and year - a four field compound - and declare a compound index with one text field and 3 numeric fields.

How often do book titles repeat? That would govern whether it made sense to have a book-title table analogous to the author and publisher tables. But if not that many book-title are duplicates, perhaps it isn't worth that level of effort to split out a names table. In either case, it still might make sense for the book name table to have a non-unique index on the name. If you do the split I just mentioned, though, you would not need an index on each field in the book name table because the two "translation" tables would have a usable index on those values when needed.

When you need to produce a REPORT showing all of those fields, a multi-table JOIN would probably be the starting point. To ENTER data, a form with a couple of combo-boxs tied to author and publisher would make data entry fairly simple.

This is how I would approach the problem. However, from your brief description, I admit to "shooting from the hip" so if there is a wrinkle here and there, you would have to explain a bit more in order to smooth out those wrinkles.
Yes, the title is „long text”
I do have title codes and author codes. I may try using that. The ideea is that for a new title i wold like to have some check for duplicates, but not only for the title itself but for all thos variables combined.I am not interested in how often a title can repeat (not that aften, but imagine vow many people wrote a book called Marketing or Algebra. Yet, the publisher never reissue a book without a small change in thitle,even if they are adding sommething like „2'nd edition" so the composite should do the job.
Now, the check must be done after the user enters the title detalis (title, Publisher, year, location) so other complex reports and queries are not really wellcome in the workflow, that,s why i was thinking on a composite index, where a data validation should do the job.
What do you think about a trimmed title field, to 255 length. It is a calculated field, but it should work.
Thank you for the information about composite index limitation.
 
Last edited:
What do you think about a trimmed title field, to 255 length. It is a calculated field, but it should work

Work in what context? If you need more than 255 characters for title alone, your composite index will not be useful. If you have year (literal, 2 bytes as Integer), author code (4 bytes as Long), and publisher code (2 bytes as Integer), the composite index to assure uniqueness would have to be 8 bytes for the combined year, author, and publisher, leaving 247 bytes for title. AND that assumes that you have no more than 65K unique publishers.
 
Most of the titles are much shorter then 255 characters. Yet, there are authors that are creating gigantesque titles. Well... their book, their title. Anyway, if I add a calculated field that cuts the first 100 characters, combined with publisher, year, and I set the composite to unique, the probability to have a collision is null. As I told you, the composite sollution is prefered because I can use index validation instead of queryes. But I have to keep the original title as well, being the book title.
 
For example, a book title is not unique, the publisher is not unique, the author is not unique, but a book of a given author, edited by a certian publisher in a certain yera must be unique
I don't fully understand the above discussions.

I think you need a Book Titles table, an Authors table, and a Publishers table. The keys of these tables plus the year are brought together in a junction table to form the desired combinations. The composite index would then be created using these four values.
 
I don't fully understand the above discussions.

I think you need a Book Titles table, an Authors table, and a Publishers table. The keys of these tables plus the year are brought together in a junction table to form the desired combinations. The composite index would then be created using these four values.
Yes, you don't understand, of course. Look for "Composite indexes".
 
Below is a picture of the indexes dialog which shows three inexes. The PK and two other indexes each of which is composite. To add a new index, go to the first blank like. Add the index name, pick the first field and set the Primary property to No, the Unique property to Yes, and the Ignore Nulls to yes or No. Then go to the next line to add the second field. Leave the name blank and just pick a field name. Move to the next blank line to enter a third field name, etc. You can have up to 10 columns in the composite index. SQL server allows more - maybe 32.
OtherIndex.PNG
 
With books there should always an ISBN available. That's the way. Use that as a unique PK, and that should solve all problems.

There might conceivably be the same author and title released as a reprint or reissue. I presume they are treated as different books, and have different ISBNs. That's why ISBN is the safest course, I would think, although books are a special case, because there's a clear way of defining a book.

With CDs I have ripped my own CDs to storage and have had the tracks and album identified as a different one to my CD. I'm pretty sure I noticed this with a John Denver 2 disc album, where one of the discs identified as a different album. I think I had another where some tracks where mis-identified and misnamed, obviously from a similar but different album.

The point you were querying is that you would expect to find duplicate values for part of a composite index. It can't be any other way.

A sales order will most likely have multiple items, and a repeated order number. The unique composite key is order number plus product code, something like that. Neither order number not product code on their own would be unique, although it might be worth including them as indexes.
 
Last edited:
With books there should always an ISBN available. That's the way. Use that as a unique PK, and that should solve all problems.

There might conceivably be the same author and title released as a reprint or reissue. I presume they are treated as different books, and have different ISBNs. That's why ISBN is the safest course, I would think, although books are a special case, because there's a clear way of defining a book.

With CDs I have ripped my own CDs to storage and have had the tracks and album identified as a different one to my CD. I'm pretty sure I noticed this with a John Denver 2 disc album, where one of the discs identified as a different album. I think I had another where some tracks where mis-identified and misnamed, obviously from a similar but different album.

The point you were querying is that you would expect to find duplicate values for part of a composite index. It can't be any other way.

A sales order will most likely have multiple items, and a repeated order number. The unique composite key is order number plus product code, something like that. Neither order number not product code on their own would be unique, although it might be worth including them as indexes.
It is true, but the users already have about 10k records without ISBN. But yes, on a new project, ISBN is the solution.
 
It is true, but the users already have about 10k records without ISBN. But yes, on a new project, ISBN is the solution.
Yes. What I really meant is that it is really tricky to define a unique book. It may not be as simple as title and author.
 
if you want to keep a long title, consider hashing the value perhaps concatenating the other fields as well - that could reduce 1000 characters to 7 or 8 - certainly less than 255.

I use the technique to hash all the fields of a record to create a unique value as a temporary pk when importing data to ensure i don’t import duplicates

on my phone at the moment but can find the code when back at my desk
 

Users who are viewing this thread

Back
Top Bottom