Solved Should (some) non-keys in tables be set to "indexed?" (1 Viewer)

HalloweenWeed · Aug 16, 2023

Recently I have heard from a forum member that one should not mark a table field as "indexed" before adding a query that selects records based on said FK field, claiming that Access will (sometimes?) create a duplicate index (if you create the relationship before marking the FK as indexed or adding a query to select records based on it), and was told that Access will automatically index said field. This question is somewhat related, as I am rebuilding my tables from scratch in a new immature database. I have a field that I have marked as "required" and do NOT "allow zero length," it may become hundreds of records in length, and this field will be used by the interface to select the record, but it is not the FK (nor the PK). Should I mark it "indexed: Yes (no duplicates)?" The field is type "short text." I am thinking it may enhance performance of the database when the user selects the record, when the table becomes many records in length. The user will probably be using a combobox to select the record, and it will be sorted by this field. Thank you for any input regarding this matter.

ebs17 · Aug 16, 2023

An index is a database object built into the definition of a table. Both are objects of the database in which the real table is actually located.
An index cannot be created by a "normal" query (Data Manipulation Language as a subset of SQL), but it can be created with DDL (Data Definition Language). Anyone who uses something like this knows what they are dealing with.

An index is comparable to the index in a book. It supports reading processes and thereby accelerates them. An index slows down writing processes because the index tree has to be maintained in addition to the data. In a typical database, however, read operations predominate, so that an index used is mostly useful.

An index is useful for fields or combinations of fields that are used for comparison, that is, where operations such as linking, filtering, sorting, grouping, and partial aggregation are applied.
An index is particularly important when linking tables via relationships, so when creating a relationship with referential integrity set, Access automatically indexes the foreign key field. However, this index is not visible in a superficial view (index window). Indexing the foreign key field separately would therefore be an unnecessary duplication.

In addition, there is a setting in the options in Access, where fields with specific designations (ID, key) are automatically indexed to support beginners and those who do not know. It is better to switch off this setting immediately in order to be able to implement your own index planning without disruptions.

To reiterate, select queries don't create indexes, but they can use them if they exist.

CJ_London · Aug 16, 2023

Sounds like the member is misinformed or you have misunderstood

access will automatically create indexes where they are specified- see File>Options>object designers

As a default they usually specify 'ID' and some others (can't remember what). So any field created with a name ending in ID (or whatever is in the list) will have an index automatically created

They will also automatically create an index for a field specified as the primary key - and this is where duplicates can occur. Access will of course create an index for any field ending in ID, including those that are designated a foreign key - but that is at the time of creation and you are unlikely to go into the index form and add another one. Perhaps a good reason to name your foreign keys with a different suffix.

You will see I have effectively turned autoindex off by leaving it blank but to illustrate how it works, I added ID then I created a table with fields called SID, PID and AIDT.

You can see from the index view (indexes button on the ribbon), there are two indexes automatically created. Not for AIDT becuase it doesn't end in ID

I then changed the PID field to primary key and a primary key index is created - note also the change in the allow duplicates and required properties

So now we have a duplicate index and you need to removes the ones you don't want.

So it is the primary key, not foreign key that potentially gets duplicated - because they have different properties.

My advice is to disable autoindex and this won't happen.

General rule: for indexing

use numbers, not text for primary and foreign keys (better performance) - but see below - a few hundred records such as US state abbreviations would be OK as text
Index all fields that are regularly joined on or used as criteria or sorted - with the exception of..
fields with a limited range of values and roughly evenly split between them (no real performance gain, but performance overhead in maintaining them

other considerations
Where fields are frequently null, set the ignore nulls value to yes
Tables with only a few hundred records probably won't benefit from indexing

CJ_London · Aug 16, 2023

ebs17 said:
However, this index is not visible in a superficial view (index window).

So where is it visible?

HalloweenWeed · Aug 16, 2023

CJ_London said:
...

You can see from the index view (indexes button on the ribbon), there are two indexes automatically created. Not for AIDT becuase it doesn't end in ID
View attachment 109462

Thank you @CJ_London . But I did not have such "indexes" button. So I looked in the ribbon commands, and added it under db "tools" tab, "custom" group. But still, I selected a table in a previous database that I know has multiple indexes and the button remains greyed out and non-selectable.

EDIT: I tried turning on show hidden & system objects in Objects ribbon, no joy.

HalloweenWeed · Aug 16, 2023

Otherwise, I believe I got my answer: there is no significant advantage in marking it indexed. TY all.

ebs17 · Aug 16, 2023

Show indexes via code.

Code:

Sub ShowIndexes(TableName As String)
    Dim db As DAO.Database
    Dim idx As DAO.Index
   
    Set db = CurrentDb
    With db.TableDefs(TableName)
       For Each idx In .Indexes
           Debug.Print idx.Name, idx.Fields
       Next idx
    End With
End Sub

ebs17 · Aug 16, 2023

HalloweenWeed said:
this field will be used by the interface to select the record ... The user will probably be using a combobox to select the record, and it will be sorted by this field.

Filtering and sorting are typical operations where an index will result in a performance improvement.

CJ_London · Aug 16, 2023

HalloweenWeed said:
But still, I selected a table in a previous database that I know has multiple indexes and the button remains greyed out and non-selectable.

It’s in the table design ribbon - and won’t work on linked tables

HalloweenWeed · Aug 16, 2023

CJ_London said:
It’s in the table design ribbon - and won’t work on linked tables

Ah, I see now. TY.

Pat Hartman · Aug 16, 2023

HalloweenWeed said:
Recently I have heard from a forum member that one should not mark a table field as "indexed" before adding a query that selects records based on said FK field, claiming that Access will (sometimes?) create a duplicate index (if you create the relationship before marking the FK as indexed or adding a query to select records based on it), and was told that Access will automatically index said field.

@HalloweenWeed Please do not summarize my posts when you do not understand them. Post a link so others can read what I said rather than your interpretation of my words. You missed the important part of my explanation. Read the following very carefully:

When you create a relationship between two tables using the Relationship window and check the Enforce RI box, Access creates a hidden index on the FK field. The Index is named using a GUID. This is how you can identify the hidden indexes. They NEVER show up in the Indexes dialog. Since Access automagically creates the index on the FK, you do not need to create that index manually. If you do create this index manually (or have suffixes in the autoindex which causes additional duplicate indexes), you end up with multiple identical indexes on the same column which is not good. I posted pictures that showed you that the hidden index exists but does not show in the Indexes dialog box so it is not easy to determine that you have created a duplicate index.

Access also automatically creates indexes if you have suffix values in the autoindex on create box as shown by CJ. BUT, those indexes are not hidden so you will see them in the Indexes dialog.

I did not tell you to not create indexes on non FK fields and non PK fields.

CJ_London said:
So it is the primary key, not foreign key that potentially gets duplicated - because they have different properties.

If there are suffixes in the autoindex box, BOTH the FK and PK can end up with duplicate indexes. I mentioned that as an aside in the post Halloween has misquoted but it wasn't the point of my response. I don't have a link to my long dissertation with pictures but I can repeat it if you can't find it. Since you know how to code, you can see ALL indexes if you care to by using VBA to examine the index property of the tableDefs collection. Partial code:

Code:

    For Each tblLoop In db.TableDefs
        If Left(tblLoop.Name, 4) = "MSys" Or Left(tblLoop.Name, 2) = "xx" Or Left(tblLoop.Name, 2) = "zz" Or Left(tblLoop.Name, 1) = "~" Or Left(tblLoop.Name, 2) = "f_" Then
        Else
            CountTables = CountTables + 1
            Forms!frmPrintDoc!txtTableCount = CountTables
            Forms!frmPrintDoc!txtTableName = tblLoop.Name
            rs.AddNew
            rs!SourceDBID = vSourceDBID
            rs!TableName = tblLoop.Name
            rs!RecordCount = tblLoop.RecordCount
            rs!ConnectString = IIf(tblLoop.Connect = "", Null, tblLoop.Connect)
''            rs!SourceTableName = IIf(tblLoop.SourceTableName = "", Null, tblLoop.SourceTableName)  '''' broke 3/28/23
            rs!Updateable = tblLoop.Updatable
''            On Error Resume Next ' the following property is only available when it is not null
            ''''''''''''''''rs!Description = tblLoop.Properties("Description")
            SaveTableID = rs!TableID
            rs.Update
            
            '' loop through Field names
            For Each fldLoop In tblLoop.Fields
                rsFields.AddNew
                rsFields!TableID = SaveTableID
                rsFields!FieldName = fldLoop.Name
                rsFields!OrdinalPosition = fldLoop.OrdinalPosition
                rsFields!AllowZeroLength = fldLoop.AllowZeroLength
                rsFields!DefaultValue = fldLoop.DefaultValue
                rsFields!Size = fldLoop.Size
                rsFields!Required = fldLoop.Required
                rsFields!Type = fldLoop.Type
                rsFields!ValidationRule = fldLoop.ValidationRule
                rsFields!Attributes = fldLoop.Attributes
                
                'some of the following properties are only available when it they not null so we used the ProprtyExists function
                If PropertyExists(fldLoop, "Format") = True Then
                    rsFields!Format = fldLoop.Properties("Format")
                End If
                If PropertyExists(fldLoop, "Description") = True Then
                    rsFields!Description = fldLoop.Properties("Description")
                End If
                rsFields!FieldType = GetType(fldLoop.Type)
                If PropertyExists(fldLoop, "Caption") = True Then
                    rsFields!Caption = fldLoop.Properties("Caption")
                End If
                If fldLoop.Attributes And dbAutoIncrField Then  'performs bitwise operation
                    rsFields!AutoNum = True
                    rsFields!Required = True
                Else
                    rsFields!AutoNum = False
                End If
                If PropertyExists(fldLoop, "DecimalPlaces") = True Then
                    rsFields!DecimalPlaces = fldLoop.Properties("DecimalPlaces")
                End If
                If PropertyExists(fldLoop, "InputMask") = True Then
                    rsFields!InputMask = fldLoop.Properties("InputMask")
                End If
                If PropertyExists(fldLoop, "Index") = True Then
                    rsFields!InputMask = fldLoop.Properties("Index")
                   ' Debug.Print fldLoop.P
                End If
                rsFields.Update
            Next fldLoop
            '' loop through indexes
            For Each idxLoop In tblLoop.Indexes
                CountIndexes = CountIndexes + 1
                Forms!frmPrintDoc!txtIndexCount = CountIndexes
                Forms!frmPrintDoc!txtIndexName = tblLoop.Name
                Forms!frmPrintDoc.Repaint
            
                If Left(tblLoop.Name, 4) = "MSys" Or Left(tblLoop.Name, 1) = "z" Or Left(tblLoop.Name, 1) = "x" Then
                Else
                    Position = 1
                    For Each fldLoop In idxLoop.Fields
                        rsIndexes.AddNew
                            rsIndexes!TableID = SaveTableID
                            rsIndexes!IndexName = idxLoop.Name
                            rsIndexes!TableName = tblLoop.Name
                            rsIndexes!Unique = idxLoop.Unique
                            rsIndexes!PrimaryKey = idxLoop.Primary
                            rsIndexes!OrdinalPosition = Position
                            rsIndexes!FieldName = fldLoop.Name
                        rsIndexes.Update
                        Position = Position + 1
                    Next fldLoop
                End If
            Next idxLoop
        End If
    Next tblLoop

The_Doc_Man · Aug 16, 2023

The original question includes whether certain indexes are of any value. Many reasons exist for adding an index. One way or another, all of those ways come back to looking for something - i.e. searching the table. Whether searching via a combo box, or filtering something in a report or form, or having a formal relationship back to some field to support a JOIN clause, you are searching for some specific value. Therefore, though many ways exist that use indexes, they are always present to support a search operation.

The answer to the index's usefulness can be estimated by computing the cardinality of the index. The word cardinality in database terms roughly means "how many records do I expect to be returned when searching for a given value of this indexed field?"

Cardinality can be ROUGHLY computed by counting the number of unique values in the field in question and divide that into the number of records in the table. (This is a ROUGH calculation... there are other ways, particularly if you have a predictably biased distribution of values.) The bigger this quotient, the less value there is in the index for most purposes. If you get back 50% of the table, the index would not do much good in searching. If your search returns 1 record, then your index was the PK and is the ideal search field. All indexes will have cardinality between a unique return and 50% return (unless you place an index on a field that is essentially constant.)

Why is a 50% cardinality not so good? Because if you are using the index, you have to go to the index, then find the next matching index entry, then follow the pointer to the record in question. But if you didn't use the index (instead doing what is called a "relation scan"), you just find the next record and if it matches, use it - but if not, find the next record in turn until you get a match. Flipping attention between the index and the table isn't awfully expensive but when dealing with a 50% return on a search, the two methods are not terribly far apart.

If you are dealing with composite indexes, the cardinality of any ONE of the indexes might be a moderate-to-high number. However, if the combination of indexes in this composite key has small cardinality, it might still be useful. Expressed as percentages, cardinalities multiply. So if index A has about 10% cardinality and index B also has 10% cardinality, as long as A and B are independent of each other, you might see a 1% net cardinality, which is a significant reduction of record returns.

The remaining question on utility is "how often will you actually use it for searching by any of the normal means of searching?" If the answer is "many times per session" then you can still get some advantage out of a high-percentage return. If the answer is "maybe once per week, if that much" then perhaps you don't care even about a moderate cardinality.

But, you say... Why do I care? Because if you have lots of INSERT INTO or UPDATE or DELETE operations in that table, EVERY CHANGE requires EVERY INDEX to be rewritten. So... indexes cost you in table maintenance/overhead. Lightly used or rarely used indexes don't need to be there. But heavily used indexes, even with poor cardinality, might be worth the effort of keeping them around. It's all relative to usage, so the DB's designer has to decide on utility.

Galaxiom · Aug 17, 2023

The_Doc_Man said:
If you get back 50% of the table, the index would not do much good in searching. I

This is why RDMS like SQL Server support Filtered Indexes. They are configured by the designer to exclude values that cause high cardinality, providing benefit where the index would be an advantage, without the overheads of maintaining an index on common values.

Cardinality is also important factor for the query optimiser. As a table grows the frequency of particular values can change altering the 'Statistics" of the table. The statistics are used by the query optimiser to calculate the plan. When they get stale they need to be updated and the query plan rebuilt. This is done with specific commands in a DBMS.

In Access, editing the query, probably compacting and definitely decompiling the database or copying the objects to a new database will force the query plans to be rebuilt. I don't know what Access (actually the ACE engine) does for statistics. I would assume they do exist in ACE and would be updated in a Compact and Repair of the back end.

The_Doc_Man · Aug 17, 2023

G., I've never seen an article on any statistics kept by the Jet or Ace engines. Not saying there aren't any, but I've never seen them in any non-trivial discussion. I would suspect that they have to be kept within the individual tables in some hidden sub-structure.

ebs17 · Aug 17, 2023

Attached is a script by Michael Kaplan and Julianne Lee for the last point.

Pat Hartman · Aug 17, 2023

@The_Doc_Man I have an old book for Jet 3.5. It has a wealth of information. Sadly there isn't an ACE version. "Jet Database Engine Programmer's Guide, second edition" It talks about statistics and that they are used by the Query Optimizer. They also mentioned that statistics are also requested via ODBC when working with linked tables which I thought was interesting.

MarkK · Aug 17, 2023

Pat Hartman said:
Please do not summarize my posts when you do not understand them.

It is difficult for me to see my own ignorances. Please be patient and kind with me while I learn from you.

Pat Hartman · Aug 17, 2023

MarkK said:
It is difficult for me to see my own ignorances. Please be patient and kind with me while I learn from you.

You are not the one who made the incorrect summary of my advice and you have no need to be snarky. If HalloweenWeed was offended by my request, let him be snarky if he feels the need to. Besides, if you read post #11, you might actually learn something you didn't know.

The_Doc_Man · Aug 18, 2023

Pat Hartman said:
@The_Doc_Man I have an old book for Jet 3.5. It has a wealth of information. Sadly there isn't an ACE version. "Jet Database Engine Programmer's Guide, second edition" It talks about statistics and that they are used by the Query Optimizer. They also mentioned that statistics are also requested via ODBC when working with linked tables which I thought was interesting.

If you read the history, Ace is merely a new version (not a total rewrite) of Jet. So at least some of it might be valid.

isladogs · Aug 18, 2023

ebs17 said:
Attached is a script by Michael Kaplan and Julianne Lee for the last point.

That's a great article as you would expect whenever Michael Kaplan was involved.
I'm only surprised that there was no mention of viewing the query execution plan using the undocumented JET ShowPlanfeature.

Show Plan - Go Faster

Use Jet ShowPlan to view the execution plan of Access queries. This can help optimise query execution

isladogs.co.uk

Perhaps that feature wasn't available at the time that article was written?

Solved Should (some) non-keys in tables be set to "indexed?" (1 Viewer)

Member

Well-known member

Super Moderator

Super Moderator

Member

Member

Well-known member

Well-known member

Super Moderator

Member

Super Moderator

Immoderate Moderator

Super Moderator

Immoderate Moderator

Well-known member

Attachments

Super Moderator

bit cruncher

Super Moderator

Immoderate Moderator

MVP / VIP

Similar threads

Users who are viewing this thread