How do I know if I have bad performance?

The_Doc_Man · Jan 4, 2017

CJ London's posts inspired me. Those of you who know me know this could be a good thing or a bad thing. I'll try to keep it straight-forward.

To understand and properly analyze whether you are getting good performance out of a database operation, you have to understand how it works. Access (the program) takes your query and analyzes its requirements. This analysis occurs at creation of the QueryDef for your query. Dynamically created queries work find but might not be pre-optimized, so the first helpful thing to know is that if you have to use queries based on some variable, a "Parameter Query" approach gives you the best of both worlds. A stored Parameter Query can be optimized yet still have some variables fed to it.

Factors going into the analysis and optimization of a query include whether indexes exist for the table(s) and whether you have single-table or multi-table operations (JOIN or UNION, for example).

The nature of the query also is a factor. SELECT only has to touch its sources once for reading content to form the final recordset, but a SELECT query doesn't DO anything without interacting with another element of Access that will impose speed issues. For instance, SELECT feeds datasheet views, reports, and forms - all of which have formatting delays that will have far greater effect on performance than the raw speed of the query itself. An isolated SELECT query is therefore usually part of some other transformative method for data inspection, and that other method will be the speed driver.

INSERT INTO with a SELECT clause as the source of the insertion touches data twice - once for the SELECT and once for the INSERT. It doesn't return a recordset so doesn't have GUI/print issues.

UPDATE queries also touch twice - once to read and once to write back. Again, no returned recordset, so no GUI/print issues.

DELETE queries have to touch all of the extant records twice, too. Once to find the record, once to mark it as DELETED. (If you doubt this, open a query, then manually open the underlying table or run a DELETE query and watch the records that you are looking at change to #DELETED#.) Again, no GUI/print issues because no returned recordsets.

So let's analyze a simple case where the speed of the GUI/print elements are not a factors. Let's see how to compute how fast something SHOULD happen. Let's focus on an UPDATE query.

Suppose that your update is going touch every record in a table for which the average record size is about 128 bytes and you have 4096 records. (You can use the Database Documenter to analyze record size by adding up the field size for a given table.)

Assume that you have a modern system with a 15 kRPM disk which means you have about 250 RPS on that disk. We will also assume you have enough physical memory that paging or swapping will not be an issue. Given most off-the-shelf systems these days, that should not be a tough assumption.

We will ALSO assume you compacted and repaired your database and that you recently defragged your disk. (Both are important - discussion later.)

Important fact: Access does things in chunks of disk buffers, 4 kbytes each. If you have 128 bytes, you can fit 32 records per buffer. You have 4096 records so will need 128 disk reads (and since this is an UPDATE, 128 disk writes). For a 250 RPS disk, rotational latency will be 1/2 of the time for one revolution, or 2 ms. To do 256 total disk operations you will need 512 ms or a little over 1/2 second for the optimum case.

You might ask why I'm not adding in compute time here. Windows uses a device driver that operates in parallel with other programs, so after the first buffer is fetched, modified, and written back, Access and Jet/Ace can overlap their current operations with the write-back of the buffer just updated. Therefore, the instruction overhead is hidden behind the longer delay of disk readiness for the next read or write operation.

In summary on these calculations, you can never do an UPDATE of 4096 records of 128 bytes length faster than about 1/2 second for the given conditions. If you see anything slower than 1/2 second for updates, you know that you have missed something. (If you see faster, the odds are that you have a faster disk, as it is the biggest contributor to this computation.)

I said to defrag the disk and then compact & repair the DB. The action of a C&R is to create a new, empty database and then to read each table, one at a time, to transfer records from the old to the new copy. This has the effect of linearizing the tables and making them contiguous, which minimizes disk movement (head-seek latency). It also enhances the probability that you need ONLY 32 buffer-loads to manage the entire 4096-record table.

If your database has a lot of "churn" factor (addition and deletion of records), you increase the likelihood that you need more disk buffers to manage the table because it will become scattered within the DB file. That is, a disk buffer might not be 100% dedicated to your table - other tables might be there; so might some deleted records. So your fragmented database detracts from performance.

If your DISK has a lot of "churn" factor as well, don't forget that database have this thing called "bloat." That is, they tend to grow in size. On a fragmented disk, the appended disk blocks are not necessarily contiguous with the the original DB file. This increases the need to do more distant seek-arm movement.

Therefore, when folks hear us say "have you compacted & repaired your DB file" or "have you recently defragged your disk" we are talking about affect the physical factors that are performance detractors for any database.

What I just did was show you how a system analyst looks at performance, breaking it down into contributors to that performance and the expected size of their contributions.

CJ London's notes talk about things to optimize your database layout, choosing the optimum kinds of operation (query vs. recordset/loop, etc.). This note should be taken as a way to know what is the proper expectation for performance.

static · Jan 4, 2017

There's a joke in that title somewhere.
You don't unless somebody tells you. ? Like bad breath? Missing the obvious? Yup, that's me and I've started so I might as well finish.

I hate maths. Numbers do my head in.

I couldn't give a crap about computer hardware and disk spin speeds or how long it takes to repaint a form that contains umpteen nested subforms.

Often, I don't even care if my tables are properly indexed. gasp!

And neither do your users. Users have certain expectations and it's your (the developers) job to try and meet them.

Pretty much every user I've had the pleasure of working with, knows when something isn't right. If a button is clicked it's generally expected that something will happen within a certain amount of time ... because that's how computers work. It's like the digital version of ergonomics.

Instant response is obviously the goal, but good enough is usually good enough and good enough is a finger in the air job.

Sometimes tasks do take a long time and there is absolutely nothing you can do about it.
This is why such things as splash screens and status bars were invented - to let the user know that something is happening and that normal service will resume shortly.

jdraw · Jan 5, 2017

static,

Interesting comments --and you are right from the point of view of some users.

But DocMan said he was talking from a system analyst's view -- I would narrow that to a technical database analyst. However, terminology like system analyst, performance analyst, database analyst,
etc have no single definition so what's relevant in one environment may not be relevant in another.

Performance can take on many aspects. Does the database support the business? Does the system do what it was intended to do? Is there sufficient edit/validation of data? Has training in the operation and maintenance of the application been sufficient? Is there commensurate backup and recovery of the data? Has security been considered and effected?......

It isn't only the technical aspects that "deal with performance", but it is a key consideration.

Unfortunately, in my view, it is the lack of a holistic approach to performance or appreciation of the various aspects that gets users into "development mode" and M$oft encourages same with its marketing of Access.

It's a little bit like using a hammer to drive screws - it certainly can be done, and you can't stop some "would be mechanics" from doing so. But using the tools as intended is a goal.

When we get newbies asking questions, I think it is prudent to advise them of the "bigger picture"; relational database concepts, etc. Many come here having exhausted themselves with multiple spreadsheets and horrific manual procedures to keep things flying. They've heard of Access and possibly database and want to jump in by loading their spreadsheets as is.

Do we mention concepts, normalization, naming convention, form/subform or ...?

Some want to learn, and some just want the quickest, dirtiest "solution". And we all know there is a broad range of acceptable applications based on laws/rights and time constraints and "who is affected by the application".

I think CJ and DocMan have presented sound facts that should be considered in applications. I do see your point, and I have seen it many times over the years.

NauticalGent · Jan 5, 2017

Great post Doc. I have reall enjoyed CJ London's recent posts and spent all day yesterday going over the tables in my BE application to see if I could improve them. Amazing what you can see that-which-could-not-be-seen before once you have the luxury of looking at it with fresh eyes.

Posts like his (hers?) and yours are a great resource for guys like me who are trying to turn the corner and transistion from power user (need a tongue-in-cheek emoji) to developer.

Really appreciate these posts. These, once vetted, should be pinned so that they are readily found.

Thanks again, both of you.

The_Doc_Man · Jan 5, 2017

static, it is perfectly OK for you to not be interested in the dirty details. I very clearly narrowed the niche of that post by claiming it to be useful for people who suspected they had performance issues and wanted to know what kind of test could determine that fact.

Developing quick-and-dirty applications is PERFECTLY OK and I'm not bothered that you choose to do so. I do it too, sometimes. But when I was still contracting for the government, the quick-and-dirty version was for proof of concept. The fussy analysis came later to make the bloody thing work acceptably fast, to squeeze that last ounce of "oomph" out of my CPU. Even if you don't think this matters to you now, perhaps one day it will. Just remember that when you are ready, someone showed you how to do that kind of analysis.

On at least one of the databases for which I was the platform analyst and software configuration designer, we were able to do some systems magic and some platform adjustment that had the effect of making queries run 30 times faster. (No, NOT 30% faster. 30 TIMES faster.) When you can do that, you get a lot of "attaboys" and you just hope you don't get an "oh, shit." Because as we well know, one "oh shit" cancels at least a couple of dozen "attaboys."

When you are a designer, you want it to be correct. When you are an analyst, you want it to be efficient and/or robust. When you are just trying to get a job done fast because you are overworked and overloaded, you just want the damned thing to work. I've been in all three positions and I understand each viewpoint.

gemma-the-husky · Jan 5, 2017

this is an interesting examination of timing issues, and clearly demonstrates why queries reading a lot of data take a long time to complete.

empirically, queries should run in a finite time, certainly minutes rather than hours even for very large processes. If they are taking longer then perhaps try to change the query, or break it into multiple steps. It helps if you can see the progress in the progress bar, as you can gauge where any bottlenecks might be.

Minty · Jan 5, 2017

I have a fairly complex query that I have stored as a view on SQL server, very oddly when the server is under some load, I get time out errors running the view from design view within SSMS, but the linked view (table) in access always opens fine.

I think I might rewrite it as a SP with some temp tables to reduce the complex sub queries that are doing some aggregate functions, although they all load individually absolutely fine.

CJ_London · Jan 5, 2017

Have your tried JetShowPlan, an undocumented function - see this link

http://www.techrepublic.com/article/use-microsoft-jets-showplan-to-write-more-efficient-queries/#

the article relates to JET for .mdb's. see this link for where to locate for 2007+

http://stackoverflow.com/questions/12607296/how-to-get-query-plans-showplan-out-from-access-2010

The showplan.out file appears in your documents folder (well, it does for me!)

Not sure if it is relevant to your situation. It doesn't show how long each step of the plan takes, but does show the order in which they occur - so perhaps by reordering your query in some way you can improve performance. Note it does not handle subqueries

Edit: Just realised you are talking about SqlServer views/SP - SqlServer has an equivalent documentor, but can't remember offhand what it is called.

Rather than using subqueries - have you considered using Cross Apply?

Minty · Jan 5, 2017

I've run the execution plan and there is also the System profiler, which I always forget about.

Using that I noticed that the view wasn't coming up as having anything missing performance wise, but when I then ran the view SQL as a external query it came up with a missing index. Performance improvement was claimed at 20% , but in real terms was more like 200%. Run without the index took 2 + minutes (Running the execution plan always adds a significant overhead) with the index 11 seconds.

Taking away the Execution plan overhead, it now runs in 2 seconds including the output results to screen. Amazing, one simple index.

CJ_London · Jan 5, 2017

one simple index

goes to show

Minty · Jan 5, 2017

CJ_London said:
goes to show

Absolutely - and I thought I had covered most eventualities, you never know too much

kevlray · Jan 6, 2017

I follow a MS-SQL forum from time to time. And there is a real science (and some art) to get the best performance out of a MS-SQL database. A lot of it is beyond my understanding (and fortunately beyond my job duties). I have read articles on performance issues on other types of databases (Oracle, SyBase, etc.). It appears that each one has a different approach on how to get the best performance out of a database.

The_Doc_Man · Jan 10, 2017

kevlray - true that. Every vendor wants you to optimize differently, and how you do that often also depends on the tools available with the operating system.

With Access, you want to do Compact & Repair often, and you want to archive things to longer-term storage when you no longer need them for fast retrieval. After that, there is little you can do other than carefully attend to your indexes and keep your disk defragged as well as possible.

With ORACLE, you also run into things related to "disk buckets" (on some systems, at least). A "bucket" is like one of those Access disk buffers, but is variable in size. You can tell ORACLE that when you are storing indexed files, you want the bucket fill-factor to be X% because they optimize the placement of records. So if you set the fill factor for 75% and you have 6 records already in the bucket, you can add 2 more records before you have to split the contents. I.e. you don't have to move a lot of records aside to insert in the middle of a table because you can tell ORACLE ahead of time to leave extra room for a few insertions.

ORACLE also allows you to split records and indexes across multiple physical disks when you have them. That is, you can say that the index is on DISK10 and the raw table data is on DISK11 for the same table.

ORACLE can also do analytics to identify hot spots - places where you are hitting a disk too hard. Some years ago on a SHAREBASE database that didn't allow such analysis, we had an undetected hot spot that was SO frequently accessed, it eventually caused a disk and its mirror to fail within 15 minutes of each other. Took us 3 weeks to recover from that one because of the need to first rebuild the disk's original structure, then play transactions forward from the last full backup, all the while building up and holding new records that we could not process because if they had gone into the system in the wrong order, some guys wouldn't have gotten paid. Bad thing, that.

Anyway, where I'm going with the comment is a point of interest for anyone doing database work in an environment new to them. Find out ASAP regarding optimization and analytical tools so you can watch for hot spots, unbalanced data entry situations, crowded indexes, and other performance stealers.

As Edwin Edwards, former governor of Louisiana once said about a state budget mess he inherited, "It ain't my baby but I('ve) got to rock it." Well, your environment might not be your baby, but let's hope you know some good lullabyes.

gemma-the-husky · Jan 10, 2017

DocMan

Are the disk-buckets you mention equivalent to the balancing mechanisms you get in b-trees, or are we talking something else?

was your example of 75% with 8-record buckets just an illustration, because typically a b-tree bucket will hold many more records, something like 50 or 100

wiki on b-trees
https://en.wikipedia.org/wiki/B-tree

The_Doc_Man · Jan 11, 2017

Similar to b-tree balancing, but not identical. In essence, the mechanism I described reduced the frequency at which a b-tree rebalance was needed. Wasteful of space in some ways, but efficient in managing low-frequency updates.

ORACLE allowed you to tailor your bucket sizes based on the O/S cluster sizes or even divisions of track sizes. I.e. it cooperated with what your O/S said was a good way to lay out its disks. And yes, that set of numbers was for illustration based on long records and small buckets.

How do I know if I have bad performance?

The_Doc_Man

Immoderate Moderator

static

Registered User.

jdraw

Super Moderator

NauticalGent

Ignore List Poster Boy

The_Doc_Man

Immoderate Moderator

gemma-the-husky

Super Moderator

Minty

AWF VIP

CJ_London

Super Moderator

Minty

AWF VIP

CJ_London

Super Moderator

Minty

AWF VIP

kevlray

Registered User.

The_Doc_Man

Immoderate Moderator

gemma-the-husky

Super Moderator

The_Doc_Man

Immoderate Moderator

Similar threads

Users who are viewing this thread