Subquery Seppuku - Is there a more efficient way to do this?

Swatkins

Registered User.
Local time
Today, 04:28
Joined
Oct 19, 2012
Messages
42
Hi,

I introduced myself on the Intro board, but I'd like to get to the meat of the problem I'm having.

A little background on the problem. At the company where I work there is an old legacy app. It can generate a few canned reports, but it's a bit of a clumsy implementation, and lacks any flexibility. However, it's not a high priority area for the business, so I'm told the enterprise app/dev team will have no interest in revisiting the old app to make changes and updates for us. Instead, what we have now is an Access/SQL based back-end to the underlying tables: we can see those tables and write our own queries, forms, etc., but we can't see any of the queries the app itself uses to generate its reports or anything like that.

I've been asked to create a report that compares the profitability of the widgets we produce across different budget periods and analyze what revenue drivers drive changs in profitability over time.

The data I'm working with is pretty large database (at least by my standards). There are 2 business units I'm concerned with and 4 Categories of widgets. There are thousands of individual widgets. In the database each budget period is a long-range budget plan with many prior years of actual historical info, the current forecast year, and a few dozen future year's worth of projected data. Each new budget period we duplicate the data of the prior budget period, update the actual data for the most recently completed year and then update projected revenue variables.

The problem I'm having isn't when I'm pulling the individual widgets themselves - I have a query for that which runs fairly quickly and smoothly. It's when I'm pulling the revenue variables. There are only about 500-600 rows of data for this (a little over 30-years worth of actual and projected years' data by the 4 categories by the 2 business units by the two budget periods I'm pulling to compare). The problem is that while some of the revenue variables will change for each year within the budget period (both past and for future projected years), some don't change often from year to year, and will remain constant across many projected years. The former sort of variable is easy to query because there's a separate value stored in the tables for each year. But the latter is tricky because for years when the values have not changed from the year prior, the tables store a null value. For these, when I pull the data, I want to see value from the last time the field was updated, holding the budget period, business unit and category constant.

The only way I've figured out how do this so far is with subqueries. The problem is, doing it this way, the query runs so slowly that it cripples my system. It's painful. The past two nights I've left work for the day right after starting the query running, and come back the next morning to find Access is still chugging away.

I know there has to be a way to do this that's more efficient and faster. The app front-end can display the data correctly within the system, and it only takes a few minutes to churn the data instead of many, many hours. But the app can't export the data in a useable way, and I can't see under the hood to figure out how the app does it.

Here's basically what the SQL from my query looks like, which I wrote using the Access query writer; this query pulls from two queries I also wrote which pulls different parts of the data from different tables, and tries to align them together (because it's pulling from these two queries, the null values in the underlying tables have become 0s in the queries I'm pulling from):

Code:
SELECT 
Q_RevVars.TP_BUDGETPER, 
Q_RevVars.BUS_UNIT, 
Q_RevVars.Year, 
Q_RevVars.PART_CATEGORY, 
Q_Avg_WidgetMarket.AvgOfWidget_Market_Size, 
IIf(
 [Q_RevVars].[WidgMkt_Share]<>0,
 [Q_RevVars].[WidgMkt_share],
 (SELECT [Q_SW_ROSVars]![WidgMkt_Share] 
  FROM [Q_SW_RevVars] 
  WHERE [Q_SW_RevVars]![TP_BUDGETPER]=Q_RevVars.TP_BUDGETPER 
  AND [Q_SW_RevVars]![BUS_UNIT] = Q_RevVars.BUS_UNIT 
  AND [Q_SW_RevVars]![PART_CATEGORY] = Q_RevVars.PART_CATEGORY 
  AND [Q_SW_RevVars]![Year]=
   (SELECT MAX([Q_SW_RevVars]![Year]) 
   FROM [Q_SW_RevVars] 
   WHERE [Q_SW_RevVars]![TP_TP_BUDGETPER] = Q_ROSVars.TP_TP_BUDGETPER 
   AND [Q_SW_RevVars]![BUS_UNIT] = Q_RevVars.BUS_UNIT 
   AND [Q_SW_RevVars]![PART_CATEGORY] = Q_RevVars.PART_CATEGORY 
   AND [Q_SW_RevVars]![WidgMkt_Share] <> 0 
   AND [Q_SW_RevVars]![Year] < Q_RevVars.Year))) 
  AS WidgeMkt_Share_ALL, 
Q_RevVars.Widg_UnitPrice, 
IIf(
 [Q_RevVars].[UNITS_PER_SHARE]<>0,
 [Q_RevVars].[UNITS_PER_SHARE],
 (SELECT [Q_SW_RevVars]![UNITS_PER_SHARE] 
  FROM [Q_SW_RevVars] 
  WHERE [Q_SW_RevVars]![TP_BUDGETPER]=Q_RevVars.TP_BUDGETPER 
  AND [Q_SW_RevVars]![BUS_UNIT] = Q_RevVars.BUS_UNIT 
  AND [Q_SW_RevVars]![PART_CATEGORY] = Q_RevVars.PART_CATEGORY 
  AND [Q_SW_RevVars]![Year]=
   (SELECT MAX([Q_SW_RevVars]![Year]) 
   FROM [Q_SW_RevVars] 
   WHERE [Q_SW_RevVars]![TP_BUDGETPER] = Q_RevVars.TP_BUDGETPER 
   AND [Q_SW_RevVars]![BUS_UNIT] = Q_ROSVars.BUS_UNIT 
   AND [Q_SW_RevVars]![PART_CATEGORY] = Q_ROSVars.PART_CATEGORY 
   AND [Q_SW_RevVars]![UNITS_PER_SHARE] <> 0 
   AND [Q_SW_RevVars]![Year] < Q_RevVars.Year) )) 
  AS UNITS_PER_SHARE_ALL, 
IIf(
 [Q_RevVars].[REFUND_RATE]<>0,
 [Q_RevVars].[REFUND_RATE],
 (SELECT [Q_SW_RevVars]![REFUND_RATE] 
  FROM [Q_SW_RevVars] 
  WHERE [Q_SW_RevVars]![TP_BUDGETPER]=Q_RevVars.TP_BUDGETPER 
  AND [Q_SW_RevVars]![BUS_UNIT] = Q_RevVars.BUS_UNIT 
  AND [Q_SW_RevVars]![PART_CATEGORY] = Q_RevVars.PART_CATEGORY 
  AND [Q_SW_RevVars]![Year]=
   (SELECT MAX([Q_SW_RevVars]![Year]) 
   FROM [Q_SW_RevVars] 
   WHERE [Q_SW_RevVars]![TP_BUDGETPER] = Q_RevVars.TP_BUDGETPER 
   AND [Q_SW_RevVars]![BUS_UNIT] = Q_RevVars.BUS_UNIT 
   AND [Q_SW_RevVars]![PART_CATEGORY] = Q_RevVars.PART_CATEGORY 
   AND [Q_SW_RevVars]![REFUND_RATE] <> 0 
   AND [Q_SW_RevVars]![Year] < Q_RevVars.Year))) 
  AS REFUND_RATE_ALL
FROM Q_SW_RevVars AS Q_RevVars INNER JOIN Q_Avg_WidgetMarket 
 ON (Q_RevVars.BUS_UNIT = Q_Avg_WidgetMarket.BUS_UNIT) 
 AND (Q_RevVars.Year = Q_Avg_WidgetMarket.YEAR) 
 AND (Q_RevVars.TP_VBUDGETPER = Q_Avg_WidgetMarket.TP_BUDGETPER)
ORDER BY Q_RevVars.Year;

As a simple example, what I want it to do is when it encounters this:

Code:
Year     Units_Per_Share
2008     457
2009     0
2010     500
2011     525
2012     0
2013     0
2014     0

the query should output this:

Code:
Year     Units_Per_Share
2008     457
2009     457
2010     500
2011     525
2012     525
2013     525
2014     525

Can anyone see the fatal flaw in my query, or does anyone have an idea of a better or more efficient way to achieve the desired results?

Thanks a million,

Stephen
 
If you have repeated recalculations of the same data, I would as number 1 option create a temporary table, fill in the missing (=null) numbers there, and then use that table.

Also, instead of a subquery, I would make a UDF that simply retains the value in each record in a variable declared as STATIC and presents it if the correpsonding field in the current record is null (of course, provided things are ran through in some consistent incrementing-with-time sequence).

How that affects the distribution of work between ACE/JET and the SQL-server I am not sure - but it would be quick to try.

Either of the above could likely be done on the SQL server.

With this I leave the field to those AWF-members who possess greater SQL-server expertise than mine.
 
If you have repeated recalculations of the same data, I would as number 1 option create a temporary table, fill in the missing (=null) numbers there, and then use that table.

Also, instead of a subquery, I would make a UDF that simply retains the value in each record in a variable declared as STATIC and presents it if the correpsonding field in the current record is null (of course, provided things are ran through in some consistent incrementing-with-time sequence).

How that affects the distribution of work between ACE/JET and the SQL-server I am not sure - but it would be quick to try.

Either of the above could likely be done on the SQL server.

With this I leave the field to those AWF-members who possess greater SQL-server expertise than mine.

While my knowledge and understanding of Access has increased dramatically the last year over what it was before, I'm afraid I'm still too much of a neophyte to imagine how to go about doing either of the suggestions you've made.

How would you approach putting this data into a temporary table, and/or how would you run this through a UDF? (I'm guessing the latter would involve some kind VBA coding, but I'm not sure of how to pull it off.)

Thanks for your suggestions.
 
Based on the feedback above, I started fiddling around with a Make Table query and some VBA to try to accomplish what I want in a more efficient manner.

I set up a make-table query that takes the data I was pulling in the query above but without the nested subqueries to translate all the 0s into the correct numbers. I set the Make Table query to sort by TP_BUDGETPER, BUS_UNIT, and Year, in that order. When I view the table after running the Make Table query, it looks pretty much as I expect. Everything's in the order I anticipated, and the later years have 0s populated in WidgMkt_Share, UNITS_PER_SHARE and REFUND_RATE.

So then I turn to VBA, where I've got a Do While loop that iterates through the table I've just created, and on each checks the value of the variables above (as a test case I just set this up to look only at WidgMkt_Share), if it's not 0 it saves the value in a variable that keeps track of the last value for that field. If it's 0 it updates the field with the value of the variable.

After running the VBA, I noticed that my table had been re-arranged: the order was off from what it was before. Then I noticed that a lot of 0 values made it through. I'm still trying to diagnose what went wrong.

So... does anybody have any thoughts on why the table would be ordered differently after I've looped through the table using VBA?
 
Tables have no inherent order!

If you want to loop through a recordset, and rely on some given sequence of records, then the recordset must be based on a query.

Code:
Set rst = db.OpenRecordset("SELECT * FROM tblMyData  ORDER BY SomeFieldOrOther")
 
Thanks. Do I have to worry about whether the query is updatable? If I loop through the records in a query this way, can I update the fields back to the table?
 
I just wanted to say thanks again. Doing it this way has so far proven literally thousands of times more efficient - I've got a table with the correct values stored in a matter of seconds instead of many, many hours.

I've got some more work to do tying everying together, but the problem that had me up against a wall seems to be fully resolved. I appreciate the help and ideas.
 

Users who are viewing this thread

Back
Top Bottom