complicated query issue with many calculated fields (1 Viewer)

terrytek

Registered User.
Local time
Today, 17:19
Joined
Aug 12, 2016
Messages
75
This question is a long two-parter.
(1) I have query with student testing information. I would like to make a new query to show only those students who have taken a post-test. The query I have now is

qryTwoLastTests
Code:
SELECT a.StudentID, a.TestDate, a.TestTypeID, a.TestType, a.SSNum, a.GENum, a.NextForm
FROM qryTwoLastTests AS a INNER JOIN qryTwoLastTests AS b ON a.StudentID = b.StudentID
WHERE (((a.TestTypeID)<>[b].[TestTypeID]) OR ((a.TestDate)<>[b].[TestDate]))
GROUP BY a.StudentID, a.TestDate, a.TestTypeID, a.TestType, a.SSNum, a.GENum, a.NextForm;
I have attached an annotated photo of the query results, which are not exactly what I want (the annotations elaborate). The only way I could figure out only the students who have a post-test was to look for students who have more than one test session, so their Student ID is listed more than once. Problem is, some students take more than one test on a given date (like Reading and Math). I only want the students returned in the query that have two tests of the same [TestTypeID] on two different dates. If necessary, I could add a field where the choices are pretest and posttest, but I am afraid that the end users would not use the field properly (once someone takes a posttest, that posttest now becomes the pretest for the same test type on a later date).

(2) If I could ever get the query in (1) to return the desired results, I could move on to the following query, which is based on query (1) and calculates the difference in [SSNum] and [GENum] for a student's two most recent test dates of the same test type:

Code:
SELECT qryTwoLastTests.StudentID, DMax("TestDate","qryTwoLastTests","StudentID=" & [StudentID]) AS LastTestDate, DLookUp("TestTypeID","qryTwoLastTests","StudentID=" & [StudentID] & " and TestDate = #" & [LastTestDate] & "#") AS LastTT, DLookUp("SSNum","qryTwoLastTests","StudentID=" & [StudentID] & " and TestDate = #" & [LastTestDate] & "#") AS LastSS, DLookUp("GENum","qryTwoLastTests","StudentID=" & [StudentID] & " and TestDate = #" & [LastTestDate] & "#") AS LastGE, DMax("TestDate","qryTwoLastTests","StudentID=" & [StudentID] & " and TestDate < #" & [LastTestDate] & "#") AS PrevTestDate, IIf(IsNull([PrevTestDate]),Null,DLookUp("TestType","qryTwoLastTests","StudentID=" & [StudentID] & " and TestDate= #" & [PrevTestDate] & "#")) AS PrevTT, IIf(IsNull([PrevTestDate]),Null,DLookUp("SSNum","qryTwoLastTests","StudentID=" & [StudentID] & " and TestDate= #" & [PrevTestDate] & "#")) AS PrevSS, IIf(IsNull([PrevTestDate]),Null,DLookUp("GENum","qryTwoLastTests","StudentID=" & [StudentID] & " and TestDate= #" & [PrevTestDate] & "#")) AS PrevGE, ([LastSS]-[PrevSS]) AS SSDiff, ([LastGE]-[PrevGE]) AS GEDiff
FROM qryTwoLastTests
GROUP BY qryTwoLastTests.StudentID
HAVING (((Count(qryTwoLastTests.StudentID))>=2))
ORDER BY qryTwoLastTests.StudentID;
(a) This query either runs VERY slowly or crashes Access 2007. Is that due to all of the calculated fields and/or the fact that the query on which it's based returns results that make some of the calculations null (because the student took more than one test on the same day but neither of them have posttests yet, like student 146 in the photo)? Is there a way to rewrite this query, or break it up, or do what I'm trying to do in VBA instead (I know almost no VBA)? I'm not experienced enough to know.
(b) As I've mentioned earlier, I need this query to also be able to compare two SETS of tests on two different dates, as with student 156 in photo. The DMax in this query can pull the most recent test date, but what if the student has the same most recent test date for a Reading and a Math test? I need to compare both sets of tests.

Thank you for reading my long question, and thank you to this forum for all your help; I've learned a lot already, but have so far to go.....
 

Attachments

  • Capture1.PNG
    Capture1.PNG
    60.2 KB · Views: 133

CJ_London

Super Moderator
Staff member
Local time
Today, 22:19
Joined
Feb 19, 2013
Messages
16,607
first - I presume that results for students who you are not annotating should also be ignored.

second - not shown in your example but can you have the situation of same student, same testtype, but 3 records? and if so do you need to compare record 2 with record 1 and record 3 with record 2? Or just 1 and 3 and ignore record 2?

Would also be helpful to show the original table and some example data plus the sql of your qryTwoLastTests query

With regards slow running queries, this is usually due to poor or non existent indexing and use of domain functions.

I don't understand why you are grouping I would have expected you query to be more like


SELECT a.StudentID, a.TestDate, a.TestTypeID, a.TestType, a.SSNum, a.GENum, a.NextForm
FROM qryTwoLastTests AS a INNER JOIN qryTwoLastTests AS b ON a.StudentID = b.StudentID AND a.TestTypeID=.[TestTypeID] and [a].TestDate<.[TestDate]
 

terrytek

Registered User.
Local time
Today, 17:19
Joined
Aug 12, 2016
Messages
75
Thank you for your reply.

first - I presume that results for students who you are not annotating should also be ignored.

Yes, I just annotated one of the students who had two tests on the same day and no follow-up tests yet as an example.

>>second - not shown in your example but can you have the situation of same student, same testtype, but 3 records? and if so do you need to compare record 2 with record 1 and record 3 with record 2? Or just 1 and 3 and ignore record 2?

The database will certainly have students with 3 or more tests of the same testtype on different dates moving forward, but I only need to compare the most recent 2 tests. The older tests need to be kept in the database for reference but do not factor into the calculation.

>>Would also be helpful to show the original table and some example data plus the sql of your qryTwoLastTests query

(1)I did not realize that I had given the SQL to qryTwoLastTestsWORKING instead of qryTwoLastTests (oops!) Here is qryTwoLastTests:
Code:
SELECT qryTesting.StudentID, qryTesting.TestDate, qryTesting.TestTypeID, qryTesting.TestType, qryTesting.Form, qryTesting.SSNum, qryTesting.GENum, qryTesting.NextForm
FROM qryTesting INNER JOIN qryTesting AS qryTesting_1 ON qryTesting.StudentID=qryTesting_1.StudentID
WHERE (((qryTesting.TestDate) In (SELECT TOP 4 [TestDate] FROM qryTesting WHERE [StudentID]=[qryTesting_1].[StudentID] ORDER BY [TestDate] desc)))
GROUP BY qryTesting.StudentID, qryTesting.TestDate, qryTesting.TestTypeID, qryTesting.TestType, qryTesting.Form, qryTesting.SSNum, qryTesting.GENum, qryTesting.NextForm
HAVING (((Count(qryTesting.StudentID))>=2));
(2)I am attaching a pic of some example data from tblTesting, as well as an intermediate query qryTesting that converts the original SSs and GEs from text (because some of the scores have "+" in them and had to be converted to data type number). Here is the SQL for qryTesting as well:
Code:
SELECT tblStudents.StudentID, tblTesting.TestDate, tblTestType.TestTypeID, tblTestType.TestType, tblTesting.Form, tblTesting.SS, Val([SS] & "") AS SSNum, tblTesting.GE, Val([GE] & "") AS GENum, tblTesting.NextForm
FROM tblTestType INNER JOIN (tblStudents INNER JOIN tblTesting ON tblStudents.StudentID=tblTesting.StudentID) ON tblTestType.TestTypeID=tblTesting.TestTypeID
GROUP BY tblStudents.StudentID, tblTesting.TestDate, tblTestType.TestTypeID, tblTestType.TestType, tblTesting.Form, tblTesting.SS, tblTesting.GE, tblTesting.NextForm
ORDER BY tblStudents.StudentID, tblTesting.TestDate DESC;
>>With regards slow running queries, this is usually due to poor or non existent indexing and use of domain functions.
Indexes are a black box to me. Should I index any field called by the domain functions?

>>I don't understand why you are grouping I would have expected you query to be more like


SELECT a.StudentID, a.TestDate, a.TestTypeID, a.TestType, a.SSNum, a.GENum, a.NextForm
FROM qryTwoLastTests AS a INNER JOIN qryTwoLastTests AS b ON a.StudentID = b.StudentID AND a.TestTypeID=.[TestTypeID] and [a].TestDate<.[TestDate]


I ran your query (results shown in photo qry4)and it did not show all the requisite tests for student 156 (who had both reading and math on 8/1 and again on 8/27, which is not shown), students 188 and 189 each had two tests of the same type on different days (your query only shows one test each), and one of student 190's record is a duplicate.
 

Attachments

  • tblTesting.PNG
    tblTesting.PNG
    57.5 KB · Views: 119
  • qryTesting.PNG
    qryTesting.PNG
    66.6 KB · Views: 115
  • qry4.PNG
    qry4.PNG
    21.7 KB · Views: 146

CJ_London

Super Moderator
Staff member
Local time
Today, 22:19
Joined
Feb 19, 2013
Messages
16,607
Sorry - change the select statement to just SELECT * (both records are on the same line, but current select is only selecting one of them)

Again I see no reason for grouping qryTesting or qryTwoLastTest since you are not summing etc

Alternatively based on your original table try

Code:
 SELECT L.*, P.*
 FROM tblTesting L INNER JOIN tblTesting P 
     ON L.StudentID=P.StudentID AND L.TestTypeID=P.TestTypeID
 WHERE 
     L.TestDate=(SELECT TOP 1 TestDate FROM tblTesting WHERE StudentID=P.StudentID AND TestTypeID=P.TestTypeID ORDER BY TestDate Desc) 
        AND P.Testdate IN =(SELECT TOP 2 TestDate FROM tblTesting WHERE StudentID=P.StudentID AND TestTypeID=P.TestTypeID ORDER BY TestDate Desc) AND L.TestDate<>P.TestDate
 ORDER BY L.StudentID, L.TestTypeID
Ensure fields StudentID., TestTypeID and TestDate are indexed

Aliases L equates to Last and P to Previous

Indexes are a black box to me. Should I index any field called by the domain functions?
domain functions do not use indexes, one of the reasons they are slow. Also by grouping you lose the benefit of indexing
 

terrytek

Registered User.
Local time
Today, 17:19
Joined
Aug 12, 2016
Messages
75
Sorry - change the select statement to just SELECT * (both records are on the same line, but current select is only selecting one of them)

Again I see no reason for grouping qryTesting or qryTwoLastTest since you are not summing etc

Alternatively based on your original table try

Code:
 SELECT L.*, P.*
 FROM tblTesting L INNER JOIN tblTesting P 
     ON L.StudentID=P.StudentID AND L.TestTypeID=P.TestTypeID
 WHERE 
     L.TestDate=(SELECT TOP 1 TestDate FROM tblTesting WHERE StudentID=P.StudentID AND TestTypeID=P.TestTypeID ORDER BY TestDate Desc) 
        AND P.Testdate IN =(SELECT TOP 2 TestDate FROM tblTesting WHERE StudentID=P.StudentID AND TestTypeID=P.TestTypeID ORDER BY TestDate Desc) AND L.TestDate<>P.TestDate
 ORDER BY L.StudentID, L.TestTypeID
Ensure fields StudentID., TestTypeID and TestDate are indexed

Aliases L equates to Last and P to Previous

domain functions do not use indexes, one of the reasons they are slow. Also by grouping you lose the benefit of indexing

As to the grouping you mentioned in post #2: I'm a SQL newbie. When I try to remove all but StudentID from GROUP BY, I get an error from Access "You tried to execute a query that does not include the specified expression 'TestDate' as part of an aggregate function" (I get that error from any field that I remove from the GROUP BY). I have tried to research this error but still do not understand it. But putting the field that throws that error in GROUP BY gets rid of the error.

I had to remove the "=" signs from your query (I'm such a newbie that I'm proud of myself for figuring that out :) ) BUT OTHERWISE IT'S AMAZING--now I just need to figure out how to get the difference between the two scores in the proper format. Hopefully I can do that without assistance--we shall see...
 

CJ_London

Super Moderator
Staff member
Local time
Today, 22:19
Joined
Feb 19, 2013
Messages
16,607
sorry about the = was copy and pasting

with regards your error

tried to execute a query that does not include the specified expression 'TestDate' as part of an aggregate function
its generated in group by queries when the field used in the criteria is not either aggregated in some way or grouped - The solution is to create another column (in this case for a.testdate) and put b.testdate in the criteria and also change the groupby in the dropdown to where

But as I said - there is no need for the group by at all since you aren't aggregating anything and from the definition of your data there should not be any duplicates (if there were, use SELECT DISTINCT instead - in the query properties change Unique Values to Yes)
 

terrytek

Registered User.
Local time
Today, 17:19
Joined
Aug 12, 2016
Messages
75
sorry about the = was copy and pasting

with regards your error

its generated in group by queries when the field used in the criteria is not either aggregated in some way or grouped - The solution is to create another column (in this case for a.testdate) and put b.testdate in the criteria and also change the groupby in the dropdown to where

But as I said - there is no need for the group by at all since you aren't aggregating anything and from the definition of your data there should not be any duplicates (if there were, use SELECT DISTINCT instead - in the query properties change Unique Values to Yes)

So if you have a query where you're aggregating a field (and therefore need the totals row in the Query Design Window), but you want to show other fields in the returned record, you have to do a work-around? Is this a flaw in Access, or is this something that SQL also requires? As you said, sometimes you just don't want to GROUP BY every other field that you just want to show in the result, but Access doesn't allow you to just leave the Totals row blank for a given record; GROUP BY is the default.
 

CJ_London

Super Moderator
Staff member
Local time
Today, 22:19
Joined
Feb 19, 2013
Messages
16,607
So if you have a query where you're aggregating a field (and therefore need the totals row in the Query Design Window), but you want to show other fields in the returned record, you have to do a work-around?
Not saying that at all - just use group by for the other fields if you want to show them. I'm saying you can't apply a criteria to a grouped by column which references a field that has not been included

I wouldn't call it a workaround and no, it is not a flaw in access - just the way group by queries generally work in databases.

If you look at the sql of a group by query it will be something like

SELECT fld1, sum(fld2) as sumoffld2
FROM myTable
WHERE fld3=xxx
GROUP BY fld1
HAVING fld1=zzz

When you put a criteria in the group by or sum column, it appears in the HAVING part of the query - and is 'actioned' after everything has been grouped/aggregated - so if you haven't included that other field, it can't be referenced as a criteria.

However when you put a criteria in a WHERE column it appears in the WHERE part of the sql and is actioned before being grouped - which also means it is general faster because the grouping will then only be applied on the data selected from the where statement, rather than grouping the entire table before applying the criteria.

Further down the line you will find the same thing with aliases and calculated fields

a normal select query

SELECT fld1, fld2/10 as calcfield, calcfield*1.5 as calcfield2
FROM myTable

works

but change it to a group by query and calcfield2 cannot reference calcfield1, instead for calcfield2 you would use

fld2/10 *1.5 as calcfield2

nearest analogy I can think of at this time of night is think of the fields being pieces of paper (perhaps a report) and the group by a folder.

You put the pieces of paper in the folder and give it to your boss. If one of those pieces of paper references another that is not in the folder, the boss becomes confused....That's the HAVING part
 

terrytek

Registered User.
Local time
Today, 17:19
Joined
Aug 12, 2016
Messages
75
Not saying that at all - just use group by for the other fields if you want to show them. I'm saying you can't apply a criteria to a grouped by column which references a field that has not been included

I wouldn't call it a workaround and no, it is not a flaw in access - just the way group by queries generally work in databases.

If you look at the sql of a group by query it will be something like

SELECT fld1, sum(fld2) as sumoffld2
FROM myTable
WHERE fld3=xxx
GROUP BY fld1
HAVING fld1=zzz

When you put a criteria in the group by or sum column, it appears in the HAVING part of the query - and is 'actioned' after everything has been grouped/aggregated - so if you haven't included that other field, it can't be referenced as a criteria.

However when you put a criteria in a WHERE column it appears in the WHERE part of the sql and is actioned before being grouped - which also means it is general faster because the grouping will then only be applied on the data selected from the where statement, rather than grouping the entire table before applying the criteria.

Further down the line you will find the same thing with aliases and calculated fields

a normal select query

SELECT fld1, fld2/10 as calcfield, calcfield*1.5 as calcfield2
FROM myTable

works

but change it to a group by query and calcfield2 cannot reference calcfield1, instead for calcfield2 you would use

fld2/10 *1.5 as calcfield2

nearest analogy I can think of at this time of night is think of the fields being pieces of paper (perhaps a report) and the group by a folder.

You put the pieces of paper in the folder and give it to your boss. If one of those pieces of paper references another that is not in the folder, the boss becomes confused....That's the HAVING part

I think I understand (at least a bit) what you're saying. You were asking why I was GROUPing by certain fields in an earlier post; I guess it was my beginner way of having those fields show in the query results.

Thanks again for all of your help!
 

Simon_MT

Registered User.
Local time
Today, 22:19
Joined
Feb 26, 2007
Messages
2,177
There is another way. Use VBA to run a SQL statement after input by the user and if the count = 1 set a flag to PreTest if the count = 2 set a flag to Post or something along those lines. Basically this is triggered after a Save Record on the Form Close or Add another Record.

I do this to total up transaction in four subsystems into one analysis table. Everything stays in sync. There are thousands of records in these subsystems and yes I could use SQL but this is much more efficient.

Simon
 

terrytek

Registered User.
Local time
Today, 17:19
Joined
Aug 12, 2016
Messages
75
There is another way. Use VBA to run a SQL statement after input by the user and if the count = 1 set a flag to PreTest if the count = 2 set a flag to Post or something along those lines. Basically this is triggered after a Save Record on the Form Close or Add another Record.

I do this to total up transaction in four subsystems into one analysis table. Everything stays in sync. There are thousands of records in these subsystems and yes I could use SQL but this is much more efficient.

Simon

The problem is the tests' Pre-Post status isn't always constant. Eventually, the database will have multiple tests for many users--only the count for new students' tests will be 1.
The most recent test date will be the posttest, and the penultimate date will be the pretest. However, once a new (i.e. later) test is taken, the previous posttest then becomes the pretest.
 

Users who are viewing this thread

Top Bottom