Solved Aggregate Query Returns Different Values When a Field is Included not in Group By

MajP

You've got your good things, and you've got mine.
Local time
Yesterday, 23:23
Joined
May 21, 2018
Messages
8,935
This problem relates to this Thread

I have come across something I have never seen, and do not understand.
The OP has a table of soccer matches with Home Team ID, Away Team ID, Home Team Score, Away Team Score as seperate columns. In order to get the stats I normalize this with a union query and works fine.
Code:
SELECT tblmatches.matchid,
       tblmatches.hometeamid
       AS TeamID,
       awayteamid
       AS OpponentID,
       tblmatches.homescore
       AS Score,
       Iif([homescore] > [awayscore], 3, Iif([homescore] = [awayscore], 1, 0))
       AS
       MatchPoints,
       [homescore] - [awayscore]
       AS MatchDifferential,
       "home"
       AS Location,
       homescore
       AS MatchGoalsFor,
       awayscore
       AS MatchGoalsAgainst,
       Iif([homescore] > [awayscore], 1, 0)
       AS MatchWin,
       Iif([homescore] < [awayscore], 1, 0)
       AS MatchLoss,
       Iif([homescore] = [awayscore], 1, 0)
       AS MatchTie,
       periodid
FROM   tblmatches
UNION
SELECT tblmatches.matchid,
       tblmatches.awayteamid
       AS TeamID,
       hometeamid
       AS OpponentID,
       tblmatches.awayscore
       AS Score,
       Iif([homescore] < [awayscore], 3, Iif([homescore] = [awayscore], 1, 0))
       AS
       MatchPoints,
       [awayscore] - [homescore]
       AS MatchDifferential,
       "away"
       AS Location,
       awayscore,
       homescore,
       Iif([homescore] < [awayscore], 1, 0),
       Iif([homescore] > [awayscore], 1, 0)
       AS MatchLoss,
       Iif([homescore] = [awayscore], 1, 0)
       AS MatchTie,
       periodid
FROM   tblmatches
ORDER  BY 1,
          3 DESC;

So now there is a seperate record for each team's match data.
Using an aggregate query I can then sum or count the values to get the statistics.
This is where it gets strange.
If I do the following it works.

Code:
SELECT qrymatchpoints.periodid,
       qrymatchpoints.teamid,
       Sum(qrymatchpoints.matchpoints) AS TotalPoints,
       Count(qrymatchpoints.matchid)   AS GamesPlayed
FROM   qrymatchpoints
GROUP  BY qrymatchpoints.periodid,
          qrymatchpoints.teamid
ORDER  BY qrymatchpoints.periodid,
          Sum(qrymatchpoints.matchpoints) DESC;
qryTotalPointsByPeriod qryTotalPointsByPeriod

PeriodIDTeamIDTotalPointsGamesPlayed
1​
1​
15​
5​
1​
2​
11​
5​
1​
8​
10​
5​
1​
4​
9​
5​
1​
3​
8​
5​
1​
5​
8​
5​
1​
7​
8​
5​
1​
6​
7​
5​
1​
14​
4​
5​
1​
9​
4​
5​
1​
10​
4​
5​
1​
11​
3​
5​
1​
13​
2​
5​
1​
12​
1​
5​
2​
12​
13​
5​
2​
13​
11​
5​
2​
14​
10​
5​
2​
10​
10​
5​
2​
11​
9​
5​
2​
6​
7​
5​
2​
9​
7​
5​
2​
4​
6​
5​
2​
3​
5​
5​
2​
5​
5​
5​
2​
7​
5​
5​
2​
8​
4​
5​
2​
2​
2​
5​
2​
1​
0​
5​

If I remove the Count of Matches Played by a team field:
Code:
Count(qrymatchpoints.matchid)   AS GamesPlayed

Code:
SELECT qrymatchpoints.periodid,
       qrymatchpoints.teamid,
       Sum(qrymatchpoints.matchpoints) AS TotalPoints
FROM   qrymatchpoints
GROUP  BY qrymatchpoints.periodid,
          qrymatchpoints.teamid
ORDER  BY qrymatchpoints.periodid,
          Sum(qrymatchpoints.matchpoints) DESC;

I get the wrong values for TotalPoints in some records even though the group by has not changed.
qryTotalPointsByPeriod_problem qryTotalPointsByPeriod_problem

PeriodIDTeamIDTotalPoints
1​
1​
12​
1​
8​
10​
1​
4​
9​
1​
2​
8​
1​
3​
8​
1​
5​
8​
1​
7​
8​
1​
6​
7​
1​
14​
4​
1​
9​
4​
1​
10​
4​
1​
11​
3​
1​
13​
2​
1​
12​
1​
2​
12​
13​
2​
13​
11​
2​
14​
10​
2​
11​
9​
2​
6​
7​
2​
9​
7​
2​
10​
7​
2​
4​
6​
2​
3​
5​
2​
5​
5​
2​
7​
5​
2​
8​
4​
2​
2​
2​
2​
1​
0​

The Total Points do not match and are wrong. Ex Period 1 team 1 went from 15 to 12 and Period 1 team 2 went from 11 to 8. That difference is one win (3 points) not included in the summation. Anyone seen anything like this?

Now this is the symptom, but it is a bigger problem. There is an issue trying to do any query based on the aggregate.
Any thoughts would be appreciated.
FYI this is not some type of corruption. This can be replicated in a clean database.
 
Last edited:
That is awesome. Even better, since you have 2 different results you have no idea what the correct TotalPoints is.

Time to dig into qrymatchpoints and manually determine the right TotalPoints by hand and then find the ones dropping out.
 
That is awesome. Even better, since you have 2 different results you have no idea what the correct TotalPoints is.
Time to dig into qrymatchpoints and manually determine the right TotalPoints by hand and then find the ones dropping out.
I know the first query results are correct.
In fact I know which records are not correct and how many records are not rolling up. In the problem which is only a few records in the aggregate seems that one record is dropped in the aggregation.

Potential issues.
This starts with a union query that has a lot of IIF functions and a few calculated fields.
The aggregate is based on the union.
 
I have to admit uncertainty on one aspect of this presentation: What does the ORDER BY do when you don't offer a field name but instead offer numbers? (It appears in the second part of the UNION.) I looked up the Microsoft syntax reference and didn't see "field number" as a recognized variant. I checked a couple of my SQL syntax PDFs from ISO and didn't find that option either. I also found questions about the effect of an ORDER BY clause in one of the parts of a UNION query. I recall a discussion in the forum that the correct way to sort a UNION query was to layer it with the ORDER BY in an outer query based on the UNION query as its record source. It may be extraneous, but I wonder if it is needed and whether it affects the oddity you reported.

Also, have you looked at the query plans of the two outermost queries to see if anything is different that might shed light on the problem?
 
Nulls!

What happens if you try:
Code:
SELECT qrymatchpoints.periodid,
       qrymatchpoints.teamid,
       Sum(qrymatchpoints.matchpoints) AS TotalPoints,
       Count(qrymatchpoints.*)   AS GamesPlayed
FROM   qrymatchpoints
GROUP  BY qrymatchpoints.periodid,
          qrymatchpoints.teamid
ORDER  BY qrymatchpoints.periodid,
          Sum(qrymatchpoints.matchpoints) DESC;
 
I'm suddenly very suspicious of the field MatchID, which is NOT grouped or ordered, not SELECT DISTINCT, and thus very likely to be duplicated by being in both legs of the UNION. It seems to me that it would be counted twice. In fact, since each match is counted twice - once from the HOME team and once from the AWAY team - by the UNION query, "games played" becomes a bit more difficult from that approach.

Since you have 3 points for a win but only 1 point for a tie, you can't do a SUM of the match points and divide by something to get total games played. You could have 0's in the mix that would complicate matters tremendously such that a really abysmal team would appear to have played no games at all. (And of course the brutal sports reporters might agree with that sentiment.)

To get total games played, I might have done SUM( MatchWin + MatchTie + MatchLoss ) (because only 1 of them will be 1 in any given record and no record should ever have three 0's).
 
@cheekybuddha,
Thanks. There are no nulls in the data set. Everything is based on a calculation from two fields (homescore, awayScore) and these default to zero. The query you post is similar to the one that already works. The query that removes the count fields (which is really not needed nor included in the group by) causes problems in the totalpoints.
 
The field MatchID, which is NOT grouped or ordered, not SELECT DISTINCT, and thus very likely to be duplicated by being in both legs of the UNION.

MatchID is duplicated since each match has two teams. Also there is no issue calculating games played. The issue is not including this in a query causes totalpoints to be wrong for certain teams.
qryMatchPoints qryMatchPoints

MatchIDTeamIDOpponentIDScoreMatchPointsMatchDifferentialLocationMatchGoalsForMatchGoalsAgainstMatchWinMatchLossMatchTiePeriodID
1​
3​
10​
2
3​
2​
Home
2​
0​
1​
0​
0​
1​
1​
10​
3​
0
0​
-2​
Away
0​
2​
0​
1​
0​
1​
2​
3​
9​
0
1​
0​
Away
0​
0​
0​
0​
1​
1​
2​
9​
3​
0
1​
0​
Home
0​
0​
0​
0​
1​
1​
3​
3​
12​
3
3​
2​
Home
3​
1​
1​
0​
0​
1​
3​
12​
3​
1
0​
-2​
Away
1​
3​
0​
1​
0​
1​
4​
5​
9​
3
1​
0​
Home
3​
3​
0​
0​
1​
1​
4​
9​
5​
3
1​
0​
Away
3​
3​
0​
0​
1​
1​

So for example team 2 is in matches 1,2,3. However If I group by team then MatchID you only count the match once for each time a team plays. Also there is no issue in the count of matches, it is in the total points.

The file is here
 
The query for complete stats works fine, but any attempt to use that query seems to cause this strange problem
qryTeamStatsByPeriod

This complete data is correct as far as I can tell which is just an aggregate of the union on Team and Period
qryTeamStatsByPeriod


PeriodIDTeamIDTotalPointsGoalBalanceGamesPlayedHomeMatchesAwayMatchesGoalsForGoalsAgainstWinsLossesTies
1​
1​
15​
8​
5​
4​
1​
9​
1​
5​
0​
0​
1​
2​
11​
3​
5​
3​
2​
5​
2​
3​
0​
2​
1​
8​
10​
4​
5​
3​
2​
8​
4​
3​
1​
1​
1​
4​
9​
5​
5​
2​
3​
8​
3​
3​
2​
0​
1​
7​
8​
4​
5​
3​
2​
9​
5​
2​
1​
2​
1​
3​
8​
3​
5​
3​
2​
7​
4​
2​
1​
2​
1​
5​
8​
2​
5​
3​
2​
7​
5​
2​
1​
2​
1​
6​
7​
0​
5​
2​
3​
6​
6​
2​
2​
1​
1​
9​
4​
-2​
5​
2​
3​
4​
6​
0​
1​
4​
1​
14​
4​
-4​
5​
3​
2​
4​
8​
1​
3​
1​
1​
10​
4​
-6​
5​
2​
3​
3​
9​
1​
3​
1​
This is only grouped on TeamID and Period ID. However if I remove certain fields or try to use this query in another query it fails.

See qryTotalPointsByPeriod_Problem
 
have you tried UNION ALL rather that UNION in your query?

UNION will group data from the different tables when all the fields are the same value, UNION ALL doesn't

tbl1
val1..val2..val3
1.......2.......3
4.......5.......6

tbl2
val1..val2..val3
1.......2.......3
7.......8.......9


UNION these two tables will return 3 records since tbl2 has records which match those in tbl1

val1..val2..val3
1.......2.......3
4.......5.......6
7.......8.......9

UNION ALL will return 4 records
val1..val2..val3
1.......2.......3
4.......5.......6
1.......2.......3
7.......8.......9

edit: by the same token, duplicates in the same table with UNION will also be removed

tbl1
val1..val2..val3
1.......2.......3
4.......5.......6
1.......2.......3

tbl2
val1..val2..val3
1.......2.......3
7.......8.......9

UNION will still return 3 records whilst UNION All will return 5
 
Thanks everyone. Attached is the stripped down version.
1. The normalizing union query works fine as far as I can tell. I have not found an issue. "qryMatchPoints"
2. The aggregate query, "qryTeamStatsByPeriod" aggregated on TeamID and PeriodID also works fine and gives the proper stats for each team for 2 different periods.

The issue is that the aggregate query will not return the correct TotalPoints if you do not include the count of games played in the query. This even happens if you make a select query from qryTeamStatsByPeriod. It too has to have a count of games played. But to reiterate that field is not part of the aggregation. I do not think the count of games is relevant, but somehow certain fields work when added to the query.

You can see the issue in
qrySelectFromStats and qryTotalPointsByPeriod. These work and return the correct TotalPoints. See the corresponding queries "_Problem"
Example records that are not correct are
Team 1, Period 1. Correct is 15, incorrect is 12
Team 2 Period 1, Correct is 11, incorrect is 8

qrySelectFromStats_Problem is probably the most Illustrative. This query simply selects fields from the aggregate query "qryTeamStatsByPeriod". The TotalPoints are correct in the "qryTeamStatsByPeriod", but incorrect in the select if you do not include the count of matches field. How could a select query return different results based on which fields you include in the select query?
 

Attachments

Last edited:
Count(*) counts all the rows in a set
Count(somefieldname) counts all the rows in a set where somefieldname is not null

I don't know why that would affect the other count but try changing it anyway. Specifying the name of a field in the Count() might somehow be interfering with the aggregation.
 
Count(*) counts all the rows in a set
Count(somefieldname) counts all the rows in a set where somefieldname is not null

I don't know why that would affect the other count but try changing it anyway. Specifying the name of a field in the Count() might somehow be interfering with the aggregation.
The problem is the count works fine. I want to remove that field completely or understand why that would be needed. Removing that field causes the problem in the Totalpoints.
Code:
SELECT qrymatchpoints.periodid,
       qrymatchpoints.teamid,
       Sum(qrymatchpoints.matchpoints) AS TotalPoints,
       Count(qrymatchpoints.matchid)   AS GamesPlayed
FROM   qrymatchpoints
GROUP  BY qrymatchpoints.periodid,
          qrymatchpoints.teamid
ORDER  BY qrymatchpoints.periodid,
          Sum(qrymatchpoints.matchpoints) DESC;
The above works fine, but if I remove "games played" field the Totalpoints do not all calculate correctly. It seems like an arbitrary field, but for some reason makes the query work correctly.
 
I would just experiment the hell out of it to see what effect it has:

1. Change the field inside it--COUNT(matchpoints), COUNT(teamid), COUNT(etc)

2. Change the function--AVG, MAX(), MIN(), LAST(), ETC()
 
The problem is the count works fine. I want to remove that field completely or understand why that would be needed. Removing that field causes the problem in the Totalpoints.
I think you may have missed the point. Does changing Count(somefield) to Count(*) change the sums of the other columns because it changed the domain such that only rows where somefield is not null are included when you have the Count(somefield) in the query whereas all rows are included if you have Count(*)?
 
Does changing Count(somefield) to Count(*) change the sums of the other columns because it changed the domain such that only rows where somefield is not null are included when you have the Count(somefield) in the query whereas all rows are included if you have Count(*)?
It does nothing because it already works using Count(someField). If Count(someField) is included in the query the results are correct already. Count(*) also causes correct results. The issue occurs when you remove Count(* or somefield) from the query completely. Also there is not a null in any field in the underlying data or any query.

If I get time I am going to put this in SQL Server to verify it is not an issue with Access.
 
As long as you tried the Count(*) and found that didn't change the results. I would have tested it with your database but I couldn't find the queries that were in the original thread. The queries in the upload were different. It doesn't make sense that Count(xyz) was changing the underlying recordset to eliminate not null values which would explain why the sums were different in the two queries, but I've seen some strange things. It was a thought.
 
I ran the problem query in SSMS and it works fine. Results
PeriodID TeamID TotalPoints
1 1 15
1 2 11
1 3 8
1 4 9
1 5 8
1 6 7
1 7 8
1 8 10
1 9 4
1 10 4
1 11 3
1 12 1
1 13 2
1 14 4
2 1 0
2 2 2
2 3 5
2 4 6
2 5 5
2 6 7
2 7 5
2 8 4
2 9 7
2 10 10
2 11 9
2 12 13
2 13 11
2 14 10

In Access the exact same query shows problems in 1,1 / 1,2, / 2/10. Each record is 3 points less than they should be.
qryTotalPointsByPeriod_problem qryTotalPointsByPeriod_problem

PeriodIDTeamIDTotalPoints
1​
1​
12​
1​
2​
8​
1​
3​
8​
1​
4​
9​
1​
5​
8​
1​
6​
7​
1​
7​
8​
1​
8​
10​
1​
9​
4​
1​
10​
4​
1​
11​
3​
1​
12​
1​
1​
13​
2​
1​
14​
4​
2​
1​
0​
2​
2​
2​
2​
3​
5​
2​
4​
6​
2​
5​
5​
2​
6​
7​
2​
7​
5​
2​
8​
4​
2​
9​
7​
2​
10​
7​
2​
11​
9​
2​
12​
13​
2​
13​
11​
2​
14​
10​
I will mark this solved as being some bizarre glitch in the Access Matrix.
 
Which query is correct. Did team 1 get 5 wins (15 points) or only 4 (12 points)

Maybe your query which compares home goals against away goals, doesn't resolve all matches correctly. Sometimes team 1 is the home team, and sometimes it's the away team. I would be inclined to include the total goals scored and conceded to see if that throws any light on it. Maybe in one of the queries you are summing your team as the home team when they are in fact the away team?
 
Also, could you check the games themselves. A win should produce a total points value of 3, and a draw should produce a total of 2, so you can tell how many match points were scored in total. In table 1, your teams have 94 points in total. In table 2, the total is only 88. Are you sure every record is correctly represented in the separate match results table. Maybe 2 games are missing in the second table, or the period value is wrong.
 

Users who are viewing this thread

Back
Top Bottom