Removing Duplication in Complicated Report

  • Thread starter Thread starter Deleted member 73419
  • Start date Start date
D

Deleted member 73419

Guest
Hi,

I have a complicated question which I'm trying to describe simply, so will initially start with the data.

The data I need to represent on the report is contained within three tables. I have managed to cobble together a query which displays the data but there is a lot of duplication which I want to remove for a report so that it is simpler to read and understand.

The three tables are OVR, BND and FNT. For each record in OVR.AAA, I need to find which values lies between BND.B and BND.C
I then need to join BND.ID to FNT.id to list the records which correspond to the values from BND.

This is briefly summarized here:
Untitled 1.png


Query1 is constructed by:

SQL:
SELECT ovr.AAA, FNT.AA, FNT.BB, BND.A, BND.B, BND.C, BND.D
FROM (BND INNER JOIN FNT ON BND.id = FNT.id) INNER JOIN OVR ON (ovr.AAA <= BND.C) AND (ovr.AAA >= BND.B);

Now the complicated part, the report.

Taking just one entry from the query, I've highlighted the areas where the duplication occur which I want to remove:
Untitled 3.png

I want each record to contain the contents of the yellow rectangle. I don't need the duplicate entries from columns AAA, AA, BB, A, B, C & D - I only need each text to appear once.

The actual layout isn't too important in that I have missed the column title labels off but I'm trying to simplify the question.

What is the best way to achieve this? Would nested queries, each filtering out the duplication and then link a report? Or would nested reports be the way to go?

I've attached the database as it contains more data for context.

Many thanks
 

Attachments

Over the years, I've observed that, all too often, attempts to simplify actually obscure reality to the extent that it is harder, not easier, to follow a problem. I know you put a lot of effort into this, but it's actually counter-productive and tends to scare off would-be helpers.
 
I have google on this laptop.
However the entries will have to exist on the FIRST one found, not halfway down as you are showing.

 
Whenever you have multiple rows in tblA that match multiple rows in tblB, you end up with "duplication" since every row from tblA shows each match in tblB.

So - the question becomes - WHY do you have TWO rows in tblB with the same range? Once you can answer that, we can get to how to figure out which ONE you want to match with.
 
Agree with GPGeorge *10. I just can't keep A, AA, AAA straight in my head to try and grasp your issue.

I'm looking at your database query when I filter it to AAA=10347 I see 7 results. Is that what you want? If not, how many records do you expect to see and which ones are they?

Additionally looking at your query why do both of the JOIN conditions include the =

Code:
...(ovr.AAA <= BND.C) AND (ovr.AAA >= BND.B);

That's gonna cause duplication in another manner. Right now you have duplicates because you have duplicate ranges in BND. When both of the above use the equal signs (>= and <=) and a record from OVR has a value that is at the end of one range and the beginning of another, it's going to match both those ranges. For example a record in OVR where AAA=8300 will match both BND.id=1 and BND.id=2. It seems like you either need to drop either <= to < or >= to >.
 
Tables BND and FNT are simplified versions of the published data. Table OVR is a table which is generated by an application where the intention is to match the records from OVR.AAA where the value is between BND.B and BND.C

So in the above example 10347 from OVR.AAA is between 9000 from BND.B and 11300 from BND.C

Table BND.ID is linked to table FNT.ID. Table BND has two records which match the values between 9000 and 11300 corresponding to an ID of 3 & 4.

Table FNT subsequently has a number of records returned for IDs of 3 & 4.

Hence there is duplication in a couple of areas and the main reason for wanting to remove this is the actual dataset that will be worked with will contain thousands of characters of text for some of these columns and hundreds of duplicate values.
Given this is a safety related issue, it makes sense to make it easier for the end user to read and not just glance over after seeing the number of pages in a report and thinking OMG.
 
I still don't understand. Here's a shot of Query 1 of the database you posted filtered to AAA=10347:

1719095926769.png


I assume that is too many results. Which of those do you expect to be returned by the query and why?

You keep mentioning duplicates, but from a technical perspective there are no duplicates--every record is unique. I understand to you with a better understanding of the data there might be "duplicates" but you need to explain to me what constitutes a duplicate and why?
 
Before you can remove the "duplicates", you need to understand what is causing them and so far, we don't.
 
There are no duplicates there.
Certain fields are repeated as the information is extracted that way.
it seems to be a problem that you need to sort out in a report or something

you want to split the information
for each AAA at condition between B and C printed once
And the qualifiers listed individually
aa, bb, a, d, and then further grouped or sorted by, a

I wish that I could tell you the code to sort out the problem but don,t know. But I can see exactly what you mean. your query is good but it needs presenting in a visually more appealing format. gasmans post seems to have part of your solution.

separating the 3a and 4a from the same field could be another problem that will arise.

2 queries night be needed
on field a
1 that sorts p
1 that sorts s

the report combined from both if that is possible
 
Last edited:
I can't open your database, I use Acc2010.
So just the pictures:

A table representation as you requested is not the usual, there are no standard means for this, if you really want to solve it exactly like that, it will be incredibly complex. Maybe you can then create your report output using other things like Excel.

If we stick with Access:
As correctly stated, there are no duplicates (=> across entire records), just repeating individual values, which is technically justified.

1) You can insert a subreport in a report, and another subreport in this subreport. That would be typical database technology, but you have something other than a single table.

Report1 contains the fields AAA, B, C in the RecordSource
Subreport1 contains the fields A and D as well as AAA in the RecordSource for linking
Subreport2 contains the fields AA and BB as well as AAA in the RecordSource for linking

2) It also seems possible to display it directly in a table using a query. I believe that this would require determining different running numbers using different field combinations and then using these to set field combinations with a running number greater than 1 to ZERO.

This is extremely time-consuming, however. I would only start something like this if I were to receive a significant, non-repayable advance payment for pain and suffering.
 
If the data is imported from another source, you don't have to live with it the way it is. You can normalize it on import and that may solve your problem
 
Ok, so I'm nearly there now apart from an inconsistency in ordering.

I decided that the easiest way was to break down into a number of smaller problems as suggested earlier although I chose smaller queries which were bound to (sub)reports rather than smaller reports.

In the attached database, Report 1 contains sub-reports 2 and 3 which reports all the data in the right order apart from the odd occasion where the report order switches:
Screenshot 2024-06-24 205429.png

this is strange behaviour because most of the time the order is correct:
Screenshot 2024-06-24 205404.png

The underlying queries (Q1 drives Report 1, Q2 drives Report 2 and Q3 drives Report 3) all have the same order (ascending).
And, when I run the queries they are all in the correct order so I'm not understanding why somewhere between the query and the report the order changes.
 

Attachments

Are you expecting the BB to be sorted, as your last pic shows it is incorrect?
Access reports take NO notice of source order, needs to be sorted in the report.
 
Are you trying to identify a band?
measure the value
determine which band it falls into?
rising to the band and descending to the band/value?
 
I've been playing around with the Reports OrderBy property and it looks like there is inconsistency.

With the property blank and opening Report 3 in an attempt to narrow down the issue:
Without Sort 1.png

The order of the alphanumeric part is not correct as it should follow A, B, C etc
Opening Report 1, I can see that not only is is the order incorrect as the records should be 9 then 10 but also the alphanumeric order also changes:
Without Sort 2.png

Once I've set the sort order in Report 3:
Sort.png

and re-open Report 3 we can see that the alphanumeric parts are correct:
With Sort 1.png

and now opening Report 1 we can see that although the alphanumeric parts are correct the 9 and 10 are the wrong order.
With Sort 2.png

so, it appears the Order By property is doing something however it looks like it is doing something different depending on whether you open Report 3 or Report 1 which contains Report 3.

The assumption would be that if Report 3 has the correct order, then opening Report 1 which contains Report 3 as a sub-report then it should behave the same???
 
I don't understand what you are saying. Strings are sorted character by character, left to right. Are you saying that 9A comes before 10A? it doesn't because 9 is not less than 1.

Sorting verges on impossible when you mush columns together. If the trailing letter is significant, it can't be last. You need to break the single field into two and then you can sort the letters first and the numbers second if that is how you want them to sort.

Also, as @Gasman pointed out, you need to specify the sort using the reports sorting/grouping property dialog.
 
I don't understand what you are saying. Strings are sorted character by character, left to right. Are you saying that 9A comes before 10A? it doesn't because 9 is not less than 1.

Sorting verges on impossible when you mush columns together. If the trailing letter is significant, it can't be last. You need to break the single field into two and then you can sort the letters first and the numbers second if that is how you want them to sort.

Also, as @Gasman pointed out, you need to specify the sort using the reports sorting/grouping property dialog.
I can see where you're coming from with regard to the ordering of the numbers. With that in mind, I would have thought that ordering with using something like VAL([FNT].[AA]), right([FNT].[AA],1) would be needed since it will strip out the numerical part and enable it to be ordered numerically and then order by the last alphabetic character.

Using this in Report 3:
1719948527935.png

does appear to have the desired effect in that all of them are displayed in the correct order:
1719948584391.png


but that is Report 3 run on its own. Normally this is a sub-report of Report 2 which itself is a sub-report of Report 1.

If I open Report 1, the resulting records for sub-report 3 are not in the correct.

So, what I'm confused with is the difference between the sorting dialog box and the orderBy property? Which one do you use where?

The sorting only gives me the option of sorting by ID and not fields AA or BB. Seems rather strange as they are all in the query on the subreport so why shouldn't I be able to select them!!
 
Ok, so another approach.

I've changed Q3 (which is the query driving Report 3) to this:
Code:
SELECT DISTINCT Q2.ID, FNT.ID, FNT.AA, FNT.BB, Val(FNT.AA) AS A1, Right(FNT.AA,1) AS B1
FROM FNT, Q2
WHERE (((FNT.ID)=[Q2].[ID]))
ORDER BY Q2.ID, FNT.ID;

Now that means I can change the sorting for the report firstly to numeric sort and then alphabetic sort:
1719951933989.png

This gives me this:
1719951966030.png


and just to prove this is actually effecting the results, I can change the numeric order to descending:
1719952016600.png


this does what it should and if I change the alphabetic part to descending:
1719952071122.png


So all looks good until you open Report 1 and sub-report 3 ceases to order correctly:
1719952166004.png

Something in another report appears to be affecting the sort order. Strangely though some of them order correctly:
1719952264356.png
 
Remove the order by from all queries. It is just confusing you.

Then make sure that you have separated the parts of the string into individual columns that will sort numerically if they are actual numbers or as strings if they are strings. Remember, a string that holds a number, sorts like a string, NOT a number. So for a string 100 sorts ahead of 9. Then make sure that the main and subreports ALL have proper sorting/grouping properties.
 
Remove the order by from all queries. It is just confusing you.

Then make sure that you have separated the parts of the string into individual columns that will sort numerically if they are actual numbers or as strings if they are strings. Remember, a string that holds a number, sorts like a string, NOT a number. So for a string 100 sorts ahead of 9. Then make sure that the main and subreports ALL have proper sorting/grouping properties.
The order by has been removed and is solely relying on the sort.

Query Q3 has been changed to remove the numerical part in to A1 and the alphabetical part in to B1:
Code:
SELECT DISTINCT Q2.ID, FNT.ID, FNT.AA, FNT.BB, Val(FNT.AA) AS A1, Right(FNT.AA,1) AS B1
FROM FNT, Q2
WHERE (((FNT.ID)=[Q2].[ID]))
ORDER BY Q2.ID, FNT.ID;
Which are then sorted, is this not the same as separating into individual columns so far as the query is concerned???

I am a but confused with the statement regarding all sub-reports having proper sorting/grouping properties. I don't understand how sub-report 2 has a bearing on the sort of sub-report 3 for example. These two sub-reports data are from two different queries and I don't understand what effect the sorting of sub-report 2 has on sub-report 3. I cannot sort by any of the fields of sub-report 3 from within sub-report 2.
 

Users who are viewing this thread

Back
Top Bottom