Exclude Records where last char is alpha or is the same without alpha

The main table which we are using to exclude the special cases contains 1.3 mil rows, then we are joining to text file(s) containing at least 100k rows. A bit more background, we are trying to fill in gaps in the 1.3 mil row table (historical data - so an id could have up to 10 rows) from the text files so this table will get larger. The query/ies do the following, join the two tables to get those records that match (they are matching on a different code since the text files don't have the id_code, then exclude those records that have data for a specific year and exclude the records that have the alphas and their counterparts that don't have the alphas (these will have to be addressed manually at this point). We have to match against 10 text files (one for each year), due to how the text files were created, the field names are customized so will probably need to use vba to standardize the field names and loop through each year table one at a time.

The sql I used was from you (vbaInet) as I wasn't able to get the sql from CJ_London to work. Probably if I spend a bit more time with it to overcome the syntax errors when customized for our tables it will work, but since your query ran without syntax errors, went with that one.
 
OK. This should run faster.

1. Create a query using this:
Code:
SELECT id_code
FROM [COLOR="blue"]TableName[/COLOR]
WHERE IsNumeric(Right([id_code],1))=True;
2. Create another query based on the one in step 1 like this:
Code:
SELECT id_code
FROM [COLOR="blue"]QueryName[/COLOR]
WHERE (SELECT Count(*) 
       FROM [COLOR="blue"]QueryName [/COLOR]AS A 
       WHERE A.id_code = [COLOR="blue"]QueryName[/COLOR].id_code & "a" OR A.id_code = [COLOR="blue"]QueryName[/COLOR].id_code & "b"
      )=0;
 
Just had another thought, so this should run faster:
Code:
SELECT IIf(IsNumeric(Right([id_code],1)),[id_code],Left([id_code],Len([id_code])-1)) AS Num_id_code
FROM [COLOR="Blue"]TableName[/COLOR]
GROUP BY IIf(IsNumeric(Right([id_code],1)),[id_code],Left([id_code],Len([id_code])-1))
HAVING Sum(Not IsNumeric(Right([id_code],1)))=0;
 
Thanks for the revision. I was able to incorporate into the existing query and performance greatly improved to just under a minute. Also seems like it excluded more rows than the previous version, which turned out to be more accurate in my review comparing the results from last week and today. My supervisor is still out of the office, so will need to wait to see how it goes in production. Below is the sql as it currently stands. I had to add a bunch of trim statements because that field is padded with spaces. Some time back, we tried an update query to remove the spaces, but it didn't seem to work.

Code:
SELECT DISTINCT tbl_schools_xyr.unique_schid, Ccd_05_appended_simplified.ncessch, 
Ccd_05_appended_simplified.schnam05
FROM (Ccd_05_appended_simplified 
INNER JOIN (tbl_schools_xyr 
LEFT JOIN 
     (SELECT tbl_schools_xyr.unique_schid, tbl_schools_xyr.sch_yr 
      FROM tbl_schools_xyr WHERE tbl_schools_xyr.sch_yr="2005_06")  AS MatchYr 
ON tbl_schools_xyr.unique_schid = MatchYr.unique_schid) 
ON Ccd_05_appended_simplified.ncessch = tbl_schools_xyr.ncessch) 
INNER JOIN 
     (SELECT IIf(IsNumeric(Right(trim([unique_schid]),1)),
                 trim([unique_schid]),
                 Left(trim([unique_schid]),Len(trim([unique_schid]))-1)) AS Num_id_code
FROM tbl_schools_xyr
GROUP BY IIf(IsNumeric(Right(trim([unique_schid]),1)),
             trim([unique_schid]),
             Left(trim([unique_schid]),Len(trim([unique_schid]))-1))
HAVING Sum(Not IsNumeric(Right(trim([unique_schid]),1)))=0)  AS ExTurn 
ON tbl_schools_xyr.unique_schid = ExTurn.Num_id_code
WHERE (((MatchYr.unique_schid) Is Null));
 
You're probably better off saving that SQL as a query of it's own perhaps called qryExTurn and using that in the query above. You might find a slight performance improvement if done this way.

Regarding the difference in results, is it possible that you're testing on an updated dataset? Perhaps some records may have been updated.
 
I tried out your suggestion of saving qryExTurn. Looks like it's only 3 seconds faster, so will probably leave inside the query for now (keep down number of queries listed in the Nav Pane). Regarding the results, since my data are static local copies, there are no updates until I am given new files for testing. In any case, the text files by year are static for everyone as those are what we are using to update our in house tables. Perhaps because the earlier attempt was using make table queries at different points in the filtration process, some records may have slipped through the cracks. Appreciate your assistance with this and I'll give an update on the results once in production.
 
If I were you I'll rather have more queries if it will run 3 seconds faster. Production has millions of records so that 3 seconds is a big deal.

With regards the leading spaces update issue, is it possible that the update actually worked but records added afterwards had spaces in them?
 
All the records still have the spaces. Very few records were added after the last time we tried the update query. I'll mention to him about the saved query, but seems like he prefers to have less in the pane and if the performance hit is minimal as in this case.
 
Looks like someone forgot to test the entire dataset after running the updates.

Well if you have more records in production (as is usually the case) that would be a good saving in the long run. Worth mentioning stressing to your Manager or client. Also mention to him/her that subqueries in Access aren't as optimised as they are in other more well-known commercial products.
 
It's a construct used in regular expressions and I'm surprised that it's not mentioned in the help files.

The support for Regular Expressions in Access is limited to ranges which barely scratches the surface of what RE matching can do.

However you can use the full RE capabilities in VBA by setting a reference to Microsoft VBScript Regular Expressions x.x

Create a public function with the RE and use the function in a query.

I posted a simple example here in post 5.
 
vbaInet, this morning I again ran several iterations of the update query (tried trim, rtrim, etc) and non of them removed the padded trailing spaces. I then created a new field and updated it based on the original field and that removed the spaces. I then ran the database documenter and discovered what I think is the reason for the spaces. I noticed that under "Attributes:" the field was set up as "Fixed Size" whereas the new field I created its "Attribute:" shows "Variable Length". I don't seem to readily see where in the properties I can change the Attribute to Variable Length.
Assuming someone knows how to change the Attribute, are there any issues resulting from this that might impact the existing database in its interactions with other tables.

  • Fixed vs Variable length of the field
  • This is a compound Primary Key field (Unique_schid, sch_yr)
  • This field (unique_schid) is also being used as a primary key in a separate table
 
Assuming someone knows how to change the Attribute, are there any issues resulting from this that might impact the existing database in its interactions with other tables.

  • Fixed vs Variable length of the field
  • This is a compound Primary Key field (Unique_schid, sch_yr)
  • This field (unique_schid) is also being used as a primary key in a separate table
I can't advise you on this, you will need to consult your technical documentation. There's must be a reason why this field was set as a fixed length field because it can only be done through code using DDL.
 

Users who are viewing this thread

Back
Top Bottom