Extracting varying length string between two characters

DaniLucas

New member
Local time
Today, 16:10
Joined
May 31, 2016
Messages
6
Hi. I know I've done this before, but for the life of me, can't remember how. I need to be able to extract a string that will vary in length, that will always be between to characters. For example, these are document names:

02-20-00-AI-0035
02-2006-00-HG-0137
02-21-00-JW-1001
04-2115-01-AI-1013

I only need the portions (highlighted in red text) that will always be between the first and second "-" character. This portion of the document name is the "skid number". I need to be able to extract just the skid number from the document name.

If this can be done in an expression written into the design view of a query, that would be most ideal as I am not great with VBA at all.

Thank you!!
 
Hi. Assuming the skid number always starts at the 4th character, then maybe you could try something like this:
Code:
Mid([FieldName],4,InStr(Mid([FieldName],4),"-")-1)
(untested)
 
How about using Split?
tt="02-2006-00-HG-0137"
? split(tt,"-") (1)
2006

I'd write a simple function to return the value.
 
Thank you all! TheDBGuy, your solution seems to work well. I am not sure if they'll ever change the first bit of the document name to be more than two characters in the future, but for now, it always starts with two characters before the first "-". Would there be a way to do this if they did change it, however?
 
Thank you all! TheDBGuy, your solution seems to work well. I am not sure if they'll ever change the first bit of the document name to be more than two characters in the future, but for now, it always starts with two characters before the first "-". Would there be a way to do this if they did change it, however?
Yes, we can use the initial "-" I suppose. One way to do it is by using the Split() function like Gasman suggested earlier, but it will require some VBA code.
 
Yes, use the Split function. As long as they do not move the locations of the skid number you should be OK.
In fact the function could also take the relevant part position, delimiter so it would just be changing the part number index number.

Code:
Function fnSkidNumber(strPart As String, strDelim As String, iIndex As Integer)
Dim strSplitArray() As String

strSplitArray = Split(strPart, strDelim)
fnSkidNumber = strSplitArray(iIndex)

End Function

Code:
? fnSkidNumber("02-2006-00-HG-0137","-",1)
2006
 
Last edited:
A bit late to the party, but I happened across this while looking for something. These forums are a great resource for archiving information, so even if DaniLucas no longer needs help, someone else may benefit.

Here is a routine that illustrates the expression in VBA:
Code:
Sub x()
Const t1$ = "02-20-00-AI-0035"
Const t2$ = "02-2006-00-HG-0137"
Const t3$ = "02-21-00-JW-1001"
Const t4$ = "04-2115-01-AI-1013"
Debug.Print Extract(t1)
Debug.Print Extract(t2)
Debug.Print Extract(t3)
Debug.Print Extract(t4)
End Sub

Function Extract$(x$)
Extract = Mid$(x, InStr(x, "-") + 1, InStr(InStr(x, "-") + 1, x, "-") - InStr(x, "-") - 1)
End Function
To use it in a query, as the OP asked, use the expression:
Code:
Mid([FieldName], InStr([FieldName], '-') + 1, InStr(InStr([FieldName], '-') + 1, [FieldName], '-') - InStr([FieldName], '-') - 1)
If your field names are properly constructed, you do not need the square brackets around the field name. Doing it this way also has the benefit of needing no VBA, and so is likely to considerably faster than any of the other methods mentioned above.
 
Last edited:
Yes, use the Split function. As long as they do not move the locations of the skid number you should be OK.
In fact the function could also take the relevant part position, delimiter so it would just be changing the part number index number.

Code:
Function fnSkidNumber(strPart As String, strDelim As String, iIndex As Integer)
Dim strSplitArray() As String

strSplitArray = Split(strPart, strDelim)
fnSkidNumber = strSplitArray(iIndex)

End Function

Code:
? fnSkidNumber("02-2006-00-HG-0137","-",1)
2006

To be pedantic strSplitArray should be dimmed as Variant as that's what Split returns, and good coding practice suggests the function should be returned explicitly as variant (which is what it returns by default).

(But who cares if it works!) :)
 
My normal response to these questions is to say use a RegEx, however in this example the suggestion to use Split is the perfect answer!

To use a RegEx you'd need to know the complete range of values the 'skid' could be, or extract the second item from a set of matches which Split does far more simply.
 
Last edited:
My normal response to these questions is to say use a RegEx, however in this example the suggestion to use Split is the perfect answer!

To use a RegEx you'd need to know the complete range of values the 'skid' could be, or extract the second item from a set of matches which Split does far more simply.
RegEx and Split both require VBA coding. Using a straight expression does not.

Code:
Mid([FieldName], InStr([FieldName], '-') + 1, InStr(InStr([FieldName], '-') + 1, [FieldName], '-') - InStr([FieldName], '-') - 1)
 
Using a straight expression does not
What is the expression but applied programming?

One could also argue that the expression represents a hard-coding of precise circumstances. Split is a bit more variable, and RegEx, if you can use it reasonably well, is a highly variable magic wand.
 
What is the expression but applied programming?

One could also argue that the expression represents a hard-coding of precise circumstances. Split is a bit more variable, and RegEx, if you can use it reasonably well, is a highly variable magic wand.
Semantics are not the point. You can argue all sorts of things, including the greater capability of harnessing VBA, and it would be true. But using the expression I wrote allows the JET engine to generate the result directly, with no calls to external code.

Try it yourself, on a large dataset. Calling a VBA function is MUCH slower, and you then have code in two places that needs to be kept in sync.

If you can make up an expression that JET understands and evaluates correctly, you will always be better off than invoking a VBA procedure.
 
Last edited:
Try it yourself, on a large dataset.
Don't worry. I do that often.
You're basically right about performance. But performance is relative. If I use RegEx, even an external library, there are no problems that a user would notice when querying record numbers in the 5-digit range.

these are document names
In a query with mass data, I would never use compound expressions like the ones above, but would break them down into atomic expressions and store them in separate table fields, keyword normalization.
In doing so, you would also have to isolate all five individual pieces of information, but then only once.

you then have code in two places that needs to be kept in sync
Wrong. A variable function and its maintenance is one thing, calling a function is something else entirely.
The clarity will increase in any case, in contrast to a Mid-Instr-trallala.
 
But using the expression I wrote allows the JET engine to generate the result directly
Mid, Instr and other functions are not part of Jet-SQL, but are made available via the so-called Expression Service. Therefore, VBA functions, Access functions (Nz, DCount) and self-defined functions can be integrated into queries.

If you do not notice such a transfer and the associated overhead, this does not mean that something like this does not happen.
 
You're basically right about performance. But performance is relative. If I use RegEx, even an external library, there are no problems that a user would notice when querying record numbers in the 5-digit range.
I just tested it on a table with only 20,000 records, and the difference in performance was VERY noticeable - the JET expression was essentially instant, while the VBA function took several seconds.

In a query with mass data, I would never use compound expressions like the ones above, but would break them down into atomic expressions and store them in separate table fields, keyword normalization.
In doing so, you would also have to isolate all five individual pieces of information, but then only once.
Sure, you could do that, but why bother? All you're doing is forcing the engine to compute and store completely pointless intermediate results. It adds complexity, overhead and confusion, while generating nothing positive or useful.

Wrong. A variable function and its maintenance is one thing, calling a function is something else entirely.
The clarity will increase in any case, in contrast to a Mid-Instr-trallala.
Not wrong, as you yourself illustrate with your own words. Calling a function is one thing, maintaining it is another. You have to do TWO things in TWO places, and your only reward is hugely increased run time.

Clarity is increased by keeping things in one place. Spreading them into two places, and one of them completely unnecessary to boot is not a step towards clarity.
 
Mid, Instr and other functions are not part of Jet-SQL, but are made available via the so-called Expression Service. Therefore, VBA functions, Access functions (Nz, DCount) and self-defined functions can be integrated into queries.

If you do not notice such a transfer and the associated overhead, this does not mean that something like this does not happen.
I don't know all the intricacies of JET, but I very much doubt if this statement is true. ALL other SQL languages have their own functions like Mid and Instr. Why would JET be any different? Yes, you can use custom VBA functions, but that does not at all mean that built-in functions go the same route.

If there was such an associated overhead, it would be noticeable. The fact that the performance difference is so large argues against your statement.
 
RexExp:
Interesting question: Was your test with a static RegExp instance or with instantiation per function call?
 
Last edited:
I just tested it on a table with only 20,000 records, and the difference in performance was VERY noticeable - the JET expression was essentially instant, while the VBA function took several seconds.
just curious - Access will display the first few records before getting to the last record - which in an access query datasheet is when the number of records is populated at the bottom of the window. So in relation to your test - was this the case in this observation? or when the first few records were displayed?
 
RexExp:
Interesting question: Was your test with a static RegExp instance or with instantiation per function call?

Note: The Mid/Instr variant for the task setting will still be faster than Regex (with static instance).
I did not use a RegEx for this simple test - just the VBA version of the Mid and Instr functions, so I did not even have the overhead of the RegEx machinery. Even the bare-bones VBA functions were vastly slower than pure JET. Just having to call VBA code, even when it does almost nothing, is apparently a huge amount of extra work.
 
just curious - Access will display the first few records before getting to the last record - which in an access query datasheet is when the number of records is populated at the bottom of the window. So in relation to your test - was this the case in this observation? or when the first few records were displayed?
The first few show up quickly. Getting to the end (having to call the VBA function for every record) took a long time.
 

Users who are viewing this thread

Back
Top Bottom