How to sort alphanumeric strings

Wyndham

New member
Local time
Today, 08:07
Joined
Nov 1, 2006
Messages
4
Access does not have an option to sort alphanumeric strings properly with the result that sorting a column with the following (sort of) data is almost impossible.

Flat A1
Flat A10
Room A13
Room A2
Room A21
B5
Flat 5b
10k1
10 k3
10 k12
10 k20
10 k2

etc

I need to be able to get addresses sorted correctly and they always have a mixture of alphabetic and numeric characters.

The steps I have followed to try and achieve this are as follows:

Create a matchfield containing the data for sorting (typically the street number followed by the flat number/name)

Split the matchfield into separate fields where there are spaces. "Room A13" becomes "Room", "A13"

Then split the fields/columns by separating the alphabetic and numeric portions - i.e. "Room A13" becomes "Room", "A", "13". This I have not been able to achieve successfully.

(Thinking about it the first step of separating by spaces is probably not necessary. All that is needed is to separate the numeric and alphabetic data)).

Once you have separated the data into alaphbetic and numeric content a sort is straightforward.

Why can't Access cope with what I would regard as a fairly basic requirement (i.e. to be able to sort alphanumeric strings correctly).

Does anyone have a suggestion how I can solve this problem until Access is improved?
 
it is sorting properly, it may be that you need to rethink your table structure. If you want the A21 etc value of the property to be sorted on, this needs to be the field value.

Can you split your column to have a accomodation detail, and a separate accommodation type, so you could get


A1 flat
A10 flat
A13 room
A2 room

etc

Note however that you still get problems as it is now sorted A1, A10, A2 instead of A1, A2, A10
 
Many thanks

Many thanks for the quick responses.

The data content actually comes from a government body which provides all address data in the UK and we have to deal with it as it arrives.

The list displayed is unsorted and the results should be that 2 follows 1 and goes before 10, etc.

I would love to be able to split the data but that is not easy - hopefully the link you kindly provided will help.

Wyndham
 
good thinking Brian, basically strip the number out, and concatenate the address behind the stripped out number. (I think?!)
 
I think that's correct Gemma, but it was a long time ago and the old brain cells are a bit slow these days, I just remembered it being dealt with so did a search.

brian
 
Hmm. I have dug out the db I used to test my code and noticed that I used the NZ function despite saying you didn't need it.:confused:

The following code in the grouping/sorting of the report produced the sort order on the attached

=(Val(nz([flda],0)) & Mid([flda],IIf(Val(nz([flda],0))=0,1,Len(Val([flda]))+1)))

Brian
 

Attachments

Little progress

The linked article/discussion was very interesting but unfortunately didn't solve the problem, or maybe it did and I just didn't see it.....

I was however intruiged by the GetString and GetNumeric in the attached query but unfortunately my version of Access 2000 doesn't have these functions.

I have asked another guy in the office to study the discussion and I will report back - many thanks for your help.
 
Just thought I'd give an update.

We have abandoned trying to do this in Access and have sorted the data in Excel and then imported the results back into Access.

It really is disappointing that Access doesn't have the ability to sort something like addresses properly. After all this can't be such an unusual requirement. If any Microsoft programmers see this then can they PLEASE put this on your to do list???
 
out of interest, how can you sort these addresses in Excel in any way that isnt possible in Access
 
Your problem is based in the need to parse the data that comes to you in an unsorted and non-uniform format. For what it is worth, this is one of the hardest problems on the planet. (In that to do it "right" you need to imbue your computer with considerable intelligence.)

Your idea of sorting the data using a divide and conquer method is a good first step. However, you have more items to deal with than simply to break apart the field. Did I really see that you sometimes get the data in a way that lacks even the SPACE character behind each part? That is going to be a really tough nut to crack.

As to how to parse this, the way I would do it is to build a loop that steps through the string one character at a time. As each character comes through, use a Select Case construct to classify the character according to text, digits, blanks, punctuation, and other. Then gather like characters that appear sequentially, use spaces and other characters as "breaks" in the sequence, and store all sequences that appear to be in the same class. This is just standard parsing and the routine isn't that hard to write. Once you have it, it is easy to use it again.

Say you build one of these and try to parse "Flat B2"

So you parse it and get "Flat" (as a sequence of alphabetics), " " as a sequence of spaces (length 1), "B" as a sequence of alphabetics, and "2" as a sequence of digits. Throw out the space, leaving you "Flat", "B", and "2" as the significant parts of your parsing return. This will work for every case except where the data source contains "FlatB2", where you would only get "FlatB" and "2" from the parse. This, I can't help fix.
 
There is another way to attack this problem: Create a sort code based on the field text for the query or table to be sorted by.

One of the keys to understanding how to do this and why it works is to first realize that an Access ascending text sort will sort numbers perfectly well but with one caveat: All the numbers have to be the same length.

For example:

A001
A009
A010
A099
A100

sorts exactly as we would expect but real world data:

A1
A10
A100
A9
A99

gives no satisfaction with a text sort.

Another key is understanding that inserting a character into the string does not change the sort order provided we insert the same charater into the same location in the string:

A11
A110
A1100
A19
A199

By manipulating the character(s) inserted into the field string we can change the text sort order:

A11
A19
A210
A299
A3100

It's relatively simple VBA coding to count how many digits are strung together before either the end of the string or a non-digit character is encountered. Once we know how many digits there are, inserting the text character representing that value into the string in front of the number sequence is a snap.

IOW, we are telling Access to sort all the single digit numbers first, then all the 2 digit numbers, then all the 3 digit numbers, etc.

Does it work for complex alpha-numeric strings? Oh, yeah:

A10A1A1
A1A10A1
A1A1A1
A1A1A10
A1A1A9
A1A9A1
A9A1A1

is a correct ascending text sort but is not intuitive, whereas:

Code:
tTxt	SortCode
A1A1A1	A[b]01[/b]1A[b]01[/b]1A[b]01[/b]1
A1A1A9	A[b]01[/b]1A[b]01[/b]1A[b]01[/b]9
A1A1A10	A[b]01[/b]1A[b]01[/b]1A[b]02[/b]10
A1A9A1	A[b]01[/b]1A[b]01[/b]9A[b]01[/b]1
A1A10A1	A[b]01[/b]1A[b]02[/b]10A[b]01[/b]1
A9A1A1	A[b]01[/b]9A[b]01[/b]1A[b]01[/b]1
A10A1A1	A[b]02[/b]10A[b]01[/b]1A[b]01[/b]1
is much more appealing.

Notice that I've increased the sort codes (in bold) to 2 characters. I've seen Serial Numbers and Model Numbers that come close to stringing 10 digits together, but have yet to see one that strings 99 digits together.

The other funcunality included in the code routine is a leading Zero counter and sort: We can't count leading zeros of a string of numbers as digit place holders. "09", "008", and "5" are all resolved to 1-digit numbers. By concatenating the number of leading zeros each number encountered in the string has (2 characters for every resolved number in the passed string), the routine also sorts by leading zeros. Like this:

Code:
tTxt		SortCode
DA1A		DA011A 00
DA1T-JP1	DA011T-JP011 0000
DA01T-JP1	DA011T-JP011 0100
DA001T-JP1	DA011T-JP011 0200
DA2S-DR02	DA012S-DR012 0001
DA02S-DR2	DA012S-DR012 0100
DA02S-DR002	DA012S-DR012 0102
DA002S-DR2	DA012S-DR012 0200
DA010A		DA0210A 01
DA11T-JP1	DA0211T-JP011 0000
DAS-DR2		DAS-DR012 00
F		F

Code:
Function fSortCode(sPassed As String) As String
Dim sNumTemp As String
Dim sLZTemp As String
Dim sLZCount As String
Dim iNumCount As Integer
Dim sNumCount As String
Dim iLZCount As Integer

Do While Len(sPassed)
    'is the first character a number?
    If Asc(Left(sPassed, 1)) > 47 And Asc(Left(sPassed, 1)) < 58 Then
        iNumCount = 0
        iLZCount = 0
        sNumTemp = ""
        'It's a number, Is it a leading zero?
        Do While Left(sPassed, 1) = "0" And iNumCount = 0
            'Yes - count how many leading zeros...
            iLZCount = iLZCount + 1     'Leading Zero counter
            sPassed = Mid(sPassed, 2)   'loop on next character
        Loop
        sLZCount = Trim(Str(iLZCount))
        If Len(sLZCount) = 1 Then
            sLZCount = "0" & sLZCount
        End If
        sLZTemp = sLZTemp & sLZCount
        'The first char is a number but not (or no longer) a leading zero, Count how many numbers...
        If Len(sPassed) Then
            Do While Asc(Left(sPassed, 1)) > 47 And Asc(Left(sPassed, 1)) < 58
                iNumCount = iNumCount + 1               'digit counter
                sNumTemp = sNumTemp & Left(sPassed, 1)  'remember all digits in this number
                sPassed = Mid(sPassed, 2)               'loop on next character
                If Len(sPassed) = 0 Then Exit Do
            Loop
        End If
        'Leading Zeros now counted, digits now remembered...
        If iNumCount Then 'if there are digits, include how many (iNumCount) and the digits...
            sNumCount = Trim(Str(iNumCount))
            If Len(sNumCount) = 1 Then
                sNumCount = "0" & sNumCount
            End If
            sNumTemp = sNumCount & sNumTemp
        Else
            sNumTemp = "010"    'if there are only leading zeros
        End If
        fSortCode = fSortCode & sNumTemp
    End If
'next character not a number, loop on next character in string...
If Len(sPassed) = 0 Then Exit Do
fSortCode = fSortCode & Left(sPassed, 1)
sPassed = Mid(sPassed, 2)
Loop
fSortCode = fSortCode & " " & sLZTemp

End Function
 

Users who are viewing this thread

Back
Top Bottom