Help with unescaping UTF-8 text (1 Viewer)

duke217

Registered User.
Local time
Today, 15:10
Joined
Jan 23, 2018
Messages
17
Hello again,

When making an api call to an external website, it returns with the following: (example)

{"ip":"XX.XXX.XX.XX","location":{"country":"SK","region":"","city":"Rev\u00faca","lat":48.68346,"lng":20.11734,"postalCode":"","timezone":"+01:00"," etc...

I am trying to convert that so that instead of "Rev\u00faca", it stores "Revúca". The external website I use obviously handles diacritics in a different way than me.

Any help as to what I should be looking for here?
 

The_Doc_Man

Immoderate Moderator
Staff member
Local time
Today, 08:10
Joined
Feb 28, 2001
Messages
27,317
Found this article that has some examples and discussion that might help.


I have a similar problem but all I ever do is manually download the file, open it with NOTEPAD, do a SAVE AS in a format such as ANSI, and then process the saved ANSI version rather than the original UTF-8. In my case, diacritical marks aren't that critical.
 

duke217

Registered User.
Local time
Today, 15:10
Joined
Jan 23, 2018
Messages
17
Ok, so I am replying to my own question :) Here is a function that does the job but to perfection.

Code:
Public Function UnescapeUTF8(ByVal StringToDecode As String) As String
    Dim i As Long
    Dim acode As Integer, sTmp As String
    
    On Error Resume Next
    
    If InStr(1, StringToDecode, "\") = 0 And InStr(1, StringToDecode, "%") = 0 Then
        UnescapeUTF8 = StringToDecode
        Exit Function
    End If
    For i = Len(StringToDecode) To 1 Step -1
        acode = Asc(Mid$(StringToDecode, i, 1))
        Select Case acode
        Case 48 To 57, 65 To 90, 97 To 122
            ' don't touch alphanumeric chars
            DoEvents

        Case 92, 37: ' Decode \ or % value with uXXXX format
            If Mid$(StringToDecode, i + 1, 1) = "u" Then
                sTmp = CStr(CLng("&h" & Mid$(StringToDecode, i + 2, 4)))
                If IsNumeric(sTmp) Then
                    StringToDecode = Left$(StringToDecode, i - 1) & ChrW$(CInt("&h" & Mid$(StringToDecode, i + 2, 4))) & Mid$(StringToDecode, i + 6)
                End If
            End If
            
        Case 37: ' % not %uXXXX but %XX format
            
            sTmp = CStr(CLng("&h" & Mid$(StringToDecode, i + 1, 2)))
            If IsNumeric(sTmp) Then
                StringToDecode = Left$(StringToDecode, i - 1) & ChrW$(CInt("&h" & Mid$(StringToDecode, i + 1, 2))) & Mid$(StringToDecode, i + 3)
            End If
            
        End Select
    Next

    UnescapeUTF8 = StringToDecode
End Function

Paste that into a new module (Option Compare Database), and then you can call the function with UnescapeUTF8("UTF8String")
 

Users who are viewing this thread

Top Bottom