Export directly to UTF-8 from Access using VBA

Jimbo's picture

Since the release of ECC 6.10 the SAP R/3 system supports Unicode, a system that uses multiple bytes to represent characters and allows symbols from non-ASCII alphabets to appear in the data. While ASCII can still be used for data from America and for simple financial data, the transport of data from one system to the next is almost always handled now using Unicode.

The flavors of Unicode

Massive tomes can be written on the origin, history and evolution of Unicode. Suffice it to say that of the many standards, pages, blocks, planes and formats we need only concentrate on the most popular version of the Unicode text file format: UTF-8. This standard is compatible with nearly every Unicode-enabled system and is the standard for SAP ECC 6.

MS Access as a tool for data conversion

MS Access handles Unicode internally automatically and without issue. It is by far the most powerful and useful non-SAP tool for Legacy Data Migration. Visual Basic for Applications (VBA) is a reduced-feature version of Visual Basic and adds the functionality to Access that makes it the programmer's choice. The traditional way to export to text using VBA is with the "Open ... As" statement. This defaults the data the traditional 8-bit standard ASCII encoding. Greek, when exported as ASCII text takes the form of question marks. In fact, the same can be said of most non-Western languages exported to ASCII. It's all Greek to me.

One option for exporting data to UTF-8 from Access is to create a template that Access can use to automatically format the data when it is exported. In Access 2008 this option is almost gone and has been replaced with "Save Export". In fact, the option to choose the delimiter to use is gone and now Access adds minuses and pipes to the data. The data is mostly worthless after that.

Stepping out of MS Access for a moment

Using the CreateObject function, Access can step outside the bounds of traditional VBA. These additional features are what make exporting directly to UTF-8 possible.

Added functionality and convenience

The code below determines what is being passed as the source within Access and then, based on whether the source is a table or selection query, forms the SQL statement on the data. Supporting the direct export of queries makes it easy to sort and filter the data as it is being exported.

Thanks where it is due

The second line in the code is the URL where I found the information I needed to convert a regular export-to-text function from ANSI-only to UTF-8. It has a great explanation of how to export to "Unicode" and I changed the type to "UTF-8" and it worked! If this works for you, please give SAPLSMW a little credit. You can download this and many other tools in our aggregated Toolbox in our downloads section--just click the Downloads tab.

Function ExportToTextUnicode(strTableName As String, strFileName As String, Optional ByVal strDelim As String = vbTab) As Boolean
    'Written by Jimbo at SAPLSMW.com
    'Special thanks: accessblog.net/2007/06/how-to-write-out-unicode-text-files-in.html

    Dim rs As DAO.Recordset, strSQL As String
    Dim nCurrent As Long, nFieldCount As Long, nRecordCount As Long
    Dim RetVal As Variant, nCurRec As Long, dnow As Date, nCurSec As Long
    Dim nTotalSeconds As Long, nSecondsLeft As Long
    Dim strTest As String

    strSQL = "SELECT * FROM " & strTableName & ";"
    'Check to see if strTableName is actually a query.  If so, use its SQL query.
    nCurrent = 0
    Do While nCurrent < CurrentDb.QueryDefs.Count
        If UCase(CurrentDb.QueryDefs(nCurrent).Name) = UCase(strTableName) Then
            strSQL = CurrentDb.QueryDefs(nCurrent).SQL
        End If
        nCurrent = nCurrent + 1
    Loop
    Set rs = CurrentDb.OpenRecordset(strSQL)
    nFieldCount = rs.Fields.Count

    If Not rs.EOF Then
        'Now find the *actual* record count--returns a value of 1 record if we don't do these moves.
        rs.MoveLast
        rs.MoveFirst
    End If

    nRecordCount = rs.RecordCount
    RetVal = SysCmd(acSysCmdInitMeter, "Exporting " & strTableName & " to " & strFileName & ". . .", nRecordCount)
    'Create a binary stream
    Dim UnicodeStream
    Set UnicodeStream = CreateObject("ADODB.Stream")
    UnicodeStream.Charset = "UTF-8"
    UnicodeStream.Open

    For nCurrent = 0 To nFieldCount - 1
        If Right(rs.Fields(nCurrent).Name, 1) = "_" Then
            UnicodeStream.writetext Left(rs.Fields(nCurrent).Name, Len(rs.Fields(nCurrent).Name) - 1) & strDelim
        Else
            UnicodeStream.writetext rs.Fields(nCurrent).Name & strDelim
        End If
    Next

    UnicodeStream.writetext vbCrLf
    nCurSec = Second(Now())

    Do While Not rs.EOF
        nCurRec = nCurRec + 1
        If Second(Now()) <> nCurSec And nCurRec <> rs.RecordCount Then
            nCurSec = Second(Now())
            RetVal = SysCmd(acSysCmdUpdateMeter, nCurRec)
            RetVal = DoEvents()
        End If
        strTest = ""
        For nCurrent = 0 To nFieldCount - 1  'Check for blank lines--no need to export those!
            strTest = strTest & IIf(IsNull(rs.Fields), "", rs.Fields(nCurrent))
        Next
        If Len(Trim(strTest)) > 0 Then  'Check for blank lines--no need to export those!
            For nCurrent = 0 To nFieldCount - 1
                If Not IsNull(rs.Fields(nCurrent).Value) Then
                    UnicodeStream.writetext Trim(rs.Fields(nCurrent).Value)
                End If
                If nCurrent = (nFieldCount - 1) Then
                    UnicodeStream.writetext vbCrLf 'new line.
                Else
                    UnicodeStream.writetext strDelim
                End If
            Next
        End If
        rs.MoveNext
    Loop

    'Check to ensure that the file does't already exist.
    If Len(Dir(strFileName)) > 0 Then
        Kill strFileName  ' The file exists, so we must delete it before it be created again.
    End If
    UnicodeStream.SaveToFile strFileName
    UnicodeStream.Close
    rs.Close
    Set rs = Nothing
    ExportToTextUnicode = True
    RetVal = SysCmd(acSysCmdRemoveMeter)
End Function
Programming Language: 
VBA