Convert RPT files into Excel or Text using VBA

Jimbo's picture

On a large enough project, requests for extracts too large to handle by copy-pasting in the clipboard will arise and need to be exported to an RPT file, but users don't know how to handle RPT files and have a reasonable expectation that the data will arrive in an easy-to-use format like Excel. The necessity that was the mother for the invention of this function was a MARD extract that could not be copy-pasted from SSMS to Excel and wound up being a 3GB RPT file that nevertheless had to make it into the hands of Excel users to analyze and validate loaded data.

Analysis in ExcelThe RPT format is both delimited (by space) and fixed width (padded by spaces) making attempts to import it into Excel very cumbersome with even the most meticulous efforts. Data in columns are shifted left and right and if the number of rows exceeds the maximum allowed by Excel then it throws an error and ignores all but the first million or so rows.

One reason why RPT files are so large is the padding in the form of spaces that makes the data fixed-width. These two snippets drastically reduce the file size by removing the padding.
Out of Memory when trying to copy to clipboard.

Making this task particularly easy is the guide (or template) that is included on line 2 of the RPT file. Where the delimiter is to be applied, a space exists breaking up the long line of minus signs.
The guide has spaces where the fields are broken in the rows.

The code in this function to convert a space-delimited fixed-width RPT file to a tab-delimited TXT file is simple enough to include here in just one piece. The various steps performed in this simple snippet are detailed in the remarks such that no further explanation is required.

Function RPTtoText(strFileIn As String, strFileOut As String) As Boolean
    Dim strGuide As String
    Dim strInput As String, strOutput As String, nCurrent As Long
    
    Rem *** Get guide ***
    Open strFileIn For Input As #1
    Input #1, strGuide
    Input #1, strGuide
    Close #1
    
    Open strFileIn For Input As #1
    
    Open strFileOut For Output As #2
    Do While Not EOF(1)
        Line Input #1, strInput
        If strInput <> strGuide Then
            strOutput = ""
            For nCurrent = 1 To Len(strInput)
                If Mid(strGuide, nCurrent, 1) = " " Then
                    strOutput = Trim(strOutput) & vbTab
                Else
                    strOutput = strOutput & Mid(strInput, nCurrent, 1)
                End If
            Next
            Print #2, Trim(strOutput)
        End If
    Loop
    Close #1
    Close #2
End Function

ExcelThe code to convert a space-delimited fixed-width RPT file to Excel is a little more complex, but documented with remarks well enough to obviate the need for further documentation. A great deal of testing and analysis was required to make this as streamlined an efficient as possible.

Originally, this was recycled code from the RPT → TXT function above which knit in the tabs and then split the tab-delimited row into a string array before individually populated the Excel cells one at a time. The optimization led to the space-delimited RPT row being broken into an existing string array using offset and length values and then populated into a range of Excel cells in a single operation; these optimizations reduced processing time by more than 90%.

Function RPTtoExcel(strFileIn As String, strFileOut As String) As Boolean
    Dim objExcel As Excel.Application
    Set objExcel = CreateObject("Excel.Application")
    Dim wb As Excel.Workbook
    Dim ws As Excel.Worksheet, ws2 As Excel.Worksheet
    Dim nRecordCount As Long, nCurSec As Long
    Dim RetVal As Variant, nCurRec As Long, dnow As Date
    Dim nTotalSeconds As Long, nSecondsLeft As Long
    Dim nGuide(999, 2) As Long, nTotalColumns As Long
    
    Dim strGuide As String, strHeaders As String, strValues() As String, strValue As Variant
    Dim strInput As String, strOutput As String, nCurrent As Long, nCurrentLine
    Dim nCurrentRow As Long, nCurrentColumn As Long
    
    On Error Resume Next
    RetVal = SysCmd(acSysCmdInitMeter, "Transferring " & strFileIn & " to " & strFileOut & ". . .", nRecordCount)
    
    Rem *** Get Guide ***
    Open strFileIn For Input As #1
    Input #1, strHeaders
    strHeaders = Mid(strHeaders, 4, 9999) 'Skip three bytes of garbage
    Input #1, strGuide
    Do While Not EOF(1) 'Count the number of records...
        Input #1, strInput
        nRecordCount = nRecordCount + 1
    Loop
    Close #1
    RetVal = SysCmd(acSysCmdInitMeter, "Converting " & strFileIn & " to " & strFileOut & ". . .", nRecordCount)
    
    Rem *** Build the numeric guide. ***
    'This works as an offset and length system to show how to break up
    'each fixed-width line.
    nGuide(1, 1) = 1 'Start with the offset for the first column (1 of course!)
    strGuide = strGuide & " " 'Trigger addition of the last column.
    For nCurrent = 1 To Len(strGuide)
        If Mid(strGuide, nCurrent, 1) = " " Then 'Split here
            nTotalColumns = nTotalColumns + 1
            nGuide(nTotalColumns, 2) = nCurrent - nGuide(nTotalColumns, 1)   'Length does not include current space.
            nGuide(nTotalColumns + 1, 1) = nCurrent + 1                      'Set the offset for next column.
        End If
    Next

    Open strFileIn For Input As #1
    
    Rem *** Start with a fresh spreadsheet ***
    objExcel.DisplayAlerts = False
    Set wb = objExcel.Workbooks.Add
    wb.Worksheets(3).Delete
    wb.Worksheets(2).Delete
    wb.Worksheets(1).Name = strFileIn
    Set ws = wb.Worksheets(1)
    ws.Name = "Sheet 1"
    
    nCurrentRow = 1
    nCurrentLine = 0
    nCurSec = Second(Now())
    Do While nCurSec = Second(Now()) 'Get to next second...
    Loop
    nCurSec = Second(Now())
    Rem *** Skip the headers and guide ***
    Line Input #1, strInput
    Line Input #1, strInput
    
    Do While Not EOF(1)
        Line Input #1, strInput
        nCurRec = nCurRec + 1
        If Second(Now()) <> nCurSec And nCurRec < nRecordCount Then
            nCurSec = Second(Now())
            nTotalSeconds = nTotalSeconds + 1
            If nTotalSeconds > 3 Then
                RetVal = SysCmd(acSysCmdUpdateMeter, nCurRec)
                RetVal = DoEvents()
            End If
        End If
        strOutput = ""
        If nCurrentRow = 1 Then
            For nCurrent = 1 To nTotalColumns
                ReDim Preserve strValues(nCurrent - 1)
                strValues(nCurrent - 1) = Trim(Mid(strHeaders, nGuide(nCurrent, 1), nGuide(nCurrent, 2)))
                ws.Range(FindExcelCell(nCurrent, nCurrentRow)).NumberFormat = "Text"
                ws.Range(FindExcelCell(nCurrent, nCurrentRow)) = strValues(nCurrent - 1)
                ws.Range(FindExcelCell(nCurrent, nCurrentRow)).Font.Bold = True
                ws.Range(FindExcelCell(nCurrent, nCurrentRow)).Interior.Color = RGB(222, 222, 222)
            Next
        End If
        
        nCurrentRow = nCurrentRow + 1
        
        For nCurrent = 1 To nTotalColumns
            strValues(nCurrent - 1) = Trim(Mid(strInput, nGuide(nCurrent, 1), nGuide(nCurrent, 2)))
        Next
        Rem *** Next line blasts values into a range that fits the data.  For example, if the
        Rem *** nCurrentRow is 17 and there are 24 columns then the range is "A17:X17".
        ws.Range(FindExcelCell(1, nCurrentRow) & ":" & FindExcelCell(nTotalColumns, nCurrentRow)).NumberFormat = "Text"
        ws.Range(FindExcelCell(1, nCurrentRow) & ":" & FindExcelCell(nTotalColumns, nCurrentRow)) = strValues()
        If nCurrentRow = 1000000 Then
            Rem *** Start a new tab once we hit a million rows. ***
            Set ws = wb.Worksheets.Add(after:=wb.Worksheets(wb.Worksheets.Count))
            ws.Name = "Sheet " & wb.Worksheets.Count
            nCurrentRow = 1  'Start at the top and add header row.
        End If
        
    Loop
    Close #1
    'File must be created.  Save and then close without saving.
    objExcel.Workbooks(1).SaveAs (strFileOut)
    objExcel.Workbooks(1).Close (False)
    'Cleanup...
    Set wb = Nothing
    Set ws = Nothing
    Set objExcel = Nothing
    RetVal = SysCmd(acSysCmdRemoveMeter)
End Function

Finally, the FindExcelCell is recycled from this snippet designed to export tables and queries from Access directly to Excel. It returns an alphanumeric value like AJ3487 when x and y coordinates are passed into it.

Function FindExcelCell(nX As Long, nY As Long) As String
    Dim nPower1 As Long, nPower2 As Long
    nPower2 = 0
    If nX > 26 Then
        nPower2 = Int((nX - 1) / 26)
    End If
    nPower1 = nX - (26 * nPower2)
    If nPower2 > 0 Then
        FindExcelCell = Chr(64 + nPower2) & Chr(64 + nPower1) & nY
    Else
        FindExcelCell = Chr(64 + nPower1) & nY
    End If
End Function

The results speak for themselves. Here one can see where the original RPT file is almost 4GB, but the resulting lossless TXT file is one-fifth so big and the Excel file is one-sixth so big.
The RPT file size is greatly reduced when converting to TXT and Excel.

Here the tabs created during the load are shown with one million lines in each tab. This circumvents Excel's 1,048,576 row limitation by breaking the data into pieces, each with 999,999 records and a header row.
The Excel file has multiple tabs, each with a header row and 999,999 records.

To call these in VBA, simply create a sub and call the functions just like this. These can be included in Macros very easily, too!

Sub FixRPTFiles()
    Dim x
    x = RPTtoExcel("c:\stuff\RC_11_MLAN_PBC_07132020.rpt", "c:\stuff\RC_11_MLAN_PBC_07132020.xlsx")
    x = RPTtoText("c:\stuff\RC_02_MARC_PBC_07132020.rpt", "c:\stuff\RC_02_MARC_PBC_07132020.txt")
End Sub

Download:

These functions are included in Jimbo's VBA Automation Toolbox as of version 1.3 updated July 2020. At least something good came out of 2020.

Note:

This snippet of code calls for the Excel Objects Library to be referenced when using Access, so ensure that the version available is referenced as shown below. When using Excel, some subtle code change may be required as this code has not been tested using VBA in Excel.

Programming Language: 
ABAP