Expedite Table Lookups using Internal Tables
Part of almost any pre-validation of data in LSMW calls for a query against tables in SAP. Some of these tables are enormous with thousands or even millions of records in older systems. Comparing data to be loaded against documents already in the system using discreet external keys is a great way to avoid creating duplicate documents and is the best way to facilitate delta loads as obstacles to loading are exposed and removed. This often entails checking the key against fields that are not indexed and that can take seconds per document and make an otherwise well-developed LSMW object run for hours or even time out.
Pulling a small subset of records from the table in question into an internal table is a better solution than trying to coax a BASIS team into indexing a single free-text field for a one-time load from a legacy system. This takes a lot of memory and processing power, but it's for a one-time load so few people will actually care.
Create the internal table in the Global Data area in the Field Mapping and Conversion Rules. Internal tables are often prefixed with "it_" for easy identification. For the purposes of this white paper, the MKPF table is used. The BKTXT field is used to hold an external key added by the data conversion team and, because it is not indexed, look-ups on this field take as long as 3 seconds each or about 42 hours for 50,000 records.
This is not the most efficient way to do this, but it only takes a few seconds longer than selecting the records directly into the table and makes it easier to understand what is happening. Here the it_MKPF internal table is defined with the same structure as the MKPF table and a working area called wa_MKPF
is created for working with the data in it_MKPF.
data: it_MKPF type standard table of MKPF initial size 0. data: wa_MKPF like MKPF.
The second step is to populate the internal table with a small subset of data from the larger MKPF table. Few of the documents have the BKTXT field populated, so filtering on those records that have content in the BKTXT field allow for a much smaller and faster subset. It's possible to reduce the subset even further by filtering on a specific prefix used for data conversion, but this bit of code alone shaves more than 95% off of the conversion process run time. Add this code to the Begin of Processing section of the LMSW Field Mapping and Conversion Rules.
select * from MKPF into wa_MKPF where BKTXT ne ''. append wa_MKPF to it_MKPF. endselect.
Now, with this smaller subset of pertinent records, a quick search through the internal table pulls up any matches from the data file. This snippet of sample code can be added to the ABAP code for the field that is used as the discreet key; the code needs to be adjusted based on the object and the field used as the discreet key. Free-text fields in obscure, seldom-used screens are great places to store discreet keys.
loop at it_MKPF into wa_MKPF where BKTXT eq MB11S-BKTXT. skip_transaction. endloop.
And, with just ten lines of code the conversion with pre-validation on 50,000 records is cut from 42 hours to 42 seconds. More effort and time can be invested to shave additional seconds from the process in a case-by-case basis, but this code will most likely be used no more than once.
Update: Making code even tighter and more efficient . . .
While working for a client to identify issues with their Master Data, thousands of orphaned Customer Masters were discovered. That's Ship-To, Bill-To and Payer Customer Masters that were connected to no Sold-To Customer Masters.
By interrogating the Sales Order data, it was possible to identify links that should exist, but it took a lot of optimization to coax the data out of the client's production system. A series of nested loops produced timeouts because they had four million Sales Orders to comb through.
Initially, an attempt was made to select all of the VBAK records into an internal table of type VBAK, but that caused the system to crash when it ran out of memory; the client's VBAK table has over 200 fields with their customization and is filled with four million records. A tighter, faster, more efficient method was required.
It started by reducing the internal table to just the bare requirements. The VBAK table has only five fields that are actually required for the report and the VBPA table has just three, so defining two minimalist internal tables with just those fields was the first step.
data: begin of lVBAK occurs 0, VBELN like VBAK-VBELN, KUNNR like VBAK-KUNNR, VKORG like VBAK-VKORG, VTWEG like VBAK-VTWEG, SPART like VBAK-SPART, end of lVBAK. data: begin of lVBPA occurs 0, VBELN like VBPA-VBELN, KUNNR like VBPA-KUNNR, PARVW like VBPA-PARVW, end of lVBPA. data: it_TVKOV type standard table of TVKOV initial size 0, wa_TVKOV like TVKOV, it_KNA1 type standard table of KNA1 initial size 0, wa_KNA1 like KNA1, it_VBPA type standard table of VBPA initial size 0, wa_VBPA like VBPA. data: cString(40) type c, lv_popup_ans type string. parameters: p_File type string.
The p_File
parameter is the output file for the report that is written to the local computer as the program runs. Writing report deltas to the local hard drive is a great way to recover reports in the event of a crash.
The second step was to populate the minimalist internal tables with the appropriate data from the database. Using the into corresponding fields of
code ensured that only the required fields defined in the first step were populated. Because the program is using nested loops instead of nested selects, the CheckRuntime code can be called at any time. The CheckRuntime snippet of code prevents the system from throwing a "run-time exceeded" error by calling a pop-up on the user's screen.
perform CheckRuntime. "Anchor the initial start time . . . select * from TVKOV into table it_TVKOV. select * from VBAK into corresponding fields of table lVBAK. select * from VBPA into corresponding fields of table lVBPA.
The last step was to loop
through the internal tables to find where in the Sales Orders the Sold-To Customer Masters had been associated with the other Customer Masters. Sorting, de-duplicating and writing the report out to the hard drive are written into the loop.
loop at it_TVKOV into wa_TVKOV. loop at lVBAK where VKORG eq wa_TVKOV-VKORG and VTWEG eq wa_TVKOV-VTWEG. loop at lVBPA where VBELN eq lVBAK-VBELN. if lVBPA-KUNNR ne lVBAK-KUNNR. wa_VBPA-KUNNR = lVBPA-KUNNR. wa_VBPA-PARVW = lVBPA-PARVW. wa_VBPA-PARNR = lVBAK-KUNNR. concatenate lVBAK-VKORG lVBAK-VTWEG lVBAK-SPART into wa_VBPA-LIFNR separated by ','. append wa_VBPA to it_VBPA. endif. perform CheckRuntime. "Prevent timeouts. endloop. endloop. sort it_VBPA. delete adjacent duplicates from it_VBPA. CALL FUNCTION 'GUI_DOWNLOAD' EXPORTING FILENAME = p_File FILETYPE = 'ASC' APPEND = 'X' * CODEPAGE = 'IBM' WRITE_FIELD_SEPARATOR = 'X' HEADER = '00' TRUNC_TRAILING_BLANKS = 'X' TABLES DATA_TAB = it_VBPA EXCEPTIONS FILE_WRITE_ERROR = 1 OTHERS = 2. refresh it_VBPA. endloop.
Update: Optimizing even further . . .
It turns out that combing through the partner functions in four million Sales Order records using internal tables was much slower than teasing the data out of the system using a table join. A tight internal table with only the necessary fields was enough to hold the requisite records and a single query populated the table in dozens of seconds instead of millions of seconds.
data: begin of it_outData occurs 0, KUNNR like VBAK-KUNNR, KUNN2 like VBPA-KUNNR, PARVW like VBPA-PARVW, VKORG like VBAK-VKORG, VTWEG like VBAK-VTWEG, SPART like VBAK-SPART, end of it_outData. select VBAK~KUNNR VBPA~KUNNR VBPA~PARVW VBAK~VKORG VBAK~VTWEG VBAK~SPART into table it_outData from VBAK join VBPA on ( VBAK~VBELN eq VBPA~VBELN ) where VBAK~KUNNR ne VBPA~KUNNR. sort it_outData. delete adjacent duplicates from it_outData. CALL FUNCTION 'GUI_DOWNLOAD' EXPORTING FILENAME = p_File FILETYPE = 'ASC' APPEND = 'X' WRITE_FIELD_SEPARATOR = 'X' HEADER = '00' TRUNC_TRAILING_BLANKS = 'X' TABLES DATA_TAB = it_outData EXCEPTIONS FILE_WRITE_ERROR = 1 OTHERS = 2.