Handle text files with Excel while maintaining data integrity

Jimbo's picture

Too often a team member will use Excel to open a source file for the purpose of changing some content. While manual manipulation of source data is taboo and should be avoided at all costs, it is sometimes unavoidable.

The data can still be handled in a way that maintains its integrity. By using an alternate way of getting the data into and out of Excel, the quality of the data can be assured and the project need not suffer a catastrophic failure. These steps are fast, simple and easy-to-use. When used as standard operating procedure, this method will save hours, alleviate frustration and ensure quality.

The first step is to open the text file in Notepad. In Explorer, find the text file and open by double-clicking on the icon for the file or on the name of the file. Be careful to avoid single-clicking two times instead of double-clicking; single-clicking two times will prompt Explorer to change the name of the file.
Source file for LSMW.

The file will open in Notepad as seen below. This data is already in the perfect format to be loaded into SAP.

  • It has no quotation marks.
  • It is tab-delimited.
  • It has no scientific notation.
  • It has leading zeros.

Source data in text format.

The next step is to copy-paste the information from Notepad into Excel. This process is almost as simple as it sounds. Start by clicking Ctrl-A (select all) and then Ctrl-C (copy). This will put the entire contents of the text file into a temporary storage location called The Clipboard.
Copy the text from notepad into the clipboard.

Now, without closing Notepad, open an instance of Microsoft Excel. In Excel we will paste the data from The Clipboard into Excel, but not before explaining to Excel how the data is to be handled. Right-click on the triangle in the upper-left corner of the spreadsheet (as shown below). In the menu that appears, choose Format Cells...
Format all the cells as text even though text will be pasted in.

When the list of available formats appears, simply double-click on Text. This tells Excel that every cell in the spreadsheet should be handled as text. Without this step Excel will attempt to intuit what format the data should take. Be default, values that consist of numeric digits will be converted to a numeric value and especially long numbers will be converted to scientific notation. Values that look like a date will be converted to date. By telling Excel to handle every cell as text we ensure that the data is not adulterated by Excel.
Format all the cells as text even though text will be pasted in.

With that done, single-click on the A1 cell so that only one cell is selected; skipping this step will cause Excel to throw an error about how the clipboard data and selected area are not the same size. Then click Ctrl-V (paste) to transfer the contents of The Clipboard into Excel. Note that the data appears in Excel without modification or integrity loss.
Text pasted into Excel for transformation

Once done editing the information in Excel, simply reverse the process by copy-pasting the data from Excel into Notepad. In Excel click Ctrl-A and Ctrl-C to copy all of the content from the Excel spreadsheet into The Clipboard.
Text selected in Excel after transformation

Hopefully, notepad is still open with all the contents selected. In this window click Ctrl-V to paste all of the data from The Clipboard into Notepad. This will replace the contents of Notepad with the information from Excel. Note that the data still has all leading zeros and no scientific notation. Also, the data has no quotation marks around the data and is tab-delimited instead of comma delimited. Using File → Save from Excel will "encapsulate" the data from each cell in quotation marks and use a comma as the delimiter instead of tab.
LSMW import text with format preserved

Preserving the integrity of the source data is that easy. Easier still is to avoid using Excel, but that's not always possible.