Processing Complex Text Files with SQL Server

Many organizations work with files that contain header information at the top of the file before the data rows begin. This header information typically contains critical information of the file such as when it was processed, any transaction identity markers or batch ID’s, and other important information. In order to retrieve this information, IT departments usually have to write custom scripts. This makes it challenging to manage as the files can change over time. With EDIS though, processing files like this are very simple and quick to change when need be.

Consider an example file such as “bank_file.txt”. This file contains 4 key pieces of information at the top of the file before the actual data rows begin which are the run date, run time, batch ID, and wave ID. Here is an example of what the file looks like when opened:

Run Date: 2017-04-12
Run Time: 08:07AM

Batch ID: x4372

Wave ID: 411928257282

<< BEGIN DATA LOAD >>

TransactionID ProductID ReferenceOrderID ReferenceOrderLineID TransactionDate TransactionType Quantity ActualCost ModifiedDate
100000 784 41590 0 2013-07-31 00:00:00 W 2 0 2013-07-31 00:00:00
100001 794 41591 0 2013-07-31 00:00:00 W 1 0 2013-07-31 00:00:00

As you can see in this file, the first 11 lines of the file are filled with the information we need and whitespace before the actual data rows begin. To retrieve the 4 pieces of information we need, we will use EDIS function ufn_read_file_line. This function will read a specific line from a text file, simple as that. To get the run date, we can use the function as follows:


DECLARE @file_path nvarchar(1000) = 'C:\bin\stage files\bank_file.txt';
-- The run date is on line 2 of the file
DECLARE @run_dt_line nvarchar(max) = SSISDB.edis.ufn_read_file_line(@file_path,2)
-- now remove the text "Run Date: " and cast the result as a date
DECLARE @run_dt date = CAST(REPLACE(@run_dt_line,'Run Date: ','') as DATE);
PRINT @run_dt
2017-04-12

 

Simple right? Imagine if one day when you receive the file, the line on which the run date is printed changes to line 4 instead of line 2? To fix this, simply change the second argument in EDIS.ufn_read_file_line from 2 to 4, and that’s it.

Below are links to the example text file and a stored procedure to process it in full. Enjoy and remember that all features that were demonstrated in this blog are FREE in EDIS Standard edition. Download EDIS Standard today and give it a try!

Bank File
usp_import_flat_file_with_header_info

 

 

 

Leave a Reply

Your email address will not be published. Required fields are marked *