BI consultants the hard way structuring sql and ssis solution

So you have some Job spec to implement, one or more SSIS packages and related SQL code to implement data extraction, data loading, file moving etc.

A handy approach I developed for the second part is to use the following files as starting point and expand them; I will try to give you good tips but feel free to contact me if you need more details.

The workflow is basically the following:

  1. Understand the Business SPEC
  2. Prototype the SQL solution
  3. Writing some test plan to verify correctness
  4. Deploy the SQL to QA/UAT/PROD

So let’s see point by point what we can reuse to keep a good quality and be quick

1. Understand the Business SPEC

2. Prototype the SQL solution

Well 1 & 2 are all up to you, generally speaking it’s important to clarify as much as possible before starting coding… not the way around.

  • Deploy.SupportTables.sql

Use it if you need tables to implement the logic, avoid missing newly created tables or forget new columns added, the same for support view, udf, proc etc… better to separate in several files

3. Writing some test plan to verify correctness

After some initial tests executed on your local machine, you need to automatize the pkg while approaching the test and deployment phase. I hope you are using some sort of strategy to deploy/push your new code to QA/UAT and finally PROD so this deliverable can be deployed with a simple script (ex SQLCMD)

  • Deploy.PkgConfiguration.sql

You can use it to implement a vey common sep in the SSIS package creation: define the SSIS configuration for the pkg (SSISConfigurations). Without being too formal a SSIS package always needs some way to configure itself such as changing some vars value (ex used in the connection manager); if your pkg has no Configuration it means all it’s hardcoded and this is bad as you cannot deploy the same package on different environments without code changes on it.

  • Deploy.SQLAgentJob_XYZ_NormalMode.sql
  • Deploy.SQLAgentJob_XYZ_OverrideMode.sql

A normal way to execute a package is to define sql agent jobs to call it (DTEXEC), so you can define the scheduling, setup configuration connection, override some parmas at runtime not kept in the configuration of course.

I generally use 2 versions: one for normal mode to implement the biz as usual execution of the package and a second one to reload some data called override mode (it’s quite common to be required to reload some data that was updated in a source file/db …)

The difference usually resides in the way the pkg is configured so it will act/behave in a slightly different way in NormalMode/OverrideMode

  • TEST.usp_NormalMode.sql
  • TEST.usp_OverrideMode.sql

At the end of the testing phase, the Deploy.*.sql should be ready and tested to be deployed. The TEST.*.sql are important so you can reuse them to give to the support and instruct the IT support have to reload data or escalate issues.

  • Deploy.AddAgentJobSchedules.sql

At the very end, update it with the pkg scheduling (if any)

4. Deploy the SQL to QA/UAT/PROD

The last part is the Deployment. Assuming that all the pieces to deploy are ready, you can divide logically the scripts with the following approach:

First DDL code (create/drop/update objects), then DML (fill the tables with configuration etc.) and at the end any Admin code (not related to the logical implementation but necessary sometimes Ex calling an external PROC at the end of the deployment to refresh some data…)

  • Deploy.Helper.bat

This is quite handy, it will walk thought the Deploy.*.sql looking for some text strings you want to keep in mind and listing the USE clause found, so you can be clean and organized splitting the DDL/DML/ADMIN by db, rather then putting all in a huge and unmanageable script

  • Deploy.AdminDMLDDL.Helper.bat

Checking the Deploy.Helper.txt you can call this to create empty placeholders to copy and paste Deploy.*.SQL code from 1, 2 and 3. Note no code change should happen here…

The result would be something like:

  • Deploy.DDL_DB1_deployment_task.txt
  • Deploy.DML_DB1_deployment_task.txt
  • Deploy.Admin_DB1_deployment_task.txt
  • Deploy.DDL_DB2_deployment_task.txt
  • Deploy.DML_DB2_deployment_task.txt
  • Deploy.Admin_DB2_deployment_task.txt

All the code is ready now to be deployed to the next step of the software lifecycle.

Ex

<pre>

/*
 *** TODO change
 #XYZ#
 #DB#
 #Schema.Table#
 #Schema.Proc#
 #Schema.View#
 */</code>

--**********************
 --DDL Deployment Task *
 --**********************

--*************************************************************************************************
 --Prepared By * MARIOA
 --Date * YYYYMMDD
 --Purpose * Support DDL SQL for the #XYZ# ETL job
 --Name * DDL_Deploy#DB#_rFrameSupportSQL
 --*************************************************************************************************

USE #DB#;
 GO

SET ANSI_NULLS ON
 GO

SET QUOTED_IDENTIFIER ON
 GO

-- CREATE TABLE ---

DECLARE @object_type char(2) = 'U' --User Table
 PRINT '#Schema.Table#...'

IF NOT EXISTS (SELECT *
 FROM sys.objects
 WHERE [object_id] = OBJECT_ID('#Schema.Table#')
 AND [type] = @object_type)
 BEGIN

--
 -- DML
 --

PRINT 'New table created on ' + @@SERVERNAME + ' in ' + DB_NAME() + ' database';
 END
 ELSE
 BEGIN
 PRINT 'Table already exists on ' + @@SERVERNAME + ' in ' + DB_NAME() + ' database' + ' - no action taken';
 END
 GO

-- ALTER TABLE ---

DECLARE @object_type char(2) = 'U' --User Table
 PRINT '#Schema.Table#...'

IF EXISTS (SELECT *
 FROM sys.objects
 WHERE [object_id] = OBJECT_ID('#Schema.Table#')
 AND [type] = @object_type)
 BEGIN

--
 -- DML
 --

PRINT 'Table altered on ' + @@SERVERNAME + ' in ' + DB_NAME() + ' database';
 END
 ELSE
 BEGIN
 PRINT 'Table does not exist on ' + @@SERVERNAME + ' in ' + DB_NAME() + ' database' + ' - no action taken';
 END
 GO

-- Proc ---

DECLARE @object_type char(2) = 'P' --Stored Procedure
 PRINT '#Schema.Proc#...'

IF EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'#Schema.Proc#') AND type in (N'P', N'PC'))
 DROP ProcEDURE #Schema.Proc#
 GO

--
 -- DML
 --

PRINT 'Stored proc dropped and recreated on ' + @@SERVERNAME + ' in ' + DB_NAME() + ' database';
 GO

-- VIEW ---
 IF EXISTS (SELECT * FROM sys.views WHERE object_id = OBJECT_ID(N'#Schema.View#'))
 DROP VIEW #Schema.View#
 GO

--
 -- DML
 --

PRINT 'View dropped and recreated on ' + @@SERVERNAME + ' in ' + DB_NAME() + ' database';
 GO
</pre>

Some credits to Ian Bradshaw (for the normal/override mode part)

One thought on “BI consultants the hard way structuring sql and ssis solution

  1. Pingback: BI consultants the hard way from specs to ssis | Obar1 aka Mario Amatucci

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s