Friday, September 5, 2014

Compare two tables using CozyRock table difference transformation

Ever had to compare the destination and source for missing updates or inserts? well, I had to because of some load issues related to timing or source updates or whatever, we had some data missing in the warehouse and needed to do data quality checks from source to destination.
After going through developing various methods of doing this and see how time consuming it is to go through each and every record (when you have billions of records in a table), the fastest and easiest way I found out is using the cozyrock table difference transformation.

below is an example of how easy it is to use..
(i am using an example of Customers table and I am getting the fields Account and email)
I am sourcing the Account and email fields from the Customers table in Customers database into a warehouse table  "WH_customers".

Step 1: Write queries to read the Account and Email fields from Customers table and WH_customers table and use the queries in the 2 sources shown in fig 1.

/* from customers DB, since it is transactional and the record will be updated when the customer updates his email on the frontend application */

SELECT Distinct Account, email
FROM    Customers.customers WITH (NOLOCK)

/* from the warehouse table, if you do only inserts and not updates*/

With customers (account, email, srno)
AS
(
select account, email, SRno = RANK ( ) OVER ( partition by account order by insert_date desc )
from dbo.customers
)
select * from customers where srno =1

/* from the warehouse table, if you are doing updates to the emails*/
SELECT Distinct Account, email
FROM    WH_customers WITH (NOLOCK)

Fig 1


Step 2: Configure the table difference transformation as shown in the below figures.


you can create a table called exceptions table and load the exception records coming out of the New and Update flows and build reports or send emails to the business or whatever you want to do once you have exceptions in the table.


Tuesday, August 16, 2011

Rendering SSRS reports from Command Prompt

Subscriptions is one interesting topic in SSRS. We can create subscriptions that run the report, send them over the email or place the reports in any specific path. But what if the report has a parameter and there are more than 100 values for that parameter? Definitely, creating 100 subscriptions is not the best idea. Now, we can do the same thing using some VB script and render the report through command line.


Below is the VB script that I used to run the report, save the report as excel in a folder, so the user can access it.


I named the Vb script as testreportautomation4.rss and saved it in the temp folder.
now, in the command prompt, I give the path .rss file and the link to my report server.



When the above script is executed, the report will be executed for all the parameter values that were given in the script, and the excels are saved in the path that was given in the 'filename' variable.
For more effective reporting, we can create a link to the databse and get the parameter values from the database instead of hard coding in the script.

Saturday, June 11, 2011

Handling the Multi-Valued paramters using XML

When comes to effective reporting, we always come across the multi-valued parameters and handling them in a stored procedure can be done in many ways.

We can use temp tables or a separate function that splits these comma separated values etc...
There is one other way that we can handle this. That is through XML data type which improves the performance of the query with out having to call another separate execution of a function or a procedure.

Let's see how to do this..

Let's say we have a multivalued parameter named 'String' in our report and the value of that parameter is passed to the stored procedure. The values in that parameter will be comma separated. Now we can declare an XML variable in our stored procedure and assign the value of the 'String' parameter to the newly created XML parameter..



as you see in the picture, what i am doing here is I am taking the ',' as delimiter and creating an XML with those values and then I am filtering my main query using the nodes in the XML that we just created.
In case you need the other process, creating a function that splits the comma separated values and inserts them to a temp table and calling that would be the next better option.

In summary, we are not only reducing the amount of code we are writing, but also the execution time..

Also make sure that the tables you are using in the main stored procedure are properly indexed. Mainly, if you are using a String valued column in a where clause, in order to achieve better performance, that column must have an index.

Show Negative numbers in the report in parenthesis

It is a quite common situation where we need to show negative numbers in some format rather than just showing a '-' sign before them. Most common is the '(' , ')'. This is very easy to achieve.

First, let's say this is the value in the expression box for the field..



Now, close the expression box, and open the properties dialogue box.. Then in the 'general' tab, and in the value field, add the same expression which is in the expression box..



Now in the Format tab, in the Format code field, add this expression..



Now when you run the report, you can find the parenthesis around the negative numbers..

Tuesday, April 5, 2011

Connection String expression to save password in the Connection Manager in SSIS

Ever got into a situation where you are not using configuration Files for connection managers and everytime you need to enter the password in the connection manager to run the package? Obviously the package is going to fail if you schedule it as an agent job. A simple turnaround.. Use this expression in the "connection string" Expression box for the connection manager.. "Data Source=yourservername;User ID= youruserID;Password =your password ;Provider=OraOLEDB.Oracle.1" (In my example the provider is oracle) Now as the password in hardcoded in the connection string, when ever the package try's to access the connection manager, the password is available in the connection string expression, and the package will run smoothly even in the agent job.