In this sptep by step SSIS Tutorial I will show you how to use SSIS data profiling task and for the purpose of this tutorial I will use SQL Server 2012. These steps should work in both SQL Server 2008 and 2008 R2 versions.
Prior knowledge: It might be easier for you to follow this tutorial if you have basic knowledge of creating an ssis package for more information visit how to create an ssis package.
We are going to show you how to create in SSIS 2012 data profiling task with column null ratio request.
Below I add a new SSIS package using Solution Explorer.
We can see new package is added you can see it in Solution Exlorer in SSIS Packages folder.
Once new package exists; we can add connection to this package. In this case it will be New ADO.NET connection.
NOTE: It seems SSIS 2012 Data Profiling task works only with ADO.NET connection (does not work with OLEDB connection) and accepts only SQL Tables. So if you have flat files (CSV, TXT) you will have to import data into SQL Table. You can see our other SSIS tutorials for more information how to do it.
In "control flow" I right click empty space in connection manager (bottom part) and clicked New ADO.NET Connectiob.
I got connection Mengar window. At the botom click New buttom to create new connection.
In new pop up window type Server name (I typed localhost = .) and from database drop down box I selected itcouple.
Once my new connection is added I click ok to close the connection manager window.
You can see that in Connection Managers we have new connection called itcouple.
Now that we have connection to our "source" data that should exist in SQL Table, we should create a file connection to ouput results of data profiling task before we actually use the data profiling task.
Now I add New File Connection. To do that right click on Connection Menagers and from the list select New File Connection.
In File Connection Manager Editor we can set up Usage type and in our case we want to "Create file" (if it doesn't exist) and in File text box I typed path and file name where I want data profiling results to appear.
NOTE: The file itself does not have to exist and we chose create file option. The output of profiling task is XML so we could have put file name PageNullRatio.XML but not having extension will still work.
It's time to create the actual SSIS Task with Column NULL Ratio request.
From SSIS Toolbox I drag and droped Data Profiling Task to Control Flow area.
I double clicked on Data Profiling task that displayed Task Editor and in General section I clicked Destination drop down box and select my output file which I created previously and is called PageNullRatio.
Still being in General tab I change OverwriteDestination property to TRUE
NOTE: If you don't change this property and your output file already exists than the package will fail.
Let's move on to Profile Requests section and from ProfileType drop down box I selected new request which is Column Null Ratio Profile Request.
NOTE: I think I had to click in different places before bottom part of the window displayed properties of the new task
When you have your new request selected, the bottom part of the window shows Request Properties where I select Connection Manager which will point to my "source" which is a SQL Table and in my case I select itcouple connection.
In the Table or View property I select PAGE_Page table
and in column I select star as I want to perform the request on all fields in this table.
Now I can run my SSIS data profiling task so I click the green arrow on the toolbar.
My task successfully run as I got green circle with a tick.
That is the end of this tutorial.
Later on I will add a link here to a tutorial that I sitll have to wirte that will contain instructions how to view the results of SSIS Data Profiling task which in our case was Column Null Ration request.