I’m continuing with the basic terminologies to get us going into knowing the details of the SSIS tool in general. In the previous tutorial, I’ve introduce the interface of the SSIS platform. We discussed the control flow, data flow, event handlers and a package it self. As we saw in the last tutorial, the control flow is like the conductor of the train. It orchestrates the execution of the package. One important thing to note is there are many items available at the control flow level, data flow level and  event handlers level, however each level has items specific to functions available at that level. l’ll not be able to go through all the items at various levels today but as we go forward, we will eventually touch every item in this wonderful tool. So to get this conversation going, I decide to give  a general information about the items at each of the levels.

Note, you can’t learn the tool without having it available, so get it installed before you even worry reading this tutorial because there are many things l’m not going to mention here but you can quickly observe it. For example, click on the control tab->view->toolbox. You will realize the many tools available here. I will take each of these tools and briefly explain what it is used for and how to use it. The control flow tab has control flow items, maintenance plans and some general items. The control flow items helps to access high level tasks such as Data Flow Task, Bulk Insert Task, Execute Package Task etc. The maintenance plans helps in doing some regular maintenance works in our database, example backing up the database. Expand each of these sections in the toolbox to see the many tasks available.

When we move to the Data Flow tab, we encounter another range of  tools separated into three major groups. Data Flow Sources are tools that help you point to a source where you are going to extract data. These ranges from database, xml files, excel files etc. Depending on  where the data resides, you’ll use one of these tasks to point at it in other to extract it from there. The next section is the Data Flow Transformations, these are the tools that help us to change data, add a new column, convert the data, clean up the data etc. These section is very important and it  required detail understanding to effectively accomplish your transformation task. The next section is the Data Flow Destinations, these contains tasks that help you load data to the target and as in the source, these varies. As part of the platform, you can install or add plugins such as SharePoint List Adapter which can help your SSIS package to access lists on SharePoint to read data or write data. We will talk a little bit more about these kind of stuff when we get better understand of the tool.

Another important terminology to keep in mind is Variables. Variables as we know in programming is the same in SSIS, they hold a place where we can dynamically place different values and access at run time. For example, you can have a variable age which holds the age of a person based on the date of birth you are reading. So to dynamically get the data for each person, you can have a procedure calculate the age and give it to the age variable and do whatever you want to do with it. We will look into detail how to create variables and the data types for variables and their meaning. As l said, these is a general view of the tool and we will eventually go into detail to learn about every detail necessary to get our job done. Variables can be global or local to a task, a global variable is declared at the control flow level and not specific to any task. A global variable is available to all tasks in the package. To create a global variable, click control flow tab, right click any empty space in the window, and choose variables. The variables window will open and you can create your global variable. A local variable on the other hand is specific to a task  or container in the package. To create a local variable, first click on the task and perform the step we did above to create the global variable.

The last thing I’m covering in this section is connection managers. Connection managers are so important that, you can’t do anything without them. They are the communication link to the sources and destinations of any ETL process. They help you to access the files, databases that are needed for extraction and loading. Without them, you can’t access those sources and you can’t load to any destination. There are many connection managers but the most general ones are OLEDB Connection manager, Excel connection manager, XML connection manger, Flat file connection manager, analysis service connection manager etc. We will start seeing these in use when we start creating our packages but for now, just know what is a connection manager.

Questions, comments and suggestions welcome.

Peace.

Leave a Reply

Your email address will not be published. Required fields are marked *

Name *