Skip to main content

Posts

Showing posts from September, 2014

13th Day

Data Conversion : ** The 3rd method to convert the data is through Data conversion t/r directly. **When we convert the data using type casting by derived/conditional split t/r's we can directly    perform any operations on converted data, but here when we use this t/r a new column is added    in the output. ** Then the above output can be used for further processing. Example: Try with a flat file source as input. Merge Transformation : ** Up to now we have seen only one source as input but we can have multiple sources by merge t/r. ** For merge t/r we can only have 2 i/p sources and that to of  similar structures and same data types ** The i/p to the t/r should be in sorted order based on the same column in both the i/p's Example: Take a flat file source and a oledb source, then convert the column which needs to be merged into the same data type at source level. Link the sources to sort t/r's and define the sort c...

12th Day

Conditional statement in SSIS: same as ternary operator in C ** <condition>? <true>:<false> Eg: Gender=="M" ? "Male": "Female" Multicast Transformation : This t/r creates multiple copies of i/p data. it is useful because every time we need not to go to source. ** This will improve task performance. Aggregate Transformation : It is used to convert the detailed values into aggregated data by using Aggregate functions like sum,avg,max,min,count etc. ** It is Asynchronous transformation. Example: to the i/p data from source group by deptno and perform aggregate funcs. Conversion in SSIS ; * Conversion is very important because data from flat file, excel comes to transformations in the form of string(varchar) , so we cannot perform required operations so it require to convert data. 1) Directly at source level : ** Select flat file source, right click--> show advanced editor. ** Select i/p & o/p properties tab, ...

11th Day

** Conditional split : This transformation is used to split the data based on a condition and it is also a Asynchronous transformation ** Here we check the condition for each and every record and move the data as the o/p for first matching condition, record not satisfying any condition will be moved to default group. Example: Take employee table using sort  t/r eliminate duplicate rows for Sal column and then using conditional split, split the table based on deptno and at destination add another identity column for no of employees for each dept. ** Derived Column : By using this transformation we can add or change existing column data by some expression.. ** It is synchronous transformation. Example: Add another tax column for emp table by writing expression as Sal*0.1,                 Change existing column ename as Upper(ename).  

10th Day

** Generally packages define only structure and it doesn't maintain any data i.e it contains only metadata which defines structure, so the size of the packages will be in the size of kb but process or transforms data in tb. ** Transformations: These are intermediate metadata objects placed b//n source and destination, contains an input and output link. * Synchronous transformations: No of o/p rows will be equal to i/p rows. * Asynchronous transformations: No of o/p rows will be less than or equal to i/p rows. 1. Sort: It serve the purpose similar to order by clause except here we can eliminate redundancy/ transfer directly to the sorting columns * It is used to sort the input data either ascending or descending order based on column(s) * It is a Asynchronous transformation because it eliminates duplicate rows. Example: Load the data from source into a destination with applying sorting on 2 or 3 columns.  

9th day

** For Executing a package use F5 and package enters into debugging mode,      use Shift F5 for stop debugging and once again entering into design mode. Example 2: Load data from Excel sheet into table. Example 3: Load data from one table into another table. Heterogeneous:  Means Source and Destination are of different kind and vice versa..  

8th day

Parameters tab: * It allows to manage the parameters in the package. Event Handlers tab: *It allows to work with event handlers like OnPostExecute, OnPreExecute, OnError,  OnProgress. Package: * Package is a collection of set of executables(or objects) that are used to implement   some business process. * Every package ends with .dtsx extension. Package Explorer tab: * It allows to explore the objects in the package. * Package contains the following objects:  1) Executables 2) Connection managers 3) Precedence Constraint  4)Parameters  5) Event handlers 6) variables 7) log providers Example 1: Design the package to load the data from flat file into a table 1) Drag and drop data flow task into control flow tab and double click the task 2) Drag and drop flat file source and oledb destination onto flow tab and define connection managers

7th day

SSIS Designer: ** Packages are developed using 2 ways the one is through SSIS wizard and second is through Designer. * In SSIS wizard process we create packages in Import and export data tool, Here we create packages using SQL server data tools. * Using Wizard we can develop only simple packages but by using Sql server data tools complex packages can be developed in accordance to required business logic. * Start--> SQL server data tools or open run --> devenv * Select new project and project type as integration services for SSIS application development. * New project contains different windows--  1)Solution explorer: Many projects can be made under one solution and using this we can easily             navigate through projects and under the projects contains packages 2)Tool Box: It contains items placed in the SSIS package. ** Items can be Control flow(tasks and containers) , Data flow(source, transformation, dest) 3) Propert...

6th day

Versions of SQL: From SQL server 2005 on wards complete BI came into existence. We use SQL server 2012 i.e 11.0 version. Difference Between. Difference between Sql server 2008 and 2008r2 Difference between Sql server 2008 and 2012 Installation steps: 1.Run .exe file. 2. In the feature selection select all features except Data quality services, Data quality client, Distributed         Replay controller, Distributed Replay client. 3.select default or named instance. 4.leave default for SQL server 2012, for 2008 select as "NT Authority\System" 5. Select "mixed mode authentication" for DB config, provide password, click on "Add Current user" 6.in analysis services config, click" Add current user" 7. In reporting select "install only".

5th day

Example 2: Load data from excel source into table. if we choose drop and create table option during edit mappings. every time when we run the package table gets created with the latest data and can execute any no of times. We can create a package that run any no of times w/o error by choosing "create table option" also by following procedure. Click on edit sql and edit the text as follows if not exists(select * from sys.tables where name="table name") create.... ............ It is called as [on fly table creation] Now if we run the package with table already existing, no errors occurs and data gets appended Example 3: load data from table to table. If table already exists in destination and we want to transfer data then only "append data option" is chosen . By this option if table not there in destination then error occurs when we execute the package. Limitations of import and export data wizard: 1) only one source and destinatio...

4th day

**SSIS-- GUI based ETL tool * works on client server architecture i.e -- ssis applications which are used for integrating are developed on client machine     and deployed on server machine , so that every one uses those. * Applications are developed by creating packages( collection of all items) * Packages can be designed in two ways 1)SSIS wizard( Import and export data tool):   *only simple packages are developed by import and export data tool *go to "import and export data tool" or run--dtswizard  Executing packages: 1) Open execute package utility or run--dtsexecui 2) By using command line utility--C:\>dtexec/F<path> Example 1: load data from text file into table. choose create table option--- means package runs first time w/o error, but on second execution on wards it raise error because already table exists in database

3rd class

**ETL operations- extract business data from op s/m and make necessary      transformations and Load the data **ETL tools are used perform etl operations. **This can be done using 1) Using GUI based tools : only drag and drop can be used to perform ETL     operations no need to write code    Ex: Informatica, Data stage, Oracle ware house builder, SSIS . 2) Using code based tools: Name itself specify    Ex:SAS ETL tools, Teradata utility tools. ** ETL lifecycle: 1st stage loading into staging tables w/o any transformations 2nd stage loading into dataware house tables with necessary tran...s.

2nd class

** SSAS-SQL Server analyzing service ** It is not used for analyzing, rather than it is multidimensional database. **Data in data ware house normally resists in the form of tables. **Data in table presents in 2 dimensional format, which is difficult to analyze so it        required to represent in 3 dimensional for more. ** In SSAS data is in represented in 3 or more dimensions by creating CUBES      Ex: if we want to see data from two or more tables combined we have to use               joins, rather than using that we create cubes by writing MDX query and then use       those CUBES. **SSAS is MULTIDIMENSIONAL Database. ** CUBE is a MULTIDIMENSIONAL DATABASE object. ** MDX is multidimensional expression language used to create cubes.

1st class

** MSBI---it is used for analyzing a business i.e decision making(how to increase profits, how to increase business ex: providing discounts based on less sold months) ** To analyze a business we need summarized data i.e non detailed data which comes from detailed data ** Detailed data is each and every transaction(OLTP) and it present in operational system. ** MSBI is a set of technologies used for decision making and it consists of SSIS, SSRS, SSAS. ** SSIS- SQL Server integration services , it is an ETL used to load data from operational system into dataware house **Data in operational s/m consists of different forms i.e flatfile, excel, database. We need to integrate and provide them in a single dataware house, it can be done through ETL tools. ** SSRS-SQL Server reporting services, ** User expects data to be user friendly i.e wish to see data in easy reports than writing queries again to extract from Dataware house. So Reporting services help to do this.