Pragmatic Works Blog

rss

Company blog for some of our Pragmatic Works employee to write about company happenings and about the software business.


Webinar Follow-Up: SSIS Performance Tuning Techniques

Instructor: Mitchell Pearson

Click here to watch the entire webinar.

Click here to view Mitchell's blog.

Thank you everyone for attending my webinar on SSIS Performance Tuning, if 
you missed that presentation you can watch it here:http://pragmaticworks.com/LearningCenter/FreeTrainingWebinars/PastWebinars.aspx?ResourceId=683

Below are some of the questions that I received from the Webinar:

Q: Can you give some performance tips in using script tasks too?

A: Yea, don’t use them! Script tasks can be really bad for performance and 
many times I have been able to replace someone else’s code with native SSIS 
components. For example, I had situation where the client was using some very 
complicated .net code to parse out columns based on logic. The package was 
taking over an hour to process 1 million records. I replaced this with some 
conditional splits and derived columns and it now runs in 3 minutes.

Q:  I am assuming that the file formats must be the same for all files when 
using the MultiFlatFile transform, correct?

A: You are absolutely correct. The metadata in each file must match.

Q: PW delivers a 'TF Advanced Lookup Cache Transform" component.  What are 
the benefits of using this component over the Cache Transform covered earlier?  
It seems that the TF components cannot make use of the same result set when the 
data set is role based.

A: For basic caching of data I would use the native SSIS cache transform. The 
major advantage you get from the Task Factory component is you can do very 
difficult range lookups with ease and  they will perform at a high level. Please 
see my blog post on this.

Q: What version of SQL Server is being used?

A: I was using SQL 2012, but everything in the presentation is applicable to 
2005 and 2008.

Q: With the multi flatfile connection manager can you specify specific 
types?

A:  Yes, the wild card character can be anywhere in the connection string 
property. So you could do test*.txt to only pull in text files where the file 
name begins with test.

Q: Why would you ever not use table or view (fast load) option in the OLEDB 
Destination?

A: Well I personally would always use that option. However, with the fast 
load option all records are committed for the entire batch. So if there is a 
record that is bad and causes the failure you will not know which record caused 
the error. With table or view option each record is committed individually so 
you know exactly which record caused the failure.

Thanks again!



Writing Parametrized MDX for Reporting Services

By: Devin Knight

Click here to view Devin's blog.

Writing MDX inside of Reporting Services has been the nemesis for many report writers for far too long.  If all you need is the Query Designer to create a drag and drop datasets then it’s a piece of cake, but have you ever wondered what’s going on with all the MDX that the Query Designer creates.  Or how about how do the parameters work that it creates inside MDX.  I’ve heard many report writers that use Analysis Services as a data source (including myself) say it is too difficult to create parameters and that’s why they use the Query Designer.  In reality, I think what the real problem is probably that the MDX results that the Query Designer provides make it look a lot more difficult than it really is.  If you know some rudimentary MDX you can probably be writing your datasets yourself instead of using the Query Designer, which there’s no problem with sticking with if it fits your needs.  The goal of this article is to guide you through writing a basic MDX statement that can be used for a report query and more importantly show you how to parameterize it. 

The two MDX functions that we are going to use for this are:

StrToSet

Returns the set specified in the string expression (MSDN definition).  Essentially, Converts the text you write into an MDX set.

StrToMember

Returns the member specified in the string expression (MSDN definition).  Converts the text you write into an MDX member.

 First we will start with an example that uses StrToSet and then add to it using StrtoMember.  To follow along with this example you’ll need the Adventure Works cube that can be found at www.codeplex.com.  I’ll assume you know some basics of Reporting Services and not define each component of the tool that is not new for this example. 

Create a new Report Server Project and Report that uses the Adventure Works cube as a Data Source.  Next, create a Dataset and for this example you can call it CategorySales (this Dataset should use the Adventure Works cube data source).  Select Query Designer and click the Design Mode button 1a  to begin writing your query.  Use the following query to return back results without a parameter:

1

Select
[Measures].[Internet Sales Amount] on Columns
From [Adventure Works]

All this query did was returned back the total sales, but this is useless for a report by itself.  Let’s say we want to add the product category that the sales belong to.  If we were to do this normally in MDX it would look something like this:

Select
[Measures].[Internet Sales Amount] on Columns,
[Product].[Category].Children on Rows
From [Adventure Works]

That’s not good enough for our pretend end users though.  They not only want to see the list of product category sales, but they also want it made into a parameter so they can select from a dropdown box the category or categories they wish to view.  To do this, first create the parameter by selecting the Query Parameters button ParameterButton.

  • Assign the parameter a name. Ours  will be called ProductCategory
  • Identify the Dimension and Hierarchy it associates with. Ours will be from the Product dimension and the Category attribute hierarchy
  • Multiple values identifies if it should accept multiple values.
  • Last give the parameter a default value. This can set to Bikes for our example

2

After you hit OK to confirm the Query Parameter screen you will need to modify your MDX to take advantage of the parameter you have created.

Select
[Measures].[Internet Sales Amount] on Columns,
StrToSet(@ProductCategory, Constrained) on Rows
From [Adventure Works]

Here we use StrToSet to convert whatever is brought in from the parameter values selected by the end users to something MDX can understand.  The Constrained flag here just ensures that parameter provides a member name in the set.

Hit OK twice to confirm the query so far and make a simple tabular report to view your results so far.  You will notice that a dropdown box parameter was automatically created for the ProductCategory parameter we defined. You may already be familiar with this if you have ever used the drag and drop interface for making parameters. Reporting Services automatically creates a Dataset for this parameter dropdown box, which you can view if you right-click on the datasets folder and select Show Hidden Datasets.

3

Now let’s take this example one step further.  We will add two more parameters to our query to create a date range.  Hit the Query Parameters button again and create the parameters StartDate and EndDate that reference the Date Dimension and the Date attribute.  These can be defaulted to any date for now. 

4

Change the MDX to now add a where clause that restricts dates this time using StrToMember .

Select
[Measures].[Internet Sales Amount] on Columns,
StrToSet(@ProductCategory, Constrained) on Rows
From [Adventure Works]
Where StrToMember(@StartDate, Constrained) :StrToMember(@EndDate, Constrained)

Hit OK twice to confirm the changes to the Dataset.  Remember that these parameters will automatically create new Datasets and parameters so you should just need to preview the report to see the changes.

5

I hope this makes writing your own MDX with parameters inside Reporting Services a little easier. 


Webinar Follow-Up: Create a Tactical Plan Using Hadoop & Cloud Tech

Presenter: Rowland Gosling

Click here to view the entire webinar.

Click here to view Rowland's blog.

Thank you to all who attended my online workshop this morning! I had two questions I’d like to address:

Q: Can you use the ODBC Hive solution to insert data into Hadoop?

A: Not with our current technology. Hadoop data is basically read-only. There is no ODBC way to update a row for instance. It can be updated using things like Pig but basically what you’re doing is replacing entries.

Q: Can you use CSV/Storage Spaces?

A: I did some checking and the general answer is no. On the other hand why would you want to? There’s blob storage and plain old UNC file paths.

Thanks everyone!



Webinar Follow-Up Getting Started with Change Data Capture

Presenter: Bradley Schacht

Click here to view the entire webinar.

Click here to view Brad's blog.

Yesterday I presented about Change Data Capture for the Pragmatic Works Free Training on the T’s. I covered the basics of what Change Data Capture (CDC) is and showed how to use it. We had a good turn out, so thank you to everyone that attended. If you were unable to join or would like to see the recording you can find the link here. The Q&A can be found below.

Free-Training

 

Q: Does disabling CDC remove any objects that were added?
A: Yes, when you disable CDC on an individual table the associated change tracking table as well as the auto-generated functions are removed. If you disable CDC on the database all change tracking and auto-generated functions are removed. The wrapper functions are not removed automatically, you must remove those yourself.

Q: How does the log reader job impact the transactions being written in the application?
A: The log reader only does reads on the transaction log. It does not lock the log or anything like that. There should be no affect on the applications writing to the tables.

Q: Will CDC work with Simple recovery mode?
A: Yes, CDC works with both full and simple recovery modes.

Q: Can CDC be enabled on MDS (Master Data Services) databases?
A: CDC can be enabled on any database as it just reads the logs. That being said you do need a primary key or unique index on the tables. I’m not sure I see the value of putting CDC on an MDS database though. Those tables are setup such that you need to go through the views to get data back.

Q: Does CDC work on Hecaton databases?
A: I don’t see why not, although I will need to test this as I have not tried it. Ultimately there is still the transaction log involved as far as I know so I would venture to guess yes. Don’t at all hold me to that as I haven’t done any real world work with Hecaton so testing and a follow up blog will be required there. You’ve definitely peaked my interest there though.

Q: What do you recommend for getting the first pull of data (initial load), date or LSN based queries?
A: I prefer to go the date base route using the control table. The first time you pull data pass in NULL for the begin date and you will get back all of the data from the beginning of time. Quick and easy.

Q: Is CDC enabled on the source system or the destination system (Data Warehouse)?
A: CDC will be enabled on the source system. You want to leverage the deltas on the source side so you have less data to load into the warehouse and less complex logic to do on the warehouse load ETL.

Q: Can you enable CDC on a view?
A: No, CDC can only be enabled on tables.

Q: What version of SQL Server is required?
A: An Enterprise edition of SQL Server is required to use CDC. The feature is available in 2008 or later.

I also did a similar presentation last night for the Orlando SQL Server Usergroup (OPASS, website) on CDC with SSIS. That focused on using Change Data Capture with the new SSIS 2012 components. I have consolidated the slide decks and included both SSIS projects in the download link which can be found in the giant information box below so no one can miss it. Some notes on what is included:

  • Change Data Capture.pptx – Consolidated slide show for the two presentations.
  • Database Solution Folder – Has the database solution I use to demo CDC. It has all the scripts you should need to create the databases and populate data as well. Scripts are numbered in the order you need to run them. I tried to make some notes in the scripts so you know what is going on and why I did some of the things I did.
  • User Group SSIS – This solution uses the new SSIS 2012 and later components rather than using the wrapper functions that we created in the webinar. There is a sample package for the initial and incremental loads.
  • Webinar SSIS – This is the solution I used during the webinar. It has the old way of doing things using the SCD wizard and then the package we very quickly threw together as time was running out at the end of the webinar.

Thank you again to everyone who joined the webinar or braved the weather to come out to the User Group in Orlando. If you have any follow up questions please hit me up!



Webinar Follow-Up: Dynamic SSRS Reports Using T-SQL & MDX

Presenter: David Siska

Click here to watch the full webinar.

Click here to view David's blog.

SSRS Training Session Questions

We got a couple of questions during the Dynamic SSRS Reports presentation that I didn't have time to address during our typical end of sessions Q/A time. Those questions and answers to them are below.

Q: What if we wanted to capture the total amount of time it took to run the report... 3 seconds, 3 minutes, etc..

A: We can definitely get this kind of information from the SSRS Execution Log.
If we query the Microsoft supported ExecutionLog3 we can see the start and end times of the report, as well as the milliseconds used for each step of the report processing.
As we can see in this example, the most recent report executed from the cache, saving us quite a bit of time for data retrieval and processing time. We can also see that the report did not take much time to process from a live execution.

Q:  For a subscription report, what if I'd like to change parameters based on to whom the report is distributed? If to Joe, show Kansas data, but if to Tim show Nebraska data?

A: You need a data driven report (which is not available in standard edition of SQL Server)
You will set this up in Report Manager using subscriptions, and the design of the report will create the options to pass in parameters.
In step three of the subscription setup process you have the option to enter a query. You can test and run this query in SSMS first and you can certainly make a more complex query than the example used here.

Step 5 of this process is to define your parameter values. Since this report has only one parameter, the year, I selected the Year column from my previous query. Joining to a table that holds user report preferences or other data that might indicate which parameters should be used for their report (e.g. a sales territory) is a great way to make use of SSRS data driven subscriptions!
Note that while the prompt states the value is coming from the database, the available values are defined in your supplied query.

Q: Is it true that when upgrading from 2008 to 2012 that report subscriptions must be recreated manually?

A: I'm of the opinion that if it can go wrong it will, so backups and documentation are great things to have prior to a significant operation such as a version upgrade.
There are several tools and considerations for the Reporting Services portion of a version upgrade, and I'd suggest you refer to the MSDN documentation, which is far more detailed than my answer could be here.

Q: Hi, I alwasy have a problem with cascading parameters, where they are date types, so Param 1 says, Period No. and if they choose '1' then Param 2 (Date From) and Param 3 (Date To) is set (this works) but then chaning Param 1 to another period, the 2 cascading date parameters do not get updated, this is in any version, thanks

A: Great question, this is a "By Design" aspect of SSRS and cascading parameters and can happen with other data types too, depending on the nature of your data sets. There are several workarounds that have already been proposed by the community and you can find the initial issue report and those workarounds described here.
Boyan Penev has a longer explanation of the issue and a workaround on his blog.


Rename SharePoint Central Administration Database

By Bradley Schacht

Click here to view Brad's entire blog.

SharePoint uses a series of SQL Server databases to save configuration information, security setups and all the user data that has been created. The Central Administration site is no different and has its own content database that is created when running the SharePoint configuration wizard. Unfortunately, unlike when creating service applications, you do not get to choose the name of this database during the wizard. The database name will default to SharePoint_AdminContent_<GUID>.

image

This could be problematic for a number of reasons. Many companies have a set of standards for database names. Other times you will want to change it simply because you have a small OCD issue (that’s me). While the task is not necessarily as easy as going into SQL Server and renaming the database in the object explorer it isn’t necessarily difficult either. Just follow the steps below and you will clear up that OCD (or naming convention) issue in no time.

Note these steps should be run from a SharePoint server. You may also need elevated permissions in certain steps to drop databases from the SQL Server.
  1. Run the SharePoint 2013 Management Shell as Administrator.
    image
  2. Make note of what you would like the new database name to be as well as the current web application name for Central Administration.
    New Database: SharePointCentralAdmin
    Web Application: http://kerberos-sp:36000
  3. With that information run the following command. This will create a new content database named SharePointCentralAdmin in my case.
    New-SPContentDatabase –Name SharePointCentralAdmin –WebApplication http://kerberos-sp:36000
    image
    image
  4. The next step is to migrate the contents of the previous database (SharePoint_AdminContent_8449cb1b-4a84-4048-9425-0ec6e783ec37) to the new database (SharePointCentralAdmin). To do this we will need to pass database IDs rather than database names to the commands in the next step. To first find the IDs assigned to each of the databases using the Get-SPWebApplication and Get-SPContentDatabase commands.
    Get-SPWebApplication –Identity http://kerberos-sp:36000 | Get-SPContentDatabase | SELECT ID, Name, WebApplication | Format-List
    image
  5. Make note of the two IDs and which database each belongs to.
    Original Database: c87506a9-b87d-40b8-9582-aac9ee89c8f8
    New Database: f79cb9d8-8e45-4405-82c9-081f58bce7a0
  6. With the IDs in hand use the Get-SPSite and Move-SPSite commands to migrate the content from one database to the other. For each of these commands we need to feed the ID for the database. When running the command you will be prompted to confirm the action, press Y to confirm each prompt or A to accept all the prompts at the beginning.
    Get-SPSite –ContentDatabase c87506a9-b87d-40b8-9582-aac9ee89c8f8 | Move-SPSite –DestinationDatabase f79cb9d8-8e45-4405-82c9-081f58bce7a0
    image
  7. Note that IIS must be restarted. To do so simply type IISReset into the management shell.
    image
  8. Next, to remove the database association from the Central Administration web application run the following command and press Y to confirm each action.
    Remove-SPContentDatabase c87506a9-b87d-40b8-9582-aac9ee89c8f8
    image
  9. Optionally confirm the association has been removed by running the same command from Step 4.
    Get-SPWebApplication –Identity http://kerberos-sp:36000 | Get-SPContentDatabase | SELECT ID, Name, WebApplication | Format-List
    image
  10. Drop the original database with the GUID in the name use SQL Server Management Studio.
    image
    image
  11. Relax because your OCD is now happy. Also go tell the DBAs they owe you a cookie for removing the GUID, they will be happy too.

Using PowerShell to Build Your Azure VM Environment

By: Devin Knight

Click here to view Devin's blog.

In a previous post I wrote about Using a Hyper-V VHD in an Azure VM. Today I’d like to show you what my next steps were.  My goal with using Azure is to create an lab environment for student that fits the following needs:

  • Easy to create new virtual environments
  • Easy to power down virtual environments
  • Easy to power up virtual environments
  • Easy to delete all virtual environments

The point of this post is to guide you through how to easily create virtual machines for multiple students or employees in a quick and easy script. By the way, I have previously posted how to solve the problem of powering up and powering down the virtual lab environments in my blog about Setting up Automation in Azure.

Creating an Image

These steps will guide you through how to take what you learned in my blog aboutUsing a Hyper-V VHD in an Azure VM and create and image from your VHD you uploaded. You would also create an image from a Virtual Machine that you created from the Azure Gallery. Once an image is created you can then spin off as many virtual machines as you would like from that image.

  • Login to the Azure Management Portal.
  • Navigate to the Virtual Machines page and select Images on the top of the screen.
  • Select Create on the bottom of the screen to build a new image.

image

  • Name the image and provide the VHD url location where you are storing your vhd. If you followed my previous blog on uploading you local VHD then this would be the storage account and container that you uploaded to. You must run Sysprep on your virtual machine if you want to create an image from it.

image

PowerShell to Create your VMs

The the image created you can now either manually create new virtual machines from the image or use PowerShell to scale your solution better. For me Powershell made most sense because I’m not trying to build just one virtual machine. In my case I actually need to build 15 identical virtual machines (for 15 students) based off the same image.

  • If you haven’t already download and Install the Windows Azure PowerShell module http://go.microsoft.com/fwlink/p/?linkid=320376&clcid=0×409
  • Launch Windows Azure PowerShell
  • Before you start using the Windows Azure PowerShell cmdlets, you need to configure connectivity between your machine and Windows Azure. One way to do this is by downloading the PublishSettings file from Windows Azure and importing it. Using the PowerShell prompt enter the following commands:
    • Get-AzurePublishSettingsFile
    • Import-AzurePublishSettingsFile “C:\SubscriptionCredentials.publishsettings”
  • Download my PowerShell script called CreateVMs.ps1. Essentially you point this script to your image and cloud service and tell it how many VMs you want and it loops over your image until all the VMs you asked for are created.

param([Int32]$vmcount = 3)# Create Azure VMs for Class
# run in Powershell window by typing .\CreateVMs.ps1 -vmcount 3

$startnumber = 1
$vmName = “VirtualMachineName”
$password = “pass@word01″
$adminUsername = “Student”
$cloudSvcName = “CloudServiceName”
$image = “ImageName”
$size = “Large”
$vms = @()
for($i = $startnumber ; $i -le $vmcount; $i++)
{
$vmn = $vmName + $i
New-AzureVMConfig -Name $vmn -InstanceSize $size -ImageName $image |
Add-AzureEndpoint -Protocol tcp -LocalPort 3389 -PublicPort 3389 -Name “RemoteDesktop” |
Add-AzureProvisioningConfig -Windows -AdminUsername $adminUsername -Password $password |
New-AzureVM -ServiceName $cloudSvcName
}

  • Modify any of my generic parameters I’ve provided in the file.  You’ll likely need to modify the default values of the following
    • $vmName
    • $password
    • $adminUsername
    • $cloudSvcName
    • $image
    • $size (Optional, right now mine creates large VMs so you may want to adjust this)
  • This script also accepts one parameter that you can key in the value for externally. The –VMCount parameter allows you to specify exactly how many virtual machines you would like based on your image. To run this script use this command in your PowerShell window. Using the number 3 here will produce 3 virtual machines for me.

.\CreateVMs.ps1 -vmcount 3

  • If I were to run this script without changing any of the embedded parameters it would produce 3 virtual machines with the following names:
    • VirtualMachineName1
    • VirtualMachineName2
    • VirtualMachineName3

Once your virtual machines are created your next step is to consider if you want to keep them running all the time or only during business hours. Keep in mind you only pay for the time your virtual machines are started. For my scenario I wanted to turn my machines off during non business hours and back on in the morning, which can be automated through Azure Automation. Read this post on Setting up Azure Automation.

Click here to view Devin's entire blog.



Webinar Follow-Up: What's New In SSIS 2012 & 2014

Instructor: Kathi Kellenberger

Click here to watch the full webinar.

Click here to view Kathi's blog.

Q:  Can we develop SSIS packages using SSMS or do we have to use SSDT?
A: You must use SSDT. That makes sense; it is a development tool!

Q: Any impact when upgrading from 2008 SSIS to SSIS 2014?
A: Except for just trying it out, I haven’t had the chance to do any real upgrades. It’s important to figure out if you are going to move to the new Project Deployment model and do the planning.  Once I get some more experience with this, I’ll do another webinar just about it.

Q: Can SSIS 2012 sense the environment name and set the environment accordingly?
A: I’m not sure exactly what you are getting at here. In the new project deployment model, you assign the environment.

Q: Can I run a package developed in SSIS 2012 or 2014 in a 2008 R2 environment?
A: I haven’t tried this and a quick search didn’t turn up anything helpful. My gut is telling me that it should work as long as you are using the package deployment model.

Q. Do package configurations still exist in SSIS 2012?
A. Yes, you have the choice of using the package deployment model which gives you the traditional configurations or using the new project deployment model.

Q. Is there a way to make changes to one package without redeploying the entire project?
A. No, in the project deployment model, you must deploy the entire project even if there is just a change to one package.


Webinar Follow-Up: SSIS Tales From the Road

Presenter: Anthony Martin

SQL Server Integration Services is a versatile tool that makes it easy to move and transform data. Sometimes unusual situations present themselves that leave developers scratching their heads. In this webinar,  Anthony will demonstrate how to handle some of these oddball challenges.

Click here to watch the full webinar.

Click here to view Anthony Martin's blog.

Follow-Up Questions:

Q:  Could the file cache be used between packages in the same projects?

Yes, the ‘Use file cache’ option of the Cache Transform transformation can be used between packages that are in the same project or even different projects. 

In order to do this, you’ll need to configure the Cache Transform with the ‘Use file cache’ option and execute the package.  This is because the Cache Transform file (.caw extension) is created until the package is executed.  Next, in another package that you want to reuse the Cache Transform file, add a Lookup transformation to a Data Flow Task.  While configuring the Lookup transformation, select the ‘Cache Connection Manager’ connection type on the ‘General’ page.  On the ‘Connection’ page, click ‘New’ to create a new Cache Connection Manager, select the ‘Use File Cache’ option and browse to the location where the file was created in the previous package.  Configure the rest of the Lookup transformation just like you normally would and you are all set!

Q:  Are Lookup Transformations better than just using Stored Procedures in the Source component?

If the connection for the source and the lookup are on the same database and correct indexing exists then a stored procedure in the source component that performs the join for you is probably the appropriate choice.  However, if the source and the lookup connections are on different database servers or different file types altogether then the lookup transformation is one of only a few options available and is a highly efficient transformation if configured using the default settings.

Q:  Could you please share the SSIS project and SSMS solution?

I’ve posted both solutions to my OneDrive.  You can access them here.


Introducing DOC xPress Server Edition

Join us for a demo of the new DOC xPress Server Edition! DOC xPress Server brings the metadata capabilities of the DOC xPress desktop application to a web application that you host, making it much easier to share documentation, lineage, and data dictionary information in your organization. By supporting a web-based user interface, you can provide access to all of your business and technical users, without requiring them to have a desktop installation. 

View Demo Here!


Copyright 2014 by Pragmatic Works