Pragmatic Works Blog

rss

Company blog for some of our Pragmatic Works employee to write about company happenings and about the software business.


Big Data Needs to Mean Big Insights

Recently the focus on increased insights and data driven decisions has become paramount from the board room to marketing, and even to the production floor. Those insights and decisions were promised by the world of “Big Data”. The hype around big data is unmistakable. You can’t miss it. I think Big Data will perform at the super bowl halftime show. Well maybe not,but you can bet that the stadium, advertisers, teams and vendors are all usingbig data technologies to be their best for that big game.

I speak with customers that are driven to implement a big data solution, and yet I’ve never seen more people working to learn and implement a technology with so little definition on how they plan to leverage it for their business. They know a few things…

1. Big Data might allow them to increase the value and variety of decisions they make.
2. These technologies might provide a way to get around the limitations of traditional BI systems.
3. They believe it to be a cost effective and rapid time-to-value option
4. They believe their team can adapt and learn the tools quickly
5. Their superiors are counting on it to provide insight into business growth and efficiency opportunities


These sounds good in theory, so let’s take them one by one.

1. Big Data might allow them to increase the value and variety of decisions they make.

The foundation of Big Data is a distributed platform and file system such as Mapreduce/Yarn/Tez and HDFS. These will allow you to work with data that is less traditional in form and volume and through that, you maybe able to analyze a wider variety of things and possibly do this faster than if all the data had to go through the same normalization and ETL process.  That means that we need a properly sized Hadoop platform with all the right tools and a methodology for how we are going to approach these types of data challenges. This takes time, trial and error to develop without guidance from a partner who has been there before. This is why many of our clients are reaching out to get help finding that leg up on accelerating their big data insights process.

2. These technologiesmight provide a way to get around the limitations of traditional BI systems

I’m not sure about this one. I do believe in many cases that we can deconstruct and prepare data for analysis in a more agile way using some of the tools that the Hadoop ecosystem provides. That said, we have only prepared the data for analysis, we have not actually done anything useful with the data. That will require a client tool like Power BI, or another analytics platform. Our new data will likely need to be aligned with corporate enterprise BI data to be really useful and now we’re combining that insight, often integrating it into our enterprise BI environment now that we’ve found the valuable pieces of that data. So often we’re not getting around limitation, but becoming more agile in our methods of identifying data that’s valuable in the context of our business and then being more efficient in the size and scope of the data that we need to eventually integrate into our enterprise BI solutions.This helps us control data warehouse sprawl and keeps the EDW valuable for the business.

3. They believe it to bea cost-effective and rapid time-to-value option

Some companies think they will deploy an amazing big data solution on hardware they have laying around. This is fine for a prototype, but the only real way to deploy big data in a cost efficient, rapid and scalable way is to look to the cloud. The cloud provides a scalable set of services,infrastructure and storage that allow companies to explore big data’s capabilities and value without building out more data center floor space. The nature of cloud storage and compute being separate allows for a multi-tenant approach to big data that is virtually impossible to deliver on-premise. This allows for scale on demand approaches, saving significant dollars and administrative overhead. Automation capabilities with cloud platforms provide this functionality and allow it to be controlled either from central IT or from the business analysts as need. Pragmatic Works deploys 90% of our customers’big data solutions in Microsoft’s Azure Cloud for rapid time-to-value and a much better 12 month ROI.

4. They believe theirteam can adapt and learn the tools quickly

The most significant point of confusion when investing in big data is around all the different tools, services and components that makeup the ecosystem. Check out here for more on that. There is Hadoop, Pig, Hive,Oozie, Flume, Storm, HDInsight, Cloudera, Impala, and so many others. Companies don’t know which way to go and are nervous about getting locked into a vendor or a direction. Sometimes they even go so far as to try the completely opensource route, thinking that will be better, with no vendor to get locked into.The problem is without an enterprise behind your enterprise solution, you are left without support, training and a system for upgrades and improvements to the product. That is putting a lot of pressure on your team. Many teams are already working full time in their current roles and don’t have time to integrate 20 new technologies into their daily life. The cloud comes to therescue again with products like Microsoft Azure’s HDinsight, a managed big dataservice in the cloud. This service can be scaled as needed, combines with azure’s other secure cloud services (They have added over 200 new services in the last year. .wow..) This service provides all the core Hadoop ecosystem components, no SQL options, virtual infrastructure and tons of integration options for your applications and enterprise systems. We are seeing customersexcited about realizing their big data vision by leveraging a managed platformlike this. Best of all, since it’s the cloud, you only pay for what you use.You’re still depreciating those old servers in the data center aren’t you? I thought so.

5. Their business iscounting on it to provide insight into business growth and efficiencyopportunities

Now that you’ve got this amazing distributed scaling on demand big data platform in the cloud, what are we doing with the data, and here in lies the big secret of big data? It doesn’t solve all your problems.You still need to do that. What a good big data implementation will allow you to do is iterate through data organization, duration and publication faster and provide it as a scalable source for more in depth analysis. Enter machinelearning and predictive analytics. These technologies work with the big data ecosystem (and other data) to help you make decisions about what you expect your business, customers or market to do, allowing your stakeholders to plan more appropriately and effectively. These are the technologies that consumethis data and provide real insight. Many companies are looking for a “Data Scientist” to drive these initiative, but Microsoft has done a nice job in working with their internal data scientist and their Azure teams to create a managed and scalable Azure Machine Learning service that really does a nice job of democratizing this technology and making it accessible to companies willing to explore it. Pragmatic Works is doing a lot in this space, helping clients adapt from traditional BI to a more forward looking approach using predictive analytics. Our customers are seeing significant value and ROI from their investments here. Remember Cloud = scalable, pay for what you use etc.. so this fits that model without having to hire a team of Phd’s to run it for you. Inmany cases we’re able to train the existing team to leverage this platform.That is very exciting for our clients.

As you can see Big Data is an exciting technology development, but as we watch companies adopt these technologies, we are paying careful attention to how they can be successful and not just initiate another overloaded data project. Our techniques and cloud adoption have resulted in our initiatives having a significantly higher adoption rate than the industry average.

If you want to find out how to take your business forward into a more insight driven world, contact us here at Pragmatic Works.

See you out there…

Adam Jorgensen

President, Pragmatic Works Consulting

Twitter: @ajbigdata


Webinar Follow-Up: SSIS Performance Tuning Techniques

Instructor: Mitchell Pearson

Click here to watch the entire webinar.

Click here to view Mitchell's blog.

Thank you everyone for attending my webinar on SSIS Performance Tuning, if 
you missed that presentation you can watch it here:http://pragmaticworks.com/LearningCenter/FreeTrainingWebinars/PastWebinars.aspx?ResourceId=683

Below are some of the questions that I received from the Webinar:

Q: Can you give some performance tips in using script tasks too?

A: Yea, don’t use them! Script tasks can be really bad for performance and 
many times I have been able to replace someone else’s code with native SSIS 
components. For example, I had situation where the client was using some very 
complicated .net code to parse out columns based on logic. The package was 
taking over an hour to process 1 million records. I replaced this with some 
conditional splits and derived columns and it now runs in 3 minutes.

Q:  I am assuming that the file formats must be the same for all files when 
using the MultiFlatFile transform, correct?

A: You are absolutely correct. The metadata in each file must match.

Q: PW delivers a 'TF Advanced Lookup Cache Transform" component.  What are 
the benefits of using this component over the Cache Transform covered earlier?  
It seems that the TF components cannot make use of the same result set when the 
data set is role based.

A: For basic caching of data I would use the native SSIS cache transform. The 
major advantage you get from the Task Factory component is you can do very 
difficult range lookups with ease and  they will perform at a high level. Please 
see my blog post on this.

Q: What version of SQL Server is being used?

A: I was using SQL 2012, but everything in the presentation is applicable to 
2005 and 2008.

Q: With the multi flatfile connection manager can you specify specific 
types?

A:  Yes, the wild card character can be anywhere in the connection string 
property. So you could do test*.txt to only pull in text files where the file 
name begins with test.

Q: Why would you ever not use table or view (fast load) option in the OLEDB 
Destination?

A: Well I personally would always use that option. However, with the fast 
load option all records are committed for the entire batch. So if there is a 
record that is bad and causes the failure you will not know which record caused 
the error. With table or view option each record is committed individually so 
you know exactly which record caused the failure.

Thanks again!



Writing Parametrized MDX for Reporting Services

By: Devin Knight

Click here to view Devin's blog.

Writing MDX inside of Reporting Services has been the nemesis for many report writers for far too long.  If all you need is the Query Designer to create a drag and drop datasets then it’s a piece of cake, but have you ever wondered what’s going on with all the MDX that the Query Designer creates.  Or how about how do the parameters work that it creates inside MDX.  I’ve heard many report writers that use Analysis Services as a data source (including myself) say it is too difficult to create parameters and that’s why they use the Query Designer.  In reality, I think what the real problem is probably that the MDX results that the Query Designer provides make it look a lot more difficult than it really is.  If you know some rudimentary MDX you can probably be writing your datasets yourself instead of using the Query Designer, which there’s no problem with sticking with if it fits your needs.  The goal of this article is to guide you through writing a basic MDX statement that can be used for a report query and more importantly show you how to parameterize it. 

The two MDX functions that we are going to use for this are:

StrToSet

Returns the set specified in the string expression (MSDN definition).  Essentially, Converts the text you write into an MDX set.

StrToMember

Returns the member specified in the string expression (MSDN definition).  Converts the text you write into an MDX member.

 First we will start with an example that uses StrToSet and then add to it using StrtoMember.  To follow along with this example you’ll need the Adventure Works cube that can be found at www.codeplex.com.  I’ll assume you know some basics of Reporting Services and not define each component of the tool that is not new for this example. 

Create a new Report Server Project and Report that uses the Adventure Works cube as a Data Source.  Next, create a Dataset and for this example you can call it CategorySales (this Dataset should use the Adventure Works cube data source).  Select Query Designer and click the Design Mode button 1a  to begin writing your query.  Use the following query to return back results without a parameter:

1

Select
[Measures].[Internet Sales Amount] on Columns
From [Adventure Works]

All this query did was returned back the total sales, but this is useless for a report by itself.  Let’s say we want to add the product category that the sales belong to.  If we were to do this normally in MDX it would look something like this:

Select
[Measures].[Internet Sales Amount] on Columns,
[Product].[Category].Children on Rows
From [Adventure Works]

That’s not good enough for our pretend end users though.  They not only want to see the list of product category sales, but they also want it made into a parameter so they can select from a dropdown box the category or categories they wish to view.  To do this, first create the parameter by selecting the Query Parameters button ParameterButton.

  • Assign the parameter a name. Ours  will be called ProductCategory
  • Identify the Dimension and Hierarchy it associates with. Ours will be from the Product dimension and the Category attribute hierarchy
  • Multiple values identifies if it should accept multiple values.
  • Last give the parameter a default value. This can set to Bikes for our example

2

After you hit OK to confirm the Query Parameter screen you will need to modify your MDX to take advantage of the parameter you have created.

Select
[Measures].[Internet Sales Amount] on Columns,
StrToSet(@ProductCategory, Constrained) on Rows
From [Adventure Works]

Here we use StrToSet to convert whatever is brought in from the parameter values selected by the end users to something MDX can understand.  The Constrained flag here just ensures that parameter provides a member name in the set.

Hit OK twice to confirm the query so far and make a simple tabular report to view your results so far.  You will notice that a dropdown box parameter was automatically created for the ProductCategory parameter we defined. You may already be familiar with this if you have ever used the drag and drop interface for making parameters. Reporting Services automatically creates a Dataset for this parameter dropdown box, which you can view if you right-click on the datasets folder and select Show Hidden Datasets.

3

Now let’s take this example one step further.  We will add two more parameters to our query to create a date range.  Hit the Query Parameters button again and create the parameters StartDate and EndDate that reference the Date Dimension and the Date attribute.  These can be defaulted to any date for now. 

4

Change the MDX to now add a where clause that restricts dates this time using StrToMember .

Select
[Measures].[Internet Sales Amount] on Columns,
StrToSet(@ProductCategory, Constrained) on Rows
From [Adventure Works]
Where StrToMember(@StartDate, Constrained) :StrToMember(@EndDate, Constrained)

Hit OK twice to confirm the changes to the Dataset.  Remember that these parameters will automatically create new Datasets and parameters so you should just need to preview the report to see the changes.

5

I hope this makes writing your own MDX with parameters inside Reporting Services a little easier. 


Webinar Follow-Up: Create a Tactical Plan Using Hadoop & Cloud Tech

Presenter: Rowland Gosling

Click here to view the entire webinar.

Click here to view Rowland's blog.

Thank you to all who attended my online workshop this morning! I had two questions I’d like to address:

Q: Can you use the ODBC Hive solution to insert data into Hadoop?

A: Not with our current technology. Hadoop data is basically read-only. There is no ODBC way to update a row for instance. It can be updated using things like Pig but basically what you’re doing is replacing entries.

Q: Can you use CSV/Storage Spaces?

A: I did some checking and the general answer is no. On the other hand why would you want to? There’s blob storage and plain old UNC file paths.

Thanks everyone!



Webinar Follow-Up Getting Started with Change Data Capture

Presenter: Bradley Schacht

Click here to view the entire webinar.

Click here to view Brad's blog.

Yesterday I presented about Change Data Capture for the Pragmatic Works Free Training on the T’s. I covered the basics of what Change Data Capture (CDC) is and showed how to use it. We had a good turn out, so thank you to everyone that attended. If you were unable to join or would like to see the recording you can find the link here. The Q&A can be found below.

Free-Training

 

Q: Does disabling CDC remove any objects that were added?
A: Yes, when you disable CDC on an individual table the associated change tracking table as well as the auto-generated functions are removed. If you disable CDC on the database all change tracking and auto-generated functions are removed. The wrapper functions are not removed automatically, you must remove those yourself.

Q: How does the log reader job impact the transactions being written in the application?
A: The log reader only does reads on the transaction log. It does not lock the log or anything like that. There should be no affect on the applications writing to the tables.

Q: Will CDC work with Simple recovery mode?
A: Yes, CDC works with both full and simple recovery modes.

Q: Can CDC be enabled on MDS (Master Data Services) databases?
A: CDC can be enabled on any database as it just reads the logs. That being said you do need a primary key or unique index on the tables. I’m not sure I see the value of putting CDC on an MDS database though. Those tables are setup such that you need to go through the views to get data back.

Q: Does CDC work on Hecaton databases?
A: I don’t see why not, although I will need to test this as I have not tried it. Ultimately there is still the transaction log involved as far as I know so I would venture to guess yes. Don’t at all hold me to that as I haven’t done any real world work with Hecaton so testing and a follow up blog will be required there. You’ve definitely peaked my interest there though.

Q: What do you recommend for getting the first pull of data (initial load), date or LSN based queries?
A: I prefer to go the date base route using the control table. The first time you pull data pass in NULL for the begin date and you will get back all of the data from the beginning of time. Quick and easy.

Q: Is CDC enabled on the source system or the destination system (Data Warehouse)?
A: CDC will be enabled on the source system. You want to leverage the deltas on the source side so you have less data to load into the warehouse and less complex logic to do on the warehouse load ETL.

Q: Can you enable CDC on a view?
A: No, CDC can only be enabled on tables.

Q: What version of SQL Server is required?
A: An Enterprise edition of SQL Server is required to use CDC. The feature is available in 2008 or later.

I also did a similar presentation last night for the Orlando SQL Server Usergroup (OPASS, website) on CDC with SSIS. That focused on using Change Data Capture with the new SSIS 2012 components. I have consolidated the slide decks and included both SSIS projects in the download link which can be found in the giant information box below so no one can miss it. Some notes on what is included:

  • Change Data Capture.pptx – Consolidated slide show for the two presentations.
  • Database Solution Folder – Has the database solution I use to demo CDC. It has all the scripts you should need to create the databases and populate data as well. Scripts are numbered in the order you need to run them. I tried to make some notes in the scripts so you know what is going on and why I did some of the things I did.
  • User Group SSIS – This solution uses the new SSIS 2012 and later components rather than using the wrapper functions that we created in the webinar. There is a sample package for the initial and incremental loads.
  • Webinar SSIS – This is the solution I used during the webinar. It has the old way of doing things using the SCD wizard and then the package we very quickly threw together as time was running out at the end of the webinar.

Thank you again to everyone who joined the webinar or braved the weather to come out to the User Group in Orlando. If you have any follow up questions please hit me up!



Webinar Follow-Up: Dynamic SSRS Reports Using T-SQL & MDX

Presenter: David Siska

Click here to watch the full webinar.

Click here to view David's blog.

SSRS Training Session Questions

We got a couple of questions during the Dynamic SSRS Reports presentation that I didn't have time to address during our typical end of sessions Q/A time. Those questions and answers to them are below.

Q: What if we wanted to capture the total amount of time it took to run the report... 3 seconds, 3 minutes, etc..

A: We can definitely get this kind of information from the SSRS Execution Log.
If we query the Microsoft supported ExecutionLog3 we can see the start and end times of the report, as well as the milliseconds used for each step of the report processing.
As we can see in this example, the most recent report executed from the cache, saving us quite a bit of time for data retrieval and processing time. We can also see that the report did not take much time to process from a live execution.

Q:  For a subscription report, what if I'd like to change parameters based on to whom the report is distributed? If to Joe, show Kansas data, but if to Tim show Nebraska data?

A: You need a data driven report (which is not available in standard edition of SQL Server)
You will set this up in Report Manager using subscriptions, and the design of the report will create the options to pass in parameters.
In step three of the subscription setup process you have the option to enter a query. You can test and run this query in SSMS first and you can certainly make a more complex query than the example used here.

Step 5 of this process is to define your parameter values. Since this report has only one parameter, the year, I selected the Year column from my previous query. Joining to a table that holds user report preferences or other data that might indicate which parameters should be used for their report (e.g. a sales territory) is a great way to make use of SSRS data driven subscriptions!
Note that while the prompt states the value is coming from the database, the available values are defined in your supplied query.

Q: Is it true that when upgrading from 2008 to 2012 that report subscriptions must be recreated manually?

A: I'm of the opinion that if it can go wrong it will, so backups and documentation are great things to have prior to a significant operation such as a version upgrade.
There are several tools and considerations for the Reporting Services portion of a version upgrade, and I'd suggest you refer to the MSDN documentation, which is far more detailed than my answer could be here.

Q: Hi, I alwasy have a problem with cascading parameters, where they are date types, so Param 1 says, Period No. and if they choose '1' then Param 2 (Date From) and Param 3 (Date To) is set (this works) but then chaning Param 1 to another period, the 2 cascading date parameters do not get updated, this is in any version, thanks

A: Great question, this is a "By Design" aspect of SSRS and cascading parameters and can happen with other data types too, depending on the nature of your data sets. There are several workarounds that have already been proposed by the community and you can find the initial issue report and those workarounds described here.
Boyan Penev has a longer explanation of the issue and a workaround on his blog.


Rename SharePoint Central Administration Database

By Bradley Schacht

Click here to view Brad's entire blog.

SharePoint uses a series of SQL Server databases to save configuration information, security setups and all the user data that has been created. The Central Administration site is no different and has its own content database that is created when running the SharePoint configuration wizard. Unfortunately, unlike when creating service applications, you do not get to choose the name of this database during the wizard. The database name will default to SharePoint_AdminContent_<GUID>.

image

This could be problematic for a number of reasons. Many companies have a set of standards for database names. Other times you will want to change it simply because you have a small OCD issue (that’s me). While the task is not necessarily as easy as going into SQL Server and renaming the database in the object explorer it isn’t necessarily difficult either. Just follow the steps below and you will clear up that OCD (or naming convention) issue in no time.

Note these steps should be run from a SharePoint server. You may also need elevated permissions in certain steps to drop databases from the SQL Server.
  1. Run the SharePoint 2013 Management Shell as Administrator.
    image
  2. Make note of what you would like the new database name to be as well as the current web application name for Central Administration.
    New Database: SharePointCentralAdmin
    Web Application: http://kerberos-sp:36000
  3. With that information run the following command. This will create a new content database named SharePointCentralAdmin in my case.
    New-SPContentDatabase –Name SharePointCentralAdmin –WebApplication http://kerberos-sp:36000
    image
    image
  4. The next step is to migrate the contents of the previous database (SharePoint_AdminContent_8449cb1b-4a84-4048-9425-0ec6e783ec37) to the new database (SharePointCentralAdmin). To do this we will need to pass database IDs rather than database names to the commands in the next step. To first find the IDs assigned to each of the databases using the Get-SPWebApplication and Get-SPContentDatabase commands.
    Get-SPWebApplication –Identity http://kerberos-sp:36000 | Get-SPContentDatabase | SELECT ID, Name, WebApplication | Format-List
    image
  5. Make note of the two IDs and which database each belongs to.
    Original Database: c87506a9-b87d-40b8-9582-aac9ee89c8f8
    New Database: f79cb9d8-8e45-4405-82c9-081f58bce7a0
  6. With the IDs in hand use the Get-SPSite and Move-SPSite commands to migrate the content from one database to the other. For each of these commands we need to feed the ID for the database. When running the command you will be prompted to confirm the action, press Y to confirm each prompt or A to accept all the prompts at the beginning.
    Get-SPSite –ContentDatabase c87506a9-b87d-40b8-9582-aac9ee89c8f8 | Move-SPSite –DestinationDatabase f79cb9d8-8e45-4405-82c9-081f58bce7a0
    image
  7. Note that IIS must be restarted. To do so simply type IISReset into the management shell.
    image
  8. Next, to remove the database association from the Central Administration web application run the following command and press Y to confirm each action.
    Remove-SPContentDatabase c87506a9-b87d-40b8-9582-aac9ee89c8f8
    image
  9. Optionally confirm the association has been removed by running the same command from Step 4.
    Get-SPWebApplication –Identity http://kerberos-sp:36000 | Get-SPContentDatabase | SELECT ID, Name, WebApplication | Format-List
    image
  10. Drop the original database with the GUID in the name use SQL Server Management Studio.
    image
    image
  11. Relax because your OCD is now happy. Also go tell the DBAs they owe you a cookie for removing the GUID, they will be happy too.

Using PowerShell to Build Your Azure VM Environment

By: Devin Knight

Click here to view Devin's blog.

In a previous post I wrote about Using a Hyper-V VHD in an Azure VM. Today I’d like to show you what my next steps were.  My goal with using Azure is to create an lab environment for student that fits the following needs:

  • Easy to create new virtual environments
  • Easy to power down virtual environments
  • Easy to power up virtual environments
  • Easy to delete all virtual environments

The point of this post is to guide you through how to easily create virtual machines for multiple students or employees in a quick and easy script. By the way, I have previously posted how to solve the problem of powering up and powering down the virtual lab environments in my blog about Setting up Automation in Azure.

Creating an Image

These steps will guide you through how to take what you learned in my blog aboutUsing a Hyper-V VHD in an Azure VM and create and image from your VHD you uploaded. You would also create an image from a Virtual Machine that you created from the Azure Gallery. Once an image is created you can then spin off as many virtual machines as you would like from that image.

  • Login to the Azure Management Portal.
  • Navigate to the Virtual Machines page and select Images on the top of the screen.
  • Select Create on the bottom of the screen to build a new image.

image

  • Name the image and provide the VHD url location where you are storing your vhd. If you followed my previous blog on uploading you local VHD then this would be the storage account and container that you uploaded to. You must run Sysprep on your virtual machine if you want to create an image from it.

image

PowerShell to Create your VMs

The the image created you can now either manually create new virtual machines from the image or use PowerShell to scale your solution better. For me Powershell made most sense because I’m not trying to build just one virtual machine. In my case I actually need to build 15 identical virtual machines (for 15 students) based off the same image.

  • If you haven’t already download and Install the Windows Azure PowerShell module http://go.microsoft.com/fwlink/p/?linkid=320376&clcid=0×409
  • Launch Windows Azure PowerShell
  • Before you start using the Windows Azure PowerShell cmdlets, you need to configure connectivity between your machine and Windows Azure. One way to do this is by downloading the PublishSettings file from Windows Azure and importing it. Using the PowerShell prompt enter the following commands:
    • Get-AzurePublishSettingsFile
    • Import-AzurePublishSettingsFile “C:\SubscriptionCredentials.publishsettings”
  • Download my PowerShell script called CreateVMs.ps1. Essentially you point this script to your image and cloud service and tell it how many VMs you want and it loops over your image until all the VMs you asked for are created.

param([Int32]$vmcount = 3)# Create Azure VMs for Class
# run in Powershell window by typing .\CreateVMs.ps1 -vmcount 3

$startnumber = 1
$vmName = “VirtualMachineName”
$password = “pass@word01″
$adminUsername = “Student”
$cloudSvcName = “CloudServiceName”
$image = “ImageName”
$size = “Large”
$vms = @()
for($i = $startnumber ; $i -le $vmcount; $i++)
{
$vmn = $vmName + $i
New-AzureVMConfig -Name $vmn -InstanceSize $size -ImageName $image |
Add-AzureEndpoint -Protocol tcp -LocalPort 3389 -PublicPort 3389 -Name “RemoteDesktop” |
Add-AzureProvisioningConfig -Windows -AdminUsername $adminUsername -Password $password |
New-AzureVM -ServiceName $cloudSvcName
}

  • Modify any of my generic parameters I’ve provided in the file.  You’ll likely need to modify the default values of the following
    • $vmName
    • $password
    • $adminUsername
    • $cloudSvcName
    • $image
    • $size (Optional, right now mine creates large VMs so you may want to adjust this)
  • This script also accepts one parameter that you can key in the value for externally. The –VMCount parameter allows you to specify exactly how many virtual machines you would like based on your image. To run this script use this command in your PowerShell window. Using the number 3 here will produce 3 virtual machines for me.

.\CreateVMs.ps1 -vmcount 3

  • If I were to run this script without changing any of the embedded parameters it would produce 3 virtual machines with the following names:
    • VirtualMachineName1
    • VirtualMachineName2
    • VirtualMachineName3

Once your virtual machines are created your next step is to consider if you want to keep them running all the time or only during business hours. Keep in mind you only pay for the time your virtual machines are started. For my scenario I wanted to turn my machines off during non business hours and back on in the morning, which can be automated through Azure Automation. Read this post on Setting up Azure Automation.

Click here to view Devin's entire blog.



Webinar Follow-Up: What's New In SSIS 2012 & 2014

Instructor: Kathi Kellenberger

Click here to watch the full webinar.

Click here to view Kathi's blog.

Q:  Can we develop SSIS packages using SSMS or do we have to use SSDT?
A: You must use SSDT. That makes sense; it is a development tool!

Q: Any impact when upgrading from 2008 SSIS to SSIS 2014?
A: Except for just trying it out, I haven’t had the chance to do any real upgrades. It’s important to figure out if you are going to move to the new Project Deployment model and do the planning.  Once I get some more experience with this, I’ll do another webinar just about it.

Q: Can SSIS 2012 sense the environment name and set the environment accordingly?
A: I’m not sure exactly what you are getting at here. In the new project deployment model, you assign the environment.

Q: Can I run a package developed in SSIS 2012 or 2014 in a 2008 R2 environment?
A: I haven’t tried this and a quick search didn’t turn up anything helpful. My gut is telling me that it should work as long as you are using the package deployment model.

Q. Do package configurations still exist in SSIS 2012?
A. Yes, you have the choice of using the package deployment model which gives you the traditional configurations or using the new project deployment model.

Q. Is there a way to make changes to one package without redeploying the entire project?
A. No, in the project deployment model, you must deploy the entire project even if there is just a change to one package.


Webinar Follow-Up: SSIS Tales From the Road

Presenter: Anthony Martin

SQL Server Integration Services is a versatile tool that makes it easy to move and transform data. Sometimes unusual situations present themselves that leave developers scratching their heads. In this webinar,  Anthony will demonstrate how to handle some of these oddball challenges.

Click here to watch the full webinar.

Click here to view Anthony Martin's blog.

Follow-Up Questions:

Q:  Could the file cache be used between packages in the same projects?

Yes, the ‘Use file cache’ option of the Cache Transform transformation can be used between packages that are in the same project or even different projects. 

In order to do this, you’ll need to configure the Cache Transform with the ‘Use file cache’ option and execute the package.  This is because the Cache Transform file (.caw extension) is created until the package is executed.  Next, in another package that you want to reuse the Cache Transform file, add a Lookup transformation to a Data Flow Task.  While configuring the Lookup transformation, select the ‘Cache Connection Manager’ connection type on the ‘General’ page.  On the ‘Connection’ page, click ‘New’ to create a new Cache Connection Manager, select the ‘Use File Cache’ option and browse to the location where the file was created in the previous package.  Configure the rest of the Lookup transformation just like you normally would and you are all set!

Q:  Are Lookup Transformations better than just using Stored Procedures in the Source component?

If the connection for the source and the lookup are on the same database and correct indexing exists then a stored procedure in the source component that performs the join for you is probably the appropriate choice.  However, if the source and the lookup connections are on different database servers or different file types altogether then the lookup transformation is one of only a few options available and is a highly efficient transformation if configured using the default settings.

Q:  Could you please share the SSIS project and SSMS solution?

I’ve posted both solutions to my OneDrive.  You can access them here.


Copyright 2014 by Pragmatic Works