Pragmatic Works Blog

rss

Company blog for some of our Pragmatic Works employee to write about company happenings and about the software business.


Webcast Follow-Up: Introduction to Columnstore Indexes

So the normal concept of index on a particular column is a key and all rows that have that key in that column. So, with columnstore do we create index on row instead of column?

Columnstore is created across all, or most columns, in a table. Each column is stored individually, so the order of the columns makes no difference.  Each column can be accessed independently from other columns in the table.

A typical fact table is mostly comprised of keys and measures. So if you cannot include keys in the columnstore for the table would you architect the warehouse without foreign key relationships and handle them through a meta-data layer for ad hoc reporting?

Yes, create the data warehouse without foreign key relationships and enforce these business rules through the ETL processes.  Unlike, OLTP databases, data warehouses typically have a limited number of processes that change data in each table.  So ensuring these business rules are followed is much more simple.

Can you explain how data is stored between rowstore and columnstore index with some example data?

I would primarily recommend re-watching the first portion of the webcast again.  That said, in a rowstore, SQL Server stores all of the columns for a row together.  While in a column stored all of the rows fora column are stored together.

Are columnstores linked to SQL Server edition somehow? are they available in standard edition?

Yes, columnstore indexes are only available in Enterprise edition.  As a side note, you can leverage them in SQL Server Azure VMs since they come pre-installed with Enterprise Edition.

Could you please recommend a resource (book) which we could useto practice columnstore indexes?

Currently, the only resource I am aware of is Expert Performance Indexing for SQL Server 2012.


WEBCAST FOLLOW-UP: INTRODUCTION TO CLUSTERED INDEXES AND HEAPS

By: Jason Strate

As a DBA, how can I find good column to create a clustered index on a table given that I don't know my workload on the table.

I would start with the primary key on the table.  That's often one of the best places to start.  From there it’s a matter of how you are using the table, or plan to.  A great resource would be the session I did last year, Choosing Your Clustered Index.

Can you please discuss the difference between clustered and non-clustered indexes?

Non-clustered indexes represent secondary access patterns for accessing your data, where clustered indexes are the primary path for data access.  The webcast How Do Non-Clustered Indexes Improve Performance? can provide some additional information that would be useful.

Could you please provide a good book we could use as reference/review?

The best book out there on indexes would be my book for SQL Server 2012 titled, Expert Performance Indexing for SQL Server 2012.

Is there any real time scenario, where heap is better than clustered index?

If you plan to load a table and then query the data once or twice.  It often takes longer to import and maintain the clustered index that it would be to just use a clustered index.

Is there a general rule or typical scenarios when would youconsider using a non-clustered index, or is there one?

Non-clustered indexes are intended as alternative paths to data on a row.  The clustered index key won't always be the best way to find every row every time.  For more information, check out - How Do Non-Clustered Indexes Improve Performance?.


SSIS Deployment Webinar Followup

By: Kathi Kellenberger

On April 3rd, I gave a webinar for Pragmatic Works Training on the Ts on the new SSIS Project Deployment Model introduced with SQL Server 2012. Be sure to check out the recording if you missed the session. If you would like a copy of my slides and code, it is here. Here are the top 5 questions from the session:


Q: How can you deploy just one package if there is a change?

A: In the project deployment model, you can only deploy the project, not individual packages.

Q: When you set up separate configurations within Visual Studio, what happens with those settings when you deploy the project?

A: When building the project, the ISPAC file will reflect the configuration chosen at the time. Under the BIN folder, you’ll find a folder with the configuration name. The default values from the chosen configuration will be seen when deploying from that ispac file.

Q: Does it make sense to have one set of environments for multiple projects?

A: Yes, it does make sense to do that. For example, your production servers will probably be the production servers for many projects. As I mentioned during the session, you will spend some time figuring out how to best use this model in your shop.

Q: Can some packages in a project be in the old model and some in the new model?

A: No, this is set at the project level. It is possible, however, that some of the packages will have parameters and some will not have parameters. This doesn’t mean that the packages without parameters are using the old package model.

Q: What do you do if someone else is supposed to deploy the project?

A: You would just send the ispac file. Everything that you need to deploy the project is in that file.

Convince Me – Persuasion Techniques That Get Things

By: Kevin Kline

Ever wanted to convince the boss try something new, but didn't know where to start? Ever tried lead your peers only to fail to achieve your goals? This session teaches you the eight techniques of influencing professionals, so that you can innovate and achieve change in your organization.

View the recording here.

Download the PowerPoint here: UG - Convince Me Persuasion Techniques That Get Things Done (Short Form).pptx


Webinar Q&A: Getting Started With Analysis Services

By: Bradley Schacht

I recently presented a free webinar for Pragmatic Works Free Training on the T’s about Getting Started With Analysis Services. We covered some of the differences between the two project types at a high level then built one of each to give everyone an idea of how difficult and how fast each could be developed. If you attended I want to say thank you. It’s always appreciated when people come out for these sessions. If you were unable to join the session here is a link to the recording that you can watch for free. The Q&A can be found below. 


Q: Is it easy to convert from a multidimensional to tabular project (or even tabular to multidimensional)?
A: No. You will need to rebuild the project in the version you want to switch to. They operate and are built very different.

Q: How well does SSAS interact with Microsoft PDW?
A: Obviously this is a little more specific question. My short answer is that SSAS on top of PDW works just like it does against SMP SQL or any other relational database for that matter. The problems start to appear when you run SSAS in ROLAP or DirectQuery mode though. I have a fairly large SSAS multidimensional solution running in ROLAP mode built on PDW for my current client and it is working great. In the current version of PDW (v2 AU .5) the two things I would say though are to avoid parent child hierarchies since the T-SQL issued to satisfy those are not currently supported on PDW and put your fact table in columnstore, it makes a big difference.

Q: How does the “find related tables” button know which tables to bring in?
A: You must have constraints built into your data model. Simply putting a primary key on each table is not enough.

Q: The thought of building a data warehouse is daunting, I understand the concepts but have never built one. Do you have any advice or resources that might help increase my chances of creating a useful data warehouse the first time?
A: There are a lot of really good articles on SQLServerCentral.com and a ton of webinars on PragmaticWorks.com that are free just like the one I delivered this week. Before I go any farther, I work for Pragmatic Works and I do not get commission or referral bonuses or anything like that. That being said, I don’t like to push the company products on my blog because I don’t want to make this a giant advertisement. In this case I will make an exception. Pragmatic Works has a training offering that fits exactly what you are asking and we call it a Business Intelligence Bootcamp. It’s a 1 week, in person training session that we do at Microsoft offices around the country. We go through data modeling, ETL, SSIS and then reporting and show you how all the pieces fit together. The great thing about it is that you get to take the project files back with you for reference and we show you how decisions along the way affect other parts of the project. For instance, why surrogate keys in your data model are important when you get to building your SSAS cube.

Q: How do you add a second data source when using the Tabular model?
A: To add more data sources just click the same button in the top left corner of the screen that we used to add the first set of tables. You can have as many data sources in the model as you would like.

Q: Could you comment on the differences between SSAS Tabular and PowerPivot for SharePoint?
A: A lot of the features are the same between the two. In a nutshell you can think of Tabular as the big brother of PowerPivot. Security is more robust, enterprise backup/DR strategy, model management and refresh is better (partitions) and it can handle much more data. PowerPivot workbooks uploaded to SharePoint are limited to 2GB in size, you don’t have to worry about that with Tabular. A lot of companies are using PowerPivot to prototype or get the projects started, then use the Import PowerPivot Workbook option I mentioned to create the full size Tabular model.

Q: Is Tabular in Excel and what version of Windows is required?
A: Tabular is not in Excel, PowerPivot however is in Excel. The minimum to use PowerPivot is Excel 2010. The system requirements can be found here. I’d rather link to it since Microsoft tends to keep those pages up to date very well and there are constant product changes between Excel and SQL Server. Minimum requirements for SQL 2012 though is Windows Vista SP2 or Windows Server 2008 SP2.

Q: Can there be multiple fact tables (measure groups) in a single cube?
A: Absolutely. I just showed one because of time constraints. There are certain situations that actually require additional measure groups to be added, such as many to many relationships.

Q: When was Tabular introduced? When was PowerPivot introduced?
A: Tabular was first introduced in SQL Server 2012. PowerPivot was first introduced with Excel 2010.

Q: How big is too big for tabular?
A: How big is your server? This is the classic “it depends”. Tabular is built to run with data in memory. If your server only has 4 GB of memory and you are wanting to build a model that has 2TB of data you are likely going to be unhappy. That’s not to say this won’t happen with Multidimensional as well, because it will, but it has lower memory limitations because it is built to have preaggregated values created at processing time and stores the data differently. I’d recommend watching Devin Knight (Twitter|Blog) and Dustin Ryan’s (Twitter|Blog) session on “Choosing the Right Analysis Services: MOLAP vs Tabular” (Free Recording) since they touch on this issue.


Introduction to Power BI Recording and Q&A

By: Devin Knight, @knight_devin

I hope you were able to attend my free webinar on Understanding Microsoft Self-Service BI on March 13, 2014. If you weren’t you can now download the recording and slides.

I usually like to post follow up questions and answer that I wasn’t able to address during the webinar. Here’s several that may help you!

Q: Can you explain quickly how to use Power View on SharePoint 2010?

To use these together you must have SharePoint 2010 SP1 Enterprise and SQL Server 2012 Enterprise or BI Edition. Unfortunately, there’s not a quick way to describe the rest of the configuration so here’s a post that may help.

http://blogs.msdn.com/b/karang/archive/2012/04/13/step-by-step-guide-of-installing-power-view-with-powerpivot-2012.aspx

Q: What devices are supported by mobile BI

The Mobile BI will be available on Windows 8 and Windows RT devices. There is also a native Mobile BI app for iPad under development.

Q: Are MS planning to remove the dependency on Silverlight for Power View (etc.) thus making it possible to render on iPad?

Yes! There are actually some public betas of Power View working on HTML 5 instead of Silverlight. This will be the major enhancement that allows support for the mobile BI app on IPads


Webinar Follow-up: MDX Trek: First Contact Downloads

Greetings. After delivering my MDX Trek: First Contact presentation as part of the Pragmatic Works Free Training series on 3/11, I got some great feedback from an attendee. He pointed out that my single zip file download on my home page for the presentation only contained the SQL Server 2008 R2 version and that I may want to include upgrade instructions for people that have SQL 2012. That was a great point. I have neglected to do much with that download, even after I started delivering this presentation in the SQL Server 2012 tools some time ago. While the MDX syntax is the same, the project would have to be upgraded to be opened in SQL Server Data Tools as the existing zip was in BI Studio. This would require some extra steps and create more work for the target audience (which includes people just getting started with SSAS).

So, to rectify this situation, the home page for my MDX Trek: First Contact presentation now has separate downloads for SQL 2008 R2 and SQL 2012. You can got there now by clicking on the image below. I should have done this a long time ago and apologize for being so late.

Via Mark Vaillancourt's personal blog: http://markvsql.com/

MDX-Trek-First-Contact-SQL-2012.zip

MDX-Trek-First-Contact-SQL-2008-R2.zip


Webinar Q&A: New BI Features in SharePoint 2013

Whats New in SharePoint 2013.pptx
by 
Bradley Schacht

I recently presented a free webinar for Pragmatic Works Free Training on the T’s about new Business Intelligence features in SharePoint 2013. If you attended I want to say thank you. It’s always a fun time presenting and the audience is always a lot of fun to interact with before presenting. If you were unable to join the session here is a link to the recording that you can watch for free. The Q&A can be found below.

Q: Do the improvements mean that we can do security without Kerberos in 2013?
A: Yes and no. The security improvements I showed in this webinar are specific to PerformancePoint and Excel services connection to SQL Server Analysis Services only. This does not affect SSRS typing to connect to any source. So there definitely could still be a need for Kerberos for other connection scenarios. Keep in mind these settings also don’t affect other security models for the rest of your SharePoint farm, just those two specific BI applications.

Q: In the Browser View Options can you only select certain worksheets to deploy to SharePoint so you can essentially hide some of the worksheets from users?
A: That is correct. If you choose to deploy worksheets they will be stored inside the workbook and uploaded, but they will not be visible in the browser to the user.

Q: So what is the best practice for security setup in SP: Using individual credentials and pass it down to data source via SP using Kerberos and delegation of security, or use that single account for all users (Unattended Service Account), or finally pass users credentials as connection property?
A: I’m going to be a consultant here for a moment and say “it depends”. Honestly, it really does. It depends on the data in the report. Human resources data for instance I would say you need to go Kerberos or connection property because access to that data is likely on a per user basis. For instance, I shouldn’t be able to see HR data for the western region if I’m the HR manager for the eastern region. If the data requires less restrictive access, say weather data we have collected then the unattended account is probably just fine since everyone would be connecting under the same credentials and not their own identity.

Q: What version of SQL Server is needed to use SharePoint 2013 features?
A: Power View would need to be SQL Server Reporting Services 2012 or later. The back end for SQL though, to get the PerformancePoint and Excel Services functionality would only need to be 2008R2 SP1 or later.

Q: Do you know if Microsoft plans to add support for multidimensional models for the Excel version of Power View?
A: I don’t have any good insider information on this topic but I think it would be a huge mistake if they didn’t. Look for this in future updates to Excel.

Q: How difficult is to import PerformancePoint 2007 objects into SharePoint 2013?
A: The migration method I showed in the webinar would only work for moving 2013 objects between sites. There is some information about upgrading 2007 objects in this blog post here: http://blogs.msdn.com/b/performancepoint/archive/2010/02/25/upgrading-performancepoint-server-2007-to-pps-2010.aspx

Q: How does do the new security settings you showed affect SSAS security?
A: This will not change your current SSAS security model. You still need to set up roles in your SSAS cubes to handle the mapping between users and the data they can see.

Q: Can Excel and PerformancePoint 2013 show data from SSAS 2005 cubes?
A: Yes.

Q: In the excel services report can you view your pie chart and pivot table in the same view?
A: Not in the gallery view, you would have to use the standard view (aka, not setting the browser view options to individual items).


Q&A and Follow Up from Data Mining Session

Thanks for everyone who attended the webinar on the Introduction to Data Mining. The recording can be found here:   http://pragmaticworks.com/LearningCenter/FreeTrainingWebinars/PastWebinars.aspx?ResourceId=613.  Here are a few of the questions that some of the attendees had:

Q: Do Data Mining professional also need to be well versed in Statistics?

A: It's not necessary to be a beginner to intermediate in data mining. However if you want to become an expert in data mining, learn more about statistics through a college class is very useful. This is very handy to get the most out of it.

Q: Data mining does not clean the data but they use the clean data for prediction- is that right?

A: You are correct. A large part of a data mining project is the cleansing of the data through ETL or through T-SQL. The reason this is so important is the nature of putting data in buckets. For classification, the fewer the buckets the better. For example, if you have dirty data and you have four versions of the country United States, USA, United States of America and US, data mining won't be able to do as good of a job with customers.

Download the Data Mining in SQL Server.pptx PowerPoint


BI xPress 4.7 Adds SSIS Unit Testing

Thanks for those who attended the webinar on Unit Testing with SSIS and BI xPress. In case you couldn't be there, you can see a recording of it below.  

 

The new version can be downloaded here: http://www.pragmaticworks.com/Products/BI-xPress.aspx.

Some of the questions from the audience to point out:

Q: Daniel: Does it support older version of ssis such as 2008?
A: Yes

Q: Aaron: What does the Unit Test actually create?  would need them to be in version control to trigger and run on check-in of .dtsx file(s) then actually fail the build if any of the assertions fail.  would love to see that scenario.
A: It creates an XML file behind the scenes that can be checked into source control along with the packages
Q: would also like the best practice analyzer (code analysis) to break the build if any of the rule fail as well
A: You can integrate it into a build from the command line, so you can respond to the results there.
 
Q: Jaysen: Some of our packages require connection information from higher level packages.  Is it easy to set these sorts of variables?
A: Yes, you can set any variable in the package directly for unit testing purposes.

 


Copyright 2014 by Pragmatic Works