So the normal concept of index on a particular column is a key and all rows that have that key in that column. So, with columnstore do we create index on row instead of column?
Columnstore is created across all, or most columns, in a table. Each column is stored individually, so the order of the columns makes no difference. Each column can be accessed independently from other columns in the table.
A typical fact table is mostly comprised of keys and measures. So if you cannot include keys in the columnstore for the table would you architect the warehouse without foreign key relationships and handle them through a meta-data layer for ad hoc reporting?
Yes, create the data warehouse without foreign key relationships and enforce these business rules through the ETL processes. Unlike, OLTP databases, data warehouses typically have a limited number of processes that change data in each table. So ensuring these business rules are followed is much more simple.
Can you explain how data is stored between rowstore and columnstore index with some example data?
I would primarily recommend re-watching the first portion of the webcast again. That said, in a rowstore, SQL Server stores all of the columns for a row together. While in a column stored all of the rows fora column are stored together.
Are columnstores linked to SQL Server edition somehow? are they available in standard edition?
Yes, columnstore indexes are only available in Enterprise edition. As a side note, you can leverage them in SQL Server Azure VMs since they come pre-installed with Enterprise Edition.
Could you please recommend a resource (book) which we could useto practice columnstore indexes?
Currently, the only resource I am aware of is Expert Performance Indexing for SQL Server 2012.
By: Jason Strate
As a DBA, how can I find good column to create a clustered index on a table given that I don't know my workload on the table.
I would start with the primary key on the table. That's often one of the best places to start. From there it’s a matter of how you are using the table, or plan to. A great resource would be the session I did last year, Choosing Your Clustered Index.
Can you please discuss the difference between clustered and non-clustered indexes?
Non-clustered indexes represent secondary access patterns for accessing your data, where clustered indexes are the primary path for data access. The webcast How Do Non-Clustered Indexes Improve Performance? can provide some additional information that would be useful.
Could you please provide a good book we could use as reference/review?
The best book out there on indexes would be my book for SQL Server 2012 titled, Expert Performance Indexing for SQL Server 2012.
Is there any real time scenario, where heap is better than clustered index?
If you plan to load a table and then query the data once or twice. It often takes longer to import and maintain the clustered index that it would be to just use a clustered index.
Is there a general rule or typical scenarios when would youconsider using a non-clustered index, or is there one?
Non-clustered indexes are intended as alternative paths to data on a row. The clustered index key won't always be the best way to find every row every time. For more information, check out - How Do Non-Clustered Indexes Improve Performance?.
On April 3rd, I gave a webinar for Pragmatic Works Training on the Ts on the new SSIS Project Deployment Model introduced with SQL Server 2012. Be sure to check out the recording if you missed the session. If you would like a copy of my slides and code, it is here. Here are the top 5 questions from the session:
By: Kevin Kline
Ever wanted to convince the boss try something new, but didn't know where to start? Ever tried lead your peers only to fail to achieve your goals? This session teaches you the eight techniques of influencing professionals, so that you can innovate and achieve change in your organization.
View the recording here.
Download the PowerPoint here: UG - Convince Me Persuasion Techniques That Get Things Done (Short Form).pptx
Q: Is it easy to convert from a multidimensional to tabular project (or even tabular to multidimensional)?
A: No. You will need to rebuild the project in the version you want to switch to. They operate and are built very different.
Q: How well does SSAS interact with Microsoft PDW?
A: Obviously this is a little more specific question. My short answer is that SSAS on top of PDW works just like it does against SMP SQL or any other relational database for that matter. The problems start to appear when you run SSAS in ROLAP or DirectQuery mode though. I have a fairly large SSAS multidimensional solution running in ROLAP mode built on PDW for my current client and it is working great. In the current version of PDW (v2 AU .5) the two things I would say though are to avoid parent child hierarchies since the T-SQL issued to satisfy those are not currently supported on PDW and put your fact table in columnstore, it makes a big difference.
Q: How does the “find related tables” button know which tables to bring in?
A: You must have constraints built into your data model. Simply putting a primary key on each table is not enough.
Q: The thought of building a data warehouse is daunting, I understand the concepts but have never built one. Do you have any advice or resources that might help increase my chances of creating a useful data warehouse the first time?
A: There are a lot of really good articles on SQLServerCentral.com and a ton of webinars on PragmaticWorks.com that are free just like the one I delivered this week. Before I go any farther, I work for Pragmatic Works and I do not get commission or referral bonuses or anything like that. That being said, I don’t like to push the company products on my blog because I don’t want to make this a giant advertisement. In this case I will make an exception. Pragmatic Works has a training offering that fits exactly what you are asking and we call it a Business Intelligence Bootcamp. It’s a 1 week, in person training session that we do at Microsoft offices around the country. We go through data modeling, ETL, SSIS and then reporting and show you how all the pieces fit together. The great thing about it is that you get to take the project files back with you for reference and we show you how decisions along the way affect other parts of the project. For instance, why surrogate keys in your data model are important when you get to building your SSAS cube.
Q: How do you add a second data source when using the Tabular model?
A: To add more data sources just click the same button in the top left corner of the screen that we used to add the first set of tables. You can have as many data sources in the model as you would like.
Q: Could you comment on the differences between SSAS Tabular and PowerPivot for SharePoint?
A: A lot of the features are the same between the two. In a nutshell you can think of Tabular as the big brother of PowerPivot. Security is more robust, enterprise backup/DR strategy, model management and refresh is better (partitions) and it can handle much more data. PowerPivot workbooks uploaded to SharePoint are limited to 2GB in size, you don’t have to worry about that with Tabular. A lot of companies are using PowerPivot to prototype or get the projects started, then use the Import PowerPivot Workbook option I mentioned to create the full size Tabular model.
Q: Is Tabular in Excel and what version of Windows is required?
A: Tabular is not in Excel, PowerPivot however is in Excel. The minimum to use PowerPivot is Excel 2010. The system requirements can be found here. I’d rather link to it since Microsoft tends to keep those pages up to date very well and there are constant product changes between Excel and SQL Server. Minimum requirements for SQL 2012 though is Windows Vista SP2 or Windows Server 2008 SP2.
Q: Can there be multiple fact tables (measure groups) in a single cube?
A: Absolutely. I just showed one because of time constraints. There are certain situations that actually require additional measure groups to be added, such as many to many relationships.
Q: When was Tabular introduced? When was PowerPivot introduced?
A: Tabular was first introduced in SQL Server 2012. PowerPivot was first introduced with Excel 2010.
Q: How big is too big for tabular?
A: How big is your server? This is the classic “it depends”. Tabular is built to run with data in memory. If your server only has 4 GB of memory and you are wanting to build a model that has 2TB of data you are likely going to be unhappy. That’s not to say this won’t happen with Multidimensional as well, because it will, but it has lower memory limitations because it is built to have preaggregated values created at processing time and stores the data differently. I’d recommend watching Devin Knight (Twitter|Blog) and Dustin Ryan’s (Twitter|Blog) session on “Choosing the Right Analysis Services: MOLAP vs Tabular” (Free Recording) since they touch on this issue.
By: Devin Knight, @knight_devin
I hope you were able to attend my free webinar on Understanding Microsoft Self-Service BI on March 13, 2014. If you weren’t you can now download the recording and slides.
I usually like to post follow up questions and answer that I wasn’t able to address during the webinar. Here’s several that may help you!
Q: Can you explain quickly how to use Power View on SharePoint 2010?
To use these together you must have SharePoint 2010 SP1 Enterprise and SQL Server 2012 Enterprise or BI Edition. Unfortunately, there’s not a quick way to describe the rest of the configuration so here’s a post that may help.
Q: What devices are supported by mobile BI
The Mobile BI will be available on Windows 8 and Windows RT devices. There is also a native Mobile BI app for iPad under development.
Q: Are MS planning to remove the dependency on Silverlight for Power View (etc.) thus making it possible to render on iPad?
Yes! There are actually some public betas of Power View working on HTML 5 instead of Silverlight. This will be the major enhancement that allows support for the mobile BI app on IPads
Via Mark Vaillancourt's personal blog: http://markvsql.com/
I recently presented a free webinar for Pragmatic Works Free Training on the T’s about new Business Intelligence features in SharePoint 2013. If you attended I want to say thank you. It’s always a fun time presenting and the audience is always a lot of fun to interact with before presenting. If you were unable to join the session here is a link to the recording that you can watch for free. The Q&A can be found below.
Q: Do the improvements mean that we can do security without Kerberos in 2013?
A: Yes and no. The security improvements I showed in this webinar are specific to PerformancePoint and Excel services connection to SQL Server Analysis Services only. This does not affect SSRS typing to connect to any source. So there definitely could still be a need for Kerberos for other connection scenarios. Keep in mind these settings also don’t affect other security models for the rest of your SharePoint farm, just those two specific BI applications.
Q: In the Browser View Options can you only select certain worksheets to deploy to SharePoint so you can essentially hide some of the worksheets from users?
A: That is correct. If you choose to deploy worksheets they will be stored inside the workbook and uploaded, but they will not be visible in the browser to the user.
Q: So what is the best practice for security setup in SP: Using individual credentials and pass it down to data source via SP using Kerberos and delegation of security, or use that single account for all users (Unattended Service Account), or finally pass users credentials as connection property?
A: I’m going to be a consultant here for a moment and say “it depends”. Honestly, it really does. It depends on the data in the report. Human resources data for instance I would say you need to go Kerberos or connection property because access to that data is likely on a per user basis. For instance, I shouldn’t be able to see HR data for the western region if I’m the HR manager for the eastern region. If the data requires less restrictive access, say weather data we have collected then the unattended account is probably just fine since everyone would be connecting under the same credentials and not their own identity.
Q: What version of SQL Server is needed to use SharePoint 2013 features?
A: Power View would need to be SQL Server Reporting Services 2012 or later. The back end for SQL though, to get the PerformancePoint and Excel Services functionality would only need to be 2008R2 SP1 or later.
Q: Do you know if Microsoft plans to add support for multidimensional models for the Excel version of Power View?
A: I don’t have any good insider information on this topic but I think it would be a huge mistake if they didn’t. Look for this in future updates to Excel.
Q: How difficult is to import PerformancePoint 2007 objects into SharePoint 2013?
A: The migration method I showed in the webinar would only work for moving 2013 objects between sites. There is some information about upgrading 2007 objects in this blog post here: http://blogs.msdn.com/b/performancepoint/archive/2010/02/25/upgrading-performancepoint-server-2007-to-pps-2010.aspx
Q: How does do the new security settings you showed affect SSAS security?
A: This will not change your current SSAS security model. You still need to set up roles in your SSAS cubes to handle the mapping between users and the data they can see.
Q: Can Excel and PerformancePoint 2013 show data from SSAS 2005 cubes?
Q: In the excel services report can you view your pie chart and pivot table in the same view?
A: Not in the gallery view, you would have to use the standard view (aka, not setting the browser view options to individual items).
Thanks for everyone who attended the webinar on the Introduction to Data Mining. The recording can be found here: http://pragmaticworks.com/LearningCenter/FreeTrainingWebinars/PastWebinars.aspx?ResourceId=613. Here are a few of the questions that some of the attendees had:
Q: Do Data Mining professional also need to be well versed in Statistics?
A: It's not necessary to be a beginner to intermediate in data mining. However if you want to become an expert in data mining, learn more about statistics through a college class is very useful. This is very handy to get the most out of it.
Q: Data mining does not clean the data but they use the clean data for prediction- is that right?
A: You are correct. A large part of a data mining project is the cleansing of the data through ETL or through T-SQL. The reason this is so important is the nature of putting data in buckets. For classification, the fewer the buckets the better. For example, if you have dirty data and you have four versions of the country United States, USA, United States of America and US, data mining won't be able to do as good of a job with customers.
Download the Data Mining in SQL Server.pptx PowerPoint
Thanks for those who attended the webinar on Unit Testing with SSIS and BI xPress. In case you couldn't be there, you can see a recording of it below.
The new version can be downloaded here: http://www.pragmaticworks.com/Products/BI-xPress.aspx.
Some of the questions from the audience to point out:
Q: Daniel: Does it support older version of ssis such as 2008?