Data sampling for comfortable SSAS 2012 Tabular modelling

My initial goal to work in tabular was to have a comfortable environment to work with. To do this i wanted to work only with a sample of the warehouse data while at the same time take advantage of the tabular environment “bells and whistles”.

Working with SQL queries instead of mapping objects (tables, views) directly has the big disadvantage that you cannot use later the visual interface to pick columns and use filters with the GUI. On the other hand, using partitioning on tables, when partitioning itself is also a SQL query has the same effect. As soon as you want to add a column graphically, the SQL partitioning is reset.

So the data sampling filter must be part of the data source.

A recommendation which seems to be commonly accepted is to try to work with views instead of mapping directly tables from the SQL source. This isolates the Tabular model from the eventual data source changes and allows also to enable/disable sampling inside the view (we suppose that we have control over the data source and can add our views to it). Enabling/disabling sampling via modifying the views is easier than doing it by Filters in Tabular GUI and can be automated in a more simple and safe way.

With this in mind first i chose the sampling unit, which in my case having Clients, Contracts and Contract Facts are the Clients. The idea is to design a sampling based on Clients and that then this sampling propagates following the DBs relational logic to the rest of the DB.

To store the sampling itself i use a “helper” table in some corner of the DB where it does not bother, and there i apply some sampling criteria, in this case a random sample receipt.

   ClientId nvarchar(100)

insert into TEMP_DimClientSample
SELECT ClientId FROM DimClient
(ClientId, NEWID())) as int))
% 100) < 2 -- put here the sample percentage

And then create the views to consume this with selects like:

    -- remove next 2 lines when going into production
    join TEMP_DimClientSample
        on DimClient.ClientId = TEMP_DimClientSample.ClientId


    join DimClient on DimContract.ClientId = DimClient.ClientId
    -- remove next 2 lines when going into production
    join TEMP_DimClientSample
        on DimClient.ClientId = TEMP_DimClientSample.ClientId

Related posts:

Yet another MS BI blog

After working on a Microsoft “multidimensional” BI project (SQL2005, SQL2008, SQL2008R2) for some years i feel the need to get to know deeply the new technology proposals from Microsoft, specifically Tabular and PowerPivot. Because i work in a company that believes strongly on corporate BI, i will be most probably focused on Tabular.

I know i do not have at this point much to share with the MS BI community but i want to go on anyway with this blog because of a couple of reasons. Besides personal ones, I think that maybe a newbie experience can be helpful for somebody out there as everybody’s learning experience is slightly different and it may present old issues from a different point of view (most probably not new but hopefully different).

Its also my first blogging experience (oh my God! a double-newbie blog!).