PowerPivot, Parent/Child and Unary Operators – A possible variation?

While evaluating the possible migration of an accounting/financials cube from Multidimensional to Tabular i came across these posts which helped me make a hyper-jump forward.

This post is an attempt of variation of the technique explained in PowerPivot, Parent/Child and Unary Operators to simulate the Multidimensional Unary Operator. It starts from a slightly different initial situation, where additionally to the Unary Operator for each Account, we know something about the account type itself, in this case if the account expects normally Inflows or Outflows. This is an information which usually is available in this type of setup.

This post builds on the example given in Alberto Ferraris above mentioned post and the techniques explained there in detail are not repeated here.

The basic idea of the approach is to separate calculation from presentation, not using the Unary Operator to make the calculation but just to correct the sign of the calculated amount to suit the user requirement.

We start with a slightly different data set, adding Accounts.InOutFlow representing the expected direction of the cash flow. The AggPaths query from the original dataset we will not need.


CREATE TABLE Accounts (
    AccountId         INT, 
    Account           VARCHAR (100),
    ParentAccountId   INT,
    Aggregator        CHAR(1),
    InOutFlow         char(3)
)

INSERT INTO Accounts (AccountId, Account, ParentAccountId, Aggregator, InOutFlow)
VALUES  
    ( 1, 'Final Total',       NULL, ' ', 'In'), -- Grand Total
    ( 2, 'Revenues',             1, '+', 'In'), --   ADD Total Revenues to Grand Total
    ( 3, 'Courses Provided',     2, '+', 'In'), --     SUM Courses to Revenues
    ( 4, 'Instructors paid',     3, '-', 'Out'), --       SUBTRACT paid instructors from Courses
    ( 5, 'Cathering',            3, '-', 'Out'), --     SUBTRACT cathering from Courses
    ( 6, 'Attendees rates',      3, '+', 'In'), --       SUM attendee rates to Courses
    ( 7, 'Consultancy',          2, '+', 'In'), --     SUM Consultancy to Revenues
    ( 8, 'Expenses',             1, '-', 'Out'), --   SUBTRACT Expenses to Total
    ( 9, 'Travel',               8, '+', 'Out'), --     SUM travel to Expenses
    (10, 'Travel refund',        9, '-', 'In'), --       SUBTRACT travels refund to Travels
    (11, 'Travels paid',         9, '+', 'Out'), --       SUM travels paid to Expenses
    (12, 'Courses Taken',        8, '+', 'Out'); --     SUM courses taken to Expenses

Now in PowerPivot i added 2 calculated columns: LeaveMultiplier and Multiplier.

Unary1

LeaveMultiplier formula is:

LeaveMultiplier=IF(Accounts[InOutFlow] = "Out", -1, 1)

and will return a 1 or -1 to multiply the Transactions[Amount] value in order to get a signed amount which aggregates correctly. It means, all Inflows are considered finally positive and all Outflows are considered finally negative. No matter what the accounts hierarchy models, the Final Total must always be the simple sum of these “signed” values. The “leave” in the name is because the Transaction[Amount] values are only recorded at leave level of the hierarchy, and this signing of amounts is conceptually only relevant at that level as later the amounts aggregate “naturally” up the hierarchy.

With this LeaveMultiplier we can make a measure:

NaturalSumOfAmounts:=IF (
    Accounts[ShowRow], 
    SUMX (Transactions, Transactions[Amount] * RELATED(Accounts[LeaveMultiplier])), 
    BLANK()
)

which will give:

Unary2

Now, while this sums up correctly the Final Total, all the nodes below Expenses do not show the sign that is required. So, now its the moment to work on the presentation of the values.

For this i created a field Multiplier which I will use to show the node aggregates following the requirement in Accounts[Aggregator]:

Multiplier=IF( 
    (Accounts[InOutFLow] = "In" && Accounts[Aggregator] = "-") 
    || (Accounts[InOutFLow] = "Out" && Accounts[Aggregator] = "+"),
    -1, 
    1
)

It says: if it is an Inflow and it should be shown as a negative value, then invert the value. Same if it is an Outflow and it should be shown positive. In all other cases, leave the value as it comes. If we can apply this to the “natural” values shown before the final presentation should be ok.

To achieve this we actually need for each node, at measure calculation time, the Multiplier value. This seems to be a very tough task as we are mixing calculation time with filters with row level. It can be solved in a hierarchy using the ISFILTERED() function and the technique explained in Parent/Child Hierarchies in Tabular with Denali to detect the “BrowseDepth”.

As preparation, we will create additional fields to expand the Multiplier to the different levels of the hierarchy. This supposes that we built the hierarchy following this post looking like:

Unary3

We want to achieve:

Unary4

using formulas like:

Level2Multiplier=LOOKUPVALUE(
    Accounts[Multiplier], 
    Accounts[AccountId], 
    VALUE(PATHITEM ( [Path], 1 ))
)

This will repeat the Multiplier value in a different format, bringing it from row level to a multi-row format, which will allow us to use a measure together with the ISFILTERED() formula to get the right Multiplier value for each node.

The measure looks like:

IsFilteredMultiplier:=IF (
    ISFILTERED ('Accounts'[Level4]), MIN(Accounts[Level4Multiplier]),
    IF (ISFILTERED ('Accounts'[Level3]), MIN(Accounts[Level3Multiplier]),
    IF (ISFILTERED ('Accounts'[Level2]), MIN(Accounts[Level2Multiplier]),
    IF (ISFILTERED ('Accounts'[Level1]), MIN(Accounts[Level1Multiplier])
))))

where the MIN in MIN(Accounts[Level4Multiplier]) is just to use some aggregate function as all values will be the same (we could also use MAX for example).

At this point we can check if IsFilteredMultiplier is working:

Unary5

Finally with a measure like:

NicerSumOfAmounts:=IF (
    Accounts[ShowRow], 
    SUMX (
        Transactions, 
        Transactions[Amount] * RELATED(Accounts[LeaveMultiplier])
    ) * Accounts[IsFilteredMultiplier], 
    BLANK()
)

we will have:

Unary6

which fulfills the original request.

If this technique works well (hope to find out soon) then it could be used as basis for migrating Multidimensional cubes modelling Chart Of Accounts situations that rely on Unary Operator and where we have information about the type of the Accounts in relation with its expected sign. Or in accounting terms if we know for each account the expected “side”, either Debit or Credit, and how it should be shown in the pivot table.

The PowerPivot file is available as UnaryVariant2.

Tabular and PowerPivot resource list

I added a page to this blog where i will try to keep the list of my favorite resources related to Tabular and PowerPivot. Its just amazing how much material is already out there and its difficult to make a selection, but sometimes you read a post which opens a new door or explains something in a completely clear way. I will try to keep track of these entries in the page.

Also there is a list of some more formal material like Books and material from Microsoft. This material may require more discipline to work with but it gives a systematized ground where to stand on.

The page is Tabular and PowerPivot resource list

Data sampling for comfortable SSAS 2012 Tabular modelling

My initial goal to work in tabular was to have a comfortable environment to work with. To do this i wanted to work only with a sample of the warehouse data while at the same time take advantage of the tabular environment “bells and whistles”.

Working with SQL queries instead of mapping objects (tables, views) directly has the big disadvantage that you cannot use later the visual interface to pick columns and use filters with the GUI. On the other hand, using partitioning on tables, when partitioning itself is also a SQL query has the same effect. As soon as you want to add a column graphically, the SQL partitioning is reset.

So the data sampling filter must be part of the data source.

A recommendation which seems to be commonly accepted is to try to work with views instead of mapping directly tables from the SQL source. This isolates the Tabular model from the eventual data source changes and allows also to enable/disable sampling inside the view (we suppose that we have control over the data source and can add our views to it). Enabling/disabling sampling via modifying the views is easier than doing it by Filters in Tabular GUI and can be automated in a more simple and safe way.

With this in mind first i chose the sampling unit, which in my case having Clients, Contracts and Contract Facts are the Clients. The idea is to design a sampling based on Clients and that then this sampling propagates following the DBs relational logic to the rest of the DB.

To store the sampling itself i use a “helper” table in some corner of the DB where it does not bother, and there i apply some sampling criteria, in this case a random sample receipt.

CREATE TABLE TEMP_DimClientSample (
   ClientId nvarchar(100)
)

insert into TEMP_DimClientSample
SELECT ClientId FROM DimClient
WHERE (ABS(CAST(
(BINARY_CHECKSUM
(ClientId, NEWID())) as int))
% 100) < 2 -- put here the sample percentage

And then create the views to consume this with selects like:

SELECT 
    DimClient.*
from 
    DimClient
    -- remove next 2 lines when going into production
    join TEMP_DimClientSample
        on DimClient.ClientId = TEMP_DimClientSample.ClientId

or

SELECT 
    DimContract.*
from 
    DimContract
    join DimClient on DimContract.ClientId = DimClient.ClientId
    -- remove next 2 lines when going into production
    join TEMP_DimClientSample
        on DimClient.ClientId = TEMP_DimClientSample.ClientId

Related posts:

Yet another MS BI blog

After working on a Microsoft “multidimensional” BI project (SQL2005, SQL2008, SQL2008R2) for some years i feel the need to get to know deeply the new technology proposals from Microsoft, specifically Tabular and PowerPivot. Because i work in a company that believes strongly on corporate BI, i will be most probably focused on Tabular.

I know i do not have at this point much to share with the MS BI community but i want to go on anyway with this blog because of a couple of reasons. Besides personal ones, I think that maybe a newbie experience can be helpful for somebody out there as everybody’s learning experience is slightly different and it may present old issues from a different point of view (most probably not new but hopefully different).

Its also my first blogging experience (oh my God! a double-newbie blog!).

Follow

Get every new post delivered to your Inbox.