Tuesday 29 August 2023

BI Self-Service & Power BI

With the evolution Data & Analytics consumption, our lives has become full of data. From ordering food πŸ”πŸ•πŸ° to checking the delivery status, from binge watchingπŸ“ΊπŸ‘€, to online shopping πŸ›’ we are always producing and consuming data. 

Long gone are the days of statis MIS reports and dependency on the IT teams for getting those reports. Even the prebuilt interactive reports are also not adequate and flexible enough for the data hungry workforces. Business users want empowerment πŸ’ͺ, to to create their own insights, or in other words self-service capability. 

Good for us that the current modern BI tools such as Power BI, Tableau, Qlik are quite capable to handle such self-service capability. But with freedom comes responsibility, so the challenges of self-service BI are as,

  • Standard - The organizational standards in terms of metrics definition, dimensional attributes, nomenclatures, UI etc. need to be consistent across.
  • Single source of truth - With everyone creating their own version of truth, it is very important to define the certified version.
  • Governance - To avoid lawlessness there has to be a strong governance structure and defined roles.
I will take the example of Power BI and try to explore how we can achieve self-service with overcoming the above challenges. 

There are 2 parts of the solution residing in Architecture and Access/Roles.

A. Architecture

To achieve this I would go with a layered architecture. Instead of keeping everything together in a single .PBIX or in a single workspace, let us split them to achieve greater flexibility. 


The left most layer in the dam represents the data layer and it is not part of the Power BI. Considering the data can be on-prem or cloud hosted and can be connected to Power BI directly or through Gateway. Again there can be different options of import vs direct query. In the case of later the the UI layer can be directly connected to the data.

Before going to the Power BI layers (from left to right), let us know that a .PBIX file has 3 major components, viz Data transformation (you can access by edit query), Data including model and tables, metrices, and reporting. These 3 components can be split into multiple workspaces spread across 3 layers. 
  1. Dataflow Workspace - This layer need to be created online in Power BI services only. This will contain the data source connectivity and the data transformations.
  2. Dataset Workspace - This will contain the imported data and the data model (relationship and cardinality). Basically these are .PBIX files developed offline but with blank pages.
  3. Reporting Workspace - This will contain the Reports only. These are .PBIX files, which will connect to the Power BI dataset instead of data source.
Now the question comes why do we have to take so much of pain, when we can easily perform everything in a single .PBIX file? πŸ˜•

Now the reason is flexibility to control and govern. From a PBIX file we can connect to single Dataset only, but a single Dataset can connect to multiple Dataflows. The plan is to create separate workspaces for secured and open Dataflows. basically the dimension data which should be available to all can be loaded through Dataflows residing in the open workspaces. Whereas the sensitive factual data can be loaded through different Dataflows residing in different workspaces, based on the departments. While creating a Dataset, these different Dataflows can be accessed based on the privilege and data can be combined. Furthermore based on the sensitivity of the data the Dataset can be put into secured or open workspace. So you can image while connecting from a UI file we get multiple combination of accessing the data, and that is going to play a key part in providing data access for Self-service. Moreover these Dataflows and Datasets can be certified to ensure the credibility.   

In addition to this we need to maintain the RLS (Row Level Security) πŸ” data to ensure data level security.

The PowerApps is like a cherry on the top πŸ’ here, it gives the end users to update the backend data by themselves. So our empowered end users are not only consuming the data, but they are producing also by themselves.

B. Access/Roles

Access / Role will play a major part in the governance, controlling who can access what and have gate keepers to ensure standards and consistency. There are 4 different roles in Power BI:



Let us try to understand and compare each role, what are the privileges they have or don't have.



Best practices for assigning roles:
  • The Admin user to be used for admin related activities only such as creation / deletion of workspaces, adding other admins etc.
  • The users who do not need to perform specific admin related activities, should not be with admin roles. 
  • Member roles can be leveraged for all other controls and deployments. They can even manage security for roles other than admin. 
  • The members should be responsible for all the content published to the larger audience. They should perform checks before final deployment.    
  • Contributors should not be allowed to update an app, that will do away with the benefit of having gate keeper and the standard of the content can be compromised.
  • The access should be given to user groups rather than individual users, that way controlling the access will be easier and there will be people to backfill others. 

Consumption 

Finally we are at the consumption layer. Here we will have 5 different user personas, starting from the top,
  • User 1 : BI Developer / Admin - Able to connect to the DB from Power BI desktop. RLS doesn’t apply. Practically they can do almost everything.
  • User 2 : Data Analyst - Able to connect to the dataflow and dataset from Power BI Desktop.
  • User 3 : BI Power User - Able to edit prebuilt published reports and create own’s. 
  • User 4 : BI Explorer - Able to personalize and share the prebuilt reports. The self-service is limited to to playing around with the prebuilt reports only.
  • User 5 : BI Consumer - Able to view, slice and dice the reports with no self-service capability. 




 

Implementing & Testing Row Level Security in Power BI

I have suffered a great deal of pain while implementing and more so while validating Row Level Security in Power BI. Let me try to capture a...