This is the second article in a series that concentrates on tasks of the Power BI Service administrator.
One of the most exciting things about being a Power BI service administrator is watching what's going on inside "your" service.
It's important to notice that there are two different aspects you can watch. The first one is focusing on the artifacts inside the service
- Workspaces
- Dataflows
- Datasets
- Reports
- Dashboards
- Workbooks
- Users (admittedly, it may seem a little weird considering a living person an artifact)
The second aspect is focusing on how users are using these artifacts. Monitoring usage metrics often comes with questions like the following one:
Who has opened the report how many times?
You have to be aware that storing and analyzing personally identifiable information (PII) might be restricted or prohibited by regulations that have to be considered by your organization. Creating an analytical application that provides answers to this second aspect is not the focus of this article. This article focus on the first aspect, the mapping of the artifacts inside the service.
Why?
For quite some time, I'm working as a part-time Power BI Service administrator for a client with almost thousands of Power BI Desktop users, thousands of datasets, and thousands of workspaces. There are more than 10thousand distinct users per day. Mapping the artifacts inside the service is the foundation to provide the users with the analytical platform they need to be successful in a world driven by a data culture.
How?
The next picture shows the tools used to build the solution. As you can imagine, we are using Power BI for data analysis and data visualization. Following blogs are adding more complexity to this solution to answer more advanced questions.
Two parts form this solution, one that creates a JSON document and stores this document as a blob to a container (or just a folder if you want) somewhere in your Azure storage account. The other part is quite familiar as it consumes the JSON document, creates a Power BI data model, and some data visualizations.
If you are wondering why we did not choose a Power BI only solution, here are some reasons:
- Connecting to a Power BI Rest API endpoint from inside Power Query (or dataflows) is not that simple, especially if you want to leverage a Service Principal. This approach does not require a Power BI Pro license
- A JSON document will be created; other applications can easily re-use
- Scheduling a runbook from inside Azure Automation is quite powerful
The outlined solution is leveraging some of the many Azure services
- Azure Active Directory, as soon as you start to use Power BI, there is an Azure Active Directory. Azure Active Directory manages secure access to all your cloud-based resources.
- Azure Automation, a service that allows automating tasks Azure Automation – Cloud Automation Service | Microsoft Azure
- Azure Blob storage, a service that allows the storing of files (blobs) on a massive scale, but of course, it's also possible to store tiny JSON documents Azure Blob Storage | Microsoft Azure
Each of the services mentioned above has its complexities. You have to be aware that this article is not an introduction to each of these services and can not explain its intricacies. This article assumes that all the necessary permissions are granted to your account and that you are in a way familiar with the Azure portal.
The next chapters list all the steps necessary to create the outlined solution.
Service Principal
Using a Service Principal in combination with the Power BI REST API makes developing applications much more straightforward. These applications no longer need to authenticate with users backed by a Pro license. This article explains how you can create a Service Principal:
Ensure that you note (for a short moment) the Application ID and the client secret assigned to the application. The Application ID will be used as a username and the secret as a password when the Azure Automation credential will be created.
Azure AD Security Group
After you have created the Service Principal, this article
explains how to create an Azure Security group and assign the service principal as a group member.
Power BI Admin Portal
The security group has to be added to the Power BI tenant setting Allow service principals to use read-only Power BI admin APIs
Azure Blob store
The Azure Blob store is one of the fundamental storage types in Azure. This article assumes that you can use an already existing storage account (Storage account overview - Azure Storage | Microsoft Docs) and a blob container called powerbi-workspacedata.
Storing the JSON document that contains the information of the Power BI service into the blob container requires the name of the storage account and a key for authentication purposes. The storage account and the key are also used to access the JSON document from Power BI Desktop to create the data model.
The next picture shows how the storage account can be accessed using the Azure Storage Explorer (Azure Storage Explorer – cloud storage management | Microsoft Azure):
Azure Automation
In a nutshell: you can write scripts using PowerShell or Python to automate tasks. Within these scripts, it's possible to interact with other services. These scripts are called runbooks. These runbooks can be scheduled for automatic execution. Starting automation requires an automation account. This account is then able to use credentials and variables. Credentials and variables can be used across different runbooks.
Azure Automation - Adding PowerShell modules
PowerShell modules can be added to the library that comes with each automation account. The MicrosoftPowerBIManagement modules are added, as these modules are not available by default. As there are dependencies between the modules, it's necessary to import the modules in a given sequence. Please be aware that not all modules are needed for the runbook I'm using, but this is just the first article in a series of articles. The proper sequence to import the modules is this:
1. MicrosoftPowerBIMgmt.Profile (≥ 1.0.896)
2. MicrosoftPowerBIMgmt.Admin (≥ 1.0.896)
3. MicrosoftPowerBIMgmt.Workspaces (≥ 1.0.896)
4. MicrosoftPowerBIMgmt.Capacities (≥ 1.0.896)
5. MicrosoftPowerBIMgmt.Data (≥ 1.0.896)
6. MicrosoftPowerBIMgmt.Reports (≥ 1.0.896)
Azure Automation - Credentials
This article explains what Azure Automation credentials are: Manage credentials in Azure Automation | Microsoft Docs
I created a credential named SP_PowerBIReadOnlyAdminAPI.
Azure Automation - Variables
Azure Automation variables are useful to "ingest" parameters into the runbook without changing the runbook.
There are three variables inside the runbook:
- PowerBITenantID
-
- It's necessary to notice that there is no such thing as a Power BI tenant ID. Nevertheless, quite often, it is required to provide this ID. The ID describes the Azure Active Directory of the Power BI service. I have to admit that thinking of this ID as a Power BI tenant ID is more simple to me than thinking of an Azure Active Directory used by my service.
- ADLSGen2
-
- The variable stores the name of the storage account.
- ADLSGen2Key1
-
- The variable stores the key to the storage account. A combination of account name and key is necessary to access a container (a folder) inside the blob store.
Azure Automation - Runbook
After all this preliminary writing, the next lines show the PowerShell script that accesses the Power BI Rest API and retrieves information about the workspaces inside the service. The runbook uses the service principal to authenticate against the Power BI service and stores the JSON document to the container inside the Azure Blob storage.
Please be aware that the PowerShell just contains an executable minimum of commands. The script is not built on any PowerShell best practices.
# Get the Teneant ID from the Automation variable called PowerBITenantID
[string]$TenantID = Get-AutomationVariable -Name "PowerBITenantID"
# Get the credential and connect to the Power BI Service
$Credentials = Get-AutomationPSCredential -Name "SP_PowerBIReadOnlyAdminAPI"
$result = Connect-PowerBIServiceAccount -ServicePrincipal -Credential $Credentials -Tenant $TenantID
# parameters used in the rest API endpoint are marked with $ for this reason $ has to be escaped using %24
# the top query parameter is mandatory, this means for large environments > 5000 workspaces, a loop has to be developed
$url = "https://api.powerbi.com/v1.0/myorg/admin/groups?%24top=5000&%24expand=dataflows,datasets,reports,dashboards,workbooks,users"
# REST API call the PowerShell cmdlet Invoke-PowerBIRestMethod is used with appropriate parameters
# like the url, instead of corresponding PowerShell cmdlet's
# this is because sometimes it takes its time until all the parameters available inside the REST API
# are supported inside the cmdlets
$result2 = Invoke-PowerBIRestMethod -Url $url -Method GET
# Disconnect the service principal
Disconnect-PowerBIServiceAccount
# using the Env environment as a temporary drive, this is necessary to "compose" the file
# that will be stored to Azure blob
$workspacecontent = $result2 | ConvertFrom-Json
$workspacecontent.value | ConvertTo-Json -Depth 10 | Out-File -FilePath "$Env:temp/temp.json"
# fetching the StorageAccountName and the StorageAccountkey
$theStorageAccountName = Get-AutomationVariable -Name 'ADLSGen2'
$theStorageAccountKey = Get-AutomationVariable -Name 'ADLSGen2Key1'
# create a context
$Context = New-AzureStorageContext -StorageAccountName $theStorageAccountName -StorageAccountKey $theStorageAccountKey
# moves the temporary file to the blob store, -Force makes sure that an already existing file will be overwritten
Set-AzureStorageBlobContent -Context $Context -Container "powerbi-workspacedata" -File "$Env:temp/temp.json" -Blob "workspaces.json" -Force
I tried to use as many comments as possible inside the script, but if you are not familiar with PowerShell, then the above lines will be looking weird.
Just one additional note, as the expand parameter, is used, the -Depth parameter of the ConvertTo-Json cmdlet is used (ConvertTo-Json (Microsoft.PowerShell.Utility) - PowerShell | Microsoft Docs). Using the Depth parameter is necessary because the expand parameter creates a nested JSON document. In preparation for future enhancements of the REST API, the possible depth of nested documents is set to ten.
These blogs provided some useful knowledge to create the above script and make the solution work:
- Avoiding workspace loops by expanding navigation properties in the GetGroupsAsAdmin API | Microsoft Power BI-Blog | Microsoft Power BI
- How to Copy Files to Azure Blob Storage with PowerShell (adamtheautomator.com)
- PowerShell Basics: $Env: - Environment Variables | Examples & Scripts (computerperformance.co.uk)
The Power BI application
Now it's time to start mapping the service or the hive as I call it, as so much is going on all the time.
The next picture shows how to connect to the blob store for sourcing the JSON document to start the mapping:
The next picture shows some columns of the JSON document, focusing on the dataflows, datasets, and users columns. As this content is a separate JSON document, this information materializes as a nested list in the query:
Some transformations and some data modeling later, the next picture shows the first report:
I will use this solution to expand on data modeling and data visualization.
Thanks for reading
Tom
Kommentar schreiben
Alex Powers (Montag, 04 Januar 2021 18:43)
Great article Tom! Planning on putting some of this into action immediately.