January 18, 2022|tutorial|

how to refresh power bi semantic models (former datasets) from data factory with managed identity.

introduction.

When combining Azure and Power BI to build enterprise data solutions, sooner or later you will find yourself with the requirement to refresh semantic models from Azure Data Factory. The simplest way to do this is by using the Power BI API “refreshes” endpoint, which can easily be called by a web activity from data factory. Before that, the data factory needs to be given appropriate rights on the Power BI workspace to be able to perform the refresh call successfully. The go-to-suggestion to be found on the internet is to create a service principal / app registration that data factory uses to authenticate to Power BI. In fact, it is possible to use data factory’s managed identity to do that – skipping the hassle that comes along with a service principal (i.e. storing and retrieving secrets from Azure Key Vault, administration etc.). With the arising of Fabric the way of authenticating has changed once more. Please check out the article on how to call power bi rest apis from fabric data pipelines if you want to refresh your semantic model from a Fabric data Pipeline. Note, this post focusses on the access / connection part of Azure Data Factory and Power BI. If you’d like to find out how to refresh single tables or even partitions in a Power BI semantic model, check out this blog post here.

[Edit – 2024-04-15 – last week a new activity was released to Fabric Data Pipelines which refreshes whole semantic models. There has been a frequent question of whether the new activity would also work in Azure Data Factory. Well, as per this comment I found on LinkedIn, at least for now, we still need to go for a Web activity as described in this article. But perhaps, this might change in the future, too]

prerequisites.

1. A published semantic model in Power BI Service (Missing data to play around? Check out this blog post to get some proper sample data)
2. An Azure Data Factory
3. User rights to create security groups in Azure and access to the Power BI tenant settings

1. Create a security group in Azure and add the managed identity of the data factory to it

In Azure portal, find Azure Service Groups and click on New group


Give it a name, select the subscription and click create


Click on the created group and search for your data factory under Members > Add members. Then press select.

2. Align Power BI tenant settings to allow service principals to use APIs and add security group

Find the Admin portal in Power BI Service

In Tenant settings, enable to Allow service principals to use Power BI APIs, add the created group and click Apply. Note, make sure your service principal is not laying in a security group that has the setting Allow service principals to use read-only admin APIs enabled because your service principal would then not be allowed to use the POST method on the refreshes REST API call.


Search for the data factory name in the top field and add it with the Member role

3. Build web activity in Azure Data Factory pipeline

In Azure Data Factory Studio, drag a web activity into your pipeline


Fill in the settings accordingly

The URL follows this pattern:

https://api.powerbi.com/v1.0/myorg/groups/[workspaceID]/datasets/[datasetID]/refreshes

Both the workspaceID and the datasetID can be copied from the address / URL bar by clicking onto the semantic model in Power BI Service. See appendix for more information.

The Method is Post

Paste the following into the Body

{“notifyOption” : “NoNotification”}

Pick System Assigned Managed Identity (Legacy: Managed Identity) as Authentication

Lastly, set Resource to

https://analysis.windows.net/powerbi/api

end.

Try debugging the pipeline and check whether a semantic model refresh has started in Power BI Service:


Click on the semantic model Settings in Power BI


Press Refresh history


Note, the API call itself is asynchronous meaning a response is directly sent back. Therefore, you do not know whether the semantic model refresh has actually succeeded. The successful execution of the ADF web activity does only mean that a refresh could be triggered. To check the status of the refresh, you can use the same API call as above, but with the GET method. This call returns a refresh history, where one can filter for the triggered run by using the RequestId from the response of the triggering call.

appendix.

To retrieve the correct workspaceID and datasetID, navigate to the Power BI workspace and click on the semantic model:

In the URL bar, you can now find the workspaceID between “groups/” and “/datasets” (first box: red) and the datasetID between “datasets/” and “/details” (second box: purple):

21 Comments

  1. Rama November 4, 2022 at 10:35 pm - Reply

    Hi!
    I’ve followed the steps, but I keep getting an error: “Invoking Web Activity failed with HttpStatusCode – ‘401 : Unauthorized’, message – ””.
    Any idea what am I doing wrong?

    • tackytechtom November 10, 2022 at 7:56 pm - Reply

      Hi Rama :)

      Thanks for your message. It appears that your data factory still lacks the permission to call the API on the dataset… To be honest it is a hard one to know what exactly the problem is. I’d suggest dropping me a contact message or connect with me on LinkedIn so we can check out the problem together.

      /Tom

  2. Simen November 18, 2022 at 5:22 pm - Reply

    Hi!

    Just a suggestion: We have had issues when using Web Activities in this fashion with AutoResolveIntegrationRuntime to call the Power BI REST API. It may trigger a DDOS Denial Policy in Power BI Service. This can be avoided by using the Self-hosted Integrated Runtime instead of the AutoResolveIntegrationRuntime.

    -Simen

    • tackytechtom November 19, 2022 at 8:33 am - Reply

      Hi Simen!
      That is a great point! I have not experienced this issue myself, probably because I mostly go with self-hosted IRs as well.
      Thanks for sharing! :)

      /Tom

  3. Chris November 22, 2022 at 5:20 pm - Reply

    Hello,

    I followed the instructions.

    When I use this URL: https://api.powerbi.com/v1.0/myorg/groups/%5BworkspaceID%5D/datasets/%5BdatasetID%5D/refreshes

    I get the following error message “Failure type: User configuration issue Details: {“error”:{“code”:”ItemNotFound”,”message”:”Dataset XXXX is not found! please verify datasetId is correct and user have sufficient permissions.”}}”

    When I used this URL: https://api.powerbi.com/v1.0/myorg/datasets/%5BdatasetID%5D/refreshes

    I get the following error message “Failure type: User configuration issue Details: {“Message”:”API is not accessible for application”}”

    Is there a simple way to check if Datafactory can talk to powerBI? Also, my integration runtime is using AutoResolve, what does this mean compared to self hosted IRs?

    Thanks,

    Chris

    • tackytechtom November 22, 2022 at 7:13 pm - Reply

      Hi Chris!

      I updated the blog post with an appendix. This part explains in more detail how to retrieve the correct WorkspaceID and DatasetIDs. Please, check that out and try it again.

      If it still does not work, just shoot me a message or connect with me on LinkedIn and we can look into it together.

      /Tom

    • Henk April 12, 2023 at 6:04 am - Reply

      This error kept me busy as well for a while. I found the trick not descibed in this (more then good) blog.
      In your workspace settings under advanced you will find a checkbox with the text: “Allow contributors to update the app for this workspace”
      You should enable this option. Then your refresh will succeed

      • tackytechtom April 13, 2023 at 3:30 am - Reply

        Hi Henk!

        Thanks for sharing! This will hopefully help others that experience the same issue :)

        /Tom

  4. Matias November 23, 2022 at 9:59 am - Reply

    Hi Tom, I have the same problems that Chris, because i forgot to include the app as a member of the workspace and now it works ok. Moreover, I first got a token and put it instead of authentication.

    • tackytechtom November 23, 2022 at 5:37 pm - Reply

      Hi Matias!

      Great that your pipeline is working! Chris and I had a quick chat and his issue was actually that the service principal was also laying in a security group that had the Allow service principals to use read-only admin APIs enabled. Because of that the POST method could not be invoked. I edited the blog post accordingly to watch out for this one.

      Just one thought about the token. The idea of this blog post is to use managed identity, so you do not need to fetch a token beforehand. :)

      /Tom

  5. Helen December 21, 2022 at 4:28 am - Reply

    Hello,
    Thank you very much. I can do it rightly based on your detailed instruction. I used the Get method to get refresh history.

    I have one question here. Is there any way that I can pull out all of the dataset refresh histories in the tenant? How to overcome the step of adding a security group as a member of all workspaces.

    Many thanks,
    Helen.

    • tackytechtom December 21, 2022 at 4:55 pm - Reply

      Hi Helen :)

      I do not know of any other way than adding the managed identity or the security group to each workspace. You could automate the process of adding the managed identity to each workspace with a Powershell script, though, if that would be of any help?

      /Tom

  6. James July 11, 2023 at 1:14 pm - Reply

    Does this method work on datasets in Power BI Embedded workspaces?

    • tackytechtom July 12, 2023 at 4:29 am - Reply

      Hi James,

      Thanks for your message! :)

      To be honest, I have never worked with Power BI embedded and hence do not dare to give you an answer on this…

      /Tom

  7. Apurve November 22, 2023 at 9:40 pm - Reply

    Hi ! thanks for the excellent blog ! I was wondering if there is a way to refresh ALL datasets in the workspace in single pipeline ? or the only way is to create activity for each ?

    • tackytechtom November 23, 2023 at 9:37 am - Reply

      Hi Dennis,
      I do not think that is possible and you would, indeed, need to loop over it in a for each activity :)
      /Tom

  8. Marty Supple February 5, 2024 at 9:33 pm - Reply

    For some reason getting a 403 Forbidden after following this pattern. Any ideas?

    • tackytechtom February 8, 2024 at 2:01 pm - Reply

      I followed this one up with Marty and he ensured that shortly after it worked. It happened because the change in Azure did not take effect immediately.
      /Tom

  9. Tim June 3, 2024 at 7:39 am - Reply

    Hi TackyTechTom, This has been so useful, thank you for the blog, a real game changer for us instead of using Power automate.

    My question, would you know how to add an IF condition to check if the semantic model is already in a refresh state, and if so, skip the refresh step?
    (I have been researching but yet to find a solution).
    Thank you.

    • tackytechtom June 3, 2024 at 10:31 am - Reply

      Hi Tim :)

      Thanks for your message! Here is what you can do:
      1) call the same (refreshes) endpoint with the GET method (instead of POST). Then you should receive the refresh history.
      2) in your if activity, reference the output of the previous activity
      3) take the very first member of the returned array and check its status. If this status is neither “completed” nor “failed”, then a refresh is currently in-progress, meaning you do not want trigger again, otherwise trigger a refresh.

      You could also implement a loop so that you keep checking after certain intervals whether a refresh is ongoing and only after the last run has been finished, you trigger a new one from the current pipeline run.

      Here some MS documentation that might help:
      https://learn.microsoft.com/en-us/rest/api/power-bi/datasets/get-refresh-history#in-progress-refresh-example

      Hope this helps :)

      /Tom

      • Tim June 3, 2024 at 12:27 pm - Reply

        Brilliant, thank you for the quick response! Seems fairly straight forward. I shall read through the MS docs – I missed those in my search.
        Thanks again.

Leave A Comment