June 30, 2022|tutorial|

how to manage azure-key-vault backed scopes in databricks.

introduction.

In case your Azure Databricks notebook needs to authenticate to another system for pulling or pushing purposes, you should consider Databricks’ in-built secret scope service to fetch keys and credentials. Another use case could be the authentication to an external Hive store (i.e. with an Azure SQL Database) where you can reference the credentials from a secret-scope in the cluster configuration (instead of storing it in plain text). Databricks provides you with two alternatives for this scenario: Databricks-backed or Azure-Key-Vault backed secret-scopes. While Databricks gives you a User Interface to connect your Databricks instance with an Azure Key Vault, you need to write code to manage the scopes and connections afterwards. This blog post delivers a simple python script for this exact purpose. It has a similar structure as the support script for managing service principals in Azure Databricks. In fact, the adding and managing of roles and service principals is the logical next step after you have set up secret scopes in Azure Databricks.

:

prerequisites.

1. An Azure Databricks instance (the one in this blog post is named toms-databricks)
2. A cluster that you can run
4. An Azure Key Vault (the one in this blog post is named toms-key-vault)

.

plan of action.

1. Set up connection from Azure Databricks to Azure Key Vault
2. The API script for managing secret-scopes and acess in Azure Databricks
3. How to add the python script to your workspace

.

1. Set up connection from Azure Databricks to Azure Key Vault

First, let’s connect Azure Databricks to the Azure Key Vault. For this you need the Vault URI and ResourceID that you can get from the Properties section from your Key Vault in the Azure portal:

.

Next, you need to create a secret scope by opening the createScope window of your Azure Databricks instance. Use the following URL snippet. Note, the URL is case sensitive and the last part needs to be written as createScope:

https://<databricks-instance>#secrets/createScope

.

After creating the connection we can go back to the Access policies section of our Key Vault and we should now spot AzureDatabricks under application with two Secret Permissions


By the way, this is the only way you can see the connection in a GUI in either Azure or Databricks. So the only way you can do some sort of managing in a User Interface, is in that section. However, managing the secret scope from the Databricks side, you need to write and execute code. In the next section, you are provided with a simople script that should get you going quickly.

.

2. The API script for managing secret-scopes and acess in Azure Databricks

You can paste and run the subsequent Python script in to a notebook in Databricks. There is a short explanation per cell underneath the code.

# Databricks notebook source
# MAGIC %md ## maintenance notebook: manage-key-vault-secrets

# COMMAND ----------

# MAGIC %md ### import of packages

# COMMAND ----------

import pandas
import json
import requests

# COMMAND ----------

# MAGIC %md ### define variables

# COMMAND ----------

pat           = 'EnterPATHere'           # paste PAT. Get it from settings > user settings
workspaceURL  = 'EnterWorkspaceURLHere'  # paste the workspace url in the format of 'https://adb-1234567.89.azuredatabricks.net' Note, the URL must not end with '/'

# COMMAND ----------

# MAGIC %md ### list secret scopes

# COMMAND ----------

response = requests.get(workspaceURL + '/api/2.0/secrets/scopes/list',\
                headers = {'Authorization' : 'Bearer '+ pat,\
                'Content-Type': 'application/json'})

pandas.json_normalize(json.loads(response.content), record_path = 'scopes')

# COMMAND ----------

# MAGIC %md ### list secrets for each scopes

# COMMAND ----------

secretScope = 'toms-test-scope'    # paste secretScope. Get secretScopeName from cell above (name)

response = requests.get(workspaceURL + '/api/2.0/secrets/list?scope=' + secretScope,\
                headers = {'Authorization' : 'Bearer '+ pat,\
                'Content-Type': 'application/json'})

pandas.json_normalize(json.loads(response.content), record_path = 'secrets')

# COMMAND ----------

# MAGIC %md ### list ACL for secret scope

# COMMAND ----------

secretScope = 'toms-test-scope'    # paste secretScope. Get secretScopeName from cell above

response = requests.get(workspaceURL + '/api/2.0/secrets/acls/list?scope=' + secretScope,\
                headers = {'Authorization' : 'Bearer '+ pat,\
                'Content-Type': 'application/json'})

pandas.json_normalize(json.loads(response.content), record_path = 'items')

# COMMAND ----------

# MAGIC %md ### set ACLs per secret scope and principal / group

# COMMAND ----------

secretScope = 'toms-test-scope'    # paste secretScope. Get secretScopeName from cell above
principal   = 'users'          # paste user group. 
permission  = 'READ'                  # paste permission: READ, WRITE, MANAGE

payload_raw = {
                'scope': secretScope,
                'principal': principal,
                'permission': permission
              }

payload = json.loads(json.dumps(payload_raw))

response = requests.post(workspaceURL + '/api/2.0/secrets/acls/put',\
                headers = {'Authorization' : 'Bearer '+ pat,\
                'Content-Type': 'application/json'},\
                data=json.dumps(payload))

pandas.json_normalize(json.loads(response.content))

# COMMAND ----------

# MAGIC %md ### delete ACLs per secret scope and principal / group

# COMMAND ----------

secretScope = 'toms-test-scope'    # paste secretScope. Get secretScopeName from cell above
principal   = 'users'          # paste user group. 

payload_raw = {
                'scope': secretScope,
                'principal': principal
              }

payload = json.loads(json.dumps(payload_raw))

response = requests.post(workspaceURL + '/api/2.0/secrets/acls/delete',\
                headers = {'Authorization' : 'Bearer '+ pat,\
                'Content-Type': 'application/json'},\
                data=json.dumps(payload))

pandas.json_normalize(json.loads(response.content))

# COMMAND ----------

.

List which secret scopes there are in your Azure Databricks instance:

.

List all the secrets that Azure Databricks has access to per secret-scope:

.

Set Access Control List per secret scope and users, groups and service principals:

.

List all users, groups and service principals and its permissions per secret scope:

.

Remove permissions from secret scopes:

.

3. How to add the python script to your Azure Databricks workspace

One way is to copy and paste the code into Notepad, save it as yourfilename.py and upload it to databricks:

.

end.

Personally, I prefer having a dedicated maintenance folder with scripts like this (see here for a script managing service principals in Azure Databricks), so I am quickly able to change things just by running the already written and prepared script.

One Comment

  1. […] have happened to have written a blog post about this where a full Python script is provided to manage secret scopes in Azure […]

Leave A Comment