Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Note
This article covers Databricks Connect for Databricks Runtime 13.3 LTS and above.
This article describes how to use Databricks Utilities with Databricks Connect for Python. Databricks Connect enables you to connect popular IDEs, notebook servers, and custom applications to Azure Databricks clusters. See What is Databricks Connect?. For the Scala version of this article, see Databricks Utilities with Databricks Connect for Scala.
Note
Before you begin to use Databricks Connect, you must set up the Databricks Connect client.
You use Databricks Connect to access Databricks Utilities as follows:
- Use the
WorkspaceClientclass'sdbutilsvariable to access Databricks Utilities. TheWorkspaceClientclass belongs to the Databricks SDK for Python and is included in Databricks Connect. - Use
dbutils.fsto access the Databricks Utilities fs utility. - Use
dbutils.secretsto access the Databricks Utilities secrets utility. - No Databricks Utilities functionality other than the preceding utilities are available through
dbutils.
Tip
You can also use the included Databricks SDK for Python to access any available Databricks REST API, not just the preceding Databricks Utilities APIs. See databricks-sdk on PyPI.
To initialize WorkspaceClient, you must provide enough information to authenticate an Databricks SDK with the workspace. For example, you can:
Hard-code the workspace URL and your access token directly within your code, and then initialize
WorkspaceClientas follows. Although this option is supported, Databricks does not recommend this option, as it can expose sensitive information, such as access tokens, if your code is checked into version control or otherwise shared:from databricks.sdk import WorkspaceClient w = WorkspaceClient(host = f"https://{retrieve_workspace_instance_name()}", token = retrieve_token())Create or specify a configuration profile that contains the fields
hostandtoken, and then intialize theWorkspaceClientas follows:from databricks.sdk import WorkspaceClient w = WorkspaceClient(profile = "<profile-name>")Set the environment variables
DATABRICKS_HOSTandDATABRICKS_TOKENin the same way you set them for Databricks Connect, and then initializeWorkspaceClientas follows:from databricks.sdk import WorkspaceClient w = WorkspaceClient()
The Databricks SDK for Python does not recognize the SPARK_REMOTE environment variable for Databricks Connect.
For additional Azure Databricks authentication options for the Databricks SDK for Python, as well as how to initialize AccountClient within the Databricks SDKs to access available Databricks REST APIs at the account level instead of at the workspace level, see databricks-sdk on PyPI.
The following example shows how to use the Databricks SDK for Python to automate Databricks Utilities. This example creates a file named zzz_hello.txt in a Unity Catalog volume's path within the workspace, reads the data from the file, and then deletes the file. This example assumes that the environment variables DATABRICKS_HOST and DATABRICKS_TOKEN have already been set:
from databricks.sdk import WorkspaceClient
w = WorkspaceClient()
file_path = "/Volumes/main/default/my-volume/zzz_hello.txt"
file_data = "Hello, Databricks!"
fs = w.dbutils.fs
fs.put(
file = file_path,
contents = file_data,
overwrite = True
)
print(fs.head(file_path))
fs.rm(file_path)
See also Interaction with dbutils in the Databricks SDK for Python documentation.