Saturday, March 8, 2025

Databricks' DABs

If you're writing Databricks code in an IDE that is to be run on an ad hoc basis (rather than some CI/CD pipeline) you might want to use the Databricks VSCode plugin. This will automatically build your Data Asset Bundle for you. Upon signing in, a databricks.yml file will be created at the root of your project. It contains the minimum amount of information to deploy your code to .bundle in your DBx home directory under a sub-folder called the bundle's name field.

You can also deploy bundles via VSCode. Configure a root_path under workspace in databricks.yml and when you press the cloud button on the BUNDLE RESOURCE EXPLORER pane withing the Databricks module:

Upload via the cloud button
the bundle will be uploaded to the workspace and directory specified. You can, of course, use the databricks CLI. But for ad hoc uploads, VS Code is very convenient. By default, deployments are to /Users/${workspace.current_user.userName}/.bundle/${bundle.target}/${bundle.name}.

Configuration can be for development of production mode. The advantage of development is "turns off any schedules and automatic triggers for jobs and turns on development mode for Delta Live Tables pipelines. This lets developers execute and test their code without impacting production data or resources." [1]

[1] Data Engineering With Databricks, Cookbook.

No comments:

Post a Comment