Get Started with Databricks Platform Administration
Contact us to book this courseData Engineering
On-Site, Virtual
1 day
In this course, you will learn the basics of platform administration on the Databricks Data Intelligence Platform. It offers a comprehensive overview of the Unity Catalog, a vital component for effective data governance within Databricks environments. Divided into five modules, it begins with a detailed introduction to Databricks infrastructure and its data intelligence platform, including an in-depth walkthrough of the Databricks Workspace. You will explore data governance principles within Unity Catalog, covering its key concepts, architecture, and roles. The course further emphasizes managing Unity Catalog metastores and compute resources, including clusters and SQL warehouses. Finally, you'll master data access control by learning about privileges, fine-grained access, and how to govern data objects. By the end, you will be equipped with essential skills to administer the Unity Catalog to implement effective data governance, optimize compute resources, and enforce robust data security strategies.
Objectives
By the end of this course, you'll be able to:
-
Describe the available compute options for workloads performed on the Databricks Data Intelligence Platform.
-
List the products and features Databricks offers for different data-centric needs within the Databricks Platform.
-
Navigate the Databricks Workspace UI.
-
Explain the significance of data governance and the role of Unity Catalog in enhancing security and management in a data-driven landscape.
-
Analyze the limitations of traditional Databricks security models and how Unity Catalog improves security by separating security elements from workspaces.
-
Apply account-level identity management skills by creating, deleting, and assigning metastores in Unity Catalog.
-
Evaluate cluster types and configurations, including scalability options and cost optimization strategies within Databricks.
-
Implement fine-grained access control methods such as column masking and row filtering to enforce data security in the Unity Catalog.
-
Apply principles of data governance by creating and managing data structures and configuring access control in Unity Catalog using Databricks tools.
Prerequisites
The content was developed for participants with these skills/knowledge/abilities:
- Basic knowledge of cloud computing and SQL concepts such as networking basics, SQL commands, aggregate functions, filters and sorting, indexes, tables, and views.
- Basic knowledge of Python programming, Jupyter notebook interface, and PySpark fundamentals.
Course outline
- Databricks Infrastructure
- Databricks Data Intelligence Platform
- Unity Catalog Overview
- Databricks Workspace Walkthrough
- Data Governance in Unity Catalog
- Managing Principles in Unity Catalog
- Managing Unity Catalog Metastores
- Compute Resources and Unity Catalog
- Data Access Control in Unity Catalog