Azure Databricks Level-400 Workshop - Agenda

Targeted Audience and Scenarios

Azure-Databricks Level-400 Workshop is aimed to upskill various audiences

Data Engineers
Data Scientists
SQL Developers
Developers
Solution Architects
Data Architects

This workshop content is useful in various scenarios,

POCs / AI Hacks - Developers can understand connecting to Blob Storage, Submitting the jobs, persisting/loading ML models etc. This can be very useful material to expedite the development at early POC stages
Self-learning through code samples
Best-Practices for Databricks Clusters (Interactive, Job, High-Concurrency)
Best practices for ADLA to Databricks Migration

Spark, Azure-Databricks overview

A brief introduction to Spark framework and the history of Big Data technologies. Why Spark framework have been widely adopted across the industries. An Overview on of Spark Modules including Spark Core (Map-Reduce), Datastructures, Streaming, SQL, GraphX. Databricks introduction and the Key differentiors of Databricks Spark in terms of Performance, Collabarative and Interactive features. Azure-Databricks benefits, deep integration of Azure-Databricks into Azure Data platform, Security, BI services.

Azure-Databricks Architecture

A detailed discussion around the Spark architecture followed by Azure Databricks Components.

Databricks Workspace, Developer Tools overview

An overview of Azure Databricks collaborative workspace and its components. Azure Databricks Developer tools discussion,

Databricks CLI
Filesystem utilities
Notebook workflow utilities
Widget, Secret, Library utilities

Azure Databricks CLI Lab

Azure Databricks - Developer Tools

Azure Databricks Lab for DBUtils such as Widgets, Notebooks, Library etc

Azure Databricks - DB Utils

Reading data from Azure Blob Storage in the databricks jobs

Azure Databricks - Azure Blob Storage

Reading data from Azure Data Lake Storage Gen2 in the databricks jobs

Azure Databricks - Azure Data Lake Storage Gen2

Read data from Azure Cosmos DB in the databricks jobs

Azure Databricks - Cosmos DB

Databricks Cluster Types and Best Practices

Azure-Databricks have various cluster types like Interactive Clusters, Job Clusters and High-Concurrency Clusters (formarly known as Serverless-pools). This section talks about selecting right cluster type depeding upon the scenario.

Submit databricks jobs using CLI and UI

Azure Databricks - Job Submission Lab 1

Create and submit Workflow Pipeline in Azure Data Factory V2 to Azure Databricks

Azure Databricks - Azure Datafactory V2 Job Pipline submission

Databricks Performance

In this section we will learn discuss about the performance improvements made by Azure Databricks.

Spark-SQL Overview

In this section we will discuss about ways to work with Structured data within Azure Databricks. We will learn the nuances of Managed, Un-managed tables and how to integrate external metastores like Hive.

Create a managed table and work with Spark SQL

Azure Databricks - Managed Tables

Machine Learning with Azure Databricks

An overview of Spark MLLib package and introduction to Statistical modeling also understand how to run Deep Learning models using Tensorflow on Azure Databricks.

Spark MLLib for Anomaly detection using Random Forests classification technique

Azure Databricks - Anomaly Detection

Implement batch predictions within Azure Databricks. You will also understand how to persist and load the model from Blob Storage within your Spark Jobs

Azure Databricks - Batch Predictions

Documentation

Azure Databricks Documentation

Link	Description
Azure Databricks - Microsoft	Azure Databricks Microsoft Documentation
Databricks Official Documentation	Azure Databricks official documentation from Databricks
Azure Databricks Sample Labs	Sample Labs in GitHub repository from Mahesh Balija

Share on

Twitter Facebook LinkedIn

Azure Databricks Level-400 Workshop

Mahesh Balija