Working With Databricks DBFS (For Beginners)

The local file system in Databricks is known as the DBFS. This article explains the underlying concepts of DBFS for Databricks beginners and people who are new to cloud storage. Once you’ve grasped the concepts, the article shows you how to: What Is Databricks DBFS? Even if you’re new to the Cloud, you will be … Read more

How To Call One Databricks Notebook From Another

You will often want to reuse code from one Databricks notebook in another. This step-by-step beginner guide shows you how to: If you’re new to this technology, don’t worry. I assume that you know the basics of notebooks in Databricks. But that’s all you need to follow along. Typical Use Case Here is a typical … Read more

How To Run Databricks Notebooks In Parallel

There are several ways to run multiple notebooks in parallel in Databricks. You can also launch the same notebook concurrently. If this is a once-off task, you may simply want to use the Workspace interface to create and launch jobs in parallel. However, you can also create a “master” notebook that programmatically calls other notebooks … Read more

What Is A Databricks Workspace? (Explained)

Are you just getting started with Databricks? Getting confused about the difference between workspaces, notebooks, and clusters? This article tells you all you need to know about the important concept of a workspace on the Databricks platform. What Is A Databricks Workspace? A Databricks workspace is a shared environment where you can collaborate with others … Read more

Quick Intro To Apache Airflow For Managers

Apache Airflow is an open-source platform for creating, scheduling, and monitoring data pipelines. Because Airflow is written in Python, your developers can use Python to define the tasks and dependencies of the steps in a pipeline. They can also use version control, code reuse, and automated tests when developing workflows. Airflow’s scheduler provides an interactive … Read more

What Is A Databricks Notebook? (Explained)

If you’re used to traditional code editors like Visual Studio or even VBA macros in Excel, you may find that notebooks take a little getting used to. I did! A Databricks notebook is a web-based interface to an interactive computing environment. Notebooks are used to create and run code, visualize results, narrate data stories, and … Read more

Databricks Clusters – Explained For Beginners

Clusters are a key part of the Databricks platform. You use clusters to run notebooks and perform your data tasks. They are how you harness the distributed computing power of Spark. This article gives you a clear understanding of what clusters are, how they work, and how to create them. If you’ve just started using … Read more

Case Statements With Multiple Conditions In Snowflake (Examples)

A CASE statement lets you perform conditional logic in SQL. It’s like an if-then-else structure found in other programming languages. A CASE statement with multiple conditions evaluates more than one condition in its structure. This article is a practical walkthrough of using CASE statements with multiple conditions in Snowflake. It also covers nested CASE statements. … Read more

Streams In Snowflake – Beginner Guide

This article walks you through the basics of streams in Snowflake. We start with what they are for and move into more complex concepts such as their architecture and how they work. We’ll also dive into what you need to know to get started: Let’s get into it. Role Of Streams In Snowflake Streams are … Read more

Get Started With Python And Snowflake – Beginner Guide

This tutorial gets you started with creating scripts to read and write data in Snowflake. You will learn how to: When you’ve mastered these steps (they’re quite simple), I’ll go into several ways to secure the account details you use for the connection. Optional: Create A Virtual Environment For Practicing If you want to follow … Read more