Skip to main content

Why is this workshop relevant?

Companies today are struggling with the complexity of managing data. From data lakes to data warehouses. They contend with issues revolving around a single source of truth given extensive data silos and the cost of maintaining various data pipelines. One of the answers to these complexities has evolved in the form of Data Lakehouses. This platform architecture implements data structures and data management capabilities like those found in a data warehouse directly on top of a low-cost, flexible cloud data lake storage. By combining both the extensibility of data lakes and structures required for analytics and BI, a Data Lakehouse provides the best of both worlds.

Objectives

After the workshop, you will understand the concepts, differences and the purpose of a Data Lakehouse as well as key technological advancements, such as Delta Lakes, that have enabled this solution. You can identify whether your organization could and / or should implement this open data management architecture and the added benefits that come with it. Additionally, you will be able to set up your own Data Lakehouse in the public cloud and connect it to a BI report, putting theory into practice. Simultaneously, you will have learned more about how the cloud can help you gain access to data science, machine learning, and business analytics project related services. In the end, we want you to leave the workshop with a new tool in your toolbelt and a sandbox environment to create your very own data driven solutions.

What does this architecture provide?

  • Unified platform for data ingestion, data storage, and data analysis.
  • Easily store, access, and analyze large amounts of structured and unstructured data.
  • Reduce the cost of data storage and access.
  • Scalability to accommodate future data growth and usage.

Target audience

We advise a certain level of understanding of data architectures, and you should ideally bring some experience with Cloud platforms. The target audience includes all data professionals who want to learn how to optimize their data architecture. This includes data scientists, data engineers, data analysts, as well as leaders and managers of data teams. The implementation of a Data Lakehouse can be applied in organizations both small and large, no matter whether you work in a private company or in the public sector.
It should be noted that this technology is vendor specific and can be implemented using services provided by but not limited to: Microsoft, Amazon, Google, Snowflake and Databricks. A laptop with access to the internet is the only equipment that will be required.

Agenda

13:00 – 13:45 Data Lakehouse Framework
  • Motivation
  • Comparison to Data Warehouse
  • Requirements & Providers
  • Delta Lake – a key technology enabling Data Lakehouses
  • Build Overview
  • Data Lakehouse use cases
13:45 – 14:30 Data Lakehouse Architecture 1
  • Introduction
  • Provisioning the resources
  • Building & testing the Delta Lake tables
14:30 – 14:45 Coffee Break
14:45 – 15:45 Data Lakehouse Architecture 2
  • Building the Data Lakehouse layer
  • Running a BI Report on the Data Lakehouse
15:45 – 16:00 Deep Dive
  • Pros and Cons
  • Takeaways and next steps