About this Event
Citizen Data Engineer –
Tiptoeing using Python for Data Engineering
Intended Audience: Citizen Data Engineer, BI Developers, Data Analysts, Analytics Engineers
Level: 200-300 (Intermediate)
Low Code Data transformation, Data modeling, Data analysis, SQL
Duration: 9:00 – 17:00 (including 60min lunch and 2*15min coffee break)
Trainer: Tom Martens (Microsoft Data Platform MVP, Speaker, Book Author, Datamonster)
Prerequisites (some are softer, one is hard):
· Basic understanding of data modeling (star schema or dimensional modeling)
· Ability to access your Fabric-enabled workspace in your Power BI/Fabric tenant to create and query Fabric items like lakehouses, delta parquet tables, and semantic models from your laptop. The Fabric-enabled workspace can either reside inside a corporate Fabric or private tenant.
· If you do not have access to your own Fabric-enabled workspace, this will be provided by datamonster e.V.
· Bring your own laptop (no special programs are required), WIFI access is a must
· First experience with programming languages like R or Python (helpful but not required)
Course Outcome:
· Create and Execute notebooks inside Microsoft Fabric
· Use Python to transform and shape data
· Use Python to write to and read from delta tables
· Use Python to move data to the various stages of a lakehouse
· Understand the benefits of a medallion architecture
Abstract
This course introduces tackling data engineering tasks using Python and notebooks within Microsoft Fabric. In addition to an introduction to Python, modern lakehouse architecture is explained. You’ll be introduced to the skills necessary to succeed as a data engineer and how to apply these skills practically using Microsoft Fabric!
This course includes practical examples that any Citizen Data Engineer can immediately apply in their day-to-day job and many valuable tips related explicitly to tackling typical data engineering tasks.
Even though Microsoft Fabric is used during the course, using Python for data engineering with Spark can be combined with many other platforms that harness the power of Spark.
Agenda
1. Introduction
a. Course Overview
b. Goals and outcomes
c. Setting the expectations
2. Microsoft Fabric workspaces and notebooks
a. What makes a Fabric-enabled workspace unique: about Spark runtimes and environments
b. Introduction to notebooks
3. The relationship of lakehouses and notebooks
a. Creating a lakehouse
b. The default lakehouse of a notebook
4. Adding data to the lakehouse
a. Understanding methods to ingest data to the lakehouse
b. Inspecting data located in the lakehouse
5. Introduction to Python
a. What is Python
b. Python Variables
c. Python Statements
d. Python Control Structures
e. Python Data Structures
6. Introduction to the Delta Parquet format
a. What is the Delta Parquet file format
b. What is a Spark dataframe
7. Creating delta parquet tables using PySpark
a. What is PySpark
b. Writing data as a delta table
8. What is a delta tables, and why it’s crucial to modern lakehouses
a. Modern Lakehouse and the Medallion architecture
b. Delta Tables in the context of spark and RDD’s
c. A spark dataframe is not the same as a Pandas dataframe
9. After some concepts now up to the data shaping using PySpark
a. Shaping data using PySpark, including method chaining and user-defined functions
b. Introduction to the medallion architecture
c. Using the UPSERT method to move data between the different stages of a medallion architecture
10. Create a semantic model using direct lake
a. Understand different storage modes (Import, DirectQuery, Direct Lake)
b. Default semantic model vs custom semantic models
11. Inspecting a semantic model using Sempy
a. What is Sempy
b. Use Sempy to inspect a semantic model
12. Orchestration of notebooks
a. Automate notebook execution
b. Data pipelines and other methods
13. A little Quiz and Closing
a. Quiz
b. Closing
Parkmöglichkeiten
Eigene Parkplätze der oh22information services GmbH
- Anzahl: 12 firmeneigene Parkplätze mit 2 E-Auto Ladesäulen
Alternative Parkmöglichkeiten:
Parkplatz am Palastweiher
- Adresse: Palastweiher, 53639 Königswinter
- Entfernung: Ca. 8 Minuten Fußweg zur Kellerstraße 3
- Kosten: 2,00 €/Stunde, 7,00 € Tageshöchstgebühr
- Maximalparkdauer: Keine Beschränkung
- Parkplatz Drachenfelsstraße
- Adresse: Drachenfelsstraße, 53639 Königswinter
- Entfernung: Ca. 10 Minuten Fußweg zur Kellerstraße 3
- Kosten: 3,50 €/Stunde, 7,00 € Tageshöchstgebühr
- Maximalparkdauer: Keine Beschränkung
- Parkhaus Bahnhof
- Adresse: Bahnhofstraße 17, 53639 Königswinter
- Entfernung: Ca. 9 Minuten Fußweg zur Kellerstraße 3
- Kosten: 1,70 €/Stunde, 7,00 € Tageshöchstgebühr
- Maximalparkdauer: Keine Beschränkung
- Parkhaus Maritim Hotel
- Adresse: Hauptstraße 495, 53639 Königswinter
- Entfernung: Ca. 6 Minuten Fußweg zur Kellerstraße 3
- Kosten: 2,50 €/Stunde (1. bis 10. Stunde), 27,00 € Tageshöchstgebühr
- Maximalparkdauer: Keine Beschränkung
Anfahrt mit öffentlichen Verkehrsmitteln
- Ab Siegburg:
Straßenbahnlinie U66 Richtung Bad Honnef zur Haltestelle "Königswinter Fähre" (Fahrtzeit ca. 50-60 Minuten), anschließend 300 m Fußweg zur Kellerstraße 3. - Ab Bonn Hbf:
Straßenbahnlinie U66 Richtung Richtung Bad Honnef zur Haltestelle "Königswinter Fähre" (Fahrtzeit ca. 30 Minuten) oder Regionalexpress (RE8) bis Bahnhof Königswinter (Fußweg etwa 10-15 Minuten). - Ab Köln:
Direkte Regionalzüge (RE8) nach Königswinter (Fahrtzeit ca. 45 Minuten), Fußweg vom Bahnhof ebenfalls 10-15 Minuten. Alternativ: ICE nach Bonn Hbf und Umstieg in die U66. - Ab Koblenz:
Regionalexpress RE8 nach Königswinter (Fahrtzeit ca. 50 Minuten), Fußweg 10-15 Minuten vom Bahnhof.
Hotelempfehlungen
Maritim Hotel Königswinter
- Adresse: Rheinallee 3, 53639 Königswinter
- Entfernung: Ca. 500 m zur Kellerstraße 3
Storyhotel Bergischer Hof Königswinter
- Adresse: Drachenfelsstraße 33, 53639 Königswinter
- Entfernung: Ca. 190 m zur Kellerstraße 3
- Das Hotel Krone
- Adresse: Hauptstraße 374, 53639 Königswinter
- Entfernung: Ca. 150 m zur Kellerstraße 3
Event Venue & Nearby Stays
oh22information services GmbH, Kellerstraße 3, Königswinter, Germany
EUR 376.56 to EUR 502.11