Batch Processing, the module 5 of Data Engineering Zoomcamp 2024
ãndréjust completed the module 5 of the Data Engineering Zoomcamp, an online course organized by DataTalksClub on GitHub.
this module is about batch processing and topics in this module are including:
Introduction to batch processing
First look at Spark
Working with Group By and Joins in Spark
Spark's DataFrames, Actions, Transformations, and UDFs
Exploring the PySpark API
Big thanks to the DataTalksClub team, especially to the instructor Alexey Grigorev !
#dezoomcamp #dataengineering #learn_in_public
