This repository contains code examples found in the book, Spark: The Definitive Guide by Chambers and Zaharia. The book at times skips preliminary steps required for the code to run completely, and these examples help readers ensure they have a complete working set of code as they read and learn Spark.
These examples are run using pyspark 3.1.1 installed via Conda and java 1.8. Refer to this link to setup pyspark correctly on an Ubuntu 18.04 OS: https://github.com/bspivey/Spark-The-Definitive-Guide/blob/main/README.md