You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/using/specifics.md
+7-31Lines changed: 7 additions & 31 deletions
Original file line number
Diff line number
Diff line change
@@ -16,8 +16,6 @@ You can build a `pyspark-notebook` image (and also the downstream `all-spark-not
16
16
*`spark_version`: The Spark version to install (`3.0.0`).
17
17
*`hadoop_version`: The Hadoop version (`3.2`).
18
18
*`spark_checksum`: The package checksum (`BFE4540...`).
19
-
* Spark is shipped with a version of Py4J that has to be referenced in the `PYTHONPATH`.
20
-
*`py4j_version`: The Py4J version (`0.10.9`), see the tip below.
21
19
* Spark can run with different OpenJDK versions.
22
20
*`openjdk_version`: The version of (JRE headless) the OpenJDK distribution (`11`), see [Ubuntu packages](https://packages.ubuntu.com/search?keywords=openjdk).
23
21
@@ -27,47 +25,25 @@ For example here is how to build a `pyspark-notebook` image with Spark `2.4.6`,
# jupyter/pyspark-notebook spark-2.4.6 7ad7b5a9dbcd 4 minutes ago 3.44GB
42
-
43
-
# Check the Spark version
44
-
docker run -it --rm jupyter/pyspark-notebook:spark-2.4.6 pyspark --version
35
+
docker run -it --rm jupyter/pyspark-notebook:spark-2.4.7 pyspark --version
45
36
46
37
# Welcome to
47
38
# ____ __
48
39
# / __/__ ___ _____/ /__
49
40
# _\ \/ _ \/ _ `/ __/ '_/
50
-
# /___/ .__/\_,_/_/ /_/\_\ version 2.4.6
41
+
# /___/ .__/\_,_/_/ /_/\_\ version 2.4.7
51
42
# /_/
52
43
#
53
-
# Using Scala version 2.11.12, OpenJDK 64-Bit Server VM, 1.8.0_265
54
-
```
55
-
56
-
**Tip**: to get the version of Py4J shipped with Spark:
57
-
58
-
* Build a first image without changing `py4j_version` (it will not prevent the image to build it will just prevent Python to find the `pyspark` module),
59
-
* get the version (`ls /usr/local/spark/python/lib/`),
60
-
* set the version `--build-arg py4j_version=0.10.7`.
61
-
62
-
```bash
63
-
docker run -it --rm jupyter/pyspark-notebook:spark-2.4.6 ls /usr/local/spark/python/lib/
# Using Scala version 2.11.12, OpenJDK 64-Bit Server VM, 1.8.0_275
67
45
```
68
46
69
-
*Note: At the time of writing there is an issue preventing to use Spark `2.4.6` with Python `3.8`, see [this answer on SO](https://stackoverflow.com/a/62173969/4413446) for more information.*
70
-
71
47
### Usage Examples
72
48
73
49
The `jupyter/pyspark-notebook` and `jupyter/all-spark-notebook` images support the use of [Apache Spark](https://spark.apache.org/) in Python, R, and Scala notebooks. The following sections provide some examples of how to get started using them.
0 commit comments