Spark on the Linux Cluster

To start an interactive Spark shell use:


$ module use -a  /lrz/sys/share/modules/extfiles
$ module load python or module load python/3.5_intel
$ module load spark
$ pyspark



PySpark in a Jupyter Notebook


$ module use -a  /lrz/sys/share/modules/extfiles
$ module load python or module load python/3.5_intel
$ module load spark
$ export PYSPARK_DRIVER_PYTHON=jupyter
$ export PYSPARK_DRIVER_PYTHON_OPTS='notebook'
$ pyspark


Now, this command should start a Jupyter Notebook in your web browser. Create a new notebook by clicking on ‘New’ > ‘Notebooks Python [default]’.