Monitor Apache Spark from Jupyter Notebook
For the google summer of code final report of this project click here
SparkMonitor is an extension for Jupyter Notebook that enables the live monitoring of Apache Spark Jobs spawned from a notebook. The extension provides several features to monitor and debug a Spark job from within the notebook interface itself.
pip install sparkmonitor jupyter nbextension install sparkmonitor --py --user --symlink jupyter nbextension enable sparkmonitor --py --user jupyter serverextension enable --py --user sparkmonitor ipython profile create && echo "c.InteractiveShellApp.extensions.append('sparkmonitor.kernelextension')" >> $(ipython profile locate default)/ipython_kernel_config.py
docker run -it -p 8888:8888 krishnanr/sparkmonitor
At CERN, the SparkMonitor extension would find two main use cases: * Distributed analysis with ROOT and Apache Spark using the DistROOT module. Here is an example demonstrating this use case. * Integration with SWAN, A service for web based analysis, via a modified container image for SWAN user sessions.