Project Jupyter created an indispensable tool for every Data Scientist that finally made exploring and visualizing data a fun thing to do: Jupyter Notebooks. Although the Notebooks app works great, they built a new and more powerful and extensible front end: JupyterLab.
Table of Contents
Tell me, where does it hurt?
Finally, I wanted to make the switch and tried to install JupyterLab 1.0.4 based on a Vagrant box (
ubuntu/bionic64). Until that point it was all fun and games. As recommended in the official docs, I went with Anaconda to install the various required Python packages, as I didn’t want to drown in a hot lava sea down in the dependency hell. In the past, I had good experiences installing Jupyter Notebook using Anaconda. However, I couldn’t get JupyterLab installed with Anaconda this time and wasted several hours scratching my head. Let me give you some impressions:
- Packages I like to use in notebooks (such as
plotlyfor data visualization) ended up in other virtual environments than the actual
jupyterpackage, although I never specified a particular environment name. Python virtual environments are still some kind of a book of seven seals for me.
pandasin a notebook gave me complaints about old versions of
numpy. I had hoped that Anaconda took automatically care of this matter 🙁
condawarned me about being out of date (although I installed the latest version available from the Anaconda website). Trying to update it as documented gave me error messages for which my intensive online research didn’t deliver any results to solve the problem.
I’ll be honest: Probably I’m simply to stupid to use Anaconda. I found that it has rarely been true that a widely used software was buggy, when i had my issues with it – more often the bug was sitting in front of my computer. In fact Anaconda has the ability to not only install Python packages, but also non-Python packages, which are required for many Python packages to work properly. However, after trying very hard for several hours, I chose to take another approach.
Running the installation with pip3
I took the easiest route on the map and installed all my packages with the Python 3 Package Manager
pip, more specifically
pip3. One immediate observation I made was a much quicker download and installation process – it felt faster by magnitudes. On the other hand, you really need to know which packages you need to install on Ubuntu and for your local Python 3 installation. An article from idroot.us served as a guideline to this setup process.
The commands shown here have been tested on a Ubuntu 18.04 Vagrant box, but work on any other Ubuntu system as well.
At first, we need to install a set of packages on the operating system level to enable the usage of Python 3.
sudo apt update sudo apt install -y \ python3 \ python3-pip \ python3-dev \ curl
Running installations with
apt on Ubuntu and Debian requires root permissions. Because of this, do not forget to use
sudo. As not all flavors of Ubuntu out there are always fully equipped with all kinds of tools, I added
curl to the list of packages to be installed. We’ll need this one later.
Popular packages for Data Science purposes
Working with Notebooks in JupyterLab doesn’t make any sense without some prominent libraries from the Data Science community. These will be installed using the now available
pip3 install pandas \ fastparquet \ pyarrow \ tables \ plotly \ seaborn \ xlrd
pandas is the most-used library for data analysis in Python,
pyarrow are packages that will allow you to persist your raw or processed data to disk into compressed formats which can be reloaded into memory very fast. I personally prefer the HDF format (fueled by the
tables package), as it is very tolerant about the data types in your data frames and as it is also amazingly quick when loading Gigabytes of data.
seaborn are great options to visualize data in your notebook, as well as for interactive plots to be used in presentations. Installing
seaborn, another more basic visualization library (
matplotlib) will be installed as a dependency –
pip3 will take care of that.
xlrd lets you use pandas to read Excel sheets. Pretty useful.
JupyterLab requires Jupyter Core to be installed. One part of it is installed on the operating system level, the other one again using
sudo apt install -y ipython pip3 install jupyter jupyterlab
In the upcoming steps we will want to use the CLI of the newly installed
jupyter. For this purpose we need to make the path of the CLI known to our shell:
export PATH=$PATH:~/.local/bin source ~/.bashrc
In comparison to the established Notebooks app, JupyterLab has a redesigned the component which allows developers to build extensions for JupyterLab to make data analysis even more fun. This will require a supported version of NodeJS. We’ll go with the LTS release.
curl -o- https://raw.githubusercontent.com/creationix/nvm/v0.33.8/install.sh | bash export NVM_DIR="$HOME/.nvm" [ -s "$NVM_DIR/nvm.sh" ] && \. "$NVM_DIR/nvm.sh" [ -s "$NVM_DIR/bash_completion" ] && \. "$NVM_DIR/bash_completion" nvm install --lts
Although being fairly new, the community around JupyterLab has already implemented some nice extensions, which we can now install using the
jupyter labextension install command:
jupyter labextension install \ @jupyterlab/toc \ jupyterlab-chart-editor \ jupyterlab-spreadsheet
@jupyterlab/toc displays a clickable table of contents in JupyterLab’s sidebar, which is very useful in longer notebooks.
@jupyterlab/plotly-extension is required for Plotly charts to work properly in JupyterLab. That was one obstacle that kept me from switching to JupyterLab until now.
jupyterlab-chart-editor allows a WYSIWYG-like manipulation of charts within the graph.
jupyterlab-spreadsheet makes it possible to display Excel sheets directly in the JupyterLab canvas. You can display all installed extensions using the command:
jupyter labextension list
UPDATE (2019-01-29): I had to remove the extension
@jupyterlab/plotly-extension from my list of extensions to install, so I also updated it in this post. The extension caused the setup process to stall indefinitely (or at least so long that I lost my patience :-)) Also looks like this extension was deprecated, but apparently there is a way to get Plotly setup into JupyterLab and according to this GitHub Issue it should work, but I haven’t tested it yet myself.
Run JupyterLab as a system service
Now, we will register the command to start JupyterLab as a Linux
CAUTION: The procedure shown below will configure JupyterLab to run listening on all network interfaces and without any authentication! Also, it will only be possible to connect to JupyterLab using HTTP. HTTPS would require setting up certificates. If you are looking for instructions to secure your Jupyter installation with Let’s Encrypt, check out this blog post.
# Define home directory and data directory (adjust to your needs) USER_HOME_DIR=$(echo ~) DATA_DIR="/mydata" # Create the data directory mkdir -p ~/.config/systemd/user/ # Create the service by echoing the text into a service unit file echo "[Unit] Description=JupyterLab [Service] ExecStart=$USER_HOME_DIR/.local/bin/jupyter lab --no-browser --port=8888 --ip=0.0.0.0 --NotebookApp.token= --notebook-dir=$DATA_DIR WorkingDirectory=$DATA_DIR [Install] WantedBy=default.target" > ~/.config/systemd/user/jupyterlab.service
The above snippet will create the service definition. You may want to adjust the value
DATA_DIR. It points to the path on your Ubuntu box, to which JupyterLab will have access. Ideally this is where all your existing notebooks and data sets are residing.
To set a password for accessing JupyterLab, run the command:
jupyter notebook password, and type the password
To enable the service, we will use
systemctl --user enable. As we are setting the service up as a user-specific service, we need to run
loginctl enable-linger. Otherwise services would automatically be stopped as soon as there aren’t anymore open shells for our current user:
systemctl --user enable jupyterlab.service loginctl enable-linger
Here we go!
Let’s start up JupyterLab!
systemctl --user start jupyterlab
A few seconds later you should be able to open JupyterLab on:
http://<IP or hostname>:8888
If you experience any issues here, check the status of the service to get more details:
systemctl --user status jupyterlab
Now that you’ve launched your spaceship towards JupyterLab, I wish you luck working on your notebooks. Live long and prosper! 🙂
Image sources for this page:
- Jupiter with Io: pixabay.com