Setting Up JupyterLab on Ubuntu Using pip3

This posts explains how to install and configure JupyterLab on Ubuntu Linux. Do not miss out on one of the greatest tools to work with data.

Project Jupyter created an indispensable tool for every Data Scientist that finally made exploring and visualizing data a fun thing to do: Jupyter Notebooks. Although the Notebooks app works great, they built a new and more powerful and extensible front end: JupyterLab.

Tell me, where does it hurt?

Finally, I wanted to make the switch and tried to install JupyterLab 1.0.4 based on a Vagrant box (ubuntu/bionic64). Until that point it was all fun and games. As recommended in the official docs, I went with Anaconda to install the various required Python packages, as I didn’t want to drown in a hot lava sea down in the dependency hell. In the past, I had good experiences installing Jupyter Notebook using Anaconda. However, I couldn’t get JupyterLab installed with Anaconda this time and wasted several hours scratching my head. Let me give you some impressions:

  • Packages I like to use in notebooks (such as plotly for data visualization) ended up in other virtual environments than the actual jupyter package, although I never specified a particular environment name. Python virtual environments are still some kind of a book of seven seals for me.
  • Importing pandas in a notebook gave me complaints about old versions of numpy. I had hoped that Anaconda took automatically care of this matter ๐Ÿ™
  • conda warned me about being out of date (although I installed the latest version available from the Anaconda website). Trying to update it as documented gave me error messages for which my intensive online research didn’t deliver any results to solve the problem.

I’ll be honest: Probably I’m simply to stupid to use Anaconda. I found that it has rarely been true that a widely used software was buggy, when i had my issues with it – more often the bug was sitting in front of my computer. In fact Anaconda has the ability to not only install Python packages, but also non-Python packages, which are required for many Python packages to work properly. However, after trying very hard for several hours, I chose to take another approach.

Running the installation with pip3

I took the easiest route on the map and installed all my packages with the Python 3 Package Manager pip, more specifically pip3. One immediate observation I made was a much quicker download and installation process – it felt faster by magnitudes. On the other hand, you really need to know which packages you need to install on Ubuntu and for your local Python 3 installation. An article from idroot.us served as a guideline to this setup process.

The commands shown here have been tested on a Ubuntu 18.04 Vagrant box, but work on any other Ubuntu system as well.

Python 3

At first, we need to install a set of packages on the operating system level to enable the usage of Python 3.

sudo apt update
sudo apt install -y \
   python3 \
   python3-pip \
   python3-dev \
   curl

Running installations with apt on Ubuntu and Debian requires root permissions. Because of this, do not forget to use sudo. As not all flavors of Ubuntu out there are always fully equipped with all kinds of tools, I added curl to the list of packages to be installed. We’ll need this one later.

Popular packages for Data Science purposes

Working with Notebooks in JupyterLab doesn’t make any sense without some prominent libraries from the Data Science community. These will be installed using the now available pip3 command:

pip3 install 
   pandas \
   fastparquet \
   pyarrow \
   tables \
   plotly \
   seaborn \
   xlrd

While pandas is the most-used library for data analysis in Python, fastparquet and pyarrow are packages that will allow you to persist your raw or processed data to disk into compressed formats which can be reloaded into memory very fast. I personally prefer the HDF format (fueled by the tables package), as it is very tolerant about the data types in your data frames and as it is also amazingly quick when loading Gigabytes of data.

plotly and seaborn are great options to visualize data in your notebook, as well as for interactive plots to be used in presentations. Installing seaborn, another more basic visualization library (matplotlib) will be installed as a dependency – pip3 will take care of that.

xlrd lets you use pandas to read Excel sheets. Pretty useful.

JupyterLab

JupyterLab requires Jupyter Core to be installed. One part of it is installed on the operating system level, the other one again using pip3:

sudo apt install -y ipython
pip3 install jupyter jupyterlab

In the upcoming steps we will want to use the CLI of the newly installed jupyter. For this purpose we need to make the path of the CLI known to our shell:

export PATH=$PATH:~/.local/bin
source ~/.bashrc 

JupyterLab Extensions

In comparison to the established Notebooks app, JupyterLab has a redesigned the component which allows developers to build extensions for JupyterLab to make data analysis even more fun. This will require a supported version of NodeJS. We’ll go with the LTS release.

curl -o- https://raw.githubusercontent.com/creationix/nvm/v0.33.8/install.sh | bash
export NVM_DIR="$HOME/.nvm"
[ -s "$NVM_DIR/nvm.sh" ] && \. "$NVM_DIR/nvm.sh"
[ -s "$NVM_DIR/bash_completion" ] && \. "$NVM_DIR/bash_completion"
nvm install --lts

Although being fairly new, the community around JupyterLab has already implemented some nice extensions, which we can now install using the jupyter labextension install command (be patient, this can take some time …):

jupyter labextension install \
   @jupyterlab/toc \
   jupyterlab-chart-editor \
   jupyterlab-spreadsheet

@jupyterlab/toc displays a clickable table of contents in JupyterLab’s sidebar, which is very useful in longer notebooks. @jupyterlab/plotly-extension is required for Plotly charts to work properly in JupyterLab. That was one obstacle that kept me from switching to JupyterLab until now. jupyterlab-chart-editor allows a WYSIWYG-like manipulation of charts within the graph. jupyterlab-spreadsheet makes it possible to display Excel sheets directly in the JupyterLab canvas. You can display all installed extensions using the command: jupyter labextension list

UPDATE (2019-01-29): I had to remove the extension @jupyterlab/plotly-extension from my list of extensions to install, so I also updated it in this post. The extension caused the setup process to stall indefinitely (or at least so long that I lost my patience :-)) Also looks like this extension was deprecated, but apparently there is a way to get Plotly setup into JupyterLab and according to this GitHub Issue it should work, but I haven’t tested it yet myself.

Run JupyterLab as a system service

Now, we will register the command to start JupyterLab as a Linux systemd unit.

CAUTION: The procedure shown below will configure JupyterLab to run listening on all network interfaces and without any authentication! Also, it will only be possible to connect to JupyterLab using HTTP. HTTPS would require setting up certificates. If you are looking for instructions to secure your Jupyter installation with Let’s Encrypt, check out this blog post.

Define home directory and data directory (adjust to your needs):

USER_HOME_DIR=$(echo ~)
DATA_DIR="/mydata"

Create the data directory:

mkdir -p ~/.config/systemd/user/

Create the service by echoing the text into a service unit file:

echo "[Unit]
Description=JupyterLab
[Service]
ExecStart=$USER_HOME_DIR/.local/bin/jupyter lab --no-browser --port=8888 --ip=0.0.0.0 --NotebookApp.token= --notebook-dir=$DATA_DIR
WorkingDirectory=$DATA_DIR
[Install]
WantedBy=default.target" > ~/.config/systemd/user/jupyterlab.service

The above snippet will create the service definition. You may want to adjust the value DATA_DIR. It points to the path on your Ubuntu box, to which JupyterLab will have access. Ideally this is where all your existing notebooks and data sets are residing.

To set a password for accessing JupyterLab, run the command: jupyter notebook password, and type the password

To enable the service, we will use systemctl --user enable. As we are setting the service up as a user-specific service, we need to run loginctl enable-linger. Otherwise services would automatically be stopped as soon as there aren’t anymore open shells for our current user:

systemctl --user enable jupyterlab.service
loginctl enable-linger

Here we go!

Let’s start up JupyterLab!

systemctl --user start jupyterlab

A few seconds later you should be able to open JupyterLab on:

http://<IP or hostname>:8888

If you experience any issues here, check the status of the service to get more details:

systemctl --user status jupyterlab

Final thoughts

Now that you’ve launched your spaceship towards JupyterLab, I wish you luck working on your notebooks. Live long and prosper! ๐Ÿ™‚

Leave a Reply

CAPTCHA


The following GDPR rules must be read and accepted:
This form collects your name, email and content so that I can keep track of the comments placed on the website. Your current IP address will also be collected in order to prevent spam comments from automated bots. For more info check the privacy policy where you can educate yourself on where, how and why your data is stored.