2020-11-25

Jupyter and Maven

After upgrading Jupyter Lab, I was editing Java files before compiling them with Maven.  To my surprise, I started getting error messages like:

[ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) on project foo: Compilation failure: Compilation failure: 

[ERROR] /build/src/main/java/foo/bar/.ipynb_checkpoints/Foo-checkpoint.java:[36,8] class Foo is public, should be declared in a file named Foo.java

[ERROR] /build/src/main/java/foo/bar/Foo.java:[36,8] duplicate class: foo.bar.Foo

This happened because Jupyter Lab is auto-saving files (nice!) and Maven is attempting to compile them (not nice).  After much digging around and experimentation, I added the following to my pom.xml file as a child of the project element:

    <build>
        <plugins>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-compiler-plugin</artifactId>
                <configuration>
                      <excludes>
                            <exclude>**/.ipynb_checkpoints/**</exclude>
                      </excludes>
                </configuration>
            </plugin>
        </plugins>
    </build>

Running Jupyter Lab in Docker

Pre-requisites: Docker, Bourne shell

 I upgraded my Jupyter setup today and I though I would publish the script I am using to launch it:

#!/bin/sh
cd $HOME/Documents/jupyter
docker pull jupyter/datascience-notebook
docker build -t jupyter - <<EOD
FROM jupyter/datascience-notebook
RUN pip install --upgrade pip
RUN pip install rdflib
ENTRYPOINT ["jupyter", "lab", "--ip=0.0.0.0", "--allow-root"]
EOD
nohup docker run \
    --rm \
    --name jupyter \
    -v "$HOME"/.jupyter:/home/joyvan/.jupyter \
    -v "$PWD":/home/jovyan/work \
    -v $HOME/Documents/GitHub:/home/jovyan/work/GitHub \
    -p 0.0.0.0:8888:8888 \
    jupyter &
cat <<EOT
To find out the token, use:
    docker exec jupyter jupyter notebook list
To set a password, use:
    docker exec -it jupyter jupyter notebook password
To stop use:
    docker exec jupyter jupyter notebook stop
EOT

This launches a docker image that makes Jupyter Lab available on port 8888.  It maps three "volumes" into this image:
  • $HOME/.jupyter: This stores your Jupyter settings like theme and password.
  • $HOME/Documents/jupyter: Where you will keep your Jupyter Notebooks, found under work in the Jupyter file browser.
  • $HOME/Documents/GitHub: Where you keep your GitHub checkouts, found under work/GitHub in Jupyter.
Jupyter requires a token or password to access it.  You can determine the token with the command:

docker exec jupyter jupyter notebook list

It is more convenient to set a password, which you can do with the command:

docker exec -it jupyter jupyter notebook password

If you need to stop Jupyter, you can do so with the command:

docker exec jupyter jupyter notebook stop

A final tip: If you need to access other services on your localhost (say services from other Docker containers mapped to local ports) from Jupyter, using localhost or 127.0.0.1 will not work because Jupyter is running inside Docker.  Instead you should use host.docker.internal as the hostname instead.  

 

2016-04-27

Using virtualenv with a Python CGI script

I've been learning Python recently, and I today I found myself writing a CGI script to run under Apache.  The only snag was that I wanted to use a specific package (Template Toolkit) and I didn't want to install it system-wide.  To complicate matters, I want to use Python 2.7, without disrupting the system installation of Python 2.6.

The Python community have several ways to do this: one is to install the package just for one user (pip install --user); and another is to use a virtual environment.

I didn't want to do the user-level installation.  The apache user is special, having no (default) shell and /var/www as home.  I don't want to start messing around with /var/www for the benefit of one CGI script.  And I want to be able to test my script from the command-line without having to be the apache user.

So, virtual environments seem to be the way to go, but how do I configure a CGI script to use a virtual environment?  Using the hashbang line to point at the python executable within the virtual environment directory doesn't work.  It just complains about missing shared libraries.  Virtual environments can be entered by sourcing a script, or by writing complex code within the script to mess around with the environment.

I found lots of documentation about setting up Apache to provide Django web services, and that would probably have worked.  But I wanted to get my simple CGI script up and running without having to learn a whole new web services architecture.

The solution I eventually settled on was to put a small Bash script into the virtual environment's bin directory:

#!/bin/bash
DIR=$(dirname $0)
source $DIR/activate
scl enable python27 "$DIR/python \"$@\""

And then I change the hashbang line in the CGI script to point at this wrapper:

#!/home/username/python/apache/bin/wrapper.sh

Hey presto!  The script runs using Python 2.7 and my virtual environment.  I can test using the same virtual environment.

I'm publishing this here in case someone else is looking to solve the same problem.

Update: I found that this approach doesn't work quite as well for Python 3.5, but the following does:
#!/bin/bash
DIR=$(dirname $0)
source /opt/rh/rh-python35/enable
source $DIR/activate
python "$@"

Update: Changed both scripts to use "$@" in place of plain $@.  This means that quoted arguments remain quoted, which is what you want here, say if you want to use -c.