Creating Docker image(conda+Jupyter notebook) for Social Scientists
We can find many codes on the internet that can serve as a good reference for our research.
However, the codes do not always work on my computer.
This is complicated by changes in the environment, such as different versions and operating systems.
This is NOT a minor issue for social scientists like me, because coding is already a significant challenge for us.
Docker can assist us. It enables us to use the entire reference code environment. We can run the code in the same environment of the author.
What is Docker
Docker's logo provides information about what Docker is. A whale is transporting containers.
We can put codes, libraries, and necessary resources in a container, and the whale will ship the container to our computer environments, which are not the same as the coding author's.
Three factors for container
We need three factors : dockerfile, image , and container.
- dockerfile : containing a setting script for the image
- image : containing all settings + dockerfile. (This image cannot be altered.)
- container : allowing the image to be used within a container. We can alter the container, which enables us to manipulate the original code for our purposes.
Creating dockerfile
Create dockerfile
FROM ubuntu:latest
RUN apt-get update && \
apt-get install -y python3.9 \
python3-pip
RUN pip3 install JPype1 jupyter pandas numpy seaborn scipy matplotlib pyNetLogo SALib
RUN useradd -ms /bin/bash jupyter
USER jupyter
WORKDIR home/jupyter
EXPOSE 8888
ENTRYPOINT ["jupyter", "notebook","--allow-root","--ip=0.0.0.0","--port=8888","--no-browser"]
This dockerfile is an example. FROM
is used to load a base image, and RUN
is used to install or run commands.
This dockerfile's framework creates a Docker image. Then, the container can be used for our purposes.
If you search for creating Docker image, you will discover that there are additional starnage files, such as requirements.txt or yml files. In addition, complex formats and a glossary are provided. Even if you were able to collect good examples for those files, a minor difference or omission would bog you down in an endless stream of error messages.
I do not like reading computer references. In many cases, simply explaining the references were NOT be sufficient to assist me. I needed another references or explanations to fully grasp the references. I only want to use useful computer tools for my purpose, not learn some new and exciting methods.
As a result, I will not create any unusual files. Instead, I try to create Image
and Container
in a straightforward manner for social scientists who want to use computer tools rather than study them, as I do.
Pull miniconda Image
from Docker hub and Create a Container
conda is a good starting point for creating a Python environment for our needs.
docker search miniconda3
I am going to use the first miniconda3 image from Docker hub.
docker run -i -t --rm --name condaenv -p 8888:8888 continuumio/miniconda3 /bin/bash
docker run
command is for running a Image. However if I do not have the image I choose, this command pulls it and creates a Container.
- -i -t : allowing the container to be interactive
- --rm : removing the container once the work is completed.
- --name condaenv : creating the container name, condaenv.
- -p 8888:8888 : connecting to port 8888 for Jupyter notebook
- /bin/bash : running bash to input more commands in the terminal
docker images
docker ps -a
docker images displays images you have and docker ps -a provides information about your containers. You can examine them in Docker Desktop.
Use conda
to create Python environment
conda update conda
conda install python=3.9.6
First of all, let's update conda, conda update conda. If you require a specific version of Python(for example, 3.9.6), install that version, conda install python=3.9.6.
conda install -c conda-forge jpype1
conda install jupyter pandas numpy
pip install SALib seaborn
Install packages with conda install and pip install.
Jupyter notebook
in the Docker Container
It is the most difficult part. Many posts on how to use Jupyter notebook in Docker can be found easily. Some of them are difficult to follow, while others are no longer functional.
This approach, I believe, will be the most straightforward and functional for social scientists like me.
# In the Windows PowerShell or VS code
ipconfig
You need the IPV4 address of your computer.
Many addresses are displayed by ipconfig. You need the IPV4 address which is connected to the internet. Because I use a laptop, I require Wireless LAN adapter Wi-Fi address (For eample, 111.000.1.000).
mkdir -p /opt/notebooks
jupyter notebook --notebook-dir=/opt/notebooks --ip='*' --port=8888 --no-browser --allow-root
mkdir creates a directory for Jupyter notebook
jupyter notebook has many options. --notebook-dir=/opt/notebooks means that Jupyter notebook will run in the new directory. The other options serve only one purpose: they allow Jupyter notebook of Docker to connect to and run in a web browser.
Do not use the URLs. Just copy a token. Put IPV4 address:8888 (For example, 111.000.1.000:8888) into your web browser. The token is then requested by Jupyter. Enter the copied token.
Update Docker Image and Create Repository
First of all, create your repository in Docker hub . My repository name is youngjoon5/condajupyter. When I push my new image to Docker hub, it should be named youngjoon5/condajupyter.
# In the PowerShell or VS code
docker ps -a
docker commit {Container ID} {new Image name(repository name)}
#ex# docker commit 40b8d16b7dba youngjoon5/condajupyter
We need Container ID to update Image with Container (Use docker ps -a). docker commit generate a new Image using the Container and the pulled Image. To push the new Image to Docker hub, the new Image name should be the same as the repository name.
docker login
docker push youngjoon5/condajupyter
After logging to Docker (docker login), copy the push command from Docker hub and paste it into your terminal.
Using new Image
# Pull the new Image and Create a Container
docker run -i -t --rm --name condaenv -p 8888:8888 youngjoon5/condajupyter /bin/bash
# Run Jupyter notebook
jupyter notebook --notebook-dir=/opt/notebooks --ip='*' --port=8888 --no-browser --allow-root