Virtual Environments for Python

Virtual Environments are essential for making reproducible code. Whether you want to move your process to a different platform, or have collaborators run the same code, virtual environments will make sure everyone’s on the same page.

In python, there are two main virtual environment managers:

  • Anaconda
  • Virtualenv

Anaconda

Anaconda is already installed on the yens. There is a legacy python 2 version, anaconda/5.2.0 module and python 3 versions, anaconda3/5.2.0 and anaconda3/2022.05. To use anaconda with python 3, simply load anaconda3 module:

$ ml anaconda3

List the loaded modules and check that conda is available:

$ ml

Currently Loaded Modules:
  1) anaconda3/2022.05

$ conda --version
conda 4.12.0

So, if you are working with a conda environment, remember to first load anaconda3 module so that the executable conda becomes available.

You are now ready to create your virtual environment.

Create a virtual environment for your process to run

To create a conda environment, run the following command on one of the Yens:

$ conda create -n py310

This will make an environment named py310 in your home directory, $HOME/.conda/envs/py310 folder. This is the path where all python packages will be installed into for this environment.

When making an environment, you can also specify which python version you want. By convention, we append the python version to the conda environment name but you can name it whatever you want:

$ conda create -n py310 python=3.10

If you plan on collaboration on a project with others, we recommed creating a shared conda environment in a project space where all collaborators have access (i.e. /zfs/projects/<project-space>/conda). See this guide on how to create a shared conda environment outside of your user home directory. This is especially useful for large conda environments that need machine learning packages such as tensorflow and pytorch.

When making your environment, you will likely encounter this warning:

==> WARNING: A newer version of conda exists. <==
  current version: 4.12.0
  latest version: 22.11.1

Please update conda by running

    $ conda update -n base -c defaults conda

As conda is installed system-wide on the yens, as a user, you cannot update it. So, we simply ignore this warning and carry on.

Activate your environment

Once your environment is ready, you will need to activate it.

You will see suggested conda activate command but please ignore it:

Preparing transaction: done
Verifying transaction: done
Executing transaction: done
#
# To activate this environment, use
#
#     $ conda activate py310
#
# To deactivate an active environment, use
#
#     $ conda deactivate

Since the command does not work properly on the Yens, so we will use source activate instead.

$ source activate py310

After your environment is activated, it will prepend your command prompt like so:

(py310) SUNetID@yenX:~$ 

The python executable will also now be the one inside your conda environment. Let’s check that:

(py310) SUNetID@yenX:~$ which python
/home/users/$USER/.conda/envs/py310/bin/python

(py310) SUNetID@yenX:~$ python --version
Python 3.10.8

Install packages into your environment

Make sure your conda environment is active. Then, you can install any packages into it using pip:

(py310) SUNetID@yenX:~$ pip install numpy pandas torch

Our advice is to stick to using pip and only use conda if a package is not available from PyPI and cannot be installed with pip. You may run into package version conflicts if you mix conda and pip installs.

(py310) SUNetID@yenX:~$ conda install <my-package> 

Note that Anaconda has different channels or repos from which you can install python packages. You will need to search on Anaconda’s website to find the right repo. Then, you can install a package with

(py310) SUNetID@yenX:~$ conda install <my-package> -c <my-channel>

where my-channel is the name of the channel or channels, such as conda-forge, tensorflow, etc.

Running python script using your new environment

As long as your environment is activated, you can run python:

(py310) SUNetID@yenX:~$ python my_script.py

Saving and sharing your environment

One of the big advantages of virtual environments is sharing and moving the environments.
This is done by saving the environment to a file.

With conda, you can save your active environment by running:

(py310) SUNetID@yenX:~$ conda env export > my_conda_env.yml

You will now have a file named my_conda_env.yml with all the necessary information for conda to build this environment.
If you want to load this environment on a new server or share it with a colleague, you can run the following command:

$ conda env create -f my_conda_env.yml

Making conda environment into Yens JupyterHub kernel

If you would like to make any of your conda environments into a JupyterHub kernel, follow this guide.

Deactivate environment when done

When you are finished using your environment, or need to switch to a different environment, you can deactivate it with:

(py310) SUNetID@yenX:~$ source deactivate

Removing conda environment

To remove an environment, first make sure it is not activated.

Then, run:

$ conda remove -n myenv --all

where myenv is the name of your environment.

Note, you can have multiple conda enviornments (with different python versions or for different research projects).

To list them all, run:

$ conda info -e

Check out the conda docs for many other useful features!

Virtualenv

Create a virtual environment for your process to run

To create a virtualenv environment, run the following command on one of the Yens:

$ virtualenv --python=/usr/bin/python3.6 TEST

This will make an environment named “TEST”.

Activate your environment

First, you need to activate your environment using source:

$ source TEST/bin/activate

Install packages into your environment

Then, you can install any packages you need using pip:

$ pip install numpy

Running python script using your new environment

Using your environment is very simple - as long as your environment is activated, you can run python normally:

(TEST) yourSUNetID@yenX:~$ python my_script.py

The python command will be specific to your environment. You can troubleshoot this with the which command:

(TEST) yourSUNetID@yenX:~$ which python
/path/to/env/TEST/bin/python

Saving and moving your environment

One of the big advantages of virtual environments is sharing and moving the environments. This is done by saving the environment to a file.
In virtualenv, you can save your environment by running:

(TEST) yourSUNetID@yenX:~$ pip freeze > requirements.txt

You will now have a file named requirements.txt with all the necessary information for pip to build your environment.
If you want to load this environment on a new server, you can run the following command:

$ source <env_name>/bin/activate
(<env_name>)$ pip install -r path/to/requirements.txt

Deactivate environment when done

When you are finished using your environment, or need to switch to a different environment, you can deactivate it with:

$ source deactivate

The virtualenv docs have more information on virtualenv