Topics
Virtual Environments for Python
Virtual Environments are essential for making reproducible code. Whether you want to move your process to a different platform, or have collaborators run the same code, virtual environments will make sure everyone’s on the same page.
In python, there are two main virtual environment managers:
- Anaconda
- Virtualenv
Anaconda
Anaconda is already installed on the yens. There is a legacy python 2
version, anaconda/5.2.0
module and python 3 versions, anaconda3/5.2.0
and anaconda3/2022.05
.
To use anaconda with python 3, simply load anaconda3
module:
$ ml anaconda3
List the loaded modules and check that conda is available:
$ ml
Currently Loaded Modules:
1) anaconda3/2022.05
$ conda --version
conda 4.12.0
anaconda3
module every time you log into Yens because
modules are loaded for an acitve shell only meaning that if you log out, or
lose connection to the Yens, no modules will be loaded by default when you re-login.So, if you are working with a conda environment, remember to first load anaconda3
module so that the executable conda
becomes available.
You are now ready to create your virtual environment.
Create a virtual environment for your process to run
To create a conda
environment, run the following command on one of the Yens:
$ conda create -n py310
This will make an environment named py310
in your home directory, $HOME/.conda/envs/py310
folder. This is the path
where all python packages will be installed into for this environment.
When making an environment, you can also specify which python version you want. By convention, we append the python version to the conda environment name but you can name it whatever you want:
$ conda create -n py310 python=3.10
If you plan on collaboration on a project with others, we recommed creating a shared conda environment in a project
space where all collaborators have access (i.e. /zfs/projects/<project-space>/conda
).
See this guide on how to create a shared conda environment outside of your user
home directory. This is especially useful for large conda environments that need machine learning packages such
as tensorflow
and pytorch
.
conda activate
or conda init
. They will not work properly on the shared Linux clusters
such as the Yens or Sherlock.When making your environment, you will likely encounter this warning:
==> WARNING: A newer version of conda exists. <==
current version: 4.12.0
latest version: 22.11.1
Please update conda by running
$ conda update -n base -c defaults conda
As conda is installed system-wide on the yens, as a user, you cannot update it. So, we simply ignore this warning and carry on.
Activate your environment
Once your environment is ready, you will need to activate it.
You will see suggested conda activate
command but please ignore it:
Preparing transaction: done
Verifying transaction: done
Executing transaction: done
#
# To activate this environment, use
#
# $ conda activate py310
#
# To deactivate an active environment, use
#
# $ conda deactivate
Since the command does not work properly on the Yens, so we will use source activate
instead.
$ source activate py310
After your environment is activated, it will prepend your command prompt like so:
(py310) SUNetID@yenX:~$
The python
executable will also now be the one inside your conda environment. Let’s check that:
(py310) SUNetID@yenX:~$ which python
/home/users/$USER/.conda/envs/py310/bin/python
(py310) SUNetID@yenX:~$ python --version
Python 3.10.8
Install packages into your environment
Make sure your conda environment is active. Then, you can install any packages into it using pip
:
(py310) SUNetID@yenX:~$ pip install numpy pandas torch
Our advice is to stick to using pip
and only use conda
if a package is not available from PyPI
and cannot be installed with pip
. You may run into package version conflicts if you mix conda
and pip
installs.
(py310) SUNetID@yenX:~$ conda install <my-package>
Note that Anaconda has different channels or repos from which you can install python packages. You will need to search on Anaconda’s website to find the right repo. Then, you can install a package with
(py310) SUNetID@yenX:~$ conda install <my-package> -c <my-channel>
where my-channel
is the name of the channel or channels, such as conda-forge
, tensorflow
, etc.
Running python script using your new environment
As long as your environment is activated, you can run python:
(py310) SUNetID@yenX:~$ python my_script.py
Saving and sharing your environment
One of the big advantages of virtual environments is sharing and moving the environments.
This is done by saving the environment to a file.
With conda, you can save your active environment by running:
(py310) SUNetID@yenX:~$ conda env export > my_conda_env.yml
You will now have a file named my_conda_env.yml
with all the necessary information for conda
to build this environment.
If you want to load this environment on a new server or share it with a colleague, you can run the following command:
$ conda env create -f my_conda_env.yml
Making conda environment into Yens JupyterHub kernel
If you would like to make any of your conda environments into a JupyterHub kernel, follow this guide.
Deactivate environment when done
When you are finished using your environment, or need to switch to a different environment, you can deactivate it with:
(py310) SUNetID@yenX:~$ source deactivate
Removing conda environment
To remove an environment, first make sure it is not activated.
Then, run:
$ conda remove -n myenv --all
where myenv
is the name of your environment.
Note, you can have multiple conda enviornments (with different python versions or for different research projects).
To list them all, run:
$ conda info -e
Check out the conda docs for many other useful features!
Virtualenv
Create a virtual environment for your process to run
To create a virtualenv environment, run the following command on one of the Yens:
$ virtualenv --python=/usr/bin/python3.6 TEST
This will make an environment named “TEST”.
Activate your environment
First, you need to activate your environment using source
:
$ source TEST/bin/activate
Install packages into your environment
Then, you can install any packages you need using pip
:
$ pip install numpy
Running python script using your new environment
Using your environment is very simple - as long as your environment is activated, you can run python normally:
(TEST) yourSUNetID@yenX:~$ python my_script.py
The python
command will be specific to your environment. You can troubleshoot this with the which
command:
(TEST) yourSUNetID@yenX:~$ which python
/path/to/env/TEST/bin/python
Saving and moving your environment
One of the big advantages of virtual environments is sharing and moving the environments. This is done by saving the environment to a file.
In virtualenv
, you can save your environment by running:
(TEST) yourSUNetID@yenX:~$ pip freeze > requirements.txt
You will now have a file named requirements.txt
with all the necessary information for pip
to build your environment.
If you want to load this environment on a new server, you can run the following command:
$ source <env_name>/bin/activate
(<env_name>)$ pip install -r path/to/requirements.txt
Deactivate environment when done
When you are finished using your environment, or need to switch to a different environment, you can deactivate it with:
$ source deactivate
The virtualenv docs have more information on virtualenv
Connect with us