Skip to main content

Virtual environments with Python projects

This page explains how and why a data scientist should use a virtual environment with their data science projects. For more information, refer to venv - Creation of virtual environments in the Python documentation.

Creating a virtual environment and storing a project’s libraries in a requirements file helps ensure that your ML project is reproducible.

Why should I use a virtual environment?

A virtual environment isolates the dependencies needed for different projects. This ensures that you can run different projects on your system without conflict.

For example, you might have a project running in Python 3.8 and another one running in Python 3.7. One way to run these projects on the same system is to create a Python 3.8 virtual environment and a Python 3.7 virtual environment.

Besides providing isolation, the environment also allows you to install libraries that are only be available with certain Python versions.

Set up and use a virtual environment

Use the venv module to create virtual environments in Python. To create a virtual environment named my_env, run:

python -m venv my_env

Once the environment has been created, you’ll need to activate it with the source command:

source my_env/bin/activate

After you activate the environment, you can install all the packages you need for the project. These packages will be installed in the environment you just created.


Running pip freeze with this environment active only displays the libraries installed in the environment.