Using Python requirements.txt

How To Use the File requirements.txt in Python

Have you ever spent many hours trying to debug a Python program to end up realizing you didn’t install all the necessary modules?

In this tutorial, we will learn how to tackle this issue by exploiting Python requirements.txt file.

What Goes Into a requirements.txt File?

Python is a language that heavily depends on modules. For example, if you install the wrong version of one of the modules your Python program needs, you might end up seeing issues while running your program.

Here is where Python requirements.txt file shines in its role.

A requirements.txt file in Python stores all the modules and packages installed in a Python project with their versions. It enables people who read the code of a Python project to know which modules are necessary to execute the project. And it also simplifies the installation of modules needed by a project.

We will see that you can quickly install and update all modules and packages specified in a requirements file using one command.

Mastering how to create and read a requirements file in Python is a must for every programmer, and we will crack the requirements.txt file in this tutorial!

Before starting with this tutorial, create a directory on your machine, and inside it create a simple Python program called requirements_test.py that contains the following code:

import numpy as np
import pandas as pd
from sklearn.preprocessing import StandardScaler

# create a 3x3 array with numpy
X_train = np.array([[ 1., -1.,  2.],
                    [ 2.,  0.,  0.],
                    [ 0.,  1., -1.]])

# fit this array to StandardScaler
scaler = StandardScaler().fit(X_train)

# transform X_train with this scaler
X_scaled = scaler.transform(X_train)

# Finally, convert this scaled array into a pandas dataframe
df = pd.DataFrame(X_scaled)

# Print the dataframe
print(df)

How Do You Create a Python requirements.txt File?

In Python, the requirements.txt file is generally used in a virtual environment. Before creating a requirements.txt file, we will learn how to create a virtual environment.

First, open the terminal and install the virtualenv module using Pip.

pip install virtualenv

Then let’s create a virtual environment named .sample_env in the same directory of our Python program using the command below:

python -m venv .sample_env

At this point, you can see that .sample_env has been created in the current working directory.

Activating the virtual environment can vary depending on the operating system. In Windows, you can use the following:

.sample_env\Scripts\activate

whereas, in Linux or macOS, you need to run the command:

. .sample_env/bin/activate

You can confirm that the virtual environment .sample_env is activated because after the prefix of the command line you will see .(sample_env) as shown below:

$ . .sample_env/bin/activate
(.sample_env) # 

Before executing the program we created in the first part, don’t forget to install the modules using the command:

pip install module_name

In this case, now that our virtual environment is active, we have to run the following commands one by one:

pip install numpy 
pip install pandas 
pip install sklearn

You can now execute the Python program we created at the beginning of this tutorial:

(.sample_env) # python requirements_test.py                 
          0         1         2
0  0.000000 -1.224745  1.336306
1  1.224745  0.000000 -0.267261
2 -1.224745  1.224745 -1.069045

We are now good to go to create the requirements file. The pip3 freeze command will tell us the modules installed with their versions.

We can redirect the output of the pip3 freeze command to a requirements.txt file using the  “>” symbol:

pip3 freeze > requirements.txt

Verify that the list of modules required by our sample Python program is inside the requirements.txt file.

Here is the list of modules I see in the requirements.txt file:

joblib==1.2.0
numpy==1.23.4
pandas==1.5.1
python-dateutil==2.8.2
pytz==2022.5
scikit-learn==1.1.3
scipy==1.9.3
six==1.16.0
sklearn==0.0
threadpoolctl==3.1.0

We have just created a requirements file with all the information we need to execute our program.

The pip3 freeze command also lists the modules that the main modules depend on. The Pandas, NumPy, and scikit-learn modules depend on other modules that we also see in the requirements file.

Later we will see how you can only list the module we imported in the project using pipreqs!

Note: the requirements file is essential for your project so don’t forget to commit and push the requirements file to your code repository in version control.

How to Add a Python Module to the requirements.txt File

The first way to add Python modules to a requirements file is of course by editing the requirements.txt file. Then you can manually add modules with their version to the file.

However, this method is not advised as it is prone to errors and can be hard to manage if many modules are imported into a Python project.

Instead, we will introduce pipreqs, which will automatically scan a project and list all the modules and their versions for you!

First, install pipreqs:

pip install pipreqs

Then run pipreqs against the current working directory in which our Python program is located.

pipreqs . --ignore .sample_env --force

Notice that we are using the –ignore flag so make sure pipreqs ignores the virtual environment directory.

Also, we are passing the  –force flag to overwrite it with a new requirements file.

You will then see a new requirements.txt file created in the current directory after seeing the following output in the command line:

INFO: Successfully saved requirements file in ./requirements.txt

You can go forward and check that requirements.txt created under your folder contains module names and versions.

On my machine  the requirements.txt file looks like this:

numpy==1.23.4
pandas==1.5.1
scikit_learn==1.1.3

Note that the version names may differ if you implement this tutorial on a different system.

And if you don’t use the –force flag and the requirements file already exists you will see the following error:

WARNING: requirements.txt already exists, use --force to overwrite it

It is noteworthy to point out some of the differences between pip freeze and pipreqs here.

While the pip freeze command will only list the packages that are installed with the pip command, pipreqs doesn’t have this restriction.

Another difference is that pip freeze saves all the packages regardless of whether they are used in a project while pipreqs only shows the packages that are imported into a project.

How to Install Python Modules Using Pip and requirements.txt

After mastering how to create a requirements file, we will now figure out how to install the Python modules specified in our requirements file.

Remember that you can install a module using the pip install command:

pip install module_name

But considering that we have created a requirements.txt file we can just pass it to pip and install all the modules with a single command:

pip install -r requirements.txt

Let’s do the following to test the installation of the modules using the requirements file:

  • deactivate the virtual environment.
  • delete the virtual environment directory.
  • recreate the virtual environment.
  • activate the new virtual environment.
(.sample_env) # deactivate
$ rm -fr .sample_env
$ python -m venv .sample_env
$ . .sample_env/bin/activate

After executing the Python program you will see the following error:

(.sample_env) # python requirements_test.py 
Traceback (most recent call last):
  File "requirements_test.py", line 1, in <module>
    import numpy as np
ModuleNotFoundError: No module named 'numpy'

The ModuleNotFoundError is due to the fact that the NumPy module is not installed yet.

Now execute the command to install all the modules our Python program depends on using the requirements.txt file.

(.sample_env) # pip install -r requirements.txt
Collecting numpy==1.23.4
  Using cached numpy-1.23.4-cp38-cp38-macosx_10_9_x86_64.whl (18.1 MB)
Collecting pandas==1.5.1
  Using cached pandas-1.5.1-cp38-cp38-macosx_10_9_x86_64.whl (11.9 MB)
Collecting scikit_learn==1.1.3
  Using cached scikit_learn-1.1.3-cp38-cp38-macosx_10_9_x86_64.whl (8.6 MB)
Collecting python-dateutil>=2.8.1
  Using cached python_dateutil-2.8.2-py2.py3-none-any.whl (247 kB)
Collecting pytz>=2020.1
  Using cached pytz-2022.5-py2.py3-none-any.whl (500 kB)
Collecting scipy>=1.3.2
  Using cached scipy-1.9.3-cp38-cp38-macosx_10_9_x86_64.whl (34.2 MB)
Collecting joblib>=1.0.0
  Using cached joblib-1.2.0-py3-none-any.whl (297 kB)
Collecting threadpoolctl>=2.0.0
  Using cached threadpoolctl-3.1.0-py3-none-any.whl (14 kB)
Collecting six>=1.5
  Using cached six-1.16.0-py2.py3-none-any.whl (11 kB)
Installing collected packages: numpy, six, python-dateutil, pytz, pandas, scipy, joblib, threadpoolctl, scikit-learn
Successfully installed joblib-1.2.0 numpy-1.23.4 pandas-1.5.1 python-dateutil-2.8.2 pytz-2022.5 scikit-learn-1.1.3 scipy-1.9.3 six-1.16.0 threadpoolctl-3.1.0

The modules have been installed successfully.

Now, let’s execute our test program:

(.sample_env) # python requirements_test.py
          0         1         2
0  0.000000 -1.224745  1.336306
1  1.224745  0.000000 -0.267261
2 -1.224745  1.224745 -1.069045

It works!

Conclusion

In this tutorial, we’ve learned the critical role requirements.txt file plays in Python. It is important in Python to know which modules and versions are used in a project, and the requirements.txt file tracks that information.

We’ve also learned how to create a requirements file and how to install all modules and packages in a requirements file with a single command.

Do you want to build a strong Python foundation? Have a look at these suggested Python books.