Environment Modules at DIPC¶
At the heart of the Donostia International Physics Center's (DIPC) advanced computing infrastructure is the Lmod utility, a system designed to manage software applications, libraries, and compilers across it systems. This guide will walk you through the essential commands, insights, and best practices to help you navigate and optimize your usage of environment modules at DIPC.
Environment modules, managed by Lmod at DIPC, are a powerful tool for managing the computational environment on a UNIX-like operating system. They allow you to dynamically manage environment variables within the same shell session, enabling seamless switching of compilers, applications, path definitions, and more. This flexibility allows you to tailor your environment to your specific computational needs.
Module Commands¶
module help¶
Need a quick refresher on module options? Simply use the module help
command:
$ module help
For guidance on a specific module, add the module name after module help
:
$ module help modulefile
module avail¶
To list all the modulefiles available for loading, the module avail
command comes in handy:
$ module avail
Many modulefiles have version numbers and, where multiple versions exist, one is typically designated as the default. To view hidden modulefiles, use the --show-hidden
option:
$ module --show-hidden avail
module spider¶
The module spider
command helps you understand the dependencies of different modules:
$ module spider FFTW
To extend the search to hidden modulefiles, use the --show-hidden
option:
$ module --show-hidden spider libGLU
module display¶
To see what a specific modulefile does to your environment, use module display
:
$ module display netCDF/4.4.1-intel-2016b
module load¶
The module load
command adds one or more modulefiles to your current environment:
$ module load netCDF/4.4.1-intel-2016b FFTW/3.3.6-intel-2016b
module list¶
To list all loaded modulefiles, use module list
:
$ module list
module unload¶
The module unload
command removes a modulefile from the environment:
$ module unload intel/2022a
module purge¶
To remove all loaded modulefiles, use module purge
:
$ module purge
Example: Load, Execute, and Unload a Module¶
Let's look at a typical scenario where you load a module, execute your program, and then unload the module once you're done. For this example, we'll be using the QuantumESPRESSO software package.
# Load the necessary module
$ module load QuantumESPRESSO/6.8-intel-2020a
# Check the modules currently loaded
$ module list
# Execute your QuantumESPRESSO program, for instance, pw.x
$ pw.x < pw.in > pw.out
# Once you're done, unload the module
$ module unload QuantumESPRESSO/6.8-intel-2020a
In the above steps, we first load the required module 'QuantumESPRESSO'. We then confirm that the module is loaded using the module list
command. After executing our QuantumESPRESSO program (in this case, pw.x), we no longer need 'QuantumESPRESSO' loaded, so we unload it.
This example should give you a practical understanding of how to use the module commands in your daily work. Remember, the key is to ensure that you only load the modules you need at a given time and unload them once you're done to maintain a clean environment.
Understanding the Impact of Loading a Module on the Environment¶
When you load a module, it modifies your shell environment to set up the software package it represents. Let's examine some of the environment variables that can be affected, using the module display
command.
$ module display QuantumESPRESSO/6.8-intel-2020a
help([[
Description
===========
Quantum ESPRESSO is an integrated suite of computer codes
for electronic-structure calculations and materials modeling at the nanoscale.
It is based on density-functional theory, plane waves, and pseudopotentials
(both norm-conserving and ultrasoft).
More information
================
- Homepage: https://www.quantum-espresso.org
]])
whatis("Description: Quantum ESPRESSO is an integrated suite of computer codes
for electronic-structure calculations and materials modeling at the nanoscale.
It is based on density-functional theory, plane waves, and pseudopotentials
(both norm-conserving and ultrasoft).
")
whatis("Homepage: https://www.quantum-espresso.org")
whatis("URL: https://www.quantum-espresso.org")
conflict("QuantumESPRESSO")
load("intel/2021a")
load("HDF5/1.10.7-iimpi-2021a")
load("ELPA/2021.05.001-intel-2021a")
load("libxc/5.1.5-intel-compilers-2021.2.0")
prepend_path("PATH","/scicomp/EasyBuild/CentOS/7.4.1708/Skylake/software/QuantumESPRESSO/6.8-intel-2021a/bin")
PATH¶
The PATH
environment variable is a list of directories that your shell searches when trying to execute a command. When you load a module, it often adds the paths to the software's executables to your PATH
. This action allows you to call these executables from anywhere without having to specify their full paths.
For instance, loading the QuantumESPRESSO module adds the path to its executables (like pw.x) to your PATH
, enabling you to simply type pw.x
instead of its full path.
LD_LIBRARY_PATH¶
The LD_LIBRARY_PATH
environment variable is a list of directories that the linker searches when trying to find shared libraries. When you load a module, it may add the paths to the software's libraries to your LD_LIBRARY_PATH
. This action allows the linker to find these libraries when executing a program.
For example, loading the QuantumESPRESSO module ensures that the system knows where to find the QuantumESPRESSO shared libraries when you run a QuantumESPRESSO executable.
Other Environment Variables¶
Modules can also set other environment variables that are specific to the software package. For instance, a module could set variables that point to the software's data files or specify options for the software.
In the case of QuantumESPRESSO, there might be environment variables that point to pseudopotential files or that set options for running QuantumESPRESSO executables.
Unloading a Module¶
When you unload a module, it reverses the changes it made to your environment. This action often involves removing the paths it added from the PATH
and LD_LIBRARY_PATH
variables and unsetting any environment variables it set.
Maintaining a clean environment by unloading modules when they are no longer needed helps prevent potential conflicts between different software packages and helps ensure that your environment is set up correctly when you load a new module.
Importance of Toolchains and Their Usage¶
Toolchains are a critical part of software development and deployment on HPC systems. They represent a set of programming tools, including compilers and libraries, that work together to build applications. Consistency is key in the software building process, and toolchains ensure that the same set of tools is used each time software is built, thus improving reliability and preventing hard-to-debug errors caused by slight variations in tool versions.
Understanding the Modulefile Naming Scheme¶
Modulefiles at DIPC follow a standardized naming convention: name/version-toolchain-toolchain-version
. This naming scheme offers a comprehensive overview of the software's version and the toolchain used to build it. The toolchain version also provides the software's compatibility with other libraries and dependencies. This enables users to match software with the correct dependencies, ensuring smooth operation.
For example, a module for software Foo
might have the name Foo/1.0.0-GCCcore-9.3.0
. This would mean:
- The software is called
Foo
- The version of the software is
1.0.0
- The toolchain used is
GCCcore
- The version of the toolchain is
9.3.0
The components of this module name would correspond to the following parts of the software build and runtime environment:
name
: This identifies the software package. It must be matched with compatible versions of libraries and other dependencies.version
: This is the version of the software package. It should be used with the corresponding versions of dependencies as specified by the software documentation.toolchain
: This is the toolchain used to build the software. It must be matched with dependencies that were built with the same toolchain.toolchain-version
: This is the version of the toolchain. The toolchain version further refines the compatibility requirements for dependencies.
This applies to subtoolchains as well. For instance, if a software Bar
is built with the gompi
subtoolchain (which is a part of the main foss
toolchain), it might be named Bar/2.5.1-gompi-2020b
. Here 2020b
is the version of the gompi
subtoolchain, which is compatible with the foss
toolchain version 2020b
because it's part of the same software suite released in the second half of 2020 (hence 2020b
).
Similarly, if another software Baz
is compiled using iimpi
subtoolchain of intel
, it might be named Baz/3.3.3-iimpi-2021a
. This would be compatible with the intel
toolchain version 2021a
, because iimpi
is part of the intel
software suite released in the first half of 2021 (hence 2021a
).
This way, users can ensure that they select modules that are mutually compatible. For example, if a user wanted to load modules for software Foo
and Bar
which were built with the GCCcore
toolchain version 9.3.0
and the foss
toolchain version 2020b
, they would select the following modules:
Foo/1.0.0-GCCcore-9.3.0
Bar/2.5.1-gompi-2020b
Even though Foo
and Bar
were built with different toolchains, they are likely to be compatible with each other and with other dependencies because gompi
is a subtoolchain of foss
and both share the same GCCcore
base. It's also important to note that if a user is attempting to load a module that was built with a different toolchain or version, there could be compatibility issues. For instance, a user attempting to load Foo/1.0.0-GCCcore-9.3.0
and Baz/3.3.3-iimpi-2021a
could encounter problems, as Baz
was built with a different toolchain (intel
through iimpi
) and a different toolchain version (2021a
).
List of main toolchains used at DIPC¶
The primary toolchains used are foss
and intel
. These toolchains consist of subtoolchains that include different sets of compilers and libraries.
-
foss: The
foss
toolchain is a free and open-source software toolchain that includes theGCC
(GNU Compiler Collection),OpenMPI
for MPI support,OpenBLAS
(optimizedBLAS
library),LAPACK
linear algebra routines,ScaLAPACK
parallel linear algebra routines, andFFTW
for Fast Fourier Transformations. -
intel: The
intel
toolchain is a proprietary software toolchain that includes the Intel compilers (icc
,ifort
), Intel MPI & Intel MKL (imkl
).
The relationship between the main toolchains and their subtoolchains can be visualized as follows:
- foss: This toolchain incorporates the subtoolchains
gompi
andGCC
. - gompi: The
gompi
toolchain includes theGCC
andOpenMPI
. -
GCC: The
GCC
toolchain is also a subtoolchain forfoss
and is used on its own. It includes theGCC
compilers only. -
intel: This toolchain incorporates the subtoolchain
iimpi
. -
iimpi: The
iimpi
toolchain includes the Intel compilers and Intel MPI. -
GCCcore: The
GCCcore
is a lower level subtoolchain that only includes theGCC
compilers and is used as a base for the other toolchains, i.e., bothfoss
andintel
(through their subtoolchains) are built on top ofGCCcore
.
Here's a table providing a summary of the toolchains and their components:
Toolchain | Components |
---|---|
foss | GCC, OpenMPI, OpenBLAS, LAPACK, ScaLAPACK, FFTW |
intel | Intel Compilers (icc, ifort), Intel MPI, Intel MKL |
gompi | GCC, OpenMPI |
GCC | GCC Compilers |
gfbf | GCC, FlexiBLAS, FFTW |
iimpi | Intel Compilers, Intel MPI |
GCCcore | GCC Compilers |
Remember that the versions of the components may change depending on the specific toolchain version you're using.
Hidden Modules: An Insight¶
By default, some modulefiles are hidden from the standard module avail
or module spider
command. This approach is often employed to keep the list of available modules concise and to avoid potential confusion with deprecated or less frequently used modules.
However, these hidden modules can contain valuable tools or software versions. Therefore, if a specific software seems absent, you may want to check the hidden modules using the --show-hidden
option. For instance:
$ module --show-hidden avail
Hidden modules are treated the same as regular modules once loaded into your environment. They can provide essential functionality that may not be available in the visible modules.
Customizing Module Behavior¶
The behavior of the module system is highly customizable, allowing you to tailor it to your specific needs. A key feature of this customizability is the ability to set the MODULEPATH
environment variable, which determines the directories that the module command will search for modulefiles.
By default, the MODULEPATH
is set to a system directory, which includes modulefiles for all the software installed on the system. However, you may wish to add your personal module directories to the MODULEPATH
. This can be useful if you have installed your software or if you want to maintain a separate environment for a specific project.
To add a directory to the MODULEPATH
, use the following command:
$ module use /path/to/your/modulefiles
Now, when you use module avail
, it will also list the modulefiles located in /path/to/your/modulefiles
.
It's important to note that this change is not permanent and will be reset when you start a new shell session. To make it permanent, you need to add the command to your shell's startup file (e.g., .bashrc
or .bash_profile
for the Bash shell).
Additionally, you can unload all currently loaded modules and reset your MODULEPATH
to its default value with the module purge
command. To then reload your custom MODULEPATH
, use the module use
command again.
In this way, you can customize your module environment to fit your needs, making the module system a flexible and powerful tool for managing your software environment on DIPC's HPC systems.