Creating and Using Private Modules¶
Modules are the centralised method of accessing different software on an HPC cluster. By using a variety of modules you can quickly and easily access different versions of applications and create work flows that suit particular projects. The modules offered on Apocrita cover a wide range of applications but there will always be situations that require something unusual or a relatively niche version of a piece of software.
Many times it is possible for these edge case applications to be built from source by the system user who requires it, but this can lead to the issue of how to access this application without resorting to specifying the exact location where each section of the application and its associated libraries are stored.
The simple answer of how to manage this is to use a private module. A private module allows you to specify the location of binaries, libraries and any other files required for an application to function. It also allows you to avoid conflicts with other versions of the application and automatically pull in any associated software modules that might be required.
What does a private module contain?¶
A private module has several sections that are simple to follow. As an example
we'll use the module for the application Salmon, version 1.9.0, installed to
~/local
within the aaa001
user's home directory (represented as
/data/home/aaa001
in examples below).
The very top line lets the system identify the file as a module. Without the
#%Module
signature on the first line, your module will not work.
The section that follows this is a series of comments to let you know what the module is. In the case of our example we have information regarding the application, version, build date and author but this is entirely optional.
The two sections that come next are for querying the module and provide
information when using the commands module whatis
and module help
.
Neither of these are essential but they can be very useful when checking
module details.
This brings us to the first of the really helpful parts of a module; the
conflicts line. This line lets you list any other modules that might cause a
problem for your module to run and prevents the module being loaded at the same
time as one of them being active. In our example we simply have conflict salmon
listed. This means that any other versions of the Salmon module cannot be
loaded if a Salmon module has already been loaded.
The module load line is the exact opposite of the conflict section in that it lets you load other modules that are required for your module to run. Our example has "gcc/8.2.0", meaning that this version of GCC will be loaded before Salmon, overriding the standard system version of GCC. You can list as many modules as you need but you should be aware that some modules will also have their own module requirements, so the longer the list, the more complicated the chain of dependencies becomes.
The final section of the module file deals with the environment variables that
need to be set. The cluster uses a number of environment variables to define
the locations that need to be searched when you run a command. When you load
the module, the prepend-path
command adds the location you specify to the
environment variable listed. In the case of our example, we can see that this
adds locations to the PATH
, LIBRARY_PATH
and LD_LIBRARY_PATH
variables,
allowing files to be picked from those locations.
The paths you need to set will vary depending on application but this can
usually be found in the documentation that comes with the application. In our
module example for Salmon version 1.9.0, binaries are installed in
/data/home/aaa001/local/salmon/1.9.0-gcc_8.2.0/bin
and libraries in
/data/home/aaa001/local/salmon/1.9.0-gcc_8.2.0/lib
.
All of the above can be simply edited into a text files using an editor of your choice. Your final module should looks something like this:
#%Module
###
### Salmon - 1.9.0 Modulefile
### Application installation date: 2022-08-30
### Creator: aaa001
###
proc ModulesHelp { } {
puts stderr " Adds Salmon to your environment"
puts stderr " Version 1.9.0 "
}
module-whatis "adds salmon 1.9.0 to your environment"
conflict salmon
module load gcc/8.2.0
#Adding basic variables
prepend-path PATH /data/home/aaa001/local/salmon/1.9.0-gcc_8.2.0/bin
prepend-path LIBRARY_PATH /data/home/aaa001/local/salmon/1.9.0-gcc_8.2.0/lib
prepend-path LD_LIBRARY_PATH /data/home/aaa001/local/salmon/1.9.0-gcc_8.2.0/lib
How do you access a private module?¶
Private modules need to be placed in your home directory under a sub-directory
named privatemodules
. Under this directory you can then create a directory
for each application, such as Salmon.
mkdir -p ~/privatemodules/salmon
Please be aware that when you load a module the name of the directory is what you need to use and that it is case sensitive, so simplicity can make life easier for you in day to day usage.
Under each application directory you can put any number of module files. We recommend keeping the names short as they will also be used when you load the module. Using the version of the application is generally the simplest approach as this allows you to clearly see what your are loading when you list any modules.
Once this structure has been set up you can then access it via the modules system using the command:
module load use.own
How do I know if my module file is working?¶
The simplest check you can run is to see if the module is visible via the command:
module avail
This should list the module under the privatemodules
section at the very top
of your modules list. This only shows that the module file is available, not
if it actually works. To check if it works you should load the module using
the module load
command and then use the command echo $PATH
. This should
display the value held under the PATH
environment variable, part of which
should be the path you added under the module.
Please be aware that if your module only uses the LIBRARY
or
LD_LIBRARY_PATH
then you would need to check these variables with the echo
command instead.