Skip to content

Welcome to the QMUL HPC blog

Sizing your Apocrita jobs for quicker results

At any one time, a typical HPC cluster is usually full. This is not such a bad thing, since it means the substantial investment is working hard for the money, rather than sitting idle. A less ideal situation is having to wait too long to get your research results. However, jobs are constantly starting and finishing, and many new jobs get run shortly after being added to the queue. If your resource requirements are rather niche, or very large, then you will be competing with other researchers for a more scarce resource. In any case, whatever sort of jobs you run, it is important to choose resources optimally, in order to get the best results. Using fewer cores, although increasing the eventual run time, may result in a much shorter queuing time.

Performance testing with NVMe storage and Spectrum Scale 5

We have recently procured 120TB of NVMe based SSD storage from E8 Storage for the Apocrita HPC Cluster. The plan is to deploy this to replace our oldest and slowest provision of scratch storage. We have been performing extensive testing on this new storage as we expect it to offer new possibilities and advantages within the cluster.

What is the ITSR RSE team?

ITS Research has a Research Software Engineering team. This post introduces the team and how it supports research in Queen Mary University of London. You can also see how to contact the team and why you may want to.

Cluster Hardware Upgrades and Additions

As part of our commitment to regular upgrades to the HPC service, and to keep up with ever-growing demand, we are pleased to announce the addition of new hardware to the Apocrita HPC Cluster for the benefit of all QMUL Researchers.

Short queue

In addition to the primary queue, there is a queue designed to minimise waiting times for short jobs and interactive sessions, in response to users who requested the ability to quickly obtain qlogin sessions for quick tests and debugging. This short queue runs on a wider selection of nodes and is automatically selected if your runtime request is 1 hour or less.

Deprecated modules

We removed some problematic module files. Please check your job scripts for use of these modules:

  • Python: Due to a number of issues with the module installs of python, older versions below 2.7.14 and 3.6.3 are being removed from Apocrita (python/2.7.13, python/2.7.13-1, python/2.7.13-3, python/3.6.1, python/3.6.2, python/3.6.2-2).
  • Java: version java/1.8.0_121-oracle causes problems with mass thread spawning on the cluster and will be removed. java/1.8.0_152-oracle will remain the default version loaded.