Skip to content

2025

The next era for Apocrita is here

For much of the year we have been working on a major project to upgrade Apocrita to a new operating system, (Rocky Linux 9, hereafter known as Rocky 9). As part of the project, we have deployed a new package building tool to help us recompile all of the research applications to work on the new system. We are now draining the cluster in batches to upgrade to Rocky 9, so now is a good opportunity to move across if you haven't already, to avoid longer queueing times as we reduce the number of remaining CentOS 7 nodes.

A PyTorch DDP Case Study With ImageNet

In this blog post, we will play about with neural networks, on a dataset called ImageNet, to give some intuition on how these neural networks work. We will train them on Apocrita with DistributedDataParallel and show benchmarks to give you a guide on how many GPUs to use. This is a follow on from a previous blog post where we explained how to use DistributedDataParallel to speed up your neural network training with multiple GPUs.