How to: Uppmax

starting off with Uppmax and Rackham

UPPMAX (Uppsala Multidisciplinary Center for Advanced Computational Science) is Uppsala University’s resource of high-performance computers, large-scale storage and know-how of high-performance computing (HPC). They provide resources for the Swedish Research Community. Their server for non-sensitive data is called Rackham.

Rackham End of Life

Unfortunately Rackham’s end of life has been announced. While researchers can still use this amazing resource until the end of 2024, we have to prepare ourselves for a big change. Fortunately, we expect that what you learn from this guide will still be applicable to its recommended replacement, Dardel in 2025.

UPPMAX Training

UPPMAX offers regular training. For a regularly updated list, please see here.

Applying for an account at NAISS SUPR

In order use the resources at Uppmax, you need to apply for an account. You need to apply for an account at NAISS SUPR.

You will also use this page to log on when appying for projects.

Appyling for projects at UPPMAX

For an extensive guide of how to start with UPPMAX have a look at their getting started guide.

The following is a small rundown of this information based on our experience.

More often than not, small compute and small storage projects are enough for the bioinformatics projects we see at SLU. You must apply for a storage project alongside a compute project if you are planning to do any analyses.

The administrators at NAISS approve projects at least once a month, but often more frequently.

When you have logged in to SUPR, there will be a list of options in the panel on the left, select ‘Proposals’ to see your proposals, scroll down to ‘Rounds and ’Resources’, and click on ‘Go to Rounds’ to get to the application.

From here you can apply for computing and storage projects.

Computing Projects

Select Compute Rounds followed by NAISS Small Compute

You will then select Create New Proposal for NAISS Small Compute 2024

We generally recommend asking for 2 or 3 x1000 core hours per month. This can always be increased later if you realise that you exceed this boundary, but that is quite rare.

Storage Projects

Select Go to Storage Rounds followed by NAISS Small Storage

Here you will have to request the size of the project folder as well as the number of files you will be storing.

As a general rule of thumb:

For folder size: multiply the size of the data you are uploading by 5-7, depending on the analysis you will be doing. If the size of your files decreases with each step of your pipeline (and you remove large intermediate files during your workflow) as with variant calling for example, a factor of 5 is usually more than enough. If you are performing multiple genome alignments from de novo assemblies, for example, your file sizes don’t decrease significantly between each step of the pipeline, then you should multiply by a higher factor of 7 (or higher, depending on what you will be doing).
For number of files: multiply the number of files you are starting out with by 1000. If you are using Nextflow workflows, this number may have to be a bit higher (but should not exceed 5000) as the number of intermediate files produced here is higher due to parallelisation.

In short, consider the analyses you will be doing, what kind of data you are producing, and how big the files are. With these two very general rules, it should be easier to decide on these parameters when applying for projects!

When your storage and compute projects have been approved, you will be able to upload data from your computer, or download from data delivery systems. If you are downloading from SciLifeLab, please see their guide.

Your approved projects will be located under /proj/naiss202x_xy_xyz where xyz are the project identifiers provided to you by UPPMAX. Here it is important to note that your project will be called “NAISS 2024/23-xyz” on the SUPR page, but the projects follow the above naming convention on the server.

After your storage and compute projects have been approved, you can navigate to your folder in the /proj/ directory and upload your data in your storage project folder. DO NOT upload your data to your user home directory as that directory is only intended for things you will use for every project and space is limited.

Logging in for the first time

Please see our guide or a more comprehensive version from the staff of Uppmax.

Loading modules

Rackham uses modules. This ensures that the path is not flooded, and allows for multiple versions of software to be installed simultaneously.

To see all available modules type module avail
If you want to see the modules that match your search term type module spider tool_name. This is not case specific, so you could search “fastqc” to see all modules that match the string “fastqc” (without the inverted commas!)
To load a module type module load tool_name from the module spider results. This is case specific, so be sure to include version numbers, capital letters, and any punctuation present.
A good habit to get into is to run module load bioinfo-tools the moment you log on to ensure that all modules contained within bioinfo-tools are available to load
You can load multiple modules at once by simply listing them, e.g.: module load bioinfo-tools blast/2.14.1+

Check out the list of list of software installed on Uppmax.

To learn more about the module system, have a look at the guide on UPPMAX.

Slurm

UPPMAX uses a job manager called Slurm. Please be sure to read their SLURM user guide before you start working on UPPMAX! The following is simply a short overview of the most commonly used functions by SLUBI, and not an exhaustive list:

To use Slurm, the project ID from your compute project is used for the -A flag. Your storage project ID is where you store data. Here, the / and spaces in the name are replaced by - as you can see below.

More usefull flags and options:

Time flags are in the format of HH:MM:SS
To run an interactive session use interactive -A naiss202x-xy-xyz -t 2:00:00 to launch an interactive session of 2 hours. Time can be changed according to your needs
If you need to run a batch script, set it up so that the information in the UPPMAX guide is in the header with your project number. You submit your script with the sbatch script.sh command
To view the jobs active for your user jobinfo -u your_username
To cancel a job scancel jobnumber. You can get the jobnumber from the jobinfo command above

UPPMAX in a can

UPPMAX in a can is a wonderful resource produced by the scientists at UPPMAX that allows users to run UPPMAX on their local servers.