Advanced Cluster Course (for CRG staff only) 2024
Advanced Cluster Course (for CRG staff only) 2024
This course will consolidate material presented in the beginner cluster course and expand on the concepts to be aware of when trying to optimize use of the cluster.
The main message of the course is to embrace the parallelism available within the cluster and that pipelines should be made from lots of small independent pieces that are spread throughout the cluster rather than large monolithic long jobs that run on a single node. The course will show why this should be done and how to achieve it.
Topics that are going to be addressed:
- Supercomputers, beowulf clusters, horizontal v vertical scaling
- Hardware considerations
- Multithreaded jobs, parallelism, Amdahl's Law
- Job arrays
- Job dependencies
- Building a pipeline
- Storage issues, treemap
- Job stats, resource estimation
- Scaling analysis
What NOT to expect:
Specific bioinformatics methods, pipeline builders (nextflow, snakemake etc.)
Program:
Session 1
- Supercomputers, beowulf clusters, horizontal v vertical scaling
- Hardware considerations
- Multithreaded jobs, parallelism, Amdahl's Law
- Job arrays
- Job dependencies
Session 2
- Building a pipeline
- Storage issues, treemap
Session 3
- Job stats, resource estimation
- Scaling analysis
Instructors and teachers: Emyr James, Head of SIT, and co-trainers: Rodny Hernandez, Gabriel Gonzalez, Luis Exposito, Clemente Borges
Dates: 9, 10 and 16th of April 2024 (10:30-13:30)
Location: Bioinformatics room, CRG Training Centre
Maximum number of participants: 18 (CRG staff only)
Level: Intermediate to Advanced
Registration deadline: 24th of March 2024
Registration HERE
For any information, please send an email to CRG Training and Academic office (TAO): training@crg.eu