Containers and Workflow Pipelines for reproducible and automated data analysis
Registration HERE
11/06/2020 to 12/06/2020
Add to Calendar
Containers and Workflow Pipelines for reproducible and automated data analysis
CRG Training Center
Dates: CHANGE OF DATES 11th & 12th June 2020
Time: 09:30-17:00h
Trainers: CRG Bioinformatics core facility
Location: CRG Training Center (PRBB Patio)
Maximum nº of attendees: 19
Registration deadline: 25th May 2020
Description of the course
The first day is dedicated to Linux Containers (Docker & Singularity) which are great tools for code portability and analysis reproducibility. You will learn how to build a container from scratch, share it with others and how to re-use and modify existing containers.
On the second day, you will learn how to use Nextflow for building scalable and reproducible bioinformatics pipelines and running them on a personal computer, cluster and cloud.
After two days of the course, there will be 10 days of the hackathon, during which the teams will work on building up real pipelines by topics of interest, followed up by the day of the hackathon follow-up and troubleshooting.
Objectives
Containers
- Learn the concept of and the difference between Docker & Singularity containers
- Write a Docker recipe, build and run a Docker image and containers
- Pull and push Docker container to / from Docker hub
- Docker files and layers; Docker cashing
- Working with volumes
- Pull Docker containers as a Singularity image
Pipelines
- Understand Nextflow's basic concepts: processes, channels, ...
- Write and run a Nextflow pipeline (using a Singularity containers)
Programme:
Day 1: Docker containers
09:00 - 09:30 Introduction to containers
- History of containers, what are containers and why should we use them?
- Containers vs. virtual machines
13:00 - 17:00 Singularity Containers
- Differences between Singularity and Docker: why and when to use one or the other. Pros and cons.
- Singularity recipes
- Building a basic Singularity image
- Pull and run an image with Singularity from Docker hub
- Volumes in Singularity
- Use a Singularity image interactively
Day 2:
09:00 - 17:00 Nextflow pipelines
- Run a simple Nextflow pipeline and obtain a thorough understanding of config and pipeline files
- Modify a pipeline and rerun processes
- Theoretic approach to processes, channels and operators; the basics of Nextflow
- Write and run a simple Nextflow pipeline (e.g. print text, process a simple calculation)
- Including Singularity containers in Nextflow pipelines
Day 3 (TBC)
09:00 – 17:00 Hackathon
- Team presentations of developed pipelines and troubleshooting
Attendees: Researchers using the Linux command line on a regular basis with no or little knowledge of containers or workflow pipelines.