Advanced LINUX and Genomics Data Formats
Advanced LINUX and Genomics Data Formats
When: 19th and 21st March 2019 (10:00-13.30)
Organizer: Bioinformatics CF
Trainer: Luca Cozzuto
Instructors: Sarah Bonnin, Toni Hermoso
Max number of participants: 16
Application deadline: 12th March 2019
Course Description:
This hands-on course is tailored to the biologists with a minimal experience with Linux commands (see prerequisites*). It introduces the most common genomics data formats and Linux commands on how to manipulate genomics files, using real data retrieved during the course from public repositories. At the end of the course you will be able to describe genomics data formats and how to parse them.
*Prerequisites: Linux course for beginners
Detailed Programme:
Day 1.
- Project structure and managing directories (Linux commands: mkdir, mv, cp, rm, find, ls, du)
- Fasta, fastq and bed-formatted files: retrieving from public repositories, viewing, handling and manipulating (Linux commands: wget, zip, zcat, cat, head, tail, more, less, diff, merge, sort, uniq, grep, egrep, chmod; regular expressions).
Day 2.
- Intro to programming in Linux (shell scripting): awk programing language, environmental variables (echo), loops, pipe, output.
- Manipulating tabular and gtf-formatted files (cut, sed, awk).
This course is a prerequisite for the following course "RNA-seq data analysis for biologists."