NEWS
Large-scale assessment of transcriptome analysis software
PRESS RELEASE
LARGE-SCALE ASSESSMENT OF TRANSCRIPTOME ANALYSIS SOFTWARE
CRG and UB researchers have taken part in an international consortium of scientists that has carried out a systematic assessment of gene expression analysis software. The results, which appear in two papers in Nature Methods, may inspire new computing approaches to handle current and future technologies for gene expression analysis and its characterisation.
Scientists use a method called RNA sequencing (RNA-seq) to see how genes are being expressed across an entire genome. But how can they analyse this information, and how good is the software they use to do so?
The RNA-seq Genome Annotation Assessment Project (RGASP), an ENCODE-affiliated initiative, evaluated the performance of a wide range of RNA-seq computer programs. They were able to specify which approaches work well for certain tasks, and which areas can be improved.
“To adequately manage sequencing data, we have been working to find new and more-sophisticated alternatives to the currently available methods”, states Roderic Guigó, coordinator of the Bioinformatics and Genomics programme at the Centre for Genomic Regulation in Barcelona. “The conclusions we are presenting in these two papers will help us to extract more information from nucleic acid sequencing methods and facilitate application of these methods to diverse fields such as medicine and biotechnology,” he adds.
“By systematically comparing the existing computational tools to detect genes within genomes, we try to determine whether the new RNA-seq data improve or not reliability of predictions on gene structures”, explains Josep F. Abril, researcher in the Department of Genetics at the University of Barcelona and member of the Institute of Biomedicine of the same university (IBUB). “Apart from providing excellent new tools for gene prediction, we have also identified the questions we should address in the future and targeted which points we should continue to study to improve these tools”, concludes.
“We found a striking degree of variability in how these programs handle different aspects of RNA-seq data. Some methods performed well overall, whereas others have clever design features that excel at solving specific problems. We were also able to highlight areas where many of these computational approaches can improve,” says Paul Bertone of EMBL-EBI, who coordinated the study. “This kind of work provides an important resource for the genomics community, and the consortium model was a unique platform to deliver that in a large-scale, systematic way.”
In both studies, developers of leading software programs were invited to participate in a detailed evaluation of computational methods for processing and interpreting RNA-seq data. The framework was based on the Encyclopedia of DNA Elements (ENCODE) Genome Annotation Assessment Project (EGASP), in which the original program developers contribute their results for evaluation. Each of the methods compared in the study performs sequence alignment and transcript reconstruction: essential steps in the analysis of RNA-seq experiments.
The consortium’s systematic, meticulous approach to the performance assessment resulted in findings that can be used to enhance and expand the range of RNA-seq analysis tools that are available for different kinds of studies. They can also be used to inform developments that meet the demands of emerging sequencing technologies.
Roderic Guigó, coordinator of the Bioinformatics and Genomics programme at the Centre for Genomic Regulation (CRG) in Barcelona and Josep F. Abril, researcher at the Institute of Biomedicine of the University of Barcelona (IBUB) and the Genetics Department at the same university have participated in these articles that the prestigious journal Nature Methods highlights this week in its webpage. Dr. Guigó organised in 2011 the Sequence Mapping and Assembly Assessment Project dnGASP/RGASP3 (SMAAP) workshop, which brought to Barcelona more than 50 experts in sequencing technology and genome alignment methods. Finally it is also remarkable that both scientists were the only Spanish participants in the Human Genome project when Josep F. Abril was a student in Roderic Guigó’s laboratory.
Source articles
- Steijger, T., et al. (2013) Assessment of transcript reconstruction methods for RNA-seq. Nature Methods (in press); published online 3 November. DOI: doi:10.1038/nmeth.2714.
- Engström, P., et al. (2013) Systematic evaluation of spliced alignment programs for RNA-seq data. Nature Methods (in press); published online 3 November. DOI: doi:10.1038/nmeth.2722.