Software
SmartCell is a program developed to provide an idea of the evolution of a network in a whole, single cell.
SmartCell is a program developed to provide an idea of the evolution of a network in a whole, single cell. Based on stochastic algorithms, SmartCell needs multiple runs to have mean results. To help the user, SmartCell is distributed with a graphic user interface that allows for the creation of a model with a user friendly interface, as well as for the analysis and treatment of the results after the runs.
SmartCell is being developed by Luis Serrano and his team at the CRG in Barcelona.
A database for the molecular phenotyping of human SNPs and disease mutations
Single nucleotide polymorphisms (SNPs) are, together with copy number variation, the primary source of variation in the human genome. SNPs are associated with altered response to drug treatment, susceptibility to disease, and other phenotypic variation. Furthermore, during genetic screens for disease-associated mutations in groups of patients and control individuals, the distinction between disease causing mutations and polymorphisms is often unclear. Annotation of the functional and structural implications of single nucleotide changes thus provides valuable information to interpret and guide experiments.
SNPeffect is a database of non-synonymous SNPs and their predicted effect on the functional and physicochemical properties of the affected proteins. More precisely, SNPeffect analyses the effect of coding, non-synonymous SNPs on 3 categories of functional and physico-chemical properties of the affected proteins, namely protein structure and dynamics [stability, aggregation, dynamics, etc.], integrity of functional sites and cellular processing.
SNPeffect was originally developed by Joost Schymkowitz and Frederic Rousseau and their team at the SWITCH Laboratory of VIB in Brussels, Belgium, in collaboration with Luis Serrano and his team at the European Molecular Biology Laboratory in Heidelberg, Germany.
model Selection in Phylogenetics based on algebraic INvariants
sQTLseekeR is a R package to detect splicing QTLs (sQTLs), which are variants associated with change in the splicing pattern of a gene. Here, splicing patterns are modeled by the relative expression of the transcripts of a gene.
Starcode is a DNA sequence clustering software.
Starcode is a DNA sequence clustering software. Sequence clustering is performed by finding all pairs below a Levenshtein distance metric. Typically, a file containing a set of related DNA sequences is passed as input, jointly with a parameter specifying the desired cluster distance. Starcode aligns and computes the distance between all the sequence pairs and prints a line for each cluster containing: canonical DNA sequence, sequence count and the list of sequences that belong to the cluster.
Starcode has many applications in the field of biology, such as DNA/RNA motif recovery, barcode clustering, sequencing error recovery, etc.
Reference: Velten et al., Nature Cell Biology 2017
Ranked best-in-class by Saelens et al., Nature Biotechnology 2019
SuperFly is a database for the comparative analysis of segmentation gene expression and regulation in dipteran species (flies, midges, and mosquitoes)
SymCurv is a computational ab initio method for nucleosome positioning prediction.
SymCurv is a computational ab initio method for nucleosome positioning prediction. It is based on the structural property of natural nucleosome forming sequences, to be symmetrically curved around a local minimum of curvature.
T-Coffee is a multiple sequence alignment package.
T-Coffee is a multiple sequence alignment package. You can use T-Coffee to align sequences or to combine the output of your favorite alignment methods (Clustal, Mafft, Probcons, Muscle, etc.) into one unique alignment (M-coffee). T-Coffee can align Protein, DNA and RNA sequences. It is also able to combine sequence information with protein structural information (Expresso), profile information (PSI-Coffee) or RNA secondary structures (R-Coffee).
An algorithm for the prediction of aggregating regions in unfolded polypeptide chains
TANGO is a statistical mechanics computer algorithm developed for the prediction of aggregation nucleating regions in proteins, as well as of the effect of mutations and environmental conditions on the aggregation propensity of these regions.
TANGO is based on the physico-chemical principles of b-sheet formation extended by the assumption that the core regions of an aggregate are fully buried. TANGO was benchmarked against 175 peptides of over 20 proteins and was able to predict the sequences experimentally observed to contribute to the aggregation of these proteins. TANGO also correctly predicts the aggregation propensities of several disease-related mutations in the Alzheimer´s b-peptide, human lysozyme and transthyrethin, and discriminates between b-sheet tendency and aggregation.
The success of TANGO confirms the model of intermolecular b-sheet formation as a wide-spread underlying mechanism of protein aggregation and opens the possibility of screening large databases for potential disease-related aggregation motifs, as well as optimizing recombinant protein yields by rationally out-designing protein aggregation.
TANGO was originally developed by Luis Serrano and his team at the European Molecular Biology Laboratory in Heidelberg, Germany.
The Flux Capacitor predicts abundances for transcript molecules and alternative splicing events from RNAseq experiments.
The Flux Capacitor predicts abundances for transcript molecules and alternative splicing events from RNAseq experiments. Additionally, there is a simulation pipeline that is capable to simulate whole transcriptome sequencing experiments.
The Flux Simulator aims at modeling RNA-Seq experiments in silico: sequencing reads are produced from a reference genome according annotated transcripts.
The GEM (GEnome Multi-tool) Library is a set of very optimized tools for indexing/querying huge genomes/files.
A set of very optimized tools for indexing/querying huge genomes/files. Provided so far: a very fast exact mapper, and an unconstrained split-mapper
trimAl is a tool for the automated removal of spurious sequences or poorly aligned regions from a multiple sequence alignment It also includes readAl, a format converter between most alignment formats.
trimAl is a tool for the automated removal of spurious sequences or poorly aligned regions from a multiple sequence alignment It also includes readAl, a format converter between most alignment formats.
The resource described, the U12 Intron Database (U12DB), aims to catalog the U12 introns of completely sequenced eukaryotic genomes and associate orthologous introns with each other.
U12-type introns are spliced by the U12-dependent spliceosome and are present in the genomes of many higher eukaryotic lineages including plants, chordates and some invertebrates. Investigations into the evolution and mechanism of U12-depending splicing would be facilitated by access to a catalog of such introns. However, due to their relatively recent discovery and a systematic bias against recognition of non-canonical splice sites in general, the introns defined by U12-type splice sites are under-represented in genome annotations. Such under-representation compounds the already difficult problem of determining gene structures. It also impedes attempts to study these introns genome-wide or phylum-wide. The resource described here, the U12 Intron Database (U12DB), aims to catalog the U12 introns of completely sequenced eukaryotic genomes and associate orthologous introns with each other.