Bringing proteomics data to life scientists: A huge but worthy challenge
Bringing proteomics data to life scientists: A huge but worthy challenge
Bringing proteomics data to life scientists: A huge but worthy challenge
Juan Antonio Vizcaino (EMBL-EBI)
September 13th, 09:00-10:00
Marie Curie Room
"My team at the European Bioinformatics institute (EMBL-EBI) is responsible for the PRIDE database, the largest resource worldwide for storing mass spectrometry based proteomics datasets. In recent years, open data policies have generalised in the proteomics field. As a result, there is an unprecedented amount of proteomics data in the public domain, in the order of PBs. This data is increasingly re-used for multiple applications, including the use of “big data” approaches such as machine learning and deep learning. Our strategy is to consistently re-analyse and integrate proteomics data with other omics data types into added-value EMBL-EBI data resources, namely UniProt (for protein sequences and post-translational modifications, starting with phosphorylation), Expression Atlas (for integrating protein and gene expression data) and Ensembl/MGnify (for integrating proteomics information with genomics features). I will explain in detail some ongoing projects.
It is worth highlighting that proteomics data is very rich and can be analysed following many different strategies, in many cases different to those used in the original studies, enabling new biological conclusions. The overall goal of these efforts is to bring proteomics data to life scientists, making it more accessible to those non-expert in proteomics. There are many different types of challenges involved that make this work very challenging at different levels"
Dr. Juan Antonio Vizcaíno is the Proteomics Team Leader at the European Bioinformatics Institute (EMBL-EBI, Cambridge, UK). His team is responsible of the development of the PRIDE database (https://www.ebi.ac.uk/pride/), the world-leading public repository for mass spectrometry (MS) proteomics data and related tools and resources. In addition, he co-founded and is coordinating the ProteomeXchange Consortium (http://www.proteomexchange.org/), aiming to standardize data submission and dissemination in proteomics resources worldwide. He actively promotes open data policies in the proteomics field and has participated in many studies where public proteomics datasets are reused for different purposes. Additionally, over the years, he has heavily contributed to the development of open proteomics data standard formats and related software, as part of his contribution to the HUPO Proteomics Standards Initiative (PSI). He is also co-leading the ELIXIR Proteomics Community (https://elixir-europe.org/communities/proteomics)