Proteogenomics: a New Integrative Approach for a Better Description of Protein Diversity Found in Soil Microflora
Céline Bland and Jean Armengaud
from: Omics in Soil Science (Edited by: Paolo Nannipieri, Giacomo Pietramellara and Giancarlo Renella). Caister Academic Press, U.K. (2014)
Proteogenomics is a relatively recent field at the junction of genomics and proteomics which consists of refining the annotation of the genome of model organisms with the help of high-throughput proteomic data. To get a comprehensive view on how a given microorganism functions, elucidating its genome is a prerequisite. Since the first complete genome of a cellular organism was sequenced, that of Haemophilus influenza in 1995, an impressive catalogue of genomes has been reported. Because automatic annotation software are not yet sufficiently confident, the annotation process should be complemented with experimental data. Alongside the development of high-throughput sequencing techniques, important innovations in tandem mass spectrometry and proteomic approaches have led to the possibility of analyzing thousands of proteins from a given sample. Proteogenomics has proved to be helpful in discovering new genes that were forgotten by automatic annotation software, identifying the true translational initiation codon of coding domain sequences and characterizing maturation events at the protein level, such as signal peptide processing. Consequently, proteogenomics is now proposed at the earliest stage of a genome sequencing project as exemplified by the Deinococcus deserti genome, for which unexpected results, such as the reversal of gene sequences in different bacteria or the use of non-canonical start codons for translation in Deinococcus species, are only some of the numerous corrections obtained by proteogenomics. Because an important issue is the identification of the correct translational start codons, we have pointed out the need for developing N-terminal-oriented strategies to reveal experimentally the precise sites of translation initiation. Today, a better description of the protein universe found in soil microflora can be achieved if proteogenomics is performed on a given set of representative models from this environment read more ...