Integrated proteogenomics database

Bacteria iconP. aeruginosa MPAO1

Pseudomonas aeruginosa strain MPAO1 (Genbank #CP027857), is an opportunistic human pathogen that belongs to the notorious group of Gram-negative ESKAPE pathogens [1]. MPAO1 is also the parental strain for the widely used transposon (Tn) insertion mutant library from the University of Washington [2]. In early 2018, there was only one strain at the NCBI that was annotated as MPAO1 (Genbank #GCF_000247435.1), and which had been sequenced using Illumina's short read technology and assembled into 140 contigs [3]. To provide an optimal basis for subsequent functional genomics and evolution studies for P. aeruginosa strain MPAO1, we re-sequenced and assembled its complete genome purely de novo, combining long PacBio and short Illumina MiSeq reads [4]. A comparative genomics analysis with the PAO1-UW reference strain [5] identified 232 MPAO1-unique gene, 21 PAO1-unique and 5,534 conserved gene clusters [4]. The complete MPAO1 genome sequence harbors several deletions and insertions compared to the PAO1-UW reference genome including numerous MPAO1-unique genes.

An iPtgxDB was created by hierarchically integrating protein coding sequences from these annotation resources:

Hierarchy Resource Link
1 NCBI RefSeq CP027857.1; from 12/10/2018
2 Prodigal [6] Ab initio gene predictions from Prodigal (v1.12)
3 ChemGenome [7] Ab initio gene predictions from ChemGenome (v2.0, http://www.scfbio-iitd.res.in/chemgenome/chemgenomenew.jsp; with parameters: method, Swissprot space; length threshold, 70 nt; initiation codons, ATG, CTG, TTG, GTG)
4 in silico ORFs The in silico ORFs annotations were generated as described by Omasits and Varadarajan et al., [8]

Only ORFs above a selectable length threshold (here 18 aa) were considered. The iPtgxDB was created using the hierarchy RefSeq > Prodigal > Chemgenome > in silico. Files were parsed to extract the identifier, coordinates and sequences of bona fide protein-coding sequences (CDS) and pseudogene entries.

References

  1. Boucher, H.W., et al. 2009. Bad bugs, no drugs: no ESKAPE! An update from the Infectious Diseases Society of America. Clin Infect Dis. 48(1): 1-12.
  2. Jacobs, M. A., Alwood, A., Thaipisuttikul, I., Spencer, D., Haugen, E., Ernst, S., et al. 2003. Comprehensive transposon mutant library of Pseudomonas aeruginosa. Proc Natl Acad Sci U S A. 100(24): 14339–44.
  3. Olivas, A. D., Shogan, B. D., Valuckaite, V., Zaborin, A., Belogortseva, N., Musch, M., et al. 2012. Intestinal tissues induce an SNP mutation in Pseudomonas aeruginosa that enhances its virulence: possible role in anastomotic leak. PLoS One 7(8): e44326.
  4. Varadarajan, A. R., Raymond, N. A., Valentin, J., Castañeda Ocampo, O. E., Somerville, V., Pietsch, F., Buhmann, M. T., Skipp, P., van der Mei, C. H., Ren, Q., Schreiber, F., Webb, S. W., Ahrens, C. H. 2020. An integrated model system to study biofilm-associated adaptation to antimicrobials and resistance evolution in Pseudomonas aeruginosa MPAO1. (manuscript to be submitted Jan 2020).
  5. Stover, C.K., et al. 2000. Complete genome sequence of Pseudomonas aeruginosa PAO1, an opportunistic pathogen. Nature. 406(6799): 959-964.
  6. Hyatt, D., Chen, G.L., Locascio, P.F., Land, M.L., Larimer, F.W., and Hauser, L.J. 2010. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics 11: 119.
  7. Singhal, P., Jayaram, B., Dixit, S.B., and Beveridge, D.L. 2008. Prokaryotic gene finding based on physicochemical characteristics of codons calculated from molecular dynamics simulations. Biophys J 94: 4173-4183.
  8. Omasits, U., Varadarajan, A. R., Schmid, M., Goetze, S., Melidis, D., Bourqui, M., Nikolayeva, O., Quebatte, M., Patrignani, A., Dehio, C., Frey, J. E., Robinson, M. D., Wollscheid, B., and Ahrens., C. H. 2017. An integrative strategy to identify the entire protein coding potential of prokaryotic genomes by proteogenomics. Genome Research. 27: 2083-2095.
iPtgxDB Release Info
Version
Versions
1
Date
Calendar
03.03.2019

Downloads icon Downloads

Compression icon

TAR.GZ

Size
MD5
SHA1
Compression icon

ZIP

Size
MD5
SHA1