Integrated proteogenomics database

Bacteria iconL. monocytogenes EGD-e

Listeria monocytogenes strain EGD-e (serovar 1/2a; Genbank #CP023861), was derived from strain EGD, originally isolated from guinea pigs and used in studies of cell-mediated immunity[1], and differs quite substantially from EGD [2]. While the strain is already available as a RefSeq strain (Genbank #NC_003210) in NCBI, assembled using a mixed strategy of de novo and reference-based assembly, its genome was re-sequenced and assembled purely with a de novo hybrid strategy, by combining PacBio and Illumina MiSeq reads, to obtain complete genome sequence for subsequent proteogenomic studies [3].

An iPtgxDB was created by hierarchically integrating protein coding sequences from these annotation resources:

Hierarchy Resource Link
1 NCBI RefSeq CP023861.1; from 23/10/2017
2 Prodigal [4] Ab initio gene predictions from Prodigal (v1.12)
3 in silico ORFs The in silico ORFs annotations were generated as described by Omasits and Varadarajan et al., 2017

Only ORFs above a selectable length threshold (here 18 aa) were considered. The iPtgxDB was created using the hierarchy RefSeq > Prodigal > in silico. Files were parsed to extract the identifier, coordinates and sequences of bona fide protein-coding sequences (CDS) and pseudogene entries.

References

  1. Glaser, P., Frangeul, L., Buchrieser, C., Rusniok, C., Amend, A., Baquero, F., et al. 2001. Comparative genomics of Listeria species. Science 294: 849–852.
  2. Bécavin, C., Bouchier, C., Lechat, P., Archambaud, C., Creno, S., Gouin, E., et al. 2014. Comparison of widely used Listeria monocytogenes strains EGD, 10403S, and EGD-e highlights genomic variations underlying differences in pathogenicity. MBio 5: e00969–14.
  3. Varadarajan, A. R., Pavlou, M., Goetze, S., Grosboillot, V., Shen, Y., Loessner, M. H., Ahrens, C., Wollscheid, B. 2019. A proteogenomic resource enabling rapid quantitative proteotype profiling of Listeria strains using DIA/SWATH. PLoS Pathog. (manuscript in preparation).
  4. Hyatt, D., Chen, G.L., Locascio, P.F., Land, M.L., Larimer, F.W., and Hauser, L.J. 2010. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics 11: 119.
  5. Omasits, U., Varadarajan, A. R., Schmid, M., Goetze, S., Melidis, D., Bourqui, M., Nikolayeva, O., Quebatte, M., Patrignani, A., Dehio, C., Frey, J. E., Robinson, M. D., Wollscheid, B., and Ahrens., C. H. 2017. An integrative strategy to identify the entire protein coding potential of prokaryotic genomes by proteogenomics. Genome Research. 27: 2083-2095.
iPtgxDB Release Info
Version
Versions
1
Date
Calendar
23.10.2017

Downloads icon Downloads

Compression icon

TAR.GZ

Size
MD5
SHA1
Compression icon

ZIP

Size
MD5
SHA1