Integrated proteogenomics database

Bacteria iconL. monocytogenes ScottA

Listeria monocytogenes strain ScottA (serovar 4b; Genbank #CP023862), was isolated during the Massachusetts listeriosis outbreak in 1983[1]. While the strain is already available as a RefSeq strain (Genbank #NZ_CM001159) in NCBI, assembled using a mixed strategy of de novo and reference-based assembly, its genome was re-sequenced and assembled purely with a de novo hybrid strategy, by combining PacBio and Illumina MiSeq reads, to obtain complete genome sequence for subsequent proteogenomic studies [2].

An iPtgxDB was created by hierarchically integrating protein coding sequences from these annotation resources:

Hierarchy Resource Link
1 NCBI RefSeq CP023862.1; from 23/10/2017
2 Prodigal [4] Ab initio gene predictions from Prodigal (v1.12)
3 in silico ORFs The in silico ORFs annotations were generated as described by Omasits and Varadarajan et al., 2017

Only ORFs above a selectable length threshold (here 18 aa) were considered. The iPtgxDB was created using the hierarchy RefSeq > Prodigal > in silico. Files were parsed to extract the identifier, coordinates and sequences of bona fide protein-coding sequences (CDS) and pseudogene entries.

References

  1. Briers, Y., Klumpp, J., Schuppler, M., Loessner, MJ. 2011. Genome sequence of Listeria monocytogenes Scott A, a clinical isolate from a food-borne listeriosis outbreak. J Bacteriol. 193: 4284–4285.
  2. Varadarajan, A. R., Pavlou, M., Goetze, S., Grosboillot, V., Shen, Y., Loessner, M. H., Ahrens, C., Wollscheid, B. 2019. A proteogenomic resource enabling rapid quantitative proteotype profiling of Listeria strains using DIA/SWATH. PLoS Pathog. (manuscript in preparation).
  3. Hyatt, D., Chen, G.L., Locascio, P.F., Land, M.L., Larimer, F.W., and Hauser, L.J. 2010. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics 11: 119.
  4. Omasits, U., Varadarajan, A. R., Schmid, M., Goetze, S., Melidis, D., Bourqui, M., Nikolayeva, O., Quebatte, M., Patrignani, A., Dehio, C., Frey, J. E., Robinson, M. D., Wollscheid, B., and Ahrens., C. H. 2017. An integrative strategy to identify the entire protein coding potential of prokaryotic genomes by proteogenomics. Genome Research. 27: 2083-2095.
iPtgxDB Release Info
Version
Versions
1
Date
Calendar
23.10.2017

Downloads icon Downloads

Compression icon

TAR.GZ

Size
MD5
SHA1
Compression icon

ZIP

Size
MD5
SHA1