Integrated proteogenomics database

Bacteria iconL. monocytogenes ScottA_tryptic

Listeria monocytogenes strain ScottA (serovar 4b; Genbank #CP023862), was isolated during the Massachusetts listeriosis outbreak in 1983[1]. While the strain is already available as a RefSeq strain (Genbank #NZ_CM001159) in NCBI, assembled using a mixed strategy of de novo and reference-based assembly, its genome was re-sequenced and assembled purely with a de novo strategy, by combining PacBio and Illumina MiSeq reads, to obtain complete genome sequence for subsequent proteogenomic studies [2].

An iPtgxDB was created by hierarchically integrating protein coding sequences from these annotation resources:

Hierarchy Resource Link
1 NCBI RefSeq CP023862.1; from 23/10/2017
2 Prodigal [4] Ab initio gene predictions from Prodigal (v1.12)
3 in silico ORFs The in silico ORFs annotations were generated as described by Omasits and Varadarajan et al., 2017

Only ORFs above a selectable length threshold (here 18 aa) were considered. The iPtgxDB was created using the hierarchy RefSeq > Prodigal > in silico. Files were parsed to extract the identifier, coordinates and sequences of bona fide protein-coding sequences (CDS) and pseudogene entries.

References

  1. Briers, Y., Klumpp, J., Schuppler, M., Loessner, MJ. 2011. Genome sequence of Listeria monocytogenes Scott A, a clinical isolate from a food-borne listeriosis outbreak. J Bacteriol. 193: 4284–4285.
  2. Varadarajan, A. R., Pavlou, M., Goetze, S., Grosboillot, V., Shen, Y., Loessner, M. H., Ahrens, C., Wollscheid, B. 2019. A proteogenomic resource enabling rapid quantitative proteotype profiling of Listeria strains using DIA/SWATH. PLoS Pathog. (manuscript in preparation).
  3. Hyatt, D., Chen, G.L., Locascio, P.F., Land, M.L., Larimer, F.W., and Hauser, L.J. 2010. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics 11: 119.
  4. Omasits, U., Varadarajan, A. R., Schmid, M., Goetze, S., Melidis, D., Bourqui, M., Nikolayeva, O., Quebatte, M., Patrignani, A., Dehio, C., Frey, J. E., Robinson, M. D., Wollscheid, B., and Ahrens., C. H. 2017. An integrative strategy to identify the entire protein coding potential of prokaryotic genomes by proteogenomics. Genome Research. 27: 2083-2095.
iPtgxDB Release Info
Versions
Version
1
Versions
Date
23.10.2017

Downloads icon Downloads

Compression icon

TAR.GZ

File icon
Size
4.0 MB
Data icon
MD5
62a24f69f5fd8ddac07a578022653d75
Data icon
SHA1
f6da120a48abd508e2758ab604dfe365043663e9
Compression icon

ZIP

File icon
Size
4.1 MB
Data icon
MD5
3341d56379a381f2885b7c11d8382e4b
Data icon
SHA1
b7288be1d9af96fdef27b4b56643afc1c6cf4569