Integrated proteogenomics database

Bacteria iconS. meliloti 2011_tryptic

Sinorhizobium meliloti strain 2011 (Genbank #NC_020528 is the reference strain [1] which includes two plasmids #NC_020527 and #NC_020560.

An iPtgxDB was created by hierarchically integrating protein coding sequences from three annotation resources (1-3) and three predictions:

Hierarchy Resource Link
1 NCBI RefSeq GCF_000346065.1_ASM34606v1 ; from 19/05/2017
2 NCBI Genbank GCA_000346065.1_ASM34606v1 ; from 31/01/2014
3 Genoscope [2] v2.7.3, accessed 14/11/2018
4 Prodigal [3] Ab initio gene predictions from Prodigal (v2.6)
5 ChemGenome [4] Ab initio gene predictions from ChemGenome (v2.0, http://www.scfbio-iitd.res.in/chemgenome/chemgenomenew.jsp; with parameters: method, Swissprot space; length threshold, 70 nt; initiation codons, ATG, CTG, TTG, GTG)
6 in silico ORFs The in silico ORF annotations were generated as described by Omasits and Varadarajan et al., 2017 [5]

Only ORFs above a selectable length threshold (here 18 aa) were considered. The iPtgxDB was created using the hierarchy RefSeq > Genbank > Genoscope > Prodigal > ChemGenome > in silico. Files were parsed to extract the identifier, coordinates and sequences of bona fide protein-coding sequences (CDS) and pseudogene entries. For more detail on how we generate iPtgxDBs and how the identifiers can be interpreted, please see reference [5].

References

  1. Sallet, E., Roux, B., Sauviac, L., Jardinaud, M. F., Carrere, S., Faraut, T., de Carvalho-Niebel, F., Gouzy, J., Gamas, P., Capela, D., Bruand, C. and Schiex, T. 2013. Next-generation annotation of prokaryotic genomes with EuGene-P: application to Sinorhizobium meliloti 2011. DNA Res 20(4): 339-354.
  2. Vallenet, D., Belda, E., Calteau, A., Cruveiller, S., Engelen, S., Lajus, A., Le Fevre, F., Longin, C., Mornico, D., Roche, D. et al. 2013. MicroScope--an integrated microbial resource for the curation and comparative analysis of genomic and metabolic data. Nucleic Acids Res 41: D636-647.
  3. Hyatt, D., Chen, G.L., Locascio, P.F., Land, M.L., Larimer, F.W., and Hauser, L.J. 2010. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics 11: 119.
  4. Singhal, P., Jayaram, B., Dixit, S.B., and Beveridge, D.L. 2008. Prokaryotic gene finding based on physicochemical characteristics of codons calculated from molecular dynamics simulations. Biophys J 94: 4173-4183.
  5. Omasits, U., Varadarajan, A. R., Schmid, M., Goetze, S., Melidis, D., Bourqui, M., Nikolayeva, O., Quebatte, M., Patrignani, A., Dehio, C., Frey, J. E., Robinson, M. D., Wollscheid, B., and Ahrens., C. H. 2017. An integrative strategy to identify the entire protein coding potential of prokaryotic genomes by proteogenomics. Genome Research. 27: 2083-2095.
iPtgxDB Release Info
Versions

Version

1
Versions

Date

19.09.2019

Downloads icon Downloads

Compression icon

TAR.GZ

File icon

Size

11.9 MB
Data icon

MD5

f34997a8e1fbdca5716edbfd88cef509
Data icon

SHA1

2ea4bb2577192d3009b13f40bce704671abddc61
Compression icon

ZIP

File icon

Size

12.2 MB
Data icon

MD5

8deb79ffffedfb9fa18b9e83dd05ce01
Data icon

SHA1

66d7c7b662a9c0dd4c1a4ea8de4eed8261332194