Integrated proteogenomics database

Bacteria iconB. diazoefficiens USDA110_tryptic

Bradyrhizobium diazoefficiens USDA 110 (Genbank #NC_004463) is a widely used model organism to study rhizobial symbiosis [1].

An iPtgxDB was created by hierarchically integrating protein coding sequences from the following annotation resources:

Hierarchy Resource Link
1 NCBI RefSeq NC_004463.1; from 22/07/2013
2 Ensembl Ensembl's Genomes project (GCA_000011365.1, Feb/2011)
3 Genoscope [2] NC_004463, accessed 09/09/2013
4 CMR [3] J. Craig Venter Institute's Comprehensive Microbial Resource (CMR)
5 Prodigal [4] Ab initio gene predictions from Prodigal (v2.5)
6 ChemGenome [5] Ab initio gene predictions from ChemGenome (v2.0, http://www.scfbio-iitd.res.in/chemgenome/chemgenomenew.jsp; with parameters: method, Swissprot space; length threshold, 70 nt; initiation codons, ATG, CTG, TTG, GTG)
7 in silico ORFs The in silico ORFs annotations were generated as described by Omasits and Varadarajan et al., 2017

Only ORFs above a selectable length threshold (here 18 aa) were considered. The iPtgxDB was created using the hierarchy RefSeq > Ensembl > Genoscope > CMR > Prodigal > in silico. Files were parsed to extract the identifier, coordinates and sequences of bona fide protein-coding sequences (CDS) and pseudogene entries.

References

  1. Kaneko T, Nakamura Y, Sato S, Minamisawa K, Uchiumi T, Sasamoto S, Watanabe A, Idesawa K, Iriguchi M, Kawashima K, Kohara M, Matsumoto M, Shimpo S, Tsuruoka H, Wada T, Yamada M, Tabata S. 2002. Complete genomic sequence of nitrogen-fixing symbiotic bacterium Bradyrhizobium japonicum USDA110. DNA Res. 9(6): 189-197.
  2. Vallenet, D., Belda, E., Calteau, A., Cruveiller, S., Engelen, S., Lajus, A., Le Fevre, F., Longin, C., Mornico, D., Roche, D. et al. 2013. MicroScope--an integrated microbial resource for the curation and comparative analysis of genomic and metabolic data. Nucleic Acids Res 41: D636-647.
  3. Peterson, J. D., Umayam, L. A., Dickinson, T., Hickey, E. K., White, O. 2001. The Comprehensive Microbial Resource. Nucleic Acids Res., 29, 1:123-5.
  4. Hyatt, D., Chen, G.L., Locascio, P.F., Land, M.L., Larimer, F.W., and Hauser, L.J. 2010. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics 11: 119.
  5. Singhal, P., Jayaram, B., Dixit, S.B., and Beveridge, D.L. 2008. Prokaryotic gene finding based on physicochemical characteristics of codons calculated from molecular dynamics simulations. Biophys J 94: 4173-4183.
  6. Omasits, U., Varadarajan, A. R., Schmid, M., Goetze, S., Melidis, D., Bourqui, M., Nikolayeva, O., Quebatte, M., Patrignani, A., Dehio, C., Frey, J. E., Robinson, M. D., Wollscheid, B., and Ahrens., C. H. An integrative strategy to identify the entire protein coding potential of prokaryotic genomes by proteogenomics. bioRxiv, Cold Spring Harbor Labs Journals, 2017.
iPtgxDB Release Info
Versions

Version

1
Versions

Date

09.09.2013

Downloads icon Downloads

Compression icon

TAR.GZ

File icon

Size

18.5 MB
Data icon

MD5

f57949d39a47b324a8a77feeba8a43e1
Data icon

SHA1

76a88012df77f8daf995ddc13d58581cbd325c18
Compression icon

ZIP

File icon

Size

19.0 MB
Data icon

MD5

91a8557ce067592eefe3cbaa585a5b76
Data icon

SHA1

5ccd535b20a378ed3105344629c126f26b181648