Integrated proteogenomics database

Bacteria iconB. producta DSM2950_tryptic

The extended simplified human intestinal microbiota (SIHUMIx) consists of eight bacterial members (Anaerostipes caccae (DSMZ 14662); Bacteroides thetaiotaomicron (DSMZ 2079); Bifidobacterium longum (NCC 2705); Blautia producta (DSMZ 2950); Clostridium butyricum (DSMZ 10702); Clostridium ramosum (DSMZ 1402); Escherichia coli K-12 (MG1655); Lactobacillus plantarum (DSMZ 20174)) of the human intestine and thus represents a model community to analyze such microbial interactions [1].

A tryptic iPtgxDB of Bifidobacterium longum (NCC 2705) was created by hierarchically integrating protein coding sequences from the following annotation resources:

Hierarchy Resource Link
1 NCBI RefSeq NC_004307.2 / NC_004943.1; from 06-APR-2020
2 Prodigal [2] ab initio gene predictions from Prodigal (v2.6)
3 ChemGenome [3] ab initio gene predictions from ChemGenome (v2.0, http://www.scfbio-iitd.res.in/chemgenome/chemgenomenew.jsp; with parameters: method, Swissprot space; length threshold, 70 nt; initiation codons, ATG, CTG, TTG, GTG)
4 in silico ORFs in silico ORF annotations were generated as described by Omasits and Varadarajan et al., 2017 (v2.0, Only ORFs above a selectable length threshold (here 18 aa) were considered.)

The iPtgxDB was created using the hierarchy RefSeq > Prodigal > ChemGenome > in silico. Files were parsed to extract the identifier, coordinates and sequences of bona fide protein-coding sequences (CDS) and pseudogene entries.

References

  1. Becker, N., Kunath, J., Loh, G. & Blaut, M. Human intestinal microbiota: Characterization of a simplified and stable gnotobiotic rat model. Gut Microbes 2, 25-33, doi:10.4161/gmic.2.1.14651 (2011).
  2. Hyatt, D., Chen, G.L., Locascio, P.F., Land, M.L., Larimer, F.W., and Hauser, L.J. 2010. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics 11: 119.
  3. Singhal, P., Jayaram, B., Dixit, S.B., and Beveridge, D.L. 2008. Prokaryotic gene finding based on physicochemical characteristics of codons calculated from molecular dynamics simulations. Biophys J 94: 4173-4183.
  4. Omasits, U., Varadarajan, A. R., Schmid, M., Goetze, S., Melidis, D., Bourqui, M., Nikolayeva, O., Quebatte, M., Patrignani, A., Dehio, C., Frey, J. E., Robinson, M. D., Wollscheid, B., and Ahrens., C. H. An integrative strategy to identify the entire protein coding potential of prokaryotic genomes by proteogenomics. bioRxiv, Cold Spring Harbor Labs Journals, 2017.
iPtgxDB Release Info
Versions

Version

1
Versions

Date

10.08.2020

Downloads icon Downloads

Compression icon

TAR.GZ

File icon

Size

10.5 MB
Data icon

MD5

93bc5d8768880635215d62e7eb28e876
Data icon

SHA1

838b61e5edbf397178d3fc2c3af8bdeadea1db15
Compression icon

ZIP

File icon

Size

10.8 MB
Data icon

MD5

c158a1b6c42e1468fd873af38151960b
Data icon

SHA1

f6418ac862ddbf68a50d87c61f9d58ab8da92e4e