Integrated proteogenomics database

Bacteria iconC. ramosum DSM1402_tryptic

The extended simplified human intestinal microbiota (SIHUMIx) consists of eight bacterial members (Anaerostipes caccae (DSMZ 14662); Bacteroides thetaiotaomicron (DSMZ 2079); Bifidobacterium longum (NCC 2705); Blautia producta (DSMZ 2950); Clostridium butyricum (DSMZ 10702); Clostridium ramosum (DSMZ 1402); Escherichia coli K-12 (MG1655); Lactobacillus plantarum (DSMZ 20174)) of the human intestine and thus represents a model community to analyze such microbial interactions [1].

A tryptic iPtgxDB of Clostridium ramosum (DSM 1402) was created by hierarchically integrating protein coding sequences from the following annotation resources:

Hierarchy Resource Link
1 NCBI RefSeq CP036346.1 from 22-FEB-2019 (NCBI Prokaryotic Genome Annotation Pipeline (PGAP v.4.7)
2 Prodigal [2] ab initio gene predictions from Prodigal (v2.6)
3 ChemGenome [3] ab initio gene predictions from ChemGenome (v2.0, http://www.scfbio-iitd.res.in/chemgenome/chemgenomenew.jsp; with parameters: method, Swissprot space; length threshold, 70 nt; initiation codons, ATG, CTG, TTG, GTG)
4 in silico ORFs in silico ORF annotations were generated as described by Omasits and Varadarajan et al., 2017 (v2.0, Only ORFs above a selectable length threshold (here 18 aa) were considered.)

The iPtgxDB was created using the hierarchy RefSeq > Prodigal > ChemGenome > in silico. Files were parsed to extract the identifier, coordinates and sequences of bona fide protein-coding sequences (CDS) and pseudogene entries.

References

  1. Becker, N., Kunath, J., Loh, G. & Blaut, M. Human intestinal microbiota: Characterization of a simplified and stable gnotobiotic rat model. Gut Microbes 2, 25-33, doi:10.4161/gmic.2.1.14651 (2011).
  2. Hyatt, D., Chen, G.L., Locascio, P.F., Land, M.L., Larimer, F.W., and Hauser, L.J. 2010. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics 11: 119.
  3. Singhal, P., Jayaram, B., Dixit, S.B., and Beveridge, D.L. 2008. Prokaryotic gene finding based on physicochemical characteristics of codons calculated from molecular dynamics simulations. Biophys J 94: 4173-4183.
  4. Omasits, U., Varadarajan, A. R., Schmid, M., Goetze, S., Melidis, D., Bourqui, M., Nikolayeva, O., Quebatte, M., Patrignani, A., Dehio, C., Frey, J. E., Robinson, M. D., Wollscheid, B., and Ahrens., C. H. An integrative strategy to identify the entire protein coding potential of prokaryotic genomes by proteogenomics. bioRxiv, Cold Spring Harbor Labs Journals, 2017.
iPtgxDB Release Info
Versions

Version

1
Versions

Date

10.08.2020

Downloads icon Downloads

Compression icon

TAR.GZ

File icon

Size

3.5 MB
Data icon

MD5

b4e1eeb51ddbd0fe53d652cea2e05af4
Data icon

SHA1

48c4229fac152e60ca32b457ce3963cf9d9a85e4
Compression icon

ZIP

File icon

Size

3.6 MB
Data icon

MD5

353287718ea4031c58ae4ee9b9973fcb
Data icon

SHA1

53e5f139312c97bbf022ace8813855d130c04b8d