Integrated proteogenomics database

Bacteria iconA. caccae DSM14662_tryptic

The extended simplified human intestinal microbiota (SIHUMIx) consists of eight bacterial members (Anaerostipes caccae (DSMZ 14662); Bacteroides thetaiotaomicron (DSMZ 2079); Bifidobacterium longum (NCC 2705); Blautia producta (DSMZ 2950); Clostridium butyricum (DSMZ 10702); Clostridium ramosum (DSMZ 1402); Escherichia coli K-12 (MG1655); Lactobacillus plantarum (DSMZ 20174)) of the human intestine and thus represents a model community to analyze such microbial interactions [1].

A tryptic iPtgxDB of Anaerostipes caccae (DSM 14662) was created by hierarchically integrating protein coding sequences from the following annotation resources:

Hierarchy Resource Link
1 NCBI RefSeq CP036345.1; from 22-FEB-2019 (NCBI Prokaryotic Genome Annotation Pipeline (PGAP v.4.7)
2 Prodigal [2] ab initio gene predictions from Prodigal (v2.6)
3 ChemGenome [3] ab initio gene predictions from ChemGenome (v2.0, http://www.scfbio-iitd.res.in/chemgenome/chemgenomenew.jsp; with parameters: method, Swissprot space; length threshold, 70 nt; initiation codons, ATG, CTG, TTG, GTG)
4 in silico ORFs in silico ORF annotations were generated as described by Omasits and Varadarajan et al., 2017 (v2.0, Only ORFs above a selectable length threshold (here 18 aa) were considered.)

The iPtgxDB was created using the hierarchy RefSeq > Prodigal > ChemGenome > in silico. Files were parsed to extract the identifier, coordinates and sequences of bona fide protein-coding sequences (CDS) and pseudogene entries.

References

  1. Becker, N., Kunath, J., Loh, G. & Blaut, M. Human intestinal microbiota: Characterization of a simplified and stable gnotobiotic rat model. Gut Microbes 2, 25-33, doi:10.4161/gmic.2.1.14651 (2011).
  2. Hyatt, D., Chen, G.L., Locascio, P.F., Land, M.L., Larimer, F.W., and Hauser, L.J. 2010. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics 11: 119.
  3. Singhal, P., Jayaram, B., Dixit, S.B., and Beveridge, D.L. 2008. Prokaryotic gene finding based on physicochemical characteristics of codons calculated from molecular dynamics simulations. Biophys J 94: 4173-4183.
  4. Omasits, U., Varadarajan, A. R., Schmid, M., Goetze, S., Melidis, D., Bourqui, M., Nikolayeva, O., Quebatte, M., Patrignani, A., Dehio, C., Frey, J. E., Robinson, M. D., Wollscheid, B., and Ahrens., C. H. An integrative strategy to identify the entire protein coding potential of prokaryotic genomes by proteogenomics. bioRxiv, Cold Spring Harbor Labs Journals, 2017.
iPtgxDB Release Info
Versions

Version

1
Versions

Date

10.08.2020

Downloads icon Downloads

Compression icon

TAR.GZ

File icon

Size

5.7 MB
Data icon

MD5

a5c94b1460fb176997571001038b7fdb
Data icon

SHA1

d47b7f12f046c578e412b668451e79a6b0b13f18
Compression icon

ZIP

File icon

Size

5.9 MB
Data icon

MD5

920518acce7404537d799cd2c9656eef
Data icon

SHA1

01cacf8e3bdea2635ed3cca0197d6a5109516fc0