Integrated proteogenomics database

Bacteria iconB. henselae Houston-1_tryptic

Bartonella henselae strain ATCC49882 (Houston-1; Genbank #NC_005956), isolated from an HIV-positive patient, is the reference strain [1].

An iPtgxDB was created by hierarchically integrating protein coding sequences from these annotation resources:

Hierarchy Resource Link
1 NCBI RefSeq 2015 GCA_000046705.1_ASM4670v1; from 07/30/2015
2 NCBI RefSeq 2013 Bartonella_henselae_Houston_1_uid57745; from 06/10/2013
3 Ensembl Ensembl's Genomes project (GCA_000046705.1, Feb/2015)
4 Genoscope [2] v2.7.3, accessed 03/09/2016
5 Prodigal [3] Ab initio gene predictions from Prodigal (v2.6)
6 ChemGenome [4] Ab initio gene predictions from ChemGenome (v2.0, http://www.scfbio-iitd.res.in/chemgenome/chemgenomenew.jsp; with parameters: method, Swissprot space; length threshold, 70 nt; initiation codons, ATG, CTG, TTG, GTG)
7 in silico ORFs The in silico ORF annotations were generated as described by Omasits and Varadarajan et al., 2017

Only ORFs above a selectable length threshold (here 18 aa) were considered. The iPtgxDB was created using the hierarchy RefSeq 2015 > RefSeq 2013 > Ensembl > Genoscope > ChemGenome > Prodigal > in silico. Files were parsed to extract the identifier, coordinates and sequences of bona fide protein-coding sequences (CDS) and pseudogene entries.

References

  1. Alsmark, C.M., Frank, A.C., Karlberg, E.O., Legault, B.A., Ardell, D.H., Canback, B., Eriksson, A.S., Naslund, A.K., Handley, S.A., Huvet, M. et al. 2004. The louse-borne human pathogen Bartonella quintana is a genomic derivative of the zoonotic agent Bartonella henselae. Proc Natl Acad Sci U S A 101: 9716-9721.
  2. Vallenet, D., Belda, E., Calteau, A., Cruveiller, S., Engelen, S., Lajus, A., Le Fevre, F., Longin, C., Mornico, D., Roche, D. et al. 2013. MicroScope--an integrated microbial resource for the curation and comparative analysis of genomic and metabolic data. Nucleic Acids Res 41: D636-647.
  3. Hyatt, D., Chen, G.L., Locascio, P.F., Land, M.L., Larimer, F.W., and Hauser, L.J. 2010. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics 11: 119.
  4. Singhal, P., Jayaram, B., Dixit, S.B., and Beveridge, D.L. 2008. Prokaryotic gene finding based on physicochemical characteristics of codons calculated from molecular dynamics simulations. Biophys J 94: 4173-4183.
  5. Omasits, U., Varadarajan, A. R., Schmid, M., Goetze, S., Melidis, D., Bourqui, M., Nikolayeva, O., Quebatte, M., Patrignani, A., Dehio, C., Frey, J. E., Robinson, M. D., Wollscheid, B., and Ahrens., C. H. An integrative strategy to identify the entire protein coding potential of prokaryotic genomes by proteogenomics. bioRxiv, Cold Spring Harbor Labs Journals, 2017.
iPtgxDB Release Info
Versions

Version

1
Versions

Date

09.11.2016

Downloads icon Downloads

Compression icon

TAR.GZ

File icon

Size

3.1 MB
Data icon

MD5

8b1af7fdd86cdce01722d6a2eb5bfab3
Data icon

SHA1

67ae0d395b5e18330e2a2b741d1283b2601d3745
Compression icon

ZIP

File icon

Size

3.2 MB
Data icon

MD5

bd95ee622fec42658fd30e87e1def2ab
Data icon

SHA1

6613aedce3eae06ba3d9b71d280a34ba3bc7a6a5