Bacillus subtilis strain 168 (Genbank #NC_00964.3) is one of the well studied bacterial strain for this widely used Gram-positive prokaryotic model organism [1].
iPtgxDB was created by hierarchically integrating protein coding sequences from the following annotation resources:
Hierarchy | Resource | Link |
---|---|---|
1 | NCBI RefSeq 2018 | GCA_000009045.1_ASM904v1; from 15/01/2018 |
2 | NCBI RefSeq 2017 | GCF_000009045.1_ASM904v1; from 21/05/2017 |
3 | Genoscope [2] | v2.7.3, accessed 17/07/2018 |
4 | IMG [3] | Integrated Microbial Genomes (IMG) initiative of the Joint Genome Institute (JGI); Taxon ID: 646311909, from 17/07/2018 |
5 | Prodigal [4] | Ab initio gene predictions from Prodigal (v2.6) |
6 | ChemGenome [5] | Ab initio gene predictions from ChemGenome (v2.0, http://www.scfbio-iitd.res.in/chemgenome/chemgenomenew.jsp; with parameters: method, Swissprot space; length threshold, 70 nt; initiation codons, ATG, CTG, TTG, GTG) |
7 | in silico ORFs | The in silico ORF annotations were generated as described by Omasits and Varadarajan et al., 2017 [6] |
Only ORFs above a selectable length threshold (here 18 aa) were considered. The iPtgxDB was created using the hierarchy RefSeq 2018 > RefSeq 2017 > Genoscope > JGI > Prodigal > ChemGenome > in silico. Files were parsed to extract the identifier, coordinates and sequences of bona fide protein-coding sequences (CDS) and pseudogene entries. For extensions or reductions to already annotated CDSs, sequences were only included up to the first ArgC cleavage site, allowing to identify such proteins using the proteomics data obtained by using any of these alternative proteases.
iPtgxDB Release Info | |
---|---|
Version
|
1 |
Date
|
06.02.2020 |