Sinorhizobium meliloti strain 2011 (Genbank #NC_020528 is the reference strain [1] which includes two plasmids #NC_020527 and #NC_020560.
An iPtgxDB was created by hierarchically integrating protein coding sequences from three annotation resources (1-3) and three predictions:
Hierarchy | Resource | Link |
---|---|---|
1 | NCBI RefSeq | GCF_000346065.1_ASM34606v1; from 19/05/2017 |
2 | NCBI Genbank | GCA_000346065.1_ASM34606v1; from 31/01/2014 |
3 | Genoscope [2] | v2.7.3, accessed 14/11/2018 |
4 | Prodigal [3] | Ab initio gene predictions from Prodigal (v2.6) |
5 | ChemGenome [4] | Ab initio gene predictions from ChemGenome (v2.0; with parameters method: Swissprot, length threshold: 70 nt, initiation codons: ATG, CTG, TTG, GTG) |
6 | in silico ORFs | The in silico ORF annotations were generated as described by Omasits and Varadarajan et al., 2017 [5] |
Only ORFs above a selectable length threshold (here 18 aa) were considered. The iPtgxDB was created using the hierarchy RefSeq > Genbank > Genoscope > Prodigal > ChemGenome > in silico. Files were parsed to extract the identifier, coordinates and sequences of bona fide protein-coding sequences (CDS) and pseudogene entries. For extensions or reductions to already annotated CDSs, the full sequence was included, allowing to identify such proteins using the proteomics data obtained from undigested samples. For more detail on how we generate iPtgxDBs and how the identifiers can be interpreted, please see reference [5].
iPtgxDB Release Info | |
---|---|
Version
|
1 |
Date
|
02.02.2022 |