Listeria monocytogenes strain ScottA (serovar 4b; Genbank #CP023862), was isolated during the Massachusetts listeriosis outbreak in 1983[1]. While the strain is already available as a RefSeq strain (Genbank #NZ_CM001159) in NCBI, assembled using a mixed strategy of de novo and reference-based assembly, its genome was re-sequenced and assembled purely with a de novo strategy, by combining PacBio and Illumina MiSeq reads, to obtain complete genome sequence for subsequent proteogenomic studies [2].
An iPtgxDB was created by hierarchically integrating protein coding sequences from these annotation resources:
Hierarchy | Resource | Link |
---|---|---|
1 | NCBI RefSeq | CP023862.1; from 23/10/2017 |
2 | Prodigal [4] | Ab initio gene predictions from Prodigal (v1.12) |
3 | in silico ORFs | The in silico ORFs annotations were generated as described by Omasits and Varadarajan et al., 2017 |
Only ORFs above a selectable length threshold (here 18 aa) were considered. The iPtgxDB was created using the hierarchy RefSeq > Prodigal > in silico. Files were parsed to extract the identifier, coordinates and sequences of bona fide protein-coding sequences (CDS) and pseudogene entries.
iPtgxDB Release Info | |
---|---|
Version
|
1 |
Date
|
23.10.2017 |