Listeria monocytogenes strain ScottA (serovar 4b; Genbank #CP023862), was isolated during the Massachusetts listeriosis outbreak in 1983. While the strain is already available as a RefSeq strain (Genbank #NZ_CM001159) in NCBI, assembled using a mixed strategy of de novo and reference-based assembly, its genome was re-sequenced and assembled purely with a de novo hybrid strategy, by combining PacBio and Illumina MiSeq reads, to obtain complete genome sequence for subsequent proteogenomic studies .
An iPtgxDB was created by hierarchically integrating protein coding sequences from these annotation resources:
|1||NCBI RefSeq||CP023862.1; from 23/10/2017|
|2||Prodigal ||Ab initio gene predictions from Prodigal (v1.12)|
|3||in silico ORFs||The in silico ORFs annotations were generated as described by Omasits and Varadarajan et al., 2017|
Only ORFs above a selectable length threshold (here 18 aa) were considered. The iPtgxDB was created using the hierarchy RefSeq > Prodigal > in silico. Files were parsed to extract the identifier, coordinates and sequences of bona fide protein-coding sequences (CDS) and pseudogene entries.
|iPtgxDB Release Info|