# *Escherichia coli* BW25113

| iPtgxDB Release     ||
| ------: | :--------: |
| Version | 1          |
| Date    | 26.09.2016 |

## Downloaded from
[/database/annotations/e-coli-bw25113](/database/annotations/e-coli-bw25113)

This is the parental strain (Genbank #[CP009273](https://www.ncbi.nlm.nih.gov/nuccore/CP009273)) of the widely used *Escherichia coli* Keio gene knockout collection [1].

An iPtgxDB was created by hierarchically integrating protein coding sequences from the following annotation resources:

| Hierarchy | Resource | Link |
| :---: | :--- | :--- |
| 1 | NCBI RefSeq | CP009273.1; from 30/10/2014 |
| 2 | IMG [2] | Integrated Microbial Genomes (IMG) initiative of the Joint Genome Institute (JGI); Ga0058822, from 12/08/2014 |
| 3 | Prodigal [3] | *Ab initio* gene predictions from Prodigal (v2.6) |
| 4 | ChemGenome [4] | *Ab initio* gene predictions from ChemGenome (v2.0, http://www.scfbio-iitd.res.in/chemgenome/chemgenomenew.jsp; with parameters: method, Swissprot space; length threshold, 70 nt; initiation codons, ATG, CTG, TTG, GTG) |
| 5 | *in silico* ORFs |  The *in silico* ORF annotations were generated as described by Omasits and Varadarajan et al., 2017  |

Only ORFs above a selectable length threshold (here 18 aa) were considered. The iPtgxDB was created using the hierarchy RefSeq > JGI > Prodigal > ChemGenome > *in silico*. Files were parsed to extract the identifier, coordinates and sequences of bona fide protein-coding sequences (CDS) and pseudogene entries.

### References
1. Baba, T., Ara, T., Hasegawa, M., Takai, Y., Okumura, Y., Baba, M., Datsenko, K.A., Tomita, M., Wanner, B.L., and Mori, H. 2006. Construction of *Escherichia coli* K-12 in-frame, single-gene knockout mutants: the Keio collection. Mol Syst Biol 2: 2006.0008.
2. Markowitz, V.M., Mavromatis, K., Ivanova, N.N., Chen, I.M., Chu, K., and Kyrpides, N.C. 2009. IMG ER: a system for microbial genome annotation expert review and curation. Bioinformatics 25: 2271-2278.
3. Hyatt, D., Chen, G.L., Locascio, P.F., Land, M.L., Larimer, F.W., and Hauser, L.J. 2010. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics 11: 119.
4. Singhal, P., Jayaram, B., Dixit, S.B., and Beveridge, D.L. 2008. Prokaryotic gene finding based on physicochemical characteristics of codons calculated from molecular dynamics simulations. Biophys J 94: 4173-4183.
5. Omasits, U., Varadarajan, A. R., Schmid, M., Goetze, S., Melidis, D., Bourqui, M., Nikolayeva, O., Quebatte, M., Patrignani, A., Dehio, C., Frey, J. E., Robinson, M. D., Wollscheid, B., and Ahrens., C. H. An integrative strategy to identify the entire protein coding potential of prokaryotic genomes by proteogenomics. bioRxiv, Cold Spring Harbor Labs Journals, 2017.
