ID | Length a | Start – End | Overlap | Vector sequence | Total a+b+c | % vector contribution | Protein | e-value | Bit score | GC ratio |
---|
 |  |  |  |
b
|
c
| Â | Â | Aa | M.W. | Â | Â | (i) | (ii) |
---|
eka1 | 104 | 70,283 – 70,386 | No | 381 | 157 | 642 | 83.8 | 214 | 23.5 | >10 | * | 39.4 | 50.0 |
eka2 | 138 | 3,651,282 – 3,651,704 | Yes, 32% | 381 | 90 | 609 | 77.3 | 203 | 22.1 | > 10 | * | 42.0 | 48.3 |
eka3 | 432 | 348,779 – 349,210 | No | 381 | 90 | 903 | 52.1 | 301 | 33.7 | 6 e-04 | 46.2 | 47.0 | 48.6 |
eka4 | 105 | 49,681 – 49,785 | No | 381 | 90 | 576 | 81.7 | 192 | 20.9 | >10 | * | 49.5 | 50.0 |
eka5 | 141 | 57,173 – 57,313 | No | 381 | 90 | 612 | 76.9 | 204 | 22.2 | > 10 | * | 43.2 | 50.8 |
eka6 | 96 | 70,285 – 70,380 | No | 381 | 90 | 567 | 83.1 | 189 | 20.5 | >10 | * | 39.6 | 48.3 |
- Start-end indicates genomic location of the selected sequences. 'a' indicates the length of the original genomic insert, 'b' and 'c' indicate vector contributed prefix and suffix DNA sequences respectively. Total (a+b+c) indicates the entire DNA sequence expressed into proteins. The pBAD vector contribution to the final protein sequence is indicated in percentages. Aa indicates the number of amino acid residues of the synthesized protein. M.W. refers to the Isotopically Averaged Molecular Weight calculated in kiloDaltons (kDa). (i) indicates GC ratio of the genomic insert, and (ii) indicates GC ratio of the complete DNA sequence (vector + genomic DNA) expressed into proteins. The large e-value and extremely small bit score approaching zero (*) indicates very low sequence similarity of eka proteins to the known protein sequences.