Skip to main content

Table 1 Description of eka sequences

From: Synthesizing non-natural parts from natural genomic template

ID

Length a

Start – End

Overlap

Vector sequence

Total a+b+c

% vector contribution

Protein

e-value

Bit score

GC ratio

    

b

c

  

Aa

M.W.

  

(i)

(ii)

eka1

104

70,283 – 70,386

No

381

157

642

83.8

214

23.5

>10

*

39.4

50.0

eka2

138

3,651,282 – 3,651,704

Yes, 32%

381

90

609

77.3

203

22.1

> 10

*

42.0

48.3

eka3

432

348,779 – 349,210

No

381

90

903

52.1

301

33.7

6 e-04

46.2

47.0

48.6

eka4

105

49,681 – 49,785

No

381

90

576

81.7

192

20.9

>10

*

49.5

50.0

eka5

141

57,173 – 57,313

No

381

90

612

76.9

204

22.2

> 10

*

43.2

50.8

eka6

96

70,285 – 70,380

No

381

90

567

83.1

189

20.5

>10

*

39.6

48.3

  1. Start-end indicates genomic location of the selected sequences. 'a' indicates the length of the original genomic insert, 'b' and 'c' indicate vector contributed prefix and suffix DNA sequences respectively. Total (a+b+c) indicates the entire DNA sequence expressed into proteins. The pBAD vector contribution to the final protein sequence is indicated in percentages. Aa indicates the number of amino acid residues of the synthesized protein. M.W. refers to the Isotopically Averaged Molecular Weight calculated in kiloDaltons (kDa). (i) indicates GC ratio of the genomic insert, and (ii) indicates GC ratio of the complete DNA sequence (vector + genomic DNA) expressed into proteins. The large e-value and extremely small bit score approaching zero (*) indicates very low sequence similarity of eka proteins to the known protein sequences.