Skip to main content
Fig. 3 | Journal of Biological Engineering

Fig. 3

From: High capacity DNA data storage with variable-length Oligonucleotides using repeat accumulate code and hybrid mapping

Fig. 3

Sequencing result analysis and data recovery. (A) The quality value of each base position along the reads. The first half part of the x axis is for reads 1 and the latter half part is for reads 2. (B) The error rate of each base position along the reads. The first half part of the distribution is for reads 1 and the latter half part is for reads 2. (C) The base content of each base position along the reads. A/T/G/C denote the type of nucleotides and N denotes a lost nucleotide which can be any one of A/T/G/C. The distribution is separated by two reads, note that for (a), (b) and (c), read 1 and read 2 are obtained from randomly sequencing from either the end of each sequence. (D) The experimental procedure for data recovery. The amplified and prepared synthetic oligo samples are sequenced using Illumina HiSeq sequencing technology. With five sets of down-sampling trials, different sizes of randomly chosen portions of raw sequence reads are sent to the decoder where the stored files are recovered. (E) The number of correctly recovered sequences against the coverage. The black circle markers represent recovered sequences before RA decoding and diamond markers represent recovered sequences after RA decoding. Among the diamond markers, red ones represent partial recovery, while green ones represent full recovery

Back to article page