Skip to main content
Fig. 4 | BMC Genomics

Fig. 4

From: The effect of variant interference on de novo assembly for viral deep sequencing

Fig. 4

The effect of genome length and read length on de novo assembly of simulated variants across a range of percentage identities (PID). a & b Comparison of genome lengths. Six different genome lengths were assembled and the final contig counts were tallied across varying PID thresholds (75–99.6%). For the simulated genome lengths of 2Kb, 10 kb, 100 Kb, and 1 Mb, the average of contig number at each PID was plotted. Panel (b) shows the close-up view where interference was the most prominent. For all six genome lengths and each of the 13 iterations, VI consistently occurred in the same range of PID (99.00–99.24%). The assembly makes a transition from VD to VI at the threshold of 99.00%, and it makes a transition from VI to VS at the threshold of 99.24%. Also, the longer the genome length, the more contigs produced during VI. c The relationship between genome length and the total number of contigs produced. Data from panel (a) were plotted on a logarithmic scale. The total number of contigs produced is significantly dependent on the genome size (r2 = 0.967; p-value< 0.0001). d and e The effect of read length in variant assembly with a genome size of 100 K. Simulated data with four different read lengths were created and assembled, and the final contig counts were tallied across varying PID thresholds (75–99.6%). Panel (e) shows the close-up view where interference was the most apparent. When longer read lengths were used, the variant interference PID range was much narrower than when shorter read lengths were used to build contigs

Back to article page