Skip to main content
Fig. 2 | BMC Genomics

Fig. 2

From: BASE: a practical de novo assembler for large genomes using long NGS reads

Fig. 2

Remove branches in backward extension tree. In the backward extension tree, we try to remove erroneous branches, repetitive branches and heterozygosis branches to obtain the consensus sequences of the extended region. As an example, we meet node v with two child node a and b. Firstly, combined with L(v), we obtained TL(v) for a and GL(v) for b to detect erroneous branches between a and b. We incrementally calculate the depth of sub-sequences of a(sub-a i with length i): T, TA, TAT, …, and b(sub-b i with length i): G, GA, GAT, … until the depth of sub-a is less than user-defined threshold τ. At the same time, if Dep(sub-a i ) is significantly smaller than Dep(sub-a i-1 ), Dep(sub-a i ) is significantly smaller than di and Dep(sub-b i ) is significantly larger than Dep(sub-a i ), then branch a will be treated as a erroneous branch or repetitive branch. When there is no erroneous signal, we will further try to remove the branch, which might be caused by heterozygosis. After obtaining two sequences representing the consensus sequences of the sub-trees rooted at a and b respectively, we compare the two sequences to find the matched region and get the depth of it. Then we use this depth to calculate base depth and compare to the base depth calculated by depth of initial seed. If the two sequences have high similarity and the two base depths are similar to each other, we will treat a as heterozygous branch if W(a) is smaller than W(b)

Back to article page