Skip to main content
Figure 1 | BMC Genomics

Figure 1

From: DB2: a probabilistic approach for accurate detection of tandem duplication breakpoints using paired-end reads

Figure 1

A flowchart summarizing the framework implemented by DB2. Since the distances between the aligned ends of the concordantly mapped read pairs can be considered as representatives of the real fragment lengths, we first extract the concordant read pairs from the BAM files and obtain the empirical fragment length distribution using them. The everted (RF) read pairs, which are also extracted, are indicative of tandem duplications. We use each of the RF pairs along with the empirical fragment length distribution to represent the feasible breakpoints of the tandem duplication that induced this RF pair. Next, DB2 clusters the read pairs that may be induced by the same tandem duplication, and hence finds distinct tandem duplications along with their potential breakpoints. It scores each potential breakpoint by utilizing the empirical length distribution and obtains the breakpoint with the highest score as the putative breakpoint of each tandem duplication. After the conflict resolution step eliminates the likely false positives, the final set of tandem duplications are reported to the user.

Back to article page