Methods overview of MID. a shows the construction of alignment regions. Head (A) and tail (G) substrings of read serve as a pair of anchors. Segment C in reference sequence denotes a deletion in read, segment D denotes an inversion in read and segment F denotes a translocation. A set of segment (B, C, D, E, F) in the reference sequence, as well as a set of segment (B, −D, F’, E) in the read sequence, form the target regions. b shows how to use a partition and recombine strategy to transfer the pair of overlapping MSs (A, B) to a set of non-overlapping MSs {(A, b), (a, B), (a, b)}. Segment A and B have overlapping region c, and segment a and b are generated by cutting region c off the original segment A and B. c shows the max-score path approach. The path starting with MSP
[1] and ending with MSP
[4] denotes a short read to be detected, and the path in red is the max-score path. MSP
[1], MSP
[2], MSP
[3], MSP
[4] are on the forward strand, and MSP
[−3] is on the reverse strand. MID starts with MSP
[1] and extends the path to MSP
[2], then if MSP
[3] and MSP
[−3] can both be matched, MID records both path candidates and ends with MSP
[4], therefore we have two path candidates {1, 2, 3, 4} and {1, 2, −3, 4}, then {1, 2, 3, 4} instead of {1, 2, −3, 4} will be chosen after scoring due to the reverse penalty for MSP
[−3] on the reverse strand. Otherwise if only MSP
[−3] is successfully matched, we have path candidates {1, 2, 4} and {1, 2, −3, 4}, after scoring {1, 2, −3, 4} would be chosen owing to the gap penalty, since path candidate {1, 2, 4} contains much more gaps. Therefore MSP
[−3] would be reported as detected MI