Skip to main content

Table 2 Summary of our methods for lengths 5, 21, and 10 to refer to 1- and 2-mismatch and 1- and 2-gap sequences

From: Perfect Hamming code with a hash table for faster genome mapping

length

condition

#keys

#words

ratio

f(s, K) when s = c(s)

f(s, K) when c ≠ c(s)

5

1-mismatch

6.625

16

41.4%

1 + 15x

1 + 15x + 42x2 + 54x3

 

2-mismatches

27.25

106

25.7%

1 + 15 + 90x2 + 210x3 + 180x4

1 + 15 + 90x2 + 170x3 + 156x4

 

1-gap

3.25

4

81.3%

4 + 12x

4 + 60x

 

2-gaps

10

16

62.5%

16 + 36x + 108x2

– ∗1

21

1-mismatch

30.53

64

47.7%

1 + 63x

1 + 63x + 210x2 + 1710x3

 

2-mismatches

611.31

1954

31.3%

1 + 63x + 1890x2 + 4410x3 + 34020x4

1 + 63x + 1890x2 + 5650x3 + 31500x4

 

1-gap

3.81

4

95.3%

4 + 60x

4 + 252x

 

2-gaps

13.87

16

86.7%

16 + 84x + 540x2

16 + 48x + 960x2

10: Serialize

1-mismatch

12.25

31

39.5%

1 + 30x + 225x2

1 + 30x + 170x2 + 538x3 + 1089x4 + 1620x5 ∗2

10: Parallelize

1-mismatch

13.25

31

44.1%

1 + 30x

1 + 30x + 84x2 + 108x3 ∗3

  1. ∗1 :s always includes one code word. ∗2: neither the first half nor the second half are code words. The reference formula when one of the two halves is a code word is 1 + 30x2 + 267x2 + 684x3 + 810x4. ∗3: neither the first half or second half are code words. The reference formula when one of the two halves is a code word is 1 + 30x2 + 42x2 + 54x3.