Skip to main content

Table 6 Edit cluster for unidirectional promoters. The word-based clusters for the two most overrepresented words for the unidirectional promoters according to the edit distance metric. Rank 1 refers to word ACCCGCCT and Rank 2 to CTTCTTTC.

From: Word-based characterization of promoters involved in human DNA repair pathways

(a) Rank 1

Word

S

ES

O

EO

Sln(S/ES)

Rev.Comp.

Position

Palindrome

ACCCGCCT

4

0.716577

4

0.727273

6.87826

AGGCGGGT

19440

No

AGCCGGCT

3

0.805285

3

0.818182

3.94551

AGCCGGCT

14

Yes

AGGCGCCT

3

1.11427

3

1.13636

2.97124

AGGCGCCT

92

Yes

AAGCGCCT

4

2.15617

4

2.22727

2.47184

AGGCGCTT

5872

No

ACCTGCAT

2

0.592063

2

0.6

2.43458

ATGCAGGT

NA

No

...

        

(b) Rank 2

Word

S

ES

O

EO

Sln(S/ES)

Rev.Comp.

Position

Palindrome

CTTCTTTC

5

1.7686

5

1.81818

5.19624

GAAAGAAG

13567

No

TCTTCTTC

4

1.30438

4

1.33333

4.48225

GAAGAAGA

NA

No

CCTCTTTA

2

0.282982

2

0.285714

3.91104

TAAAGAGG

NA

No

CTTTTTCA

3

0.917377

3

0.933333

3.55455

TGAAAAAG

NA

No

GTTCATTC

2

0.359828

2

0.363636

3.43055

GAATGAAC

NA

No

...

       Â