Skip to main content

Table 5 Edit cluster for bidirectional promoters. The word-based clusters for the two most overrepresented words for the bidirectional promoters according to the edit distance metric. Rank 1 refers to word TCGCGCCA and Rank 2 to TCCCGGGA.

From: Word-based characterization of promoters involved in human DNA repair pathways

(a) Rank 1

Word

S

ES

O

EO

Sln(S/ES)

RevComp.

Position

Palindrome

TCGCGCCA

4

0.918299

4

0.9375

5.88611

TGGCGCGA

12538

No

TCGCCCCA

3

0.805161

3

0.820513

3.94598

TGGGGCGA

2834

No

TAGCTCCA

2

0.352982

2

0.357143

3.46897

TGGAGCTA

NA

No

TCTCGCGA

2

0.438673

2

0.444444

3.0343

TCGCGAGA

4937

No

TCGCCACA

2

0.455424

2

0.461538

2.95935

TGTGGCGA

4669

No

...

        

(b) Rank 2

Word

S

ES

O

EO

Sln(S/ES)

RevComp.

Position

Palindrome

TCCCGGGA

8

3.97165

8

4.26667

5.60208

TCCCGGGA

2

Yes

TCCCGGCT

6

2.54354

6

2.66667

5.14921

AGCCGGGA

NA

No

ATCCGGGA

2

0.395077

2

0.4

3.24364

TCCCGGAT

NA

No

TCTCGCGA

2

0.438673

2

0.444444

3.0343

TCGCGAGA

4937

No

TTCCTGGA

2

0.493082

2

0.5

2.80045

TCCAGGAA

9505

No

...

       Â