Skip to main content

Table 1 Motifs identified from the analysis of predominant uncapped 5′-ends in Arabidopsis and rice degradome libraries

From: Beyond cleaved small RNA targets: unraveling the complexity of plant RNA degradome data

Group

Librarya

Regionb

Motifc

E-valued

Positione

Sitef

1 RTGATGA

TWF (At)

IGR

KRTGATGA

7.60E-22

5

28(1000)

Tx4F (At)

intron

RATGATGA

2.00E-06

4

13(770)

INF9311a (Os)

intron

RTGATGA

7.70E-05

6

20(817)

NPBs (Os)

IGR

DRTGATGA

6.40E-24

5

37(1000)

NPBs (Os)

intron

RTGATGAD

2.00E-11

6

20(1000)

2 TGTAHAKA

TWF (At)

3′UTR

TGTAHATA

2.00E-82

4

110(1000)

Tx4F (At)

3′UTR

TGTAHAKW

4.40E-52

3

72(1000)

INF9311a (Os)

intron

YTGTAMAK

1.10E-21

3

55(817)

INF9311a (Os)

CDS

TGTACAG

1.20E-07

4

27(1000)

INF9311a (Os)

3′UTR

YTGTAHAK

1.00E-376

3

320(1000)

INF939 (Os)

3′UTR

HTGTAMWK

3.50E-135

3

119(1000)

NPBs (Os)

3′UTR

YTGTAMAK

1.30E-164

3

174(1000)

NPBs (Os)

IGR

TGTAHAKW

5.70E-26

4

62(1000)

NPBs (Os)

intron

TGTACAKA

1.30E-22

4

55(1000)

3 AATAAA

Tx4F (At)

3′UTR

AAYAAARV

2.30E-10

4

60(1000)

4 CACACACA

INF939 (Os)

CDS

CACACACA

1.10E-01

-1

15(599)

INF939 (Os)

3′UTR

CACACACA

2.70E-01

-1

9(1000)

5 ATGTATGT

Col-0 (At)

3′UTR

ATGTATGT

1.70E-38

-1

103(499)

6 GTCTRGTG

Tx4F (At)

IGR

GTCTRGTG

6.10E-05

16

12(1000)

7 CAGAC

NPBs (Os)

3′UTR

MCAGAC

5.60E-02

1

40(1000)

8 AAAAAAAA

INF9311a (Os)

IGR

AAAAAAAA

2.40E-07

12

16(1000)

9 GTCCGAC

Tx4F (At)

CDS

AGTCCGAC

9.20E-21

-8

35(1000)

INF9311a (Os)

CDS

AGYCCGAC

1.50E-64

-8

81(1000)

INF939 (Os)

CDS

AGTCCGAC

4.60E-31

-8

60(599)

INF939 (Os)

3′UTR

RSYCCRAC

1.30E-07

-8

59(1000)

NPBs (Os)

CDS

ASKCCGAC

8.90E-258

-8

298(1000)

NPBs (Os)

3′UTR

VBCCGACH

8.90E-51

-7

85(1000)

NPBs (Os)

intron

SKCCGACH

1.10E-09

-7

30(1000)

10 GATCCAAC

AxIDT (At)

3′UTR

GATCCAAM

4.50E-03

-8

10(793)

AxIRP (At)

CDS

GRTCCAAC

1.00E-126

-8

121(1000)

AxIRP (At)

5′UTR

RATCCAAC

5.00E-19

-8

49(1000)

AxIRP (At)

intron

GRTCCAAC

7.10E-01

-8

18(1000)

AxSRP (At)

CDS

GATCCAAC

8.40E-40

-8

45(1000)

AxSRP (At)

5′UTR

GATCCAAC

9.80E-07

-8

22(1000)

AxSRP (At)

intron

GATCCAAC

3.70E-01

-8

15(1000)

11 GACGATC

Col-0 (At)

3′UTR

VMGACGAT

3.40E-02

-9

15(499)

ein5 (At)

3′UTR

CGACGATY

3.20E-06

-8

23(153)

ein5 (At)

CDS

SGACGWTY

1.50E-03

-8

17(476)

  1. aAt: Arabidopsis; Os: rice.
  2. bIGR, UTR and CDS indicate the intergenic region, the untranslated region and the coding sequence, respectively.
  3. cSyntax for multiple bases: B = C/G/T, D = A/G/T, H = A/C/T, K = G/T, M = A/C, R = A/G, S = G/C, V = A/C/G, W = A/T, Y = C/T.
  4. dE-value is the estimated number of (equally or more significant) motifs that one would expect to find by chance if the input sequences were shuffled.
  5. ePosition indicates the predominant position of the first nucleotide of the motif relative to the uncapped 5′-end revealed by deep sequencing which was set to 1. Upstream positions are indicated as negative values and downstream positions are indicated as positive values.
  6. fThe numbers indicate sites possessing the indicated motif at the specific position among the number of input sequences (in parentheses) for MEME analysis.