Skip to main content
Fig. 3 | BMC Genomics

Fig. 3

From: Comprehensive genome-wide identification of angiosperm upstream ORFs with peptide sequences conserved in various taxonomic ranges using a novel pipeline, ESUCA

Fig. 3

Schematic representation of BLAST-based search for uORFs conserved between homologous genes. In the third step of ESUCA, tBLASTn searches are conducted against a transcript sequence database that consists of assembled EST/TSA contigs, unclustered singleton EST/TSA sequences and RefSeq RNAs, using original uORF sequences as queries (uORF-tBLASTn). The shaded regions in the open boxes show the tBLASTn-matching regions. Asterisks represent stop codons. (i) The downstream in-frame stop codon closest to the 5′-end of the matching region of each uORF-tBLASTn hit is selected. (ii) The 5′-most in-frame ATG codon located upstream of the stop codon is selected. The ORF beginning with the selected ATG codon and ending with the selected stop codon is extracted as a putative uORF. In the fourth step of ESUCA, the downstream sequences of putative uORFs in the transcript sequences are subjected to mORF-tBLASTn analysis. Transcript sequences matching the original mORF with an E-value less than 10− 1 are extracted. (iii) For each of the uORF-tBLASTn and mORF-tBLASTn hits, the upstream in-frame stop codon closest to the 5′-end of the matching region is selected. (iv) The 5′-most in-frame ATG codon located downstream of the selected stop codon is identified as the initiation codon of the putative partial or intact mORF. If the putative mORF overlaps with the putative uORF, the uORF-tBLASTn and mORF-tBLASTn hit is discarded as a uORF-mORF fusion type

Back to article page