Skip to main content

The dirigent multigene family in Isatis indigotica: gene discovery and differential transcript abundance

Abstract

Background

Isatis indigotica Fort. is one of the most commonly used traditional Chinese medicines. Its antiviral compound is a kind of lignan, which is formed with the action of dirigent proteins (DIR). DIR proteins are members of a large family of proteins which impart stereoselectivity on the phenoxy radical-coupling reaction, yielding optically active lignans from two molecules of E-coniferyl alcohol. They exist in almost every vascular plant. However, the DIR and DIR-like protein gene family in I. indigotica has not been analyzed in detail yet. This study focuses on discovery and analysis of this protein gene family in I. indigotica for the first time.

Results

Analysis of transcription profiling database from I. indigotica revealed a family of 19 full-length unique DIR and DIR-like proteins. Sequence analysis found that I. indigotica DIR and DIR-like proteins (IiDIR) were all-beta strand proteins, with a signal peptide at the N-terminus. Phylogenetic analysis of the 19 proteins indicated that the IiDIR genes cluster into three distinct subfamilies, DIR-a, DIR-b/d, and DIR-e, of a larger plant DIR and DIR-like gene family. Gene-specific primers were designed for 19 unique IiDIRs and were used to evaluate patterns of constitutive expression in different organs. It showed that most IiDIR genes were expressed comparatively higher in roots and flowers than stems and leaves.

Conclusions

New DIR and DIR-like proteins were discovered from the transcription profiling database of I. indigotica through bioinformatics methods for the first time. Sequence characteristics and transcript abundance of these new genes were analyzed. This study will provide basic data necessary for further studies.

Background

Isatis indigotica Fort. is one of the most commonly used plants in traditional Chinese medicine for its anti-inflammatory and antiviral activities [1]. Its leaves are called “Daqingye” (Folium Isatidis), which can be used for the treatment of high fever, epidemic parotitis, pharyngitis and erysipelas. The root of I. indigotica is the well-known Chinese medicine “Banlangen” (Radix Isatidis), which is widely used for flu and infections of the upper respiratory tract in China. During the epidemic period of severe acute respiratory syndromes (SARS) in 2003, Banlangen demonstrated the potential prevention of SARS [2]. However, the antiviral compounds of I. indigotica were still unknown until Li [3] learned that lariciresinol isolated from this plant was useful for the treatment of influenza A1 virus.

Lariciresinol is a kind of lignan which has been widely studied and reported to possess a number of biological activities, including antimicrobial, antioxidant, anti-inflammatory and anti-estrogenic properties, which may reduce the risk of cardiovascular diseases, as well as certain types of cancer [49]. The precursor of lariciresinol is pinoresinol, which comes from E-coniferyl alcohol by the action of dirigent proteins (DIR) [10].

Dirigent (Latin: dirigere, to guide or align) proteins are members of a large family which imparts stereoselectivity on the phenoxy radical-coupling reaction. These proteins can capture E-coniferyl alcohol (only E-coniferyl alcohol, not p-coumaryl or sinapyl alcohols which differ only in the degree of aromatic methoxylation [10]) derived free-radical intermediates and orientate these radicals in such a way as to enable 8-8’ coupling with concomitant intramolecular cyclization to afford optically active (+)- or (−)-pinoresinol [1114]. In the absence of DIR proteins, only non-specific radical-radical coupling occurs at the 8-8’, 8-5’, or 8-O-4’ positions with the resulting formation of racemic lignan products [1214].

DIR proteins exist in almost every vascular plant [15]. Ralph and coworkers [16] suggest that the DIR proteins are subdivided into five groups: the DIR-a, DIR-b, DIR-c, DIR-d and DIR-e subfamilies. With the increasing numbers of DIR proteins, the DIR-b and DIR-d subfamilies are combined together with the appearance of the DIR-f and DIR-g subfamilies [17]. However, only members of DIR-a subfamily are being studied for their biochemical functions; the other proteins are referred to as DIR-like proteins. The DIR and DIR-like protein gene family in I. indigotica has not been analyzed in detail yet. Under the umbrella of a transcription profiling of I. indigotica[18], 19 full-length IiDIRs (the dirigent or dirigent-like protein genes of I. indigotica) are mined analytically through bioinformatics. Here we report an inventory and sequence analysis as well as the phylogenetic relationships of the IiDIR gene family. A detailed quantitative real-time PCR expression analysis in constitutive I. indigotica tissues is described for 19 IiDIRs. Finally, we provide a transcriptome analysis of IiDIRs, which is based on data treated with MeJA at different time points.

Results

Discovery of IiDIRs from the I. indigoticatranscription profiling database

Using TBLASTN and BLASTN (Basic Local Alignment Search Tool 2.2.26) against the I. indigotica transcription profiling database with released DIR and DIR-like protein sequences, we obtained 19 putative IiDIR sequences (Additional file 1). The best hit homology genes of these 19 sequences were summarized in Additional file 2. The number and subfamily designation of the IiDIR genes were based on the topology of the 19 IiDIRs with other 178 DIRs according to Ralph [17] and Arasan [19]. Typical dirigent domains were found in these 19 IiDIR protein sequences though simple modular architecture research tool (SMART, http://smart.embl-heidelberg.de/) [20] (Additional file 3).

Sequences analysis

The length of the predicted open reading frames (ORFs) for the 19 cDNAs ranged from 183 aa (IiDIR1) to 414 aa (IiDIR19). The 19 IiDIRs had predicted molecular masses range from circa 20.17 (IiDIR8) to 39.94 (IiDIR19) kDa and predicted pI values range from 4.79 (IiDIR16) to 9.85 (IiDIR8) (Additional file 3).

Using the TargetP 1.1 Server (http://www.cbs.dtu.dk/services/TargetP/) [21] and the WoLF PSORT (http://www.genscript.com/psort/wolf_psort.html) [22] subcellular localization software, it was predicted that most of the 19 IiDIRs were targeted to the secretory pathway, either through the default pathway for extracellular release, or for possible final localization in the vacuolar, chloroplast and cytoplasmic locations. The signal peptide prediction showed that most of the IiDIRs had a 20–30 aa length signal peptide at the N-terminus except IiDIR13, IiDIR15, and IiDIR18. All IiDIRs except IiDIR12/13/14/15/16/17 were found to contain N-glycosylation sites (Asn) which were a feature of secreted proteins using NetNGlyc 1.0 server (http://www.cbs.dtu.dk/services/NetNGlyc/) [23] (Additional file 3). The SMART results showed that IiDIR3/4/12/13/14/15/17/19 had transmembrane region. The molecular formula was calculated through ProtParam (http://web.expasy.org/protparam/) [24]. It found out that, the gene with the most sulfur elements was IiDIR11 (12 sulfur elements), while IiDIR16 and IiDIR17 only had three sulfur elements, respectively (Additional file 3).

Pairwise sequence similarities among predicted amino acids of the19 IiDIRs ranged from a low of 14.4% identity (IiDIR4 vs. IiDIR13, IiDIR14, IiDIR15, respectively) to a high of 98.1% (IiDIR14 vs. IiDIR15) (Table 1). IiDIR14 and IiDIR15 were an example of closely related proteins sharing amino acid identity greater than 98% that may represent within-species alleles.

Table 1 Sequence relatedness of Ii DIRs

Secondary structures of IiDIRs

Additional file 4 presented the secondary structures polygrams of the 19 IiDIRs. They were predicted by NetSurfP (http://www.cbs.dtu.dk/services/NetSurfP/) [25]. The 19 IiDIR proteins could be divided into several groups according to their secondary structures. The differences among all the IiDIRs were existing in the residues before the first β-strand. According to this shape, IiDIR16/17/18/19 were far away from IiDIR1 to IiDIR15. In the first 15 IiDIRs, IiDIR13/14/15 were different from others at the shape of region between the first β-strand and the second β-strand. IiDIR12 was different from others at the last shape of β-strand. IiDIR1/2/3/4/were different from other IiDIRs because they had a smooth Coil curve before the first β-strand.

To confirm the forecast accuracy of NetSurfP, predictions of the secondary structures were also carried out on PSIPRED (http://bioinf.cs.ucl.ac.uk/psipred/) and Phyre2 (http://www.sbg.bio.ic.ac.uk/phyre2/html/page.cgi?id=index) [26]. The position of the β-strands was in overall agreement with the predictions determined by NetSurfP (data not shown).

Tertiary structures and homologs of IiDIRs

Figure 1 showed predicted three-dimensional structures of 19 IiDIR proteins. Structures of the 19 proteins were modeled using the server: http://www.sbg.bio.ic.ac.uk/phyre2/html/page.cgi?id=index[26]. For all of the19 queried sequences, the same three top-scoring proteins were found, all of which belong to the allene oxide cyclase-like protein (AOC) family. AOC barrel-like protein d2brja1, which shared only 17–26% sequence identity among the IiDIRs, was predicted as a DIR homolog with about 98% probability, followed by two hypothetical proteins with similar probabilities (d1zvca1 and c4h69A, Table 2). Among the highest confidence level predicted by Phyre2, IiDIR14 and IiDIR18 showed the highest confidence of 98.3% respectively with the template d2brja1.

Figure 1
figure 1

Cartoon-style model of Ii DIRs derived from prediction.

Table 2 The probability and identity of homologous relationship of Ii DIRs

Phylogenetic analysis of the IiDIRs

To obtain clues about the evolutionary relationships and the topological structures of the IiDIRs, multiple sequence alignments of amino acid sequences of the 19 full-length cDNAs were used to build a Neighbor-Joining (NJ) tree with 1000 bootstrap reconstruction and completed deletion gaps/missing data treatment (Figure 2). The 19 IiDIRs were clearly separated into three distinct groups based on sequence relatedness. The amino acid sequences of IiDIR1/2/3/4 were clustered into Group 1, while IiDIR5/6/7/8/9/10/11 were clustered into the second group. These two groups were in accordance with the secondary structures. IiDIR13/14/15/16/17/18/19 were clustered into another group. IiDIR 12 was left behind between Group 1 and other IiDIRs.

Figure 2
figure 2

Neighbor-Joining (NJ) phylogenetic trees of 19 Ii DIRs. The values on the branches are bootstrap proportions, which indicated the percentage values for obtaining this particular branching in 1000 repetitions of the analysis. The lengths of branches are proportional to evolutionary distances between species.

To test the reliability of the NJ tree, a Maximum Likelihood (ML) analysis was also carried out to generate a phylogenetic tree using default parameters and 1000 bootstrap reconstruction as well (Additional file 5, −ln = 3970.33, model: WAG + F). Both of the two trees had similar topological structures with three clusters, which indicated that the two methods were in good agreement.

To better understand DIR and DIR-like protein sequences divergences and similarities among I. indigotica and other plants, a provisional molecular phylogenetic tree was constructed using multiple sequence alignment from various plant species. These gene sequences were as follows: 29 genes from Brassica rapa, 25 genes from Arabidopsis thaliana, 54 genes from Oryza sativa, 35 genes from spruce, 9 genes from Thuja plicata, and an additional 27 DIRs identified from a variety of species, including pea, cotton, corn, sesame, etc. [17, 19]. In this tree, different subfamilies according to Ralph [17] were colored in different colors. However, only DIR-a, DIR-c and DIR-f subfamilies were clustered separately. DIR-e and DIR-b/d subfamilies were mixed with genes from DIR-g subfamily. These DIR-g subfamily genes were all from B. rapa. The phylogenetic tree indicated that IiDIRs cluster into three groups, DIR-a, DIR-b/d and DIR-e (Figure 3). IiDIR1/2/3/4 grouped into subfamily DIR-a, along with 5 A. thaliana genes and 6 B. rapa genes. IiDIR5/6/7/8/9/10/11 grouped into subfamily DIR-b/d, along with 14 A. thaliana genes and 16 B. rapa genes. IiDIR13/14/15/16/17/18/19 grouped into subfamily DIR-e, along with 6 A. thaliana genes and mixed with 4 BrDIRs from DIR-g subfamily. IiDIR 12 was outside the subfamily DIR-e. We designated it to subfamily DIR-e.

Figure 3
figure 3

Phylogenetic tree of plant DIR and DIR-like protein sequences. Amino acids of 197 dirigent or dirigent-like (DIR) proteins are analyzed by Maximum Likelihood (ML) using MEGA 5.05 (−ln = 2880.02, model: WAG + F). Subfamilies DIR-a, DIR-b/d, DIR c, DIR-e, DIR-f and DIR-g are indicated by pink, yellow, green, purple, skyblue and pink-purple shading respectively. The AtDIRs are colored in red and BrDIRs are colored in darkgreen. IiDIRs are marked as normal. DIR nomenclature is as follows: Ah, Arachis hypogaea; As, Agrostis stolonifera; At, Arabidopsis thaliana; Br, Brassica rapa; Fi, Forsythia intermedia; Gb, Gossypium barbadense; Hv, Hordeum vulgare; Ii, Isatis indigotica; Nb, Nicotiana benthamiana; Os, Oryza sativa; P, Picea glauca, Picea sitchensis or P. glauca x engelmannii; Pp, Podophyllum peltatum; Ps, Pisum sativum; Sb, Sorghum bicolor; Si, Sesamum indicum; So, Saccharum officinarum; Ta, Triticum aestivum; Tan, Tamarix androssowii; Th, Tsuga heterophylla; Tp, Thuja plicata and Zm, Zea mays.

Sequence comparison

The IiDIR sequences were analyzed to address if any putative functions could be inferred. The topology analyses of the IiDIRs showed that they contribute to DIR-a, DIR-b/d, and DIR-e subfamilies. Recent studies only focused on the function of the DIR-a subfamily, classifying the other DIRs from the other subfamilies to be the DIR-like proteins. According to Pickel [11, 27], AtDIR5 and AtDIR6 from DIR-a subfamily of A. thaliana were different from those DIRs found earlier, such as FiDIR1 and TpDIR7 [28]. The first DIR from Forsythia suspensa was found to guide E-coniferyl alcohol to form (+)-pinoresinol [12], and many other DIRs had the same function [29]. However, in the presence of AtDIR6, the final product of E-coniferyl alcohol was the enantiomer (−)-pinoresinol. From the topology tree of DIRs from different species (Figure 3), IiDIR2/3/4 were adjacent to AtDIR6, and IiDIR1 was next to AtDIR5. These observations suggested that IiDIR1/2/3/4 might have similar functions with AtDIR6 and AtDIR5.

Sequence comparisons between AtDIR6 and IiDIR2/3/4 as well as AtDIR5, IiDIR1, FiDIR1 and TpDIR7 were performed by clustalX 2.1 [30]. The results showed that IiDIR2 had 93.05% identity with AtDIR6, while IiDIR3 and IiDIR4 had only 68.45% and 66.31% identity with AtDIR6. It suggested that the relative among IiDIR3, IiDIR4 and AtDIR6 might far away from that between IiDIR2 and AtDIR6. IiDIR1 had 90.66% identity with AtDIR5. Residues conservation was shown in Figure 4.

Figure 4
figure 4

Sequence comparison between DIRs from Forsythia intermedia , Thuja plicata , Arabidopsis thaliana and Isatis indigotica.Residues conserved in all of the sequences are indicated in black. Sequence conservation between A. thaliana and I. indigotica is highlighted in blue. Conservation between T. plicata and F. intermedia is highlighted in green. Conservation between AtDIR6, IiDIR2, IiDIR3 and IiDIR4 is highlighted in red. Conservation between AtDIR5 and IiDIR1 is highlighted in yellow. Conservation between AtDIR6 and IiDIR2 is highlighted in gray. Predicted N-terminus signal peptides are shown in italics with underline.

To examine sequence features of these IiDIR sequences, sequence comparison between 19 IiDIRs and 29 BrDIRs were carried out as well. The 19 IiDIRs showed five well conserved motifs in their amino acid sequences like 29 BrDIRs (Figure 5).

Figure 5
figure 5

Conserved five characteristic motifs (I-V) of dirigent proteins in Ii DIR and Br DIR protein sequences.

Transcript abundance analysis of IiDIRs in different organs

Since the transcript abundance of a gene was often correlated with its function, the relative constitutive abundance of the 19 IiDIRs were quantified in total RNA isolated from roots, stems, leaves and flowers through real-time PCR using gene-specific primers (Additional file 6). The organ specific expression of each IiDIR gene was normalized to actin as control and compared with root as reference using 2-Ct method. The transcript abundance level was showed in Figure 6.

Figure 6
figure 6

Quantitative real-time PCR analysis of constitutive IiDIR s transcript abundance in different organs. Transcript abundance of each IiDIR gene is normalized to actin as control and compared with root as reference using 2-Ct method. Values obtained by real-time PCR represent mean ± SEM (n = 3).

Based on RT-PCR analysis, 5 IiDIRs (IiDIR2/5/10/15/18) displayed the highest transcript abundance in all tissues. Another 5 IiDIRs (IiDIR3/4/11/13/17) showed higher transcript abundance in roots and flowers than in stems and leaves. IiDIR6 was higher in leaves and IiDIR7 was higher in flowers. IiDIR1/12/13/16/19 were hardly expressed in leaves. The remaining two IiDIRs (IiDIR8 and IiDIR9) were nearly not detected in any tissue (Additional file 7).

Compared with the gene transcript abundance in roots, IiDIR7 was more than 500 fold higher in flowers. All of these genes were lowly expressed in leaves than in other tissues except IiDIR6. Most IiDIRs have comparatively higher transcript abundance in roots and flowers than in stems and leaves, such as IiDIR2/10/11/14/15/17/18. The transcript abundance of IiDIR4 and IiDIR8 were higher in stems and flowers than in roots and leaves. IiDIR12 and IiDIR16 were expressed more in stems (Figure 6).

Transcript abundance analysis after treatment with MeJA

MeJA was used to induce the gene transcript abundance at hairy roots of I indigotica for different times. The IiDIRs’ transcript abundance was showed in Figure 7. The result of IiDIR15 was not tested during this experiment. IiDIR1/2/4/5/11 were down regulated at 1, 3, 6, 12 and 24 h compared with 0 h. IiDIR8/9/10/16 were up regulated at 1, 3, 6, 12 and 24 h compared with 0 h. The left genes were up or down regulate at different times. IiDIR8 and IiDIR9 were nearly not expressed in roots, stems, leaves or flowers, but both of them were up regulated after treatment with MeJA. The regulation was lasting till the end of the experiment. This indicated that IiDIR8 and IiDIR9 may take part in defense response. IiDIR6/7/12/16/19 were lowly expressed in roots. After treatment with MeJA, they were up regulated at different times and last for a period of time.

Figure 7
figure 7

Heat map of Ii DIR transcript expression obtained after treatment with MeJA at hairy roots. A color bar indicates fold-change expression differences on a natural log scale (treatment/control). Hairy roots of I. indigotica are treated with MeJA for 0, 1, 3, 6, 12 and 24 h. 0 h is designed as control.

Discussion

DIR and DIR-like proteins belong to a multigene family. They are found in all of the major terrestrial plants [15] and are considered to have developed an important enzymatic reaction for the production of lignin and lignan during the time of the adaptation of aquatic plants to the terrestrial environment [31]. The gene number of DIR proteins in plants is different from each other. There are 25 DIRs in A. thaliana, 29 DIRs in B. rapa and 54 DIRs in rice [17, 19]. In this study, 19 DIRs are discovered from I indigotica by bioinformatics methods for the first time.

From the prediction of the ORFs, combined with the topology structures, it is found that the DIR amino acid sequences are divergent, ranging from the shortest protein of 183 aa (IiDIR1) to the longest of 414 aa (IiDIR19). The length of the DIR-e subfamily members is longer than DIRs from the other subfamilies; they range from 224 aa (IiDIR18) to 414 aa (IiDIR19). IiDIRs in the DIR-a subfamily range from 183 aa (IiDIR1) to 188 aa (IiDIR2, IiDIR3 and IiDIR4). The length of the IiDIRs in the DIR-b/d subfamily is similar with DIR-a subfamily, ranging from186 aa (IiDIR5 and IiDIR6) to 191 aa (IiDIR12).

Sequence analysis of IiDIRs, using currently available web-based bioinformatics tools (http://www.cbs.dtu.dk), indicated that most IiDIRs have cleavable N-terminal signal peptides varying from 20 aa to 30 aa (Additional file 4) suggesting an extracellular localization.This means that these IiDIRs are likely to be secreted proteins. N-glycosylation sites (Asn) are a feature of secreted proteins and have been found in FiDIR1, the first and best characterized DIR protein [32]. Thirteen of these 19 IiDIRS have more than one Asn sites, also indicating that most IiDIRs were likely to be secreted protein.

The secondary and tertiary structures show that the IiDIRs are all β-strand proteins. All the IiDIRs have potential β-strands, separated by regions of coils. The β-strands shape the IiDIR proteins like a barrel. The only α-helix exists in the N-terminus and appears to be the signal peptide. Both NetSurfP and PSIPRED prediction for the secondary structure of IiDIRs show that DIRs are all-beta strand proteins. This is in agreement with previous studies [11]. Halls et al. [14] find out that the (+)-pinoresinol-forming dirigent protein from F. intermedia has been confirmed by circular dichroism analysis and is primarily composed of β-sheet and loop structures.

Hitherto, DIRs have not been crystallized [11], and X-ray or NMR structures are remain unavailable. Homologous proteins with known structures can serve as templates for modeling of DIRs. Therefore Phyre2 is used with intensive modeling to perform the prediction of the tertiary structures of IiDIRs, as well as searching for the homologous proteins of IiDIRs. It results that all the IiDIRs are barrel-like proteins.

The topological tree (Figure 3) showed that IiDIRs are divided into DIR-a, DIR-b/d, and DIR-e subfamilies. PDIR17 is separated from DIR-b/d subfamily and clustered to DIR-g subfamily. OsDIR1/2/3/4/9/49 from DIR-g subfamily are clustered with DIR-b/d subfamily. This is in agreement with Ralph’s studies [16, 17]. Ralph found subsequently that several sequences from the previous distinct DIR-d cluster were merged with the former DIR-b subfamily to form the new DIR-b/d subfamily and left the rice DIR-like proteins from the former DIR-d subfamily group to be a separate subfamily, DIR-g [17]. In this study, members of the DIR-b/d and DIR-g subfamilies are recombined again. This might be the result of extended DIR genes.

It should be noted that the transcript abundance of most IiDIRs are comparatively higher in roots and flowers than in stems and leaves. This is in accordance with Arasan’s finding in B. rapa DIRs [19]. It is well known that DIR genes are participate in lignin biosynthesis. So the IiDIRs transcript abundance in an organ specific manner in this study suggests that IiDIRs take possible roles in specific organs through lignin formation and participate in I. indigotica’s developmental processes. These organs also share characteristics that make them particularly prone to other stresses and protect themselves against stress attack. To find out IiDIRs transcript abundance in roots at the stress of MeJA, differential expression of IiDIR genes is mined from I. indigotica expression profiling database [18]. It shows that IiDIR8 and IiDIR9 may take part in defense response, because they are nearly not expressed in roots, stems, leaves or flowers, but both of them are up regulated after treatment with MeJA and the regulation lasting till the end of the experiment.

Conclusions

In this study, 19 DIRs were distinguished from the I. indigotica transcription profiling database for the first time. Sequence characters and transcript abundance of these 19 full-length IiDIRs were analyzed, respectively. The results showed that IiDIR1 and IiDIR2 are similar with AtDIR5 and AtDIR6. They might have the ability to produce (−)-lignans. The organ specific expression results in higher expression in roots and flowers than in stems and leaves indicated that roots and flowers may synthesis more lignin during plant development. IiDIR6/7/8/9/12/16/19 were up regulated after treatment with MeJA, suggesting that they may take part in defense response. All this would provide basic data necessary for further studies.

Methods

Discovery of IiDIRsfrom the transcription profiling database

After 454 pyrosequencing of I. indigotica transcription profiling, a paired-end Solexa sequencing was carried out to maximize the sequence diversity. All of the data was assembled and provided a new database for the discovery of IiDIRs. The database was consisted of 65,196 unigenes at an average length of 1,503 bp. The largest unigene was 20,383 bp long while the length of the smallest unigene was 351 bp. Among all the unigenes, 30,131 genes was annotated [18].

In order to obtain all of the sequences of the DIRs in I. indigotica database, 1715 protein sequences, 1047 nucleotide sequences and 193 EST records of DIRs of other plants from the National Center for Biotechnology Information (http://0-www-ncbi-nlm-nih-gov.brum.beds.ac.uk/) were downloaded using the search word “dirigent”. All of the 2955 sequences were used as queries to search the I. indigotica transcription profiling database through basic local alignment search tool (TBLASTN or BLASTN) to determine all of the candidate IiDIR sequences with e value 1e-5. After removing sequences with alignment length less than 500 bp (the gene length of DIR was longer than 550 bp), there were 314 candidate IiDIR sequences. Since using the search word “dirigent” had not return dirigent proteins exclusively, all of the 314 IiDIR sequences were BLAST with NCBI using default parameters to remove the none dirigent sequences. The NCBI BLAST result was used to search the I. indigotica database again to mine the omission sequences of IiDIRs.

To verify the reliability of the candidate IiDIRs, simple modular architecture research tool (SMART, http://smart.embl-heidelberg.de/) [20] was used to find the dirigent domain in these IiDIRs amino acid sequences respectively using default parameters.

Sequence analysis

All the DIR cDNA sequences mining from the database were analyzed for their basic characteristics. NCBI Open Reading Frame Finder (ORF Finder) (http://0-www-ncbi-nlm-nih-gov.brum.beds.ac.uk/gorf/gorf.html) and Vector NTI Advance (TM) 11.0 were used to identify the whole ORF of each sequence. Predictions for MW and pI were performed using the entire ORFs on Vector NTI Advance (TM) 11.0. The TargetP 1.1 program accessible at http://www.cbs.dtu.dk/services/TargetP/[21] and the WoLF PSORT server (http://www.genscript.com/psort/wolf_psort.html) [22] were used to predict presence of N-terminal signal peptides and localization of the mature protein. The molecular formula of each IiDIR protein was predicted by ProtParam (http://web.expasy.org/protparam/). Multiple protein sequences alignments of the IiDIRs and BrDIRs were made with ClustalX 2.1. All the analysis was carried out using default settings.

Prediction of the secondary and tertiary structures of IiDIRs

Secondary structure predictions of the sequences were performed by NetSurfP (http://www.cbs.dtu.dk/services/NetSurfP/) [25], PSIPRED (http://bioinf.cs.ucl.ac.uk/psipred/) and Phyre2 (http://www.sbg.bio.ic.ac.uk/phyre2/html/page.cgi?id=index) [26] using default parameters. The prediction of the tertiary structures was carried out on Phyre2 with an intensive modeling. The same program was also used to search for homologs of IiDIRs. The amino acid sequences of IiDIRs were used as the target sequences.

Computation of pairwise distances and phylogenetic analysis

Sequence similarities among the 19 full-length amino acids were computed by MEGA 5.05 with p-distance. All of the phylogenetic trees were built using MEGA 5.05 with 1000 bootstrap replicates. CONSENSE, also from MEGA 5.05, was used to create a consensus tree. Bootstrap values above 50% were added to the trees generated from the original data set. The ML tree of 197 DIRs was built using iTOL (http://itol.embl.de/) [33].

Plant materials

The plant of I. indigotica was grown in the botanical garden of Second Military Medical University, Shanghai, China, and identified by Professor Hanming Zhang. Fresh roots, stems, leaves and flowers of this plant were harvested, frozen immediately in liquid nitrogen, and stored at −80°C for RNA isolation.

Preparation of RNA and cDNA

Total RNA of I. indigotica was extracted from stored roots, stems, leaves and flowers respectively using the TIANGEN TRNzol-A+ Reagent for total RNA Isolation Kit (TIANGEN BIOTECH (BEIJING) CO., LTD, Beijing, China). The integrity of the RNA was visualized on ethidium bromide stained agarose gels, and the purity of the RNA was determined by UV spectrometry. The first-strand cDNA was reverse transcribed following the TransScript First-Strand cDNA Synthesis SuperMix’ User Manual (TransGen Biotech, Beijing, China).

Real-time PCR

Real-time PCR was conducted on an ABI 7500 PCR system (Applied Biosystems, USA) using Fast SYBR® Green Master Mix (Applied Biosystems) according to the manufacturer’s instructions. Reaction mixtures contained 1.5 μL of cDNA as template, 0.5 pmol of each primer and 10 μL of 2× Fast SYBR® Green Master Mix in a final volume of 20 μL. Gene-specific primers (Additional file 6) for each IiDIR were designed through Primer Express 3.0 (Applied Biosystems). Specificity of each primer pair was checked by BLASTN searches against the I. indigotica RNA sequences to confirm designed primers were dirigent specific. Primer specificity (single product of expected length) was confirmed by analysis on a 0.8% agarose gel and by melting curve analysis. Gene actin was served as a quantification control. It was the best hit gene found in I. indigotica transcription profiling database through BLAST using 22 A. thaliana’s actin genes [NM_114519.2, NM_179953.2, NM_125328.3, NM_121018.3, NM_115235.3, NM_112764.3, NM_001085300.1, NM_001036427.2, NM_180280.1, NM_103814.3, NM_129772.1, NM_180032.1, AY114679.1, AY062702.1, AY120779.1, AK230311.1, U39480.1, U39449.1, U42007.1, U41998.1, AF308778.1 and NM_112046.3]. Primers for I. indigotica actin were also listed in Additional file 6.

The program for all real-time PCR reactions was: hold at 95°C for 20 s; 40 cycles of 3 s at 95°C and 30 s at 60°C. Data were analyzed using ABI 7500 sds Real-Time PCR system software (Applied Biosystems). All PCR reactions consisted of 3 technical replicates. Transcript abundance of each IiDIR gene was normalized to actin as control and compared with root as reference using 2-Ct method.

Transcript abundance of IiDIRs in I. indigoticahairy roots treated with MeJA

To get insight into the IiDIRs’ transcript abundance induced with MeJA, the Illumina RNA-Seq data provide by Chen [18] was utilized. The RNA-Seq expression profile data were generated using the Illumina HiSeq™ 2000 platform, and included the hairy roots of I. indigotica treated with MeJA at 0, 1, 3, 6, 12 and 24 h. 0 h was used as control to normalize the expression level of other times. Finally, the heat map was constructed using the log2 transformed and normalized expression level data in MultiExperiment Viewer (MeV) [34].

Availability of supporting data

Sequence data from this article can be found in the GenBank data libraries under accession numbers: AhDIR1, AAZ20288.1; AsDIR1, AAY41607.1; AtDIR1, ABR46205.1; AtDIR10, AAU90058.1; AtDIR11, AAQ65106.1; AtDIR12, AEE82982.1; AtDIR13, AAP88352.1; AtDIR14, AEE82984.1; AtDIR15, AEE86966.1; AtDIR16, AAP37695.1; AtDIR17, CAB67637.1; AtDIR18, AEE83298.1; AtDIR19, AAO39937.1; AtDIR2, AAP37801.1; AtDIR20, AAU15178.1; AtDIR21, AEE34435.1; AtDIR22, AAU15153.1; AtDIR23, AAT71988.1; AtDIR24, AEE79355.1; AtDIR25, AAP49521.1; AtDIR3, AED95765.1; AtDIR5, AAQ65109.1; AtDIR6, AEE84795.1; AtDIR7, AAQ89609.1; AtDIR8, AEE75389.1; AtDIR9, AAR20779.1; AtDIRD4, AEC07124.1; FiDIR1, AAF25357.1; FiDIR2, AAF25358.1; GbDIR1, AAS73001.2; GbDIR2, AAY44415.1; HvDIR1, AAA87042.1; HvDIR2, AAA87041.1; HvDIR3, AAB72098.1; NbDIR1, BAF02555.1; OsDIR1, BAF20623.1; OsDIR10, BAF12227.1; OsDIR11, BAF22309.1; OsDIR12, BAF22310.1; OsDIR13, BAF22318.2; OsDIR14, BAC19943.1; OsDIR15, BAF22323.1; OsDIR16, BAC16397.1; OsDIR17, AAM74352.1; OsDIR18, BAD25846.1; OsDIR19, BAF13568.1; OsDIR2, BAF20624.1; OsDIR20, AAO17346.1; OsDIR21, BAB64642.1; OsDIR22, BAD52647.1; OsDIR23, BAF26452.2; OsDIR24, BAD53304.1; OsDIR25, AAM74358.1; OsDIR26, AAM74346.1; OsDIR27, BAD03849.1; OsDIR28, BAD03720.1; OsDIR29, BAD03711.1; OsDIR3, BAF20622.1; OsDIR30, BAD03854.1; OsDIR31, BAF29307.1; OsDIR32, BAF27737.1; OsDIR33, BAF27733.1; OsDIR34, AAX96293.1; OsDIR35, BAF27735.1; OsDIR36, BAF27734.1; OsDIR37, BAF29386.1; OsDIR38, BAF29514.1; OsDIR39, BAF29458.2; OsDIR4, BAF23585.1; OsDIR40, BAF29387.1; OsDIR41, BAF29454.1; OsDIR42, ABA94701.1; OsDIR43, BAH95407.1; OsDIR44, BAB89759.1; OsDIR46, BAH92863.1; OsDIR47, BAF27863.1; OsDIR48, BAF27866.1; OsDIR49, BAF27867.1; OsDIR5, BAF26451.1; OsDIR50, BAF23524.2; OsDIR51, BAD89460.1; OsDIR52, AAX96290.1; OsDIR53, ABA93522.1; OsDIR54, AAX96314.1; OsDIR6, BAF22196.1; OsDIR7, BAF22195.2; OsDIR8, BAB89617.1; OsDIR9, BAF20620.1; PDIR1, ABD52112.1; PDIR10, ABD52121.1; PDIR11, ABD52122.1; PDIR12, ABD52123.1; PDIR13, ABD52124.1; PDIR14, ABD52125.1; PDIR15, ABD52126.1; PDIR16, ABD52127.1; PDIR17, ABD52128.1; PDIR18, ABD52129.1; PDIR19, ABD52130.1; PDIR2, ABD52113.1; PDIR20, ABR27716.1; PDIR21, ABR27717.1; PDIR22, ABR27718.1; PDIR23, ABR27719.1; PDIR24, ABR27720.1; PDIR25, ABR27721.1; PDIR26, ABR27722.1; PDIR27, ABR27723.1; PDIR28, ABR27724.1; PDIR29, ABR27725.1; PDIR3, ABD52114.1; PDIR30, ABR27726.1; PDIR31, ABR27727.1; PDIR32, ABR27728.1; PDIR33, ABR27729.1; PDIR34, ABR27730.1; PDIR35, ABR27731.1; PDIR4, ABD52115.1; PDIR5, ABD52116.1; PDIR6, ABD52117.1; PDIR7, ABD52118.1; PDIR8, ABD52119.1; PDIR9, ABD52120.1; PpDIR1, AAK38666.1; PsDIR1, AAD25355.1; PsDIR2, AAB18669.1; SbDIR1, AAM94289.1; SbDIR2, ABI24164.1; SiDIR1, AAT11124.1; SoDIR1, AAR00251.1; SoDIR2, CAF25234.1; SoDIR3, AAV50047.1; TaDIR1, AAC49284.1; TaDIR2, AAM46813.1; TaDIR3, BAA32786.3; TaDIR4, AAR20919.1; TanDIR1, ABE73781.1; ThDIR1, AAF25367.1; ThDIR2, AAF25368.1; TpDIR, AAF25364.1; TpDIR1, AAF25359.1; TpDIR2, AAF25360.1; TpDIR3, AAF25361.1; TpDIR4, AAF25362.1; TpDIR5, AAF25363.1; TpDIR7, AAF25365.1; TpDIR8, AAF25366.1; TpDIR9, AAL92120.1; and ZmDIR1, AAF71261.2.

References

  1. Committee NP: Chinese Pharmacopoeia. 2010, Beijing: China Medical Science Press, 191-

    Google Scholar 

  2. Tietao D: Discussion on treatment of SARS by TCM. Tianjin J Traditional Chin Med. 2003, 3: 001-

    Google Scholar 

  3. Li B: Active Ingredients and Quality Evaluation of Isatis Indigotica. 2003, Shanghai: Second Military Medical University

    Google Scholar 

  4. Saleem M, Kim HJ, Ali MS, Lee YS: An update on bioactive plant lignans. Nat Prod Rep. 2005, 22 (6): 696-716. 10.1039/b514045p.

    Article  CAS  PubMed  Google Scholar 

  5. Adlercreutz H, Mazur W: Phyto-oestrogens and Western diseases. Ann Med. 1997, 29 (2): 95-120. 10.3109/07853899709113696.

    Article  CAS  PubMed  Google Scholar 

  6. Arts IC, van de Putte B, Hollman PC: Catechin contents of foods commonly consumed in The Netherlands. 1. Fruits, vegetables, staple foods, and processed foods. J Agric Food Chem. 2000, 48 (5): 1746-1751. 10.1021/jf000025h.

    Article  CAS  PubMed  Google Scholar 

  7. Adlercreutz H, Mousavi Y, Clark J, Höckerstedt K, Hämäläinen E, Wähälä K, Mäkelä T, Hase T: Dietary phytoestrogens and cancer: in vitro and in vivo studies. J Steroid Biochem Mol Biol. 1992, 41 (3): 331-337.

    Article  CAS  PubMed  Google Scholar 

  8. Raffaelli B, Hoikkala A, Leppälä E, Wähälä K: Enterolignans. J Chromatogr B. 2002, 777 (1): 29-43.

    Article  CAS  Google Scholar 

  9. Arts IC, Hollman PC: Polyphenols and disease risk in epidemiologic studies. Am J Clin Nutr. 2005, 81 (1): 317S-325S.

    CAS  PubMed  Google Scholar 

  10. Burlat V, Kwon M, Davin LB, Lewis NG: Dirigent proteins and dirigent sites in lignifying tissues. Phytochemistry. 2001, 57 (6): 883-897. 10.1016/S0031-9422(01)00117-0.

    Article  CAS  PubMed  Google Scholar 

  11. Pickel B, Pfannstiel J, Steudle A, Lehmann A, Gerken U, Pleiss J, Schaller A: A model of dirigent proteins derived from structural and functional similarities with allene oxide cyclase and lipocalins. FEBS J. 2012, 279 (11): 1980-1993. 10.1111/j.1742-4658.2012.08580.x.

    Article  CAS  PubMed  Google Scholar 

  12. Davin LB, Wang H-B, Crowell AL, Bedgar DL, Martin DM, Sarkanen S, Lewis NG: Stereoselective bimolecular phenoxy radical coupling by an auxiliary (dirigent) protein without an active center. Science. 1997, 275 (5298): 362-367. 10.1126/science.275.5298.362.

    Article  CAS  PubMed  Google Scholar 

  13. Halls SC, Lewis NG: Secondary and quaternary structures of the (+)-pinoresinol-forming dirigent protein. Biochemistry. 2002, 41 (30): 9455-9461. 10.1021/bi0259709.

    Article  CAS  PubMed  Google Scholar 

  14. Halls SC, Davin LB, Kramer DM, Lewis NG: Kinetic study of coniferyl alcohol radical binding to the (+)-pinoresinol forming dirigent protein. Biochemistry. 2004, 43 (9): 2587-2595. 10.1021/bi035959o.

    Article  CAS  PubMed  Google Scholar 

  15. Davin LB, Lewis NG: Dirigent proteins and dirigent sites explain the mystery of specificity of radical precursor coupling in lignan and lignin biosynthesis. Plant Physiol. 2000, 123 (2): 453-462. 10.1104/pp.123.2.453.

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  16. Ralph S, Park J-Y, Bohlmann J, Mansfield SD: Dirigent proteins in conifer defense: gene discovery, phylogeny, and differential wound-and insect-induced expression of a family of DIR and DIR-like genes in spruce (Picea spp.). Plant Mol Biol. 2006, 60 (1): 21-40. 10.1007/s11103-005-2226-y.

    Article  CAS  PubMed  Google Scholar 

  17. Ralph SG, Jancsik S, Bohlmann J: Dirigent proteins in conifer defense II: extended gene discovery, phylogeny, and constitutive and stress-induced gene expression in spruce (Picea spp.). Phytochemistry. 2007, 68 (14): 1975-1991. 10.1016/j.phytochem.2007.04.042.

    Article  CAS  PubMed  Google Scholar 

  18. Chen J, Dong X, Li Q, Zhou X, Gao S, Chen R, Sun L, Zhang L, Chen W: Biosynthesis of the active compounds of Isatis indigotica based on transcriptome sequencing and metabolites profiling. BMC Genomics. 2013, 14 (1): 857-10.1186/1471-2164-14-857.

    Article  PubMed Central  PubMed  Google Scholar 

  19. Thamil Arasan SK, Park J-I, Ahmed NU, Jung H-J, Hur Y, Kang K-K, Lim Y-P, Nou I-S: Characterization and expression analysis of dirigent family genes related to stresses in Brassica. Plant Physiol Biochem. 2013, 67: 144-153.

    Article  CAS  PubMed  Google Scholar 

  20. Letunic I, Doerks T, Bork P: SMART 7: recent updates to the protein domain annotation resource. Nucleic Acids Res. 2012, 40 (D1): D302-D305. 10.1093/nar/gkr931.

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  21. Emanuelsson O, Nielsen H, Brunak S, Von Heijne G: Predicting subcellular localization of proteins based on their N-terminal amino acid sequence. J Mol Biol. 2000, 300 (4): 1005-1016. 10.1006/jmbi.2000.3903.

    Article  CAS  PubMed  Google Scholar 

  22. Horton P, Park K-J, Obayashi T, Fujita N, Harada H, Adams-Collier C, Nakai K: WoLF PSORT: protein localization predictor. Nucleic Acids Res. 2007, 35 (suppl 2): W585-W587.

    Article  PubMed Central  PubMed  Google Scholar 

  23. Gupta R, Jung E, Brunak S: Prediction of N-glycosylation sites in human proteins. preparation. 2004, 2004,

    Google Scholar 

  24. Gasteiger E, Hoogland C, Gattiker A, Duvaud S, Wilkins MR, Appel RD, Bairoch A: Protein identification and analysis tools on the ExPASy server. The proteomics protocols handbook. Edited by: Walker JM. 2005, Totowa, N.J: Humana Press, 571-607.

    Chapter  Google Scholar 

  25. Petersen B, Petersen TN, Andersen P, Nielsen M, Lundegaard C: A generic method for assignment of reliability scores applied to solvent accessibility predictions. BMC Struct Biol. 2009, 9 (1): 51-10.1186/1472-6807-9-51.

    Article  PubMed Central  PubMed  Google Scholar 

  26. Kelley LA, Sternberg MJ: Protein structure prediction on the Web: a case study using the Phyre server. Nat Protoc. 2009, 4 (3): 363-371. 10.1038/nprot.2009.2.

    Article  CAS  PubMed  Google Scholar 

  27. Pickel B, Constantin MA, Pfannstiel J, Conrad J, Beifuss U, Schaller A: An enantiocomplementary dirigent protein for the enantioselective laccase‒catalyzed oxidative coupling of phenols. Angew Chem Int Ed. 2010, 49 (1): 202-204. 10.1002/anie.200904622.

    Article  CAS  Google Scholar 

  28. Davin LB, Lewis NG: Dirigent phenoxy radical coupling: advances and challenges. Curr Opin Biotechnol. 2005, 16 (4): 398-406. 10.1016/j.copbio.2005.06.010.

    Article  CAS  PubMed  Google Scholar 

  29. Kim MK, Jeon J-H, Fujita M, Davin LB, Lewis NG: The western red cedar (Thuja plicata) 8-8’ DIRIGENT family displays diverse expression patterns and conserved monolignol coupling specificity. Plant Mol Biol. 2002, 49 (2): 199-214. 10.1023/A:1014940930703.

    Article  CAS  PubMed  Google Scholar 

  30. Thompson JD, Gibson TJ, Plewniak F, Jeanmougin F, Higgins DG: The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res. 1997, 25 (24): 4876-4882. 10.1093/nar/25.24.4876.

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  31. Kim MK, Jeon J-H, Davin LB, Lewis NG: Monolignol radical–radical coupling networks in western red cedar and Arabidopsis and their evolutionary implications. Phytochemistry. 2002, 61 (3): 311-322. 10.1016/S0031-9422(02)00261-3.

    Article  CAS  PubMed  Google Scholar 

  32. Gang DR, Costa MA, Fujita M, Dinkova-Kostova AT, Wang H-B, Burlat V, Martin W, Sarkanen S, Davin LB, Lewis NG: Regiochemical control of monolignol radical coupling: a new paradigm for lignin and lignan biosynthesis. Chem Biol. 1999, 6 (3): 143-151. 10.1016/S1074-5521(99)89006-1.

    Article  CAS  PubMed  Google Scholar 

  33. Letunic I, Bork P: Interactive Tree Of Life (iTOL): an online tool for phylogenetic tree display and annotation. Bioinformatics. 2007, 23 (1): 127-128. 10.1093/bioinformatics/btl529.

    Article  CAS  PubMed  Google Scholar 

  34. Saeed A, Sharov V, White J, Li J, Liang W, Bhagabati N, Braisted J, Klapa M, Currier T, Thiagarajan M, Sturn A, Snuffin M, Rezantsev A, Popov D, Ryltsov A, Kostukovich E, Borisovsky I, Liu Z, Vinsavich A, Trush V, Quackenbush J: TM4: a free, open-source system for microarray data management and analysis. Biotechniques. 2003, 34 (2): 374-378.

    CAS  PubMed  Google Scholar 

Download references

Acknowledgements

The authors greatly acknowledge Prof. Geoffrey A. Cordell (University of Illinois and University of Florida, and President of Natural Products Inc.) for proofreading the manuscript. This work was financial supported by Natural Science Foundation of China [31100221, 81325024 and 81303160].

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Lei Zhang or Wansheng Chen.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors’ contributions

QL conceived the study and participated in data mining, data analysis and drafted the manuscript. JC provided the Isatis indigotica Fort. transcription profiling database. YX initiated the project. PD prepared the figures. LZ participated in the design of the study. WC helped to conceive the study and participated in its design and coordination. All authors read and approved the final manuscript.

Electronic supplementary material

Additional file 1: Sequences of 19 IiDIR s.(TXT 14 KB)

12864_2013_6080_MOESM2_ESM.doc

Additional file 2: Homology analysis of IiDIRs.1Disease resistance-responsive family protein. 2Disease resistance-responsive (dirigent-like protein) family protein. 3Defense response. 4Lignan biosynthetic process. (DOC 182 KB)

Additional file 3: Gene characteristics of DIRs from I. indigotica.(DOC 256 KB)

12864_2013_6080_MOESM4_ESM.docx

Additional file 4: Ii DIRs secondary structure predictions. IiDIRs secondary structures were predicted with NetSurfP (http://www.cbs.dtu.dk/services/NetSurfP/) using the whole amino acid sequences. The probability calculated for three types of secondary structure is shown (dashed, α-helix; solid, β-strand; and dotted, coil) against the residue number of the IiDIRs sequences. (DOCX 101 KB)

12864_2013_6080_MOESM5_ESM.tiff

Additional file 5: Maximum Likelihood (ML) phylogenetic trees of 19 Ii DIRs. The values on the branches are bootstrap proportions, which indicate the percentage values for obtaining this particular branching in 1000 repetitions of the analysis. The lengths of branches are proportional to evolutionary distances between species. -ln = 3970.33, model: WAG + F. (TIFF 640 KB)

Additional file 6: Primer sequences used for real-time PCR.(DOCX 30 KB)

12864_2013_6080_MOESM7_ESM.docx

Additional file 7: IiDIR s’ Ct value of quantitative real-time PCR. Ct values are presented as mean ± SEM, n = 3. (DOCX 30 KB)

Authors’ original submitted files for images

Rights and permissions

Open Access This article is published under license to BioMed Central Ltd. This is an Open Access article is distributed under the terms of the Creative Commons Attribution License ( https://creativecommons.org/licenses/by/2.0 ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver ( https://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, Q., Chen, J., Xiao, Y. et al. The dirigent multigene family in Isatis indigotica: gene discovery and differential transcript abundance. BMC Genomics 15, 388 (2014). https://0-doi-org.brum.beds.ac.uk/10.1186/1471-2164-15-388

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://0-doi-org.brum.beds.ac.uk/10.1186/1471-2164-15-388

Keywords