- Research article
- Open Access
Evolution and comparative genomics of the most common Trichoderma species
- Christian P. Kubicek†1, 2,
- Andrei S. Steindorff†3, 4,
- Komal Chenthamara1,
- Gelsomina Manganiello4, 5,
- Bernard Henrissat6, 7, 8,
- Jian Zhang9,
- Feng Cai9,
- Alexey G. Kopchinskiy1,
- Eva M. Kubicek2,
- Alan Kuo4,
- Riccardo Baroncelli10,
- Sabrina Sarrocco11,
- Eliane Ferreira Noronha3,
- Giovanni Vannacci10,
- Qirong Shen9Email author,
- Igor V. Grigoriev4, 12Email author and
- Irina S. Druzhinina1, 9Email authorView ORCID ID profile
© The Author(s). 2019
- Received: 5 January 2019
- Accepted: 9 April 2019
- Published: 12 June 2019
The growing importance of the ubiquitous fungal genus Trichoderma (Hypocreales, Ascomycota) requires understanding of its biology and evolution. Many Trichoderma species are used as biofertilizers and biofungicides and T. reesei is the model organism for industrial production of cellulolytic enzymes. In addition, some highly opportunistic species devastate mushroom farms and can become pathogens of humans. A comparative analysis of the first three whole genomes revealed mycoparasitism as the innate feature of Trichoderma. However, the evolution of these traits is not yet understood.
We selected 12 most commonly occurring Trichoderma species and studied the evolution of their genome sequences. Trichoderma evolved in the time of the Cretaceous-Palaeogene extinction event 66 (±15) mya, but the formation of extant sections (Longibrachiatum, Trichoderma) or clades (Harzianum/Virens) happened in Oligocene. The evolution of the Harzianum clade and section Trichoderma was accompanied by significant gene gain, but the ancestor of section Longibrachiatum experienced rapid gene loss. The highest number of genes gained encoded ankyrins, HET domain proteins and transcription factors. We also identified the Trichoderma core genome, completely curated its annotation, investigated several gene families in detail and compared the results to those of other fungi. Eighty percent of those genes for which a function could be predicted were also found in other fungi, but only 67% of those without a predictable function.
Our study presents a time scaled pattern of genome evolution in 12 Trichoderma species from three phylogenetically distant clades/sections and a comprehensive analysis of their genes. The data offer insights in the evolution of a mycoparasite towards a generalist.
- Ankyrin domains
- Core genome
- Environmental opportunism
- Gene gain
- Gene loss
The Sordariomycetes, one of the largest classes in the Division Ascomycota, display a wide range of nutritional strategies including saprotrophy and biotrophic interactions with bacteria, plants, animals, fungi or other organisms . Within them, the highest number of known genera is found in the order Hypocreales  that comprises half of the whole-genome sequenced species of Sordariomycetes (Nov. 2017, NCBI Taxonomy Browser). Molecular data suggest that the ancestors of the Hypocreales evolved some 170–200 Mya as fungi associated with plants either as parasites or saprotrophs . The diversification into extant taxa was accompanied by several intra- and interkingdom host shifts involving fungi, higher plants, and animals . Among them, parasites of animals likely appeared first in the Jurassic period, and specialized entomoparasitic families developed during the Cretaceous period, thereby following the diversification of herbivory insects and angiosperms .
Mycoparasitic fungi can be found in species from several fungal taxa , but only the Hypocreales contain exclusively fungicolous genera, i.e. Hypomyces, Escovopsis, and Trichoderma. The ancestor of these mycoparasitic fungi likely evolved at the same time as the entomoparasites, but the time and events of Trichoderma evolution are not known.
Among these fungicolous fungal genera, Trichoderma is the largest taxon with many ubiquitously distributed species. Detailed ecological and biogeographic surveys of Trichoderma [6–9] revealed that species of this genus are most frequently found on the fruiting bodies of other fungi and the dead wood colonized by them. While mycoparasitism in Hypomyces is frequently species-specific and restricted to fruiting body forming Basidiomycota , the genus Trichoderma is unique as many of its species can parasite also on Ascomycetes and even on phylogenetically close species .
It is not known how generalism evolved from the phytosaprotrophic background of the Hypocreales. Chaverri and Samuels  compared a phylogenetic tree of the genus Trichoderma with the habitats from which the individual species had been isolated and concluded that the evolution of the genus involved several interkingdom host jumps and that preference for a special habitat was gained or lost multiple time. It has been argued that the versatility of Trichoderma’s nutritional strategies can be described by the expansions of the spectrum of hosts and substrates due to enrichment of its genome by the laterally transferrered genes required for the feeding on the plant biomass .
The hypothesis of this work was that a comparative genomics of those species of Trichoderma which are most frequently sampled (and therefore must be most successful generalists) and an analysis of their pattern of gene evolution would reveal the evolutionary events that shaped the nutritional expansions and environmental generalism. In addition, identification of the gene inventory of the Trichoderma core genome (i.e. the genes that are present in all species) and its intersection with genomes of other fungi would reveal the specific genomic features of these industrially-relevant fungi.
Although the sequences of several Trichoderma genomes have already been published [11, 16–24], detailed genome wide analyses have been published for only three of them (T. reesei, T. virens and T. atroviride [11, 16, 25–27]). To test the hypothesis raised above, we have analysed the evolution and the gene inventory of the genomes from 13 Trichoderma isolates that represent 12 species with a worldwide distribution and are members of three major infrageneric groups .
Selection of the most common Trichoderma species
To reveal the most frequently sampled species in the genus Trichoderma, we have first calculated the number of nucleotide sequences deposited for Trichoderma spp. in NCBI GenBank (see Methods). There is today general agreement that the new Trichoderma spp. can only be defined by at least three or more gDNA sequences while the analysis of usually two DNA barcode fragments is required for the species identification [7–9]. The number of gene sequences in NCBI per each species may therefore roughly correspond to the number of isolates detected for this species and thus approximate the frequency of the general species occurrence. This analysis revealed (Additional file 1) that most species (80%) were relatively rare as they were represented by < 50 gene sequences each, whereas 35 species (12% of the total number of species) were represented by more than 100 nucleotide sequences each. Of these, 84% of nucleotide sequences were attributed to a small group of common species: T. harzianum sensu lato (also deposited as T. lixii or Hypocrea lixii) was responsible for 32% (9532 sequences) of total sequences. This was followed by T. asperellum, T. atroviride, T. longibrachiatum, T. gamsii, and T. virens that were represented by > 1000 sequences each and therefore also frequent.
General comparison of the genomes of twelve Trichoderma species
Based on the above analysis, we compared the genomes from 12 Trichoderma species: T. reesei, T. longibrachiatum, T. citrinoviride, T. parareesei from section Longibrachiatum (SL), T. harzianum (the the ex-type strain CBS226.95 marked with “T” throughout the manuscript, and strain TR274), T. guizouense, T. afroharzianum, T. virens from Harzianum and Virens clades (HV), and T. atroviride, T. gamsii, T. asperellum, and T. hamatum from section Trichoderma (ST). The relation between the species is shown on Fig. 2. The species concept of T. harzianum has recently been revised  and it is not known what percentage of the newly defined species would account for “T. harzianum” entries in GeneBank. We therefore included T. guizhouense and T. afroharzianum, two species with worldwide distribution [28, 29], and two strains of T. harzianum (one from Northern Europe and one from Brazil ) in this study. T. parareesei and T. gamsii were included because they are sibling species of T. reesei and T. atroviride, respectively (Fig. 2).
Properties of the Trichoderma genomes and gene distribution
Genome size (Mbp)
Orthologs and paralogs
T. reesei a
Evolution of the twelve Trichoderma species
Trichoderma gene inventory
Distribution of Trichoderma genes in sections, clades and species
All clusters with at least one gene from Trichoderma
At least two species from each clade
the clade only
T. harzianum T
PFAM group members with more than 500 genes in the 13 Trichoderma isolates
genes per cluster
Zn2Cys6 transcriptional regulators
PF00561, 07859, 02230
zinc-dependent alcohol dehydrogenases
AAA + -ATPasesAAA+ − ATPases
cytochrome P450 monooxygenases
vegetative heteroincompatibility (HET) proteins
PF06985, 07217, 17,108
amino acid permeases
PF00583, 00797, 13,302, 13,523
NmrA-like proteins, NAD-binding negative regulators of GATA-binding proteins
DnaJ molecular chaperone
RRM_1 RNA binding proteins
The Trichoderma core genome
A similar search for the functionally uncharacterized proteins revealed that 1331 of them were shared with all other fungi. The number of those shared only between some orders or families suggests a phylogenetic relationship: 211 of them were present only in the Hypocreaceae, and 177 in all Sordariomycetes (Fig. 5).
We conclude from these data that 80.7% of the genes encoding functionally predictable proteins and 67.4% of the genes encoding functionally not predictable proteins in the Trichoderma are already been present in the ancestor of Eurotiomycetes and Sordariomycetes and are therefore at least 250 million years old.
Comparing the intraspecific genome differences between the two isolates of T. harzianum showed that 1699 genes of T. harzianumT (12%) were absent from the other strain, and 1419 genes present in the latter (10.1%) absent from the type strain. Most of these genes encoded orphan proteins for the species, and a function could only be predicted for 158 and 160 genes in T. harzianumT and T. harzianum TR247, respectively. Their properties are described in Additional file 9.
We also compared the genomes of T. longibrachiatum and T. citrinoviride - the two species that are more frequently encountered as opportunistic pathogens of immunocomproized humans  - and identified 94 genes that were only present in these two species but absent from all others and could therefore belinked for their pathogenicity (Additional file 10).
105 genes of the core genome were present in all 12 Trichoderma species but not found in any other fungus. They thus fulfil the criterium of “genus specific orphans” and we will use the term “orphans” for them further throughout the manuscript. No function could be predicted for any of these genes.
Gene expansion and contraction during evolution of Trichoderma species
While these data show that the origin the genus Trichoderma and two of its clades/sections (HV, ST) underwent strong gene expansion whereas SL exhibits significant gene contraction, a deeper look into the gene evolution at the level of individual species revealed a mosaic of gain and loss events (Fig. 6a and b). Exceptions were T. longibrachiatum which shows only gene losses (but these data must be viewed with caution because of the higher incompleteness of its genome; see above), and T. citrinoviride which displays only gains. These data suggest that the extant taxa of Trichoderma are reforming their genomes at an increased rate, which is particularly reflected in T. harzianum because the two isolates of this species differed remarkably in their gene loss and gain.
The principal component analysis revealed that the tree different strategies in gene gain and loss that are characteristic for each section or clade (Fig. 6c). As all the tested species are nutritionally versatile, common and cosmopolitan, this pattern of group-specific evolution points to the importance of the core genome is the basis for the generalism.
Since the evolution of the Trichoderma genomes from their ancestor from 120 (±21) to 66 (±15) mya occured entirely by gene expansion (no gene losses revealed by the CAFÉ analysis, Figure 6a,b), we wondered whether this was due to a small genome in its putative ancestor. We therefore extended the CAFE analysis to all available Hypocreales genomes. Unfortunately, at the 99% probability used for Trichoderma, this analysis yielded no data which is probably due to the insufficient number of genomes that are currently available for the predictions over such long evolutionary interval. Reducing the probability level to 95%, however, revealed that the evolution after the split from the entomoparasite branch (184.6 ± 8 mya; see Fig. 3) and the obligate mycoparasite Escovopsis weberi (119.8 ± 21 mya) was accompanied by a total of 23 gain losses and only a single gain (Additional file 11). The ancestors of the genus Trichoderma may therefore have indeed been subject to a significant genome contraction.
The Trichoderma genomes reveals the potential for heterothallic sexual reproduction
Mating type genes in Trichoderma
T. harzianum T
Sensing of a potential mating partner is a prerequisite for sexual reproduction and fulfilled by the pheromone system . The genes involved in this process were found in all Trichoderma spp. and are given in Additional file 12.
Major aspects of Trichoderma metabolism
Carbon metabolism of Trichoderma has so far mainly been studied in T. reesei only and with respect to the catabolism of hemicellulose and pectin monomers [39, 47, 48]. We have therefore manually annotated all genes of the core genome that are putatively involved in carbon metabolism. The majority of these genes has already been described in detail for T. reesei, T. atroviride and T. virens, and we refrain from repeating these data here . Yet we detected some novel features, such as the presence of an extracellular glucose oxidase, D-xylulose-5-phosphate/D-fructose-6-phosphate ketolases, enzymes for D-erythroascorbic acid biosynthesis, and a glutathione-linked methanol degradation pathway. These findings are described in some detail in Additional file 13.
Extracellular polymer hydrolysis
Apart of polysaccharides, proteins hydrolyzed by various proteases provide a major nutritional source for fungi. Some of protease families are also important for the digestion of proteins secreted by competing organisms [50–52] or hosts. We screened the secretome of the 12 Trichoderma genomes for proteases using the MEROPS database (see Methods for details). This demonstrated the presence of A1 aspartyl proteases, G1 eqolisins (previously termed “pepstatin-insensitive aspartyl proteases”), C13 legumain-type cysteine proteases, eight metalloprotease families (InhA-like peptidases, M6; carboxypeptidases, M14; glutamate carboxypeptidases, M20; methionine aminopeptidases, M24; aminopeptidase Y, M28; deuterolysin, M35 and fungalysin, M36), and six families of serine proteases (S1 chymotrypsins, S8 subtilisins, S10 carboxypeptidases, S28 an S51 dipeptidases, and S53 sedolisins) in Trichoderma (Additional file 15). Aspartyl proteases, subtilisins, sedolisins, and aminopeptidase Y were present in the highest numbers of isoenzymes. Family S10 was particularly abundant in HV, and S53 in HV and ST. In summary, however, the number of Trichoderma proteases is comparable to that of many other fungi [50–52], and we found no protease family that was specifically expanded or contracted in Trichoderma. Proteases have been speculated to be a component allowing niche differentiation between the ascomycetes and the basidiomycetes, particularly towards adaptation to pathogenicity by the former . However, our data suggest that the primeval proteolytic arsenal of Trichoderma was sufficient for the acquisition of the mycoparasitic lifestyle and its more recent expansion towards generalism.
Secondary metabolites (SM) are an intrinsic feature of most Pezizomycotina, because they participate in cellular signalling, competition, pathogenicity, and metal ion uptake . Trichoderma too has been shown to be a proliferic producer of SMs [54, 55]. Unfortunately, the genes encoding these SMs and even the species identity of the SM producing isolates are in most cases unknown. We identified 10–25 polyketide synthase (PKS), 12–34 non-ribosomal polypeptide synthetase (NRPS)-, and 6–14 terpenoid synthase (TS) encoding genes in the 12 species (see Additional files 16 and 17), of which 6 PKS, 10 NRPS and 3 TS genes were present in the core genome.
In contrast to PKS, NRPS and TS, Trichoderma seems not to synthesize alkaloids, as we could not find the genes encoding the precursor dimethylallyl tryptophan synthases (DMATS; ) in any of studied genomes.
Small cysteine-rich secreted proteins
Number and types of small secreted cystein-rich proteins in Trichoderma
Trichoderma orphan genes
The Trichoderma core genome contained 105 orphan genes (vide supra; Additional file 19). While they comprised only 1.5% of the genes in the core genome, orphan genes restricted to sections/clades or evensingle species were much more abundant (on the average 17.4, 13.0 and 10.1% in SL, ST and HV, respectively; and even higher within the pool of species-specific genes (see also Additional file 3).
To analyse the evolution of the orphans, we measured the selection pressure acting on them by calculating the ratio of non-synonymous and synonymous amino acid substitution (dN/dS) for 53 orphan genes that are present in the Trichoderma core genome and whose nucleotide sequence could be unambiguously aligned. A dN/dS = 1 would indicate neutral evolution and dN/dS < 1 can be interpreted as evidence for purifying selection . The data obtained (Additional file 20) show that most of these genes is under strong purifying selection. An assessment of dN/dS for the clade-specific orphans was not possible, because their nucleotide sequences were too polymorphic to be aligned.
This study is based on the genome sequences of 12 of the most common Trichoderma species. Although this number represents only a few percent of Trichoderma spp. described today, the selected species are members of three phylogenetically distant sections and clades, and the results therefore enable a broader insight of the genus. Also, these species were most frequent in our own studies of soil or rhizosphere sampled in different geographic regions such as the Canary Islands, Sardinia, Columbia, Egypt, China, Israel, South-East Asia, Siberia and many others [6–8, 13] and may therefore be called cosmopolitan. Because several of the twelve species that were selected by this study are used as bioeffectors in biocontrol products against plant pathogenic fungi, stimulate plant growth and immunity, are opportunistic pathogens of immunocompromised humans and are causative agents of the green mold disease on mushroom farms [6, 13], they can be considered as environmental opportunists in a broad sense. Although species in the each of the sections and clades have unique morphological features, their overall ecological features are similar: they are mycoparasites, can feed on cellulolytic material, and can establish themselves in soil and colonize the rhizosphere. This may suggest that these species maintained the “opportunistic” features from a common ancestor what may be reflected in the core genome.
We have therefore investigated the evolution and the therefrom arosen changes in the gene inventory of the selected 12 species. Although all the genomes were still incomplete, the small predicted percentage of missing genes (2–5% for all species except T. longibrachiatum) makes it probable that we have identified all gene families that are relevant for the interpretations and conclusions in this paper. We particularly emphasize that the differences in gene numbers that we considered relevant were in most cases several folds higher than the number of putatively missed genes.
Our results reveal that the the mycoparasitic Hypocreales deversified between 100 and 140 mya, the ancestor of Trichoderma evolved around the time of the K-Pg, and the formation of the three infrageneric groups studied (ST, SL and HV) occured 40–45 mya after the K-Pg event. The uncertainty in chronological dating makes it impossible to decide whether the genus Trichoderma arose before or after K-Pg. However, we have recently proposed that the genus Trichoderma has obtained most of the genes encoding plant cell wall degrading CAZymes required for phytosaprotrophic growth, by the lateral gene transfer  that likely took place before the diversification into infrageneric groups. The most likely interpretation of these data is therefore that Trichoderma was one of the fungal genera that participated in the strong burst in fungal populations that fed on the decaying biomass of the plants killed by the K-Pg . Whether or not this increase in the number of fungi stimulated mycoparasitism can only be speculated, but clearly a successful antagonism and the ability to kill its competitor may have aided Trichoderma in establishing a high population density on decaying plant biomass. Moreover, the ability to endoparasitise closely related species (up to adelphoparasitism) could favor host/parasite DNA exchanges and further contribute to the formation of the unique core genome of Trichoderma .
Despite the standard deviation in the dating of fungal phylogenies, our data strongly suggest that the evolution of the three Trichoderma sections/clades investigated in this study occured after the K-Pg event. The origin of extant species in the three sections/clades occured in the early oligocene (20–30 mya), a phase characterized by cooler seasons and a significant extinction of the invertebrate marine fauna . It is intriguing that this split led to an increased rate of gene gain and genome expansion in HV, whereas the formation of SL was accompanied by a significant gene loss. Kelkar and Ochman  reported that Pezizomycotina genomes in the size of 25- to 75 Mb (which includes all Trichoderma spp. investigated in this study) exhibit a positive correlation between decreased genome size and increased genetic drift, and vice versa. On a first glance, this observation may not be applicable to genome contraction in SL, because it concerned genes from nearly all functional categories and thus was not specifically directed to support a certain trait. Alternatively, our results could be explained by the streamlining hypothesis , which considers selection for a more economical lifestyle as the major driving force for genome reduction. According to this model, the presence or absence of multiple genes for the same function may produce only a small effect on the performance of the organism and thus have only little benefits for the cell. Sun and Blanchard  considered that this scenario would most likely occur in relative stable environments where competition for nutrients is severe, and where a smaller genome has the ecological advantage of spending less energy for growth and development. We speculate that HV and ST – but not SL - used this alternative for further ecological success.
One of the hypotheses for this work was that gene families that were gained during Trichoderma evolution and are more abundant in Trichoderma than in other related fungi could give further insights about how this genus became an environmental opportunist. Gene families that were gained in highest number by Trichoderma were those encoding proteins with an ankyrin-repeat, proteins with a HET domain and MSF transporters. In addition, protein families that were present in higher numbers than in other Sordariomycetes were PNP_UDP_1 nucleotide phosphorylases, and NmrA-like transcriptional regulators.
The ankyrin repeats - tandemly repeated modules of about 33 amino acid that form two α-helices separated by a loop – are among the most common protein-protein interaction motifs known. They occur in a high number of proteins mainly from eukaryotes and have functions in cell cycle regulation, mitochondrial enzymes, cytoskeleton interactions, signal transduction and stress resistance [71, 72]. So far, proteins with ankyrin repeats have not been systematically characterized from Pezizomycotina, but an expansion of proteins containing ankyrin repeat proteins has been reported for the insect endosymbiotic bacterium Wolbachia . Ankyrins have therefore been suggested to play an important role in endosymbiosis of this bacterium . The higher number of proteins with this protein-protein interaction module in Trichoderma than in other fungi (with Nectria being the only exception) may suggest that its signalling and metabolic processes are more tightly coordinated than in other fungi which could ultimately result in enhanced fitness in its habitat.
Another group of proteins that made up for a significant portion of the genes gained by HV ans ST are the fungal HET (heterokaryon incompatibility) proteins. They have already received considerable attention of fungal genetisists because of their role as key players in recognition and response to non-self during cell fusion, which allows different individuals of the same species to maintain intergity and individuality [74–76]. HET proteins that contain an N-terminal HET effector domain, a central GTP binding site and a C-terminus consisting of highly conserved WD40 tandem repeats have been defined as HNWD protein family . Lamacchia et al.  recognized that proteins of this family have similarity to pathogen-recognition receptors in plant and animals and proposed that these genes might also have a function in the recognition and response to other pathogenic species [78, 79]. With respect to Trichoderma, we extend this hypothesis by speculating that they could also play a role in recognition of mycoparasitic hosts, which is a challenging objective for further studies.
Apart of these two striking examples, the expansion of genes encoding NmrA-like proteins (which function as repressors of GATA-type transcription factors; ) and Zn2Cys6-transcriptional regulators in HV is also of interest, because we did not notice an expansion of protein kinase families. We therefore assume that speciation in this clade is accompanied by a diversification and fine tuning of transcriptional regulation, whereas regulation at the posttranscriptional level occurs mainly by the canonical signalling pathways in a similar way as in other fungi.
Based on the analysis of T. reesei, T. virens and T. atroviride it was previously concluded that the genus has only a small arsenal of secondary metabolite synthases . The present comparison shows that this only true for the PKS and NRPS in species of SL. Compared with the Aspergillus spp., which are considered as being particularly rich in secondary metabolites , T. harzianum has a higher number of NRPS (twenty-nine). The number of PKS in T. virens is in the average  of those present in Aspergilli. In the case of terpenoid synthases, Trichoderma contains 12 to 17 genes and therefore clearly outnumbers the Aspergillus spp. that have only 2 to 10. However most of these secondary metabolite synthases – especially those for terpenoids - have not yet been characterized. The relation between the many secondary metabolites reported in Trichoderma and the genes responsible for their synthesis is therefore not known, which defines yet one more intriguing field for further studies.
We have also annotated the complete CAZome of all 12 Trichoderma species, which revealed the presence of some proteins like the GH4 α-glucosidases or the AA11 chitin monooxygenases that have not yet been described to occur Trichoderma. In addition, we detected that Trichoderma possesses a rich arsenal of carbohydrate-binding domains, which occur as fusions to GHs, CEs or AAs, but also as individual secreted proteins. The CBM1 cellulose-binding domain and the CBM50/LysM chitin/peptidoglycan binding domains have been already described in detail [81, 82], but we also found a high number of additional CBMs that putatively bind to starch, fructans and hemicelluloses. It therefore appears that Trichoderma makes significant use of these domains, and this could result in faster and more competitive degradation of the respective polymers. In these regards it is also of interest that HV possesses GH18 group C chitinases that contain both CBM18 as well as CBM50/LysM chitin binding domains, which have not yet been reported elsewhere. The possible differences in binding of CBM18 and CBM50/LysM to chitin are not known, however, which makes a speculation about the advantage of their arrangement in GH18 group C chitinases of HV Trichoderma difficult.
Finally, a striking feature in all Trichoderma genomes was the high number of orphan genes, of which only a very small number is also present in the core genome. The origin of orphan genes has been postulated to be either the consequence of gene duplication events and rearrangement processes followed by fast divergence, or of de novo evolution out of non-coding genomic regions . Our data showed that - in the case of T. reesei - only a fifth of the orphan genes occured in clusters that could be indicative of gene duplications, and only a very small portion of orphans (clustered and non-clustered) occured near the telomeres, a frequent area for gene duplications. Our data therefore do not support gene duplication as the major mechanism for the emergence of orphan genes. The question whether the Trichoderma orphans originate de novo (see above) cannot be answered by our data. Published transcriptome data from T. reesei and T. virens [84, 85] show that about 40% of the orphan genes are indeed expressed, and therefore represent protogenes . Our data suggest that the Trichoderma species-specific orphan genes evolve so fast that their sequences diverge beyond recognition, as already discussed for insects . The biological merit, if any, needs further investigations to become understood, however.
This paper highlights the evolution of twelve Trichoderma species that are most frequently observed in nature and which belong to three different Trichoderma sections/clades and documents the gene inventory of the core genome and the individual species. The data reveal a high genomic diversity both at the section and clade level and on the species-level, which is reflected by the fact that only 50–75% of the genes are conserved in all twelve species. The high polymorphism in ankyrin and HET genes, but also of such encoding transcription factors, enzymes for carbohydrate and secondary metabolism illustrates that Trichoderma belongs to those genera of fungi which constantly re-shape their genome for fast responses and successful competition in potentially novel habitats. These properties are exactly what one would also expect from an environmental opportunist and generalist.
The data presented in this paper will likely become a starting point for mining Trichoderma genomes for enzymes or secondary metabolites, and for selection of candidate genes for manipulating strains towards desired behaviour in their application. Sequencing and annotation of genomes of species outside the currently investigated clades will be facilitated by the curated protein identification encoded by the core Trichoderma genome. This may likely lead to the detection of still new features not seen in species from sections Longibrachiatum, and Trichoderma and in Harzianum/Virens clades.
Finally, our data raise the genus Trichoderma to the level of the few fungal taxa for which genome sequences of several different species are available, such as Aspergillus and Fusarium, and which strongly facilitated studies on various aspects of the molecular physiology of these fungi. Our data for Trichoderma now offers such a basis as well.
In silico screening for most common Trichoderma species
An in-silico screening for most common Trichoderma species whose nucleotide sequences are deposited in GenBank yielded 29,911 sequences for 292 species (April 2018). Sequences collected for undefined species (“cf. Trichoderma” or “Trichoderma sp.”), poorly characterized species (i.e. that are represented by less than 3 nucleotide sequences), or sequences arising from whole genome sequencing projects were excluded. T. reesei is a special case, because most of its sequences represented genes of only a single isolate (QM6a and its mutants), what is related to its industrial application. The total number of sequences from individual T. reesei isolates is estimated to be 30, which is rather small. T. reesei was nevertheless included in this study becuase its genome sequence and annotation were already available  and considered to be a good basis for comparison to the more abundant species of section SL.
T. harzianum TR274 has been isolated from soil in southeast of Brazil . The genome was sequenced paired-end 2 × 250 bp using Miseq technology (Illumina™) and assembled with AllpathsLG  using maximum coverage of 120X. The genome was annotated using the Mycocosm annotation pipeline , and all data generated are available at the Mycocosm portal (https://genome.jgi.doe.gov/mycocosm/home).
Re-annotation of the T. hamatum GD12 genome
The T. hamatum genome is available in the public domain only in the form of assembled nucleotide scaffolds (accession number ANCB00000000.2). We performed structural annotation using the MAKER genome annotation pipeline v2.31.8  with the gene predictor Augustus (http://bioinf.uni-greifswald.de/augustus/) trained with gene models from Fusarium graminearum. All proteins and transcripts from the Trichoderma ssp. analyzed in this study were used as gene model support. For functional annotation of translated proteins in the T. hamatum GD12, we performed InterProScan5 (http://www.ebi.ac.uk/interpro/) annotation, using stand-alone version 55 with the following embedded programs: SignalP4.1 , PFAM v.29 , Interpro  and GeneOntology (http://www.geneontology.org/).
Other fungal genomes analysed
The Ascomycota that were used in this study in comparison to Trichoderma, their habitats, taxonomic position and published genome sequences are given in Additional file 21.
Annotation of the Trichoderma proteomes
We first searched all 13 Trichoderma genome databases for orthologs in the T. reesei QM6a and RUT C-30 genome by reciprocal blastp, using a treshhold of < E-35 (this value turned out to retrieve the highest percentage of hits that were confirmed by reciprocal blastp in a series of trials with different E treshhold values). Data obtained for T. reesei QM6a and RUT C-30 were combined and pruned to contain individual genes only once. The BLAST servers of the Joint Genome Institute were used for most Trichoderma spp. A local blastp for T. parareesei and T. guizhouense was established at the server of the Institute of Chemical, Environmental and Bioscience Engineering, TU Wien. For T. gamsii, T. afroharzianum, and T. hamatum no individual BLAST server was available, and their predicted proteome therefore re-assessed by blastp in the NCBI Blast server. The so predicted proteins were cross-checked by Pfam v. 29  using a TimeLogic Decypher machine and an <E-35 treshold, and Interpro .
Conserved protein domains in proteins were further veryfied by Blastp against NCBI’s conserved domain database (https://0-www-ncbi-nlm-nih-gov.brum.beds.ac.uk/Structure/cdd/wrpsb.cgi; , using a treshhold of < E-05. Putative localization of proteins was analyzed using SignalP (for secreted proteins; http://www.cbs.dtu.dk/services/SignalP/), TargetP (for possible mitochondrial location; http://www.cbs.dtu.dk/services/TargetP/) and TMHMM (for preduction of transmembrane helixes in proteins; http://www.cbs.dtu.dk/services/TMHMM/). In all three methods, only hits with p < 0.05 were used.
In addition, we performed Ortho MCL clustering  with an inflation parameter of 1.5 on protein sequences from 26 predicted full proteomes (thirteen Trichoderma spp. and 13 from Hypocreales and Sordariomycete outgroups). A protein was considered specific to an organism subset if it was found at least in all but one of the organisms of the subset, but not in any organisms outside the subset.
Identification of specific protein families
Annotation of the genes encoding carbohydrate active enzymes (CAZymes) in the 13 Trichoderma genomes was performed using the Carbohydrate-Active Enzyme database and CAZy nomenclature (http://www.cazy.org/), by comparing each protein model from the genome by the sequence similarity search tool (BLAST) to a collection of protein modules corresponding to catalytic and carbohydrate-binding modules derived from CAZy. Individual hits were then compared by HMMer to models corresponding to each CAZy family to allow an assignment of each identified protein.
Proteases were identified by analysis of the proteomes of the 13 strains in the MEROPs database (https://www.ebi.ac.uk/merops/) and the corresponding nomenclature used to specify them.
Identification of PKS, NRPS and terpenoid synthases was performed with Antismash  and SMURF (http://www.jcvi.org/smurf). Potential orthologs of PKS genes in different Trichoderma spp. were determined by phylogenetic analysis, using the KS domain (PKS) and adenylation domain (NRPS). The Maximum Likelihood method, based on the Poisson correction model , was used to infer the evolutionary history. Branches corresponding to partitions with a boostrap coefficient of < 50% (1000 replicates) are collapsed.
To identify SSCPs, the proteomes of the 13 Trichoderma strains were first filtered with Microsoft Excel for those that have a protein size less then 300 amino acids and contain ≥5% cysteines and the detected candidates then subjected to SignalP analysis . Among this subset of proteins, hydrophobins were visually identified by the presence of 8 cysteines, of which C2/C3 and C6/C7 occured as pairs. Ceratoplatanins were identified by the presence of 4 cysteines and blastp against the NCBI database. The remaining proteins were considered as uncharacterized SSCPs.
Analysis of genome completeness
To access the completeness of the genomes, we conducted a BUSCO v2 (Benchmarking Universal Single-Copy Orthologs) search our genomes for orthologues to each of 3725 Sordariomycete orthologous genes .
Generation of a time-scaled phylogeny of the Hypocreaceae
We estimated the phylogeny of the 27 Ascomycota species in our analysis using the protein sequences of 638 orthologs present in single copy in all species, identified using Proteinortho5 . For each set of orthologous proteins, we produced multiple alignments using MAFFT  with the auto settings and identified conserved alignment blocks using Gblocks v0.19b . The final concatenated alignment used for phylogenetic reconstruction analysis consisted of 259,738 amino acid positions. Clade ages were estimated using the tool CladeAge  described in Matschiner et al. . Four ancestral nodes were used for the time calibration: a common ancestral node of the order Hypocreales was calibrated for a central 95% range of 190–196 Mya , a common ancestral node between families Hypocreaceae, Ophiocordycipitaceae and Clavicipitaceae was calibrated for a central 95% range of 162–168 Mya , a common ancestral node of Clavicipitaceae crown group for a central 95% range of 114–120 Mya  and a common ancestral node of Nectriaceae crown group for a central 95% range of 122–128 Mya . Species within these clades were forced to form a monophyletic group to constrain the tree topology. The selection of best amino acid substitution model was done using ProtTest 3 [171} based on BIC criterion. A MCMC analyses were carried out with a chain length of 20,000,000 sampling on every 1000 generation in BEAST V2.4.0 , using JTT I + G + F and the lognormal relaxed clock was used for determination of the clade ages. Their combined logs for the analyses for each dataset were diagnosed using Tracer v1.6 to confirm that the effective sample size is above 200 for the estimated parameters. In TreeAannotator v2.4.0 (in the BEAST package ), 25% of the first total trees were discarded, 0.9 was used as posterior probability limit and node heights were estimated using mean heights in order to obtain the maximum clade credibility tree. The final tree with node ages and an automatic reverse scale axis was visualized and obtained using FigTree v1.4.2 (http://tree.bio.ed.ac.uk/software/figtree/). Approximate 95% confidence interval was obtained by selecting “Height Highest Probable Density of 95%” for node bars in FigTree to show the age in the chronogram.
Analysis of Trichoderma phylogeny
The nucleotide sequence of a fragment of the rpb2 (RNA-polymerase II encoding gene) was retrieved from NCBI GenBank for 196 species of Trichoderma, and aligned. 808 nucleotides were then used for Bayesian analysis. Two independent MCMC runs were performed with 10 million generations and sampling frequency after each 100 generations; the first 800 trees have been removed. An earlier version of this tree, which does not make reference to the abundancy of species, has been published .
Analysis of protein family evolution
The evolution of protein family size variation (expansion or contraction) was analyzed by CAFÉ  (using the orthoMCL table with an e-value ≤1e-20, and an inflation parameter of 1.5) with a p-value of 0.01 and applying a stochastic model of gene death and birth.
Analysis of dN and dS
We estimated non-synonymous nucleotide substitutions (dN) and synonymous substitutions (dS) using PAML  with model M0 in pairwise mode implemented with custom shell scripts and calculated average dN/dS.
Estimates of evolutionary divergence between protein sequences
Analyses were conducted using the JTT matrix-based model and the rate variation among sites was modeled with a gamma distribution (shape parameter = 4). The analysis involved 27 species, same used to build time-scaled phylogenetic tree. All positions containing gaps and missing data were eliminated. There was a total of 380,905 positions in the final dataset. Evolutionary analyses were conducted in MEGA7 .
Genome assembly and annotations are available at at the JGI fungal genome portal MycoCosm  and are available at DDBJ/EMBL/GenBank under the following accessions: T. reesei, PRJNA225530; T. parareesei, LFMI00000000, T. longibrachiatum, MBDJ00000000; T. citrinoviride, MBDI00000000; T. harzianum CBS226.95, MBGI00000000; T. harzianum TR274, NQLC00000000; T. guizhouense, LVVK00000000; T. afroharzianum, JOKZ00000000; T. virens, PRJNA264113; T. atroviride, PRJNA164112; T. gamsii, JPDN00000000; T. asperellum, MBGH00000000; T. hamatum, ANCB00000000. The revised protein sequences and annotations of the T. reesei and T. hamatum genomes are included in the paper (Additional files 3 and 4).
The authors are grateful to Dr. David Studholme, University of Exeter, for providing the full MAKER annotation of the T. hamatum genome.
The work performed by the Nanjing Agricultural University, China, was supported by the National Key Research and Development Program of China (2017YFD 0200806). This study was supported by grants of the Austrian Science Foundation (FWF) to ISD (P25613-B20) and CPK (I-1249). AS was supported by scholarships provided by CAPES foundation (99999.001745/2014–00) and CNPq (141180/2012–9). AS and EFN were supported by CNPq grant (482697/2011–3). The work conducted by the U.S. Department of Energy Joint Genome Institute, a DOE Office of Science User Facility, was supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC02-05CH11231. B.H. gratefully acknowledges funding from IDEX Aix Marseille (Grant Microbio-E, 2015–2017).
Availability of data and materials
The raw data for this analysis were the 13 Trichoderma genome databases, which can be downloaded from the sources cited above. The reference genomes of other fungi that we used were downloaded from Mycocosm  and the NCBI database ( http://0-www.ncbi.nlm.nih.gov.brum.beds.ac.uk/ ).
AS, GM, KC, BH, JZ, FC, EMK, AGK, AK, RB, SS and IVG carried out in silico experiments and analysed data. CPK, AS, IVG and ISD conceived and designed the study, carried out the data analysis, interpretation, and discussion, and wrote the manuscript with comments from BH, GM, GV, EFN, and QS. CPK, AS, KC and ISD completed the supplements. CPK and ISD prepared the figures. All authors read and approved the final manuscript.
Ethics approval and consent to participate
No ethical approval was needed for this study.
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
- Zhang N, Castlebury LA, Miller AN, Huhndorf SM, Schoch CL, Seifert KA, et al. An overview of the systematics of the Sordariomycetes based on a four-gene phylogeny. Mycologia. 2006;98(6):1076–87.PubMedView ArticleGoogle Scholar
- Kirk PM, Cannon PF, Minter DW, Stalpers JA. Dictionary of the Fungi. 10th ed. Wallingford: CAB International; 2008. p. 332.Google Scholar
- Sung G-H, Poinar GO, Spatafora JW. The oldest fossil evidence of animal parasitism by fungi supports a cretaceous diversification of fungal-arthropod symbioses. Mol Phylogenet Evol. 2008;49:495–502.PubMedView ArticleGoogle Scholar
- Spatafora JW, Sung G-H, Sung J-M, Hywel-Jones NL, White JF. Phylogenetic evidence for an animal pathogen origin of ergot and the grass endophytes. Mol Ecol. 2007;16:1701–11.PubMedView ArticleGoogle Scholar
- Chenthamara K, Druzhinina IS. Ecological genomics of mycotrophic fungi. In: Druzhinina IS, Kubicek CP, editors. The Mycota: environmental and microbial relationships. 3rd ed; 2013. p. 215–45.Google Scholar
- Druzhinina IS, Kubicek CP. Ecological genomics of Trichoderma. In: Martin F, editor. The ecological genomics of fungi. UK: Wiley; 2014. p. 89–116.View ArticleGoogle Scholar
- Atanasova L, Druzhinina IS, Jaklitsch WM. Twohundred Trichoderma species recognized on the basis of molecular phylogeny. In: Mukherjee P, Horwitz BA, Singh US, Mukherjee M, Schmoll M, editors. Trichoderma: biology and applications. UK: CABI International; 2013. p. 10–42.Google Scholar
- Jaklitsch WM, Voglmayr H. Biodiversity of Trichoderma (Hypocreaceae) in southern Europe and Macaronesia. Stud Mycol. 2015;80:1–87.PubMedPubMed CentralView ArticleGoogle Scholar
- Jaklitsch WM. European species of Hypocrea part I. the green-spored species. Stud Mycol. 2009;63:1–91.PubMedPubMed CentralView ArticleGoogle Scholar
- Poldmaa K. Three species of Hypomyces growing on basidiomata of Stereaceae. Mycologia. 2003;95:921–33.PubMedView ArticlePubMed CentralGoogle Scholar
- Druzhinina IS, Chenthamara K, Zhang J, Atanasova L, Yang D, Miao Y, et al. Massive lateral transfer of genes for lignocellulolytic enzymes to the mycoparasitic fungal genus Trichoderma from its herbivore hosts. PLoS Genet. 2018; ms accepted for publication.Google Scholar
- Kubicek CP, Herrera-Estrella A, Seidl-Seiboth V, Martinez DA, Druzhinina IS, Thon M, et al. Comparative genome sequence analysis underscores mycoparasitism as the ancestral life style of Trichoderma. Genome Biol. 2011;12:R40.PubMedPubMed CentralView ArticleGoogle Scholar
- Druzhinina IS, Seidl-Seiboth V, Herrera-Estrella A, Horwitz BA, Kenerley CM, Monte E, et al. Trichoderma: the genomics of opportunistic success. Nat Rev Microbiol. 2011;9(10):749–59.PubMedView ArticlePubMed CentralGoogle Scholar
- Hermosa R, Rubio MB, Cardoza RE, Nicolás C, Monte E, Gutiérrez S. The contribution of Trichoderma to balancing the costs of plant growth and defense. Int Microbiol. 2013;16(2):69–80.PubMedPubMed CentralGoogle Scholar
- Chaverri P, Samuels GJ. Evolution of habitat preference and nutrition mode in a cosmopolitan fungal genus with evidence of interkingdom host jumps and major shifts in ecology. Evolution. 2013;67(10):2823–37.PubMedPubMed CentralGoogle Scholar
- Martinez D, Berka RM, Henrissat B, Saloheimo M, Arvas M, Baker SE, et al. Genome sequencing and analysis of the biomass-degrading fungus Trichoderma reesei (syn. Hypocrea jecorina). Nat Biotechnol. 2008;26:553–60.PubMedView ArticlePubMed CentralGoogle Scholar
- Studholme DJ, Harris B, Le Cocq K, Winsbury R, Perera V, Ryder L, et al. Investigating the beneficial traits of Trichoderma hamatum GD12 for sustainable agriculture-insights from genomics. Front Plant Sci. 2013;4:258.PubMedPubMed CentralView ArticleGoogle Scholar
- Xie BB, Qin QL, Shi M, Chen LL, Shu YL, Luo Y, et al. Comparative genomics provide insights into evolution of Trichoderma nutrition style. Genome Biol Evol. 2014;6(2):379–90.PubMedPubMed CentralView ArticleGoogle Scholar
- Yang D, Pomraning K, Kopchinskiy A, Karimi Aghcheh R, Atanasova L, Chenthamara K, et al. Genome Sequence and Annotation of Trichoderma parareesei, the Ancestor of the Cellulase Producer Trichoderma reesei. Genome Announc. 2015;3(4).Google Scholar
- Shi-Kunne X, Seidl MF, Faino L, Thomma BP. Draft Genome Sequence of a Strain of Cosmopolitan Fungus Trichoderma atroviride. Genome Announc. 2015;3(3):e00287–15.PubMedPubMed CentralView ArticleGoogle Scholar
- Kuo HC, Wang TY, Chen PP, Chen RS, Chen TY. Genome sequence of Trichoderma virens FT-333 from tropical marine climate. FEMS Microbiol Lett. 2015;362(7).Google Scholar
- Baroncelli R, Piaggeschi G, Fiorini L, Bertolini E, Zapparata A, Pè ME, et al. Draft Whole-Genome Sequence of the Biocontrol Agent Trichoderma harzianum T6776. Genome Announc. 2015;3(3).Google Scholar
- Baroncelli R, Zapparata A, Piaggeschi G, Sarrocco S, Vannacci G. Draft Whole-Genome Sequence of Trichoderma gamsii T6085, a Promising Biocontrol Agent of Fusarium Head Blight on Wheat. Genome Announc. 2016;4(1).Google Scholar
- Compant S, Gerbore J, Antonielli L, Brutel A, Schmoll M. Draft genome sequence of the root-colonizing fungus Trichoderma harzianum B97. Genome Announc. 2017;5(23).Google Scholar
- Schmoll M, Dattenböck C, Carreras-Villaseñor N, Mendoza-Mendoza A, Tisch D, Alemán MI, et al. The genomes of three uneven siblings: footprints of the lifestyles of three Trichoderma species. Microbiol Mol Biol Rev. 2016;80(1):205–327.PubMedPubMed CentralView ArticleGoogle Scholar
- Druzhinina IS, Kopchinskiy AG, Kubicek EM, Kubicek CP. A complete annotation of the chromosomes of the cellulase producer Trichoderma reesei provides insights in gene clusters, their expression and reveals genes required for fitness. Biotechnol Biofuels. 2016;9:75.PubMedPubMed CentralView ArticleGoogle Scholar
- Li WC, Huang CH, Chen CL, Chuang YC, Tung SY, Wang TF. Trichoderma reesei complete genome sequence, repeat-induced point mutation, and partitioning of CAZyme gene clusters. Biotechnol Biofuels. 2017;10:170.PubMedPubMed CentralView ArticleGoogle Scholar
- Chaverri P, Branco-Rocha F, Jaklitsch W, Gazis R, DegenkoSL T, Samuels GJ. Systematics of the Trichoderma harzianum species complex and the re-identification of commercial biocontrol strains. Mycologia. 2015;107(3):558–90.PubMedPubMed CentralView ArticleGoogle Scholar
- Druzhinina IS, Kubicek CP, Komoń-Zelazowska M, Mulaw TB, Bissett J. The Trichoderma harzianum demon: complex speciation history resulting in coexistence of hypothetical biological species, recent agamospecies and numerous relict lineages. BMC Evol Biol. 2010;10:94.PubMedPubMed CentralView ArticleGoogle Scholar
- Steindorff AS, Ramada MH, Coelho AS, Miller RN, Pappas GJ Jr, Ulhoa CJ, et al. Identification of mycoparasitism-related genes against the phytopathogen Sclerotinia sclerotiorum through transcriptome and expression profile analysis in Trichoderma harzianum. BMC Genomics. 2014;15:204.PubMedPubMed CentralView ArticleGoogle Scholar
- Grigoriev IV, Nikitin R, Haridas S, Kuo A, Ohm R, Otillar R, et al. MycoCosm portal: gearing up for 1000 fungal genomes. Nucleic Acids Res. 2014;42(1):D699–704.PubMedView ArticleGoogle Scholar
- Simão FA, Waterhouse RM, Ioannidis P, Kriventseva EV, Zdobnov EM. BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics. 2015;31(19):3210–2.PubMedView ArticleGoogle Scholar
- Keller G, Sahni A, Bajpai S. Deccan volcanism, the KT mass extinction and dinosaurs. J Biosci. 2009;34:709–28.PubMedView ArticleGoogle Scholar
- Stammers DK, Ren J, Leslie K, Nichols CE, Lamb HK, Cocklin S, et al. The structure of the negative transcriptional regulator NmrA reveals a structural superfamily which includes the short-chain dehydrogenase/reductases. EMBO J. 2001;20(23):6619–26.PubMedPubMed CentralView ArticleGoogle Scholar
- Koonin EV, Fedorova ND, Jackson JD, Jacobs AR, Krylov DM, Makarova KS, et al. A comprehensive evolutionary classification of proteins encoded in complete eukaryotic genomes. Genome Biol. 2004;5(2):R7.PubMedPubMed CentralView ArticleGoogle Scholar
- Sutherland IW. Polysaccharide lyases. FEMS Microbiol Rev. 1995;16(4):323–47.PubMedView ArticleGoogle Scholar
- Aranda-Martinez A, Lenfant N, Escudero N, Zavala-Gonzalez EA, Henrissat B, Lopez-Llorca LV. CAZyme content of Pochonia chlamydosporia reflects that chitin and chitosan modification are involved in nematode parasitism. Environ Microbiol. 2016;18(11):4200–15.PubMedView ArticleGoogle Scholar
- Khan A, Mathelier A. Intervene: a tool for intersection and visualization of multiple gene or genomic region sets. BMC Bioinformatics. 2017;18:287.PubMedPubMed CentralView ArticleGoogle Scholar
- Seiboth B, Metz B. Fungal arabinan and L-arabinose metabolism. Appl Microbiol Biotechnol. 2011;89(6):1665–73.PubMedPubMed CentralView ArticleGoogle Scholar
- Kuhls K, Lieckfeldt E, Börner T, Guého E. Molecular reidentification of human pathogenic Trichoderma isolates as Trichoderma longibrachiatum and Trichoderma citrinoviride. Med Mycol. 1999;37:25–33.PubMedView ArticleGoogle Scholar
- de Bie T, Cristianini N, Demuth JP, Hahn MW. CAFE: a computational tool for the study of gene family evolution. Bioinformatics. 2006;22:1269–71.PubMedView ArticleGoogle Scholar
- Kuhls K, Lieckfeldt E, Samuels GJ, Kovacs W, Meyer W, Petrini O, et al. Molecular evidence that the asexual industrial fungus Trichoderma reesei is a clonal derivative of the ascomycete Hypocrea jecorina. Proc Natl Acad Sci U S A. 1996;93(15):7755–60.PubMedPubMed CentralView ArticleGoogle Scholar
- Chaverri P, Samuels GJ, Stewart EL. Hypocrea virens sp. nov., the teleomorph of Trichoderma virens. Mycologia. 2001;93:1113–24.View ArticleGoogle Scholar
- Dodd SL, Lieckfeldt E, Samuels GJ. Hypocrea atroviridis sp. nov., the teleomorph of Trichoderma atroviride. Mycologia. 2003;95(1):27–40.PubMedView ArticleGoogle Scholar
- Druzhinina IS, Komoń-Zelazowska M, Atanasova L, Seidl V, Kubicek CP. Evolution and ecophysiology of the industrial producer Hypocrea jecorina (anamorph Trichoderma reesei) and a new sympatric agamospecies related to it. PLoS One. 2010;5(2):e9191.PubMedPubMed CentralView ArticleGoogle Scholar
- Nieuwenhuis BP, Aanen DK. Sexual selection in fungi. J Evol Biol. 2012;25(12):2397–411.PubMedView ArticleGoogle Scholar
- Mojzita D, Herold S, Metz B, Seiboth B, Richard P. L-xylo-3-hexulose reductase is the missing link in the oxidoreductive pathway for D-galactose catabolism in filamentous fungi. J Biol Chem. 2012;287(31):26010–8.PubMedPubMed CentralView ArticleGoogle Scholar
- Druzhinina IS, Kubicek CP. Familiar stranger: ecologicagenomics of the model saprotroph and industrial enzyme producer Trichoderma reesei breaks the stereotypes. Adv Appl Microbiol. 2016;95:69–147.PubMedView ArticleGoogle Scholar
- Lombard V, Golaconda Ramulu H, Drula E, Coutinho PM, Henrissat B. The carbohydrate-active enzymes database (CAZy) in 2013. Nucleic Acids Res. 2014;42(Database issue):D490–5.PubMedView ArticleGoogle Scholar
- Li J, Gu F, Wu R, Yang J, Zhang KQ. Phylogenomic evolutionary surveys of subtilase superfamily genes in fungi. Sci Rep. 2017;7:45456.PubMedPubMed CentralView ArticleGoogle Scholar
- Li J, Zhang KQ. Independent expansion of zincin metalloproteinases in Onygenales fungi may be associated with their pathogenicity. PLoS One. 2014;9:e90225.PubMedPubMed CentralView ArticleGoogle Scholar
- Muszewska A, Stepniewska-Dziubinska MM, Steczkiewicz K, Pawlowska J, Dziedzic A, Ginalski K. Fungal lifestyle reflected in serine protease repertoire. Sci Rep. 2017;7:9147.PubMedPubMed CentralView ArticleGoogle Scholar
- Macheleidt J, Mattern DJ, Fischer J, Netzker T, Weber J, Schroeckh V, et al. Regulation and role of fungal secondary metabolites. Annu Rev Genet. 2016;50:371–92.PubMedView ArticlePubMed CentralGoogle Scholar
- Mukherjee PK, Horwitz BA, Kenerley CM. Secondary metabolism in Trichoderma--a genomic perspective. Microbiology. 2012;158(1):35–45.PubMedView ArticlePubMed CentralGoogle Scholar
- Sivasithamparam K, Ghisalberti EL. Secondary Metabolism in Trichoderma and Gliocladium. In: Kubicek CP, Harman GE, editors. Trichoderma and Gliocladium. Vol. 1. Basic Biology, Taxonomy and Genetics. London: Taylor and Francis Ltd; 1998. p. 139–91.Google Scholar
- Yu X, Li S-M. Prenyltransferases of the dimethylallyltryptophan synthase superfamily. Methods Enzymol. 2012;516:259–78.PubMedView ArticlePubMed CentralGoogle Scholar
- Stergiopoulos I, de Wit PJ. Fungal effector proteins. Annu Rev Phytopathol. 2009;47:233–63.PubMedView ArticlePubMed CentralGoogle Scholar
- Djonović S, Pozo MJ, Dangott LJ, Howell CR, Kenerley CM. Sm1, a proteinaceous elicitor secreted by the biocontrol fungus Trichoderma virens induces plant defense responses and systemic resistance. Mol Plant-Microbe Interact. 2006;19(8):838–53.PubMedView ArticlePubMed CentralGoogle Scholar
- Gaderer R, Bonazza K, Seidl-Seiboth V. Cerato-platanins: a fungal protein family with intriguing properties and application potential. Appl Microbiol Biotechnol. 2014;98(11):4795–803.PubMedPubMed CentralView ArticleGoogle Scholar
- Gaderer R, Lamdan NL, Frischmann A, Sulyok M, Krska R, Horwitz BA, Seidl-Seiboth V. Sm2, a paralog of the Trichoderma cerato-platanin elicitor Sm1, is also highly important for plant protection conferred by the fungal-root interaction of Trichoderma with maize. BMC Microbiol. 2015;15:2.PubMedPubMed CentralView ArticleGoogle Scholar
- Marchler-Bauer A, Derbyshire MK, Gonzales NR, Lu S, Chitsaz F, Geer LY, et al. NCBI's conserved domain database. Nucleic Acids Res. 2015;43(Database issue):D222–6.PubMedView ArticlePubMed CentralGoogle Scholar
- Kasuga T, Mannhaupt G, Glass NL. Relationship between phylogenetic distribution and genomic features in Neurospora crassa. PLoS One. 2009;4(4):e5286.PubMedPubMed CentralView ArticleGoogle Scholar
- Scherf A, Figueiredo LM, Freitas-Junior LH. Plasmodium telomeres: a pathogen's perspective. Curr Opin Microbiol. 2001;4:409–14.PubMedView ArticlePubMed CentralGoogle Scholar
- Wortman JR, Fedorova N, Crabtree J, Joardar V, Maiti R, et al. Whole genome comparison of the A. fumigatus family. Med Mycol. 2006;44:S3–7.PubMedView ArticlePubMed CentralGoogle Scholar
- Tajima F, Nei M. Estimation of evolutionary distance between nucleotide sequences. Mol Biol Evol. 1984;1:269–85.PubMedPubMed CentralGoogle Scholar
- Vajda V, McLoughlin S. Fungal proliferation at the cretaceous-tertiary boundary. Science. 2004;303(5663):1489.PubMedView ArticlePubMed CentralGoogle Scholar
- Ivany LC, Patterson WP, Lohmann KC. Cooler winters as a possible cause of mass extinctions at the Eocene/Oligocene boundary. Nature. 2000;407(6806):887–90.PubMedView ArticlePubMed CentralGoogle Scholar
- Kelkar YD, Ochman H. Causes and consequences of genome expansion in fungi. Genome Bio Evol. 2012;4(1):13–23.View ArticleGoogle Scholar
- Dufresne A, Garczarek L, Partensky F. Accelerated evolution associated with genome reduction in a free-living prokaryote. Genome Biol. 2005;6:R14.PubMedPubMed CentralView ArticleGoogle Scholar
- Sun Z, Blanchard JL. Strong genome-wide selection early in the evolution of Prochlorococcus resulted in a reduced genome through the loss of a large number of small effect genes. PLoS One. 2014;9(3):e88837.PubMedPubMed CentralView ArticleGoogle Scholar
- Mosavi LK, Cammett TJ, Desrosiers DC, Peng ZY. The ankyrin repeat as molecular architecture for protein recognition. Protein Sci. 2004;13(6):1435–48.PubMedPubMed CentralView ArticleGoogle Scholar
- Li J, Mahajan A, Tsai MD. Ankyrin repeat: a unique motif mediating protein-protein interactions. Biochemistry. 2006;45(51):15168–78.PubMedView ArticleGoogle Scholar
- Fenn K, Blaxter M. Wolbachia genomes: revealing the biology of parasitism and mutualism. Trends Parasitol. 2006;22(2):60–5.PubMedView ArticleGoogle Scholar
- Saupe SJ. Molecular genetics of heterokaryon incompatibility in filamentous ascomycetes. Microbiol Mol Biol Rev. 2000;64:489–502.PubMedPubMed CentralView ArticleGoogle Scholar
- Saupe SJ, Clavé C, Sabourin M, Bégueret J. Characterization of hch, the Podospora anserina homolog of the het-c heterokaryon incompatibility gene of Neurospora crassa. Curr Genet. 2000;38(1):39–47.PubMedView ArticleGoogle Scholar
- Wu J, Glass NL. Identification of specificity determinants and generation of alleles with novel specificity at the het-c heterokaryon incompatibility locus of Neurospora crassa. Mol Cell Biol. 2001;21(4):1045–57.PubMedPubMed CentralView ArticleGoogle Scholar
- Chevanne D, Bastiaans E, Debets A, Saupe SJ, Clavé C, Paoletti M. Identification of the het-r vegetative incompatibility gene of Podospora anserina as a member of the fast evolving HNWD gene family. Curr Genet. 2009;55:93–102.PubMedView ArticleGoogle Scholar
- Lamacchia M, Dyrka W, Breton A, Saupe SJ, Paoletti M. Overlapping Podospora anserina transcriptional responses to bacterial and fungal non self indicate a multilayered innate immune response. Front Microbiol. 2016;7:471.PubMedPubMed CentralView ArticleGoogle Scholar
- Paoletti M, Saupe SP. Fungal incompatibility: evolutionary origin in pathogen defense? Bioessays. 2009;31:1201–10.PubMedView ArticleGoogle Scholar
- de Vries RP, Riley R, Wiebenga A, Aguilar-Osorio G, Amillis S, Uchima CA, et al. Comparative genomics reveals high biological diversity and specific adaptations in the industrially and medically important fungal genus Aspergillus. Genome Biol. 2017;18(1):28.PubMedPubMed CentralView ArticleGoogle Scholar
- Akcapinar GB, Kappel L, Sezerman OU, Seidl-Seiboth V. Molecular diversity of LysM carbohydrate-binding motifs in fungi. Curr Genet. 2015;61(2):103–13.PubMedPubMed CentralView ArticleGoogle Scholar
- Shoseyov O, Shani Z, Levy I. Carbohydrate binding modules: biochemical properties and novel applications. Microbiol Mol Biol Rev. 2006;70(2):283–95.PubMedPubMed CentralView ArticleGoogle Scholar
- Tautz D, Domazet-Lošo T. The evolutionary origin of orphan genes. Nat Rev Genet. 2011;12(10):692–702.PubMedView ArticleGoogle Scholar
- Bischof R, Fourtis L, Limbeck A, Gamauf C, Seiboth B, Kubicek CP. Comparative analysis of the Trichoderma reesei transcriptome during growth on the cellulase inducing substrates wheat straw and lactose. Biotechnol Biofuels. 2013;6(1):127.PubMedPubMed CentralView ArticleGoogle Scholar
- Morán-Diez ME, Trushina N, Lamdan NL, Rosenfelder L, Mukherjee PK, Kenerley CM, et al. Host-specific transcriptomic pattern of Trichoderma virens during interaction with maize or tomato roots. BMC Genomics. 2015;16:8.PubMedPubMed CentralView ArticleGoogle Scholar
- Carvunis AR, Rolland T, Wapinski I, Calderwood MA, Yildirim MA, Simonis N, et al. Proto-genes and de novo gene birth. Nature. 2012;487(7407):370–4.PubMedPubMed CentralView ArticleGoogle Scholar
- Wissler L, Gadau J, Simola DF, Helmkampf M, Bornberg-Bauer E. Mechanisms and dynamics of orphan gene emergence in insect genomes. Genome Biol Evol. 2013;5(2):439–55.PubMedPubMed CentralView ArticleGoogle Scholar
- Goldman N. Variance to mean ratio, R(t), for poisson processes on phylogenetic trees. Mol Phylogenet Evol. 1994;3:230–9.PubMedView ArticleGoogle Scholar
- Campbell MS, Holt C, Moore B, Yandell M. Genome Annotation and Curation Using MAKER and MAKER-P. Curr Protoc Bioinformatics. 2014;48:4.11.1–39.Google Scholar
- Petersen TN, Brunak S, von Heijne G, Nielsen H. SignalP 4.0: discriminating signal peptides from transmembrane regions. Nat Methods. 2011;8:785–6.PubMedView ArticlePubMed CentralGoogle Scholar
- Finn RD, Coggill P, Eberhardt RY, Eddy SR, Mistry J, Mitchell AL, et al. The Pfam protein families database: towards a more sustainable future. Nucleic Acids Res. 2016;44:279–85.View ArticleGoogle Scholar
- Finn RD, Attwood TK, Babbitt PC, Bateman A, Bork P, Bridge AJ, et al. InterPro in 2017 — beyond protein family and domain annotations. Nucleic Acids Res. 2017;45(D1):D190–9.PubMedView ArticlePubMed CentralGoogle Scholar
- Li L, Stoeckert CJ, Roos DS. OrthoMCL: identification of Ortholog groups for eukaryotic genomes. Genome Res. 2003;13(9):2178–89.PubMedPubMed CentralView ArticleGoogle Scholar
- Medema MH, Blin K, Cimermancic P, de Jager V, Zakrzewski P, Fischbach MA, et al. AntiSMASH: rapid identification, annotation and analysis of secondary metabolite biosynthesis gene clusters in bacterial and fungal genome sequences. Nucleic Acids Res. 2011;39(Web Server issue):W339–46.PubMedPubMed CentralView ArticleGoogle Scholar
- Lechner M, Findeiss S, Steiner L, Marz M, Stadler PF, Prohaska SJ. Proteinortho: detection of (co-)orthologs in large-scale analysis. BMC Bioinformatics. 2011;12:124.PubMedPubMed CentralView ArticleGoogle Scholar
- Katoh K, Standley DM. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol Biol Evol. 2013;30(4):772–80.PubMedPubMed CentralView ArticleGoogle Scholar
- Talavera G, Castresana J. Improvement of phylogenies after removing divergent and ambiguously aligned blocks from protein sequence alignments. Syst Biol. 2007;56:564–77.PubMedView ArticlePubMed CentralGoogle Scholar
- Bouckaert R, Heled J, Kühnert D, Vaughan T, Wu CH, Xie D, Suchard MA, Rambaut A, Drummond AJ. BEAST 2: a software platform for Bayesian evolutionary analysis. PLoS Comput Biol. 2014 Apr 10;10(4):e1003537.PubMedPubMed CentralView ArticleGoogle Scholar
- Matschiner M, Musilová Z, Barth JM, Starostová Z, Salzburger W, Steel M, Bouckaert R. Bayesian phylogenetic estimation of clade ages supports trans-Atlantic dispersal of cichlid fishes. Syst Biol. 2017;66(1):3–22.PubMedView ArticlePubMed CentralGoogle Scholar
- Yang E, Lingling X, Ying Y, Xinyu Z, Meichun X, Chengshu W, Zhiqiang A, Xingzhong L. Origin and evolution of carnivorism in the Ascomycota (fungi). Proc Natl Acad Sci. 2012;109(27):10960–5. https://doi.org/10.1073/pnas.1120915109.View ArticlePubMedPubMed CentralGoogle Scholar
- Kumar S, Stecher G, Tamura K. MEGA7: Molecular Evolutionary Genetics Analysis version 7.0 for bigger datasets. Mol Biol Evol. 2016;33:1870–4.PubMedPubMed CentralView ArticleGoogle Scholar