|
T.
c.
marinkellei
| T. c. cruzi Sylvio X10 | Â |
---|
Gene familya | Size in assemblyb | % Short readsc | Size in assemblyb | % Short readsc | SEd |
---|
DGF
| 2,129,983 (6.22 %) | 3.433 | 1,265,650 (3.28 %) | 1.324 |
Tcm
|
TS
| 2,109,163 (6.16 %) | 6.291 | 2,953,602 (7.65 %) | 6.298 | Tcc X10 |
MASP
| 540,360 (1.58 %) | 1.317 | 727,537 (1.88 %) | 1.434 | Tcc X10 |
RHS
| 521,665 (1.52 %) | 2.234 | 1,314,589 (3.41 %) | 2.915 | Tcc X10 |
GP63
| 452,732 (1.32 %) | 1.229 | 514,422 (1.33 %) | 0.898 |
Tcm
|
TcMUC mucin
| 273,890 (0.80 %) | 0.557 | 334,544 (0.87 %) | 0.515 | Tcc X10 |
ABC
| 37,490 (0.11 %) | 0.124 | 42,072 (0.11 %) | 0.162 | Tcc X10 |
RBP
| 25,946 (0.08 %) | 0.080 | 26,732 (0.07 %) | 0.074 | Tcc X10 |
- a Gene family abbreviations: DGF=Dispersed Gene Family, TS=trans-sialidase, MASP=Mucin-associated surface protein, GP63=Surface protease, RHS=Retrotransposon Hot Spot protein, ABC=ABC Transporter, RBP=RNA Binding Protein.
- b The combined number of base pairs of this gene family that was identified in the assembly. Sequences were identified using RepeatMasker and a repeat library of coding sequences from the Tcc CLBR genome. These numbers include partial coding sequences. The number inside parenthesis refers to the percentage of total assembly size.
- c The percentage of short reads that mapped to these features.
- d SE=Significantly Enriched. Refers to if one genome contained significantly more of this gene family. The significance was determined from an empirical distribution of read depth differences from homologous regions of Tcm and Tcc X10, corrected for genome size. The empirical distribution was used to calculate a p-value.