- Research article
- Open Access
Genome-wide next-generation DNA and RNA sequencing reveals a mutation that perturbs splicing of the phosphatidylinositol glycan anchor biosynthesis class H gene (PIGH) and causes arthrogryposis in Belgian Blue cattle
- Arnaud Sartelet†1,
- Wanbo Li†1,
- Eric Pailhoux2,
- Christophe Richard2,
- Nico Tamma1,
- Latifa Karim1, 3,
- Corinne Fasquelle1,
- Tom Druet1,
- Wouter Coppieters1, 3,
- Michel Georges1 and
- Carole Charlier1Email author
© Sartelet et al.; licensee BioMed Central. 2015
- Received: 10 November 2014
- Accepted: 13 April 2015
- Published: 18 April 2015
Cattle populations are characterized by regular outburst of genetic defects as a result of the extensive use of elite sires. The causative genes and mutations can nowadays be rapidly identified by means of genome-wide association studies combined with next generation DNA sequencing, provided that the causative mutations are conventional loss-of-function variants. We show in this work how the combined use of next generation DNA and RNA sequencing allows for the rapid identification of otherwise difficult to identify splice-site variants.
We report the use of haplotype-based association mapping to identify a locus on bovine chromosome 10 that underlies autosomal recessive arthrogryposis in Belgian Blue Cattle. We identify 31 candidate mutations by resequencing the genome of four cases and 15 controls at ~10-fold depth. By analyzing RNA-Seq data from a carrier fetus, we observe skipping of the second exon of the PIGH gene, which we confirm by RT-PCR to be fully penetrant in tissues from affected calves. We identify - amongst the 31 candidate variants - a C-to-G transversion in the first intron of the PIGH gene (c211-10C > G) that is predicted to affect its acceptor splice-site. The resulting PIGH protein is likely to be non-functional as it lacks essential domains, and hence to cause arthrogryposis.
This work illustrates how the growing arsenal of genome exploration tools continues to accelerate the identification of an even broader range of disease causing mutations, therefore improving the management and control of genetic defects in livestock.
- Arthrogryposis syndrome
- PIGH gene
- Splice-site mutation
- Glycosylphosphatidyl inositol deficiency
- Belgian Blue Cattle breed
The extensive use of elite sires exacerbated by the large-scale exploitation of artificial insemination in cattle breeding causes important reductions in effective population size and the common spread of loss-of-function variants. This in turn is responsible for the periodic outburst of genetic defects that cause considerable economic loss and welfare issues. With the development of genome-wide SNP arrays for all livestock species, it has become possible to rapidly map the underlying locus by means of autozygosity mapping to intervals that typically span 2 to 5 megabases thereby proving the inherited nature and mode of inheritance of the corresponding condition (f.i. ). With the advent of targeted or whole-genome next generation sequencing (NGS), it is becoming increasingly facile to identify the causative mutation, needed to develop accurate diagnostic tests, provided that the mutation is a frame-shift, nonsense, canonical splice-site, or severe missense variant. In other cases, the causative mutation may remain elusive for a considerably longer time. We show in this work how the combined use of DNA and RNA NGS data, may accelerate the discovery of an otherwise elusive, novel class of causative mutations.
Arthrogryposis emerges as a new genetic defect in Belgian Blue Cattle
A haplotype-based GWAS maps the culprit locus to a 2.2 Mb interval on bovine chromosome 10
Resequencing the whole genome of four cases identifies 31 candidate causative mutations
The corresponding interval is collinear with a 2.4 Mb segment of human chromosome 14 (chr14: 66,474,214-68,918,385 bp) and encompasses 22 annotated genes. As none of these would be obvious candidates, we generated paired-end libraries for four cases and re-sequenced them to average 2.5-fold depth (for a combined total coverage of ~10 fold) on a Illumina GAIIX instrument as described . Equivalent genomic sequences of 15 healthy Belgian Blues that did not carry the incriminated haplotype were used as controls. Sequence reads were mapped on the BosTau6/UMD3 reference genome using BWA , and variants called with SAMTools . We detected 2,968 variants with quality score > 100 mapping to the 2.2 Mb interval. As expected, the four cases appeared homozygous for the 21 Beadchip SNPs defining the associated haplotype. Filtering against known bovine dbSNP SNPs and variants observed in at least one of the 15 controls, and demanding homozygosity for the four cases, left us with only 31 candidate variants (Additional file 1). Yet, none of these would alter coding sequences (missense or nonsense), or map within three base pairs from an exon-intron junction.
RNASeq reveals skipping of PIGH exon 2 and pinpoints an intronic mutation as the likely causative variant
Interestingly, one of the 31 candidate mutations identified by whole genome sequencing is a C-to-G transversion located in the first intron of PIGH, 10-bp upstream of the exon 2 junction ″211-10C > G) (Figure 3B). The corresponding residue is located in the consensus polypyrimidine track defining canonical “GT-AG” type acceptor splice-sites (f.i. ), and is conserved in 26 of the 27 sequenced mammals (Additional file 3). We developed a 5′exonuclease assay to interrogate the c211-10C > G variant and genotyped 25 cases, 21 parents and > 10,000 healthy Belgian Blue animals. All cases were homozygous “GG”, and all parents “CG” as expected. Six percent of the healthy Belgian Blues were carriers, while none were homozygous “GG” (p = 0.00263).
The clinical spectrum of arthrogryposis is compatible with severe glycosylphosphatidyl inositol (GPI) deficiency
The PIGH protein is ubiquitously expressed. It is one of seven highly conserved (from yeast to human) subunits of the complex catalyzing the first step out of eleven in the biosynthesis of the glycophosphatidyl inositol (GPI) anchor, a complex C-terminal posttranslational modification concerning > 150 proteins or ~0.5% of cellular proteins (http://www.uniprot.org/uniprot) (f.i. [9-11]). Within the GPI-GlcNAc transferase complex (GPI-GnT), PIGH is anchored in the membrane of the endoplasmic reticulum (ER) by two trans-membrane helices (residues 38–55 and 60–78), connected by a short ER intra-luminal loop (56–59), and bounded by a short N-terminal (1–37) and a long C-terminal cytoplasmic tail (79–188) . Deleting exon 2 generates a protein missing amino-acids 61–130, therefore unlikely to be properly anchored in the ER membrane and to be functional (Additional file 4). In culture, cells with mutations in PIGH do not display any GPI-anchored protein (GPI-AP) surface expression (f.i. ). In animals, all known GPI-anchor related defects characterized by a complete lack of GPI-anchored proteins are lethal for the embryo. It is thought that this embryonic-lethal phenotype is due to the fact that several cell-to-cell adhesion molecules required for normal development and acting at specific developmental stages are GPI-anchored (reviewed in ). The additional file 5 lists the known GPI-AP and their associated knock-out phenotype in mice. Taken together, our results strongly suggest that the observed PIGH c211-10C > G mutation causes arthrogryposis in Belgian Blue Cattle.
We herein describe the identification of an intronic mutation that disrupts PIGH function by causing the skipping of exon 2, thereby causing a severe and lethal form of arthrogryposis in homozygous animals. We mapped the corresponding locus by performing a haplotype-based GWAS, an approach that has proven successful in many instances (f.i. ). If the causative mutation is either a stop-gain, frameshift, splice-site variant within 2-bp from an exon-intron boundary, or a severe missense variant, its identification is often relatively straightforward. In our experience, this situation occurs approximately fifty percent of the time, which is in agreement with the data reported by . In the other cases, the identification of the causative mutation may be considerably more arduous. In the present study, it was the concomitant generation of RNASeq data from tissue of a carrier animal that provided the clues for the identification of the causative mutation. The detection of exon skipping in an essential gene that was not observed in non-carrier animals pointed towards one of 31 genetically defined mutations (located in the immediate vicinity of the skipped exon) as being the likely causative mutation. It would have been very difficult to predict the effect of this variant, located at 10-bp from the exon-intron junction, on the splicing reaction without the information on the corresponding transcripts. Our work therefore illustrates the value of the combined availability of DNA and RNASeq data to rapidly identify a broader panel of disease causing mutations.
Although we consider the supporting evidence to be very strong, we haven’t formally proven that skipping of exon 2 of the PIGH gene abrogates its function and interferes with GPI-anchor biosynthesis. In human and mice, functional validation of GPI biosynthesis pathway’s defects relies on the detection of incomplete or absent cell surface expression of CD59, a well-characterized GPI-anchored protein (f.i. ). This quantification is typically performed by FACS on fresh erythrocytes. Unfortunately, homozygous mutant calves are dead at birth, precluding the collection of fresh blood. Producing and collecting homozygous mutant fetuses by mating carriers was beyond the scope of this study.
To the best of our knowledge, this the first report of a naturally occurring loss-of-function mutation in the PIGH gene in animals. A genetic test interrogating the PIGH c211-10C > G variant has been developed and offered to breeders since 2011. Its widespread use has led to the rapid elimination of this syndrome from the Belgian Blue population.
Blood samples were collected from sires, cows and calves, by trained veterinarians following standard procedures and relevant Belgian national guidelines. For bovine fetuses production and collection, experiments reported in this work are in agreement with the ethical guidelines of the French National Institute for Agricultural Research (INRA). Fetuses were produced by artificial insemination of a wild-type Belgian Blue female with semen of a Belgian Blue confirmed-carrier male, 7-days old embryos recovered, transferred to recipient females in an INRA experimental farm (France), then collected at 60 days post-fertilization at the INRA slaughterhouse (France). The protocol (N°: 12/046) was approved by the local ethical committee (COMETHEA) and Eric Pailhoux is the recipient of an official authorization for animal experimentation (N°: B91-649).
SNP array Genotyping
Genomic DNA of cases was extracted from 350 μl of blood using the MagAttract DNA Blood Midi M48 Kit (Qiagen). Genomic DNA of controls was extracted from frozen semen using the MagAttract Mini M48 Kit (Qiagen). The 15 cases of the initial genome scan were genotyped using a custom-made 50 K SNP array . The 275 control sires were genotyped with the BovineHD BeadChip (Illumina). SNP genotyping was conducted using standard procedures at the GIGA genomics core facility.
RNAseq data generation and analysis
As part of a companion project, bovine embryos were produced by directed mating between a sire carrier for the arthrogryposis haplotype and a wild-type cow. A panel of tissues - including pituitary, cerebral cortex and liver - from two embryos diagnosed as carrier based on a haplotype test, were collected at 60 days post fertilization (dpf); total RNA was extracted and cDNA librairies were produced using the TrueSeq mRNA kit from Illumina, following manufacturer’s instructions. Equal amounts of each indexed library were combined and sequenced 2X100bp on one lane of a HighSeq2000 instrument. Transcriptomes were analyzed using the RNASeq tool kit TopHat and Cufflinks . Mapped RNASeq reads for the carrier embryo were visually evaluated in IGV (Integrative Genome Browser) .
Mutation validation at the mRNA level in case and control calves
Total RNA was extracted from kidney and muscle of one homozygous case and one unaffected unrelated individual using Trizol (Invitrogen) following manufacturer’s instructions. The obtained total RNA was treated with TurboDNaseI (Ambion) and cDNA was synthesized using Superscript™III First Strand Synthesis System for RT-PCR (Invitrogen). A 593 bp PIGH cDNA fragment was PCR amplified using a specific primers respectively located in exon 1 (PIGH_UP : 5′-TCT CTT TGC GCT CGC TCA CC-3′) and exon 4 (PIGH_DN: 5′-GAT CCA CCA CAT CCA TAC TGG-3′). Amplicons were directly sequenced using the Big Dye terminator cycle sequencing kit (Applied Biosystems, Foster City, CA). Electrophoresis of purified sequencing reactions was performed on an ABI PRISM 3730 DNA analyzer (PE Applied Biosystems, Foster City, CA). Sequence traces were aligned and compared to bovine cDNA sequence using the Phred/Phrap/Consed package (http://www.phrap.org/phredphrapconsed.html).
5′ exonuclease diagnostic assay of the PIGH c211-10C > G splice acceptor site mutation
A 5′ exonuclease assay was developed to genotype the PIGH c211-10C > G mutation, using 5′- ATG GCA GCA GAG AGG ATC ATG -3′ and 5′- GGA GTT GAC TTA TTA ACC AGC AGA GA -3′ as PCR primers, and 5′-TGT CTG GCT [C]TC TCT TC-3′ (wild type C allele) and 5′-TCT GGC T[G]T CTC TTC -3′ (mutant G allele) as probes (Taqman, Applied Biosystems, Fosters City, CA). Reactions were carried out on an ABI7900HT instrument (Applied Biosystems, Fosters City, CA) using standard procedures.
Availability of supporting data
All the variations of the IBD disease haplotype have been submitted to dbSNP (http://0-www.ncbi.nlm.nih.gov.brum.beds.ac.uk/SNP/). They will be publicly available with the next dbSNP Build (B145), planned for fall, 2015. Additional files 6 and 7 provide respectively the list of “ss” numbers, genomic positions (Bos taurus assembly: BosTau6/UMD3) and the list of corresponding sequence variations.
We are grateful to the breeders and practitioners for their collaboration and to the Walloon Breeding Association (AWE) and the Belgian Blue Beef Herd-Book (HBBBB) for pedigree data. We also thank all the members of INRA experimental units, especially Alexandre Neveux, Valérie Gelin, Céleste Le Bourhis, Nicolas Stadler, Johann Kergozien, Gilles Gomot, Jean-Philippe Dubois for cow management (breeding, ultrasound scanning, and euthanasia). TD and CC are respectively Research Associate and Senior Research Associate of the Fonds National de la Recherche Scientifique (Belgium). This work was funded by grants from the Walloon Ministry of Agriculture (Rilouke), the Belgian Science Policy Organization (SSTC Genefunc PAI) and the University of Liège.
- Charlier C, Coppieters W, Rollin F, Desmecht D, Agerholm JS, Cambisano N, et al. Highly effective SNP-based association mapping and management of recessive defects in livestock. Nat Genet. 2008;40:449–54.View ArticlePubMedGoogle Scholar
- Zhang Z, Guillaume F, Sartelet A, Charlier C, Georges M, Farnir F, et al. Ancestral haplotype-based association mapping with generalized linear mixed models accounting for stratification. Bioinformatics. 2012;28:2467–73.View ArticlePubMedGoogle Scholar
- Sartelet A, Stauber T, Coppieters W, Ludwig CF, Fasquelle C, Druet T, et al. A missense mutation accelerating the gating of the lysosomal Cl-/H + −exchanger ClC-7/Ostm1 causes osteopetrosis with gingival hamartomas in cattle. Dis Mode Mech. 2014;7:119–28.View ArticleGoogle Scholar
- Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009;25:1754–60.View ArticlePubMed CentralPubMedGoogle Scholar
- Li, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics. 2009;25:2078–9.View ArticlePubMed CentralPubMedGoogle Scholar
- Trapnell C, Roberts A, Goff L, Pertea G, Kim D, Kelley DR, et al. Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nat Protoc. 2012;7:562–78.View ArticlePubMed CentralPubMedGoogle Scholar
- Robinson JT, Thorvaldsdóttir H, Winckler W, Guttman M, Lander ES, Getz G, et al. Integrative genomics viewer. Nat Biotechnol. 2011;29:24–6.View ArticlePubMed CentralPubMedGoogle Scholar
- Burset M, Seledtsov IA, Solovyev VV. SpliceDB: database of canonical and non-canonical mammalian splice sites. Nucleic Acids Res. 2001;29:255–9.View ArticlePubMed CentralPubMedGoogle Scholar
- Almeida A, Layton M, Karadimitris A. Inherited glycosylphosphatidyl inositol deficiency: a treatable CDG. Biochim Biophys Acta. 2009;1792:874–80.View ArticlePubMedGoogle Scholar
- Mayor S, Riezman H. Sorting GPI-anchored proteins. Nat Rev Mol Cell Biol. 2004;5:110–20.View ArticlePubMedGoogle Scholar
- Freeze HH. Understanding human glycosylation disorders: biochemistry leads the charge. J Biol Chem. 2013;288:6936–45.View ArticlePubMed CentralPubMedGoogle Scholar
- Watanabe R, Kinoshita T, Masaki R, Yamamoto A, Takeda J, Inoue N. PIG-A and PIG-H, which participate in glycosylphosphatidylinositol anchor biosynthesis, form a protein complex in the endoplasmic reticulum. J Biol Chem. 1996;271:26868–75.View ArticlePubMedGoogle Scholar
- Kamitani T, Chang HM, Rollins C, Waneck GL, Yeh ET. Correction of the class H defect in glycosylphosphatidylinositol anchor biosynthesis in Ltk- cells by a human cDNA clone. J Biol Chem. 1993;268:20733–6.PubMedGoogle Scholar
- Nicholas FW, Hobbs M. Mutation discovery for Mendelian traits in non-laboratory animals: a review of achievements up to 2012. Anim Genet. 2014;45:157–70.View ArticlePubMed CentralPubMedGoogle Scholar
- Sonnhammer ELL, von Heijne G, and Krogh A. A hidden Markov model for predicting transmembrane helices in protein sequences. In J. Glasgow, T. Littlejohn, F. Major, R. Lathrop, D. Sankoff, and C. Sensen, editors, Proceedings of the Sixth International Conference on Intelligent Systems for Molecular Biology, pages 175–182, Menlo Park, CA, 1998. AAAI Press. (http://www.cbs.dtu.dk/services/TMHMM/).
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.