- Methodology article
- Open Access
Genome-wide scans using archived neonatal dried blood spot samples
BMC Genomics volume 10, Article number: 297 (2009)
Identification of disease susceptible genes requires access to DNA from numerous well-characterised subjects. Archived residual dried blood spot samples from national newborn screening programs may provide DNA from entire populations and medical registries the corresponding clinical information. The amount of DNA available in these samples is however rarely sufficient for reliable genome-wide scans, and whole-genome amplification may thus be necessary. This study assess the quality of DNA obtained from different amplification protocols by evaluating fidelity and robustness of the genotyping of 610,000 single nucleotide polymorphisms, using the Illumina Infinium HD Human610-Quad BeadChip. Whole-genome amplified DNA from 24 neonatal dried blood spot samples stored between 15 to 25 years was tested, and high-quality genomic DNA from 8 of the same individuals was used as reference.
Using 3.2 mm disks from dried blood spot samples the optimal DNA-extraction and amplification protocol resulted in call-rates between 99.15% – 99.73% (mean 99.56%, N = 16), and conflicts with reference DNA in only three per 10,000 genotype calls.
Whole-genome amplified DNA from archived neonatal dried blood spot samples can be used for reliable genome-wide scans and is a cost-efficient alternative to collecting new samples.
Studies of genetic influence in complex disorders usually require extensive genome explorations of large cohorts. A major bottleneck, however, is access to DNA from well-characterised patients and healthy controls. This may be circumvented by use of archived residual blood samples from newborn screening programs, which in several countries engage the entire population. The blood is usually collected by heel-prick and applied on special filter paper, a proven robust and convenient medium for transport and storage . Storage policies on residual neonatal dried blood spot samples (DBSS) vary internationally, but several countries store residuals in repositories for later research purposes [2–8]. Stored DBSS combined with relevant clinical information from medical registries thus constitute an ideal resource for large studies. This set-up enjoys the advantage of representing the entire population under a certain age and of avoiding practically any kind of selection. In addition substantial costs may be saved.
The Danish Neonatal Screening Biobank (DNSB) contains nearly 2 million DBSS from virtually every Dane born after 1982. It has recently been updated to meet the new general guidelines for the establishment and operation of biobanks . Access to samples for research requires approval from the Scientific Ethical Committee System, the Data Protection Agency and the DNSB Steering Committee. In Denmark, all citizens have a unique person-identifying number that is used across all public registration systems, including the DNSB. Denmark also operates a well-established public health care system offering treatment to all citizens. Together this makes it possible to study the "entire country as a cohort" and makes the DNSB an ideal resource for studying common and complex genetic diseases in Caucasians . The major challenge using the DBSS for such studies is however the small amount of blood available. In theory, the amount of genomic DNA (gDNA) that can be extracted from a 3.2 mm punch of a DBSS is about 60 ng . In general, only one or two 3.2 mm punches per DBSS can be reserved for each project, which is scarcely enough to genotype multiple single nucleotide polymorphisms (SNP). This problem may be overcome by whole-genome amplification (WGA) of the gDNA. Previous studies have used whole-genome amplified DNA (wgaDNA) for genotyping, and with fair success, but in most cases the number of polymorphisms that can be tested has been limited [11–18].
In this study we investigate if a proper combination of DNA-extraction and WGA procedure can produce wgaDNA samples suitable for 610,000 SNP genome-wide scan using the Illumina Infinium HD Human610-Quad BeadChip. Neonatal DBSS stored for 15 to 25 years in the DNSB are employed, and as reference is used high-quality gDNA samples recently obtained from the same individuals. Two different WGA methods are tested. The multi-displacement amplification (MDA) method (the REPLI-g kit) that produces relatively long wgaDNA fragments > 10 kb , and the OmniPlex method (the GPlex2 and the GPlex4 kits) that produces fragments approximately 500 bp long . We also test the effect of using either one or three 3.2 mm disks and of extracting proteins from the disks before the DNA-extraction. Finally, the robustness of the selected approaches was evaluated.
The investigation comprised 24 subjects born between 1982 and 1992, who all had their residual neonatal DBSS stored at -24°C in the DNSB. Four subjects were informed volunteers and 20 were from a genetic study on schizophrenia (ethical approval number: 20020020; data protection agency number: 2002-41-2059).
DNA-extraction and WGA methods
Reference gDNA was purified from venous blood samples from the four volunteers and from four subjects from the schizophrenia study using the Maxwell 16 automatic system and the Maxwell® 16 Blood DNA Purification Kit (Promega). Neonatal DBSS from the eight participants were retrieved from the DNSB, and DNA was extracted from one or three DBSS disks, 3.2 mm in diameter, using Extract-N-Amp Blood PCR Kit (ENA)(extraction volume: 200 μL) (Sigma-Aldrich) or QIAamp DNA Blood Micro Kit (QIA)(extraction volume: 75 μL) (Qiagen). The DNA extracts were amplified using the REPLI-g kit (Qiagen), GenomePlex® Complete WGA Kit (GPlex2, Sigma-Aldrich), or GenomePlex® Single Cell Whole Genome Amplification Kit (GPlex4, Sigma-Aldrich). All procedures were performed according to the manufacturer's instructions. Prior to DNA-extraction, a subset of disks was extracted for proteins as described by Skogstrand et al. 2005 . Please consult Additional file 1 for set up. Furthermore, two DBSS disks from 16 other subjects were extracted for proteins before DNA-extraction using the ENA kit, and the DNA extracts were amplified using the REPLI-g and the GPlex4 kits. DNA was quantified using Quant-iT™ PicoGreen® dsDNA Reagent (Molecular Probes, Invitrogen).
The gDNA and wgaDNA samples were marked on the Illumina Infinium HD Human610-Quad BeadChip (Illumina) according to the manufacturer's instructions, with the exception that 240 ng of wgaDNA starting material was used instead of the prescribed 200 ng. The BeadChips were scanned using the BeadStation 500GX (Illumina) with a high-density upgrade and an AutoLoader (Illumina). The BeadStudio v.3 software (Illumina Corp.) was used for calculating call- and conflict-rates. In the first part of the study all calls were made using the reference Human610-Quadv1B cluster file from Illumina that is constructed from gDNA. In the second part of the study two cluster files, each constructed from 16 wgaDNA preparations made by the REPLI-g and GPlex4 kits (tailored cluster files specific for WGA method), were also used to analyse the respective samples. Conflict-rates were estimated comparing the wgaDNA samples to their respective reference gDNA samples. The percentage of conflicts introduced due to an allelic dropout (eg. AB to AA) was estimated by re-coding the Illumina data to variables allowing comparison using STATA v.9.0.
The genotyping performance of the different wgaDNA preparations is seen in Additional file 1. The ENA DNA-extraction combined with the REPLI-g kit WGA featured the highest call-rates (99.30–99.51%) and the lowest conflict-rates (0.02–0.03%).
Combining REPLI-g WGA with QIA DNA-extraction was less successful and the results were highly variable. The genotyping performance using wgaDNA made by the two OmniPlex method kits, GPlex2 and GPlex4, was independent of the DNA-extraction method, with GPlex4 showing consistently higher call rates than GPlex2 (Wilcoxon's paired test, p < 0.001). The reference gDNA call-rates were 99.8–99.9%. Almost all conflicts between results from the wgaDNA preparations and the reference gDNA were due to an allelic dropout (data not shown). Notably, extraction and amplification procedures that produced high call-rates displayed low conflict-rates with the reference gDNA and vice versa, which indicates that genome-wide scans on wgaDNA are reliable when the call-rates are high [Additional file 1]. This was partially confirmed by calculating the correlation coefficients between the call- and conflict-rates of the three WGA kits [Additional file 1]. It made no significant difference whether one or three DBSS disks were used for extraction. No systematic differences in genotyping performance were related to sample age.
Based on the results displayed in Additional file 1, the combinations of DNA extraction by the ENA kit and WGA by the REPLI-g and GPlex4 kits were selected for further evaluation. For this, 16 new subjects were employed. After DNA-extraction, WGA and subsequent genome-wide scans (GWS), the results were analysed using both a standard Human610-Quadv1B Cluster, provided by Illumina, and WGA kit specific tailored cluster files. The rationale for the tailored cluster files is demonstrated in Figure 1. Generally, the wgaDNA samples cluster nicely, but not always in the area defined by the Illumina Human610-Quadv1B cluster file, which is based on gDNA samples. By creating tailored WGA-specific cluster files and using these for analysis, the genotype call-rates of both set-ups (REPLI-g and GPlex4) improved significantly (Wilcoxon paired test, p < 0.001) as seen in Table 1. Comparison of the call-rates indicated that the REPLI-g samples had a significantly higher call-rate than the GPlex4 samples (Wilcoxon's paired test, p < 0.001). Comparison of the amount of wgaDNA amplified by each kit revealed no significant difference (Wilcoxon's paired test, p > 0.050).
We demonstrate that wgaDNA, made from 3.2 mm disks of DBSS that have been stored at -24°C for more than 20 years is well suited for reliable genotyping of 610,000 SNPs, with call-rates comparable to those obtained using gDNA. The accuracy of genotype calls using wgaDNA from stored DBSS has been of some concern. The issue has been addressed several times, using both low and medium throughput genotyping platforms, and overall with good success [11, 12, 14–18]. In this study we took the usage of DBSS one step further by conducting GWS. Moreover the accuracy of genotype calls from wgaDNA was assessed by comparing the results with results from high-quality reference gDNA from the same individuals. Initially, we tested two commercial DNA-extraction procedures, three WGA procedures, the effect of number of 3.2 mm disks used, and the effect of protein extraction prior to the gDNA extraction. The efficiency and reliability of the GWS were highly dependent on the employed DNA-extraction and WGA method. Interestingly, call- and conflict-rates were inversely related; indicating that genome scan of wgaDNA is highly reliable when the call-rates are close to 100%. However because only few samples were available to calculate the correlation coefficient, we cannot clearly define a cutoff for the call-rate that would ensure reliable genotyping. In general, the OmniPlex method performed more constantly than the MDA method, producing fairly the same call-and conflict-rates independently of the other variables tested. Of the two OmniPlex based kits the GPlex4 kit performed the best, showing high call-rates and low error-rates. The MDA method performed excellent using the ENA extraction kit and poorly when using the QIA extraction kit. In general, it appeared unimportant whether one or three DBSS disks were used for extraction. This was surprising since the amount of input gDNA for the WGA reactions is supposed to be critical, and in our set-up it was often below the lower limit of 10 ng that is recommended by the manufacturer. Moreover, the preceding protein extraction ofthe disks did not impair the genotyping of the produced wgaDNA, which is in accordance with similar observations from our laboratory .
Because the investigation focuses ondifferent combinations of wgaDNA preparation, it suffers from the weakness that the number of samples in each group is limited. In addition, only samples from the DNSB were used.
The combination of the ENA DNA-extraction with either the REPLI-g or the Gplex4 WGA kit were selected to see if the procedures were robust enough for GWA studies employing numerous samples. Both set-ups produced wgaDNA from 16 DBSS stored for 15 to 25 years that performed well with constant high call-rates. Corresponding reference gDNA samples were not available. Notably, when calling genotypes of wgaDNA preparations with the BeadStudio software, albeit clusters were nice and tight for some loci they did not fit well into the standard cluster positions. This is because the BeadStudio software calls the genotypes of a given locus by comparing the observed values with the expected values, defined by the Human610-Quadv1B Cluster file, which is based on gDNA samples [22, 23]. In such cases, data fit and call-rates can be improved by adjusting the cluster positions to match the observed data . Cluster files tailored for the OmniPlex and MDA method were hence created from the samples available, and the call-rates were significantly improved for both wgaDNA preparations. They were in fact comparable to call-rates obtained using high-quality gDNA, indicating that the approach is robust.
Eighteen WGA reactions, each producing ~5 μg of wgaDNA, can be made per ENA DNA extraction. As the Illumina Infinium HD Human610-Quad BeadChip uses 240 ng of wgaDNA, one WGA reaction is enough to run 20 chips. Thus one to three 3.2 mm disks from a DBSS are sufficient to make repeated GWS as well as fine-mapping genotyping, if required. We have briefly tested the performance of the two wgaDNA preparations on the Affymetrix platform and found that wgaDNA produced by the OmniPlex method was unsuitable, whereas wgaDNA produced by the MDA method gave results comparable to those obtained by the Illumina platform. In addition to being used for GWS, DBSS can also be used for multiplex protein measurements , quantitative RNA micro arrays detecting up to 3000 genes , and quantitative DNA methylation analysis .
The results demonstrate that residual DBSS from neonatal screening that have been stored for several years in biobanks can be used for GWS and hence for large genome-wide association studies. Using DBSS instead of collecting new samples may, in a cost-efficient way, reveal important correlations between genotypes, environment and human diseases. Both the OmniPlex and the MDA method performed excellently in combination with the ENA extraction, and we recommend to test which of the two WGA procedures is most suitable for a given task.
Mei JV, Alexander JR, Adam BW, Hannon WH: Use of filter paper for the collection and analysis of human whole blood specimens. J Nutr. 2001, 131 (5): 1631S-1636S.
Aoki K: Newborn screening in Japan. Southeast Asian J Trop Med Public Health. 2003, 34 (Suppl 3): 80-
de Carvalho TM, dos Santos HP, dos Santos IC, Vargas PR, Pedrosa J: Newborn screening: a national public health programme in Brazil. J Inherit Metab Dis. 2007, 30 (4): 615-10.1007/s10545-007-0650-7.
Olney RS, Moore CA, Ojodu JA, Lindegren ML, Hannon WH: Storage and use of residual dried blood spots from state newborn screening programs. J Pediatr. 2006, 148 (5): 618-622. 10.1016/j.jpeds.2005.12.053.
Therrell BL, Adams J: Newborn screening in North America. J Inherit Metab Dis. 2007, 30 (4): 447-465. 10.1007/s10545-007-0690-z.
Therrell BL, Hannon WH, Pass KA, Lorey F, Brokopp C, Eckman J, Glass M, Heidenreich R, Kinney S, Kling S, et al: Guidelines for the retention, storage, and use of residual dried blood spot samples after newborn screening analysis: statement of the Council of Regional Networks for Genetic Services. Biochem Mol Med. 1996, 57 (2): 116-124. 10.1006/bmme.1996.0017.
Webster D: Newborn screening in Australia and New Zealand. Southeast Asian J Trop Med Public Health. 2003, 34 (Suppl 3): 69-70.
Wilcken B, Wiley V: Newborn screening. Pathology. 2008, 40 (2): 104-115. 10.1080/00313020701813743.
Norgaard-Pedersen B, Hougaard DM: Storage policies and use of the Danish Newborn Screening Biobank. J Inherit Metab Dis. 2007, 30 (4): 530-536. 10.1007/s10545-007-0631-x.
Frank L: Epidemiology. When an entire country is a cohort. Science. 2000, 287 (5462): 2398-2399. 10.1126/science.287.5462.2398.
Hannelius U, Lindgren CM, Melen E, Malmberg A, von Dobeln U, Kere J: Phenylketonuria screening registry as a resource for population genetic studies. J Med Genet. 2005, 42 (10): e60-10.1136/jmg.2005.032987.
Catsburg A, Zwet van der WC, Morre SA, Ouburg S, Vandenbroucke-Grauls CM, Savelkoul PH: Analysis of multiple single nucleotide polymorphisms (SNP) on DNA traces from plasma and dried blood samples. J Immunol Methods. 2007, 321 (1–2): 135-141. 10.1016/j.jim.2007.01.015.
Lovmar L, Fredriksson M, Liljedahl U, Sigurdsson S, Syvanen AC: Quantitative evaluation by minisequencing and microarrays reveals accurate multiplexed SNP genotyping of whole genome amplified DNA. Nucleic Acids Res. 2003, 31 (21): e129-10.1093/nar/gng129.
Park JW, Beaty TH, Boyce P, Scott AF, McIntosh I: Comparing whole-genome amplification methods and sources of biological samples for single-nucleotide polymorphism genotyping. Clin Chem. 2005, 51 (8): 1520-1523. 10.1373/clinchem.2004.047076.
Sjoholm MI, Dillner J, Carlson J: Assessing quality and functionality of DNA from fresh and archival dried blood spots and recommendations for quality control guidelines. Clin Chem. 2007, 53 (8): 1401-1407. 10.1373/clinchem.2007.087510.
Sorensen KM, Jespersgaard C, Vuust J, Hougaard D, Norgaard-Pedersen B, Andersen PS: Whole genome amplification on DNA from filter paper blood spot samples: an evaluation of selected systems. Genet Test. 2007, 11 (1): 65-71. 10.1089/gte.2006.0503.
Hollegaard MV, Sorensen KM, Petersen HK, Arnardottir MB, Norgaard-Pedersen B, Thorsen P, Hougaard DM: Whole genome amplification and genetic analysis after extraction of proteins from dried blood spots. Clin Chem. 2007, 53 (6): 1161-1162. 10.1373/clinchem.2006.082313.
Hollegaard MV, Grove J, Thorsen P, Norgaard-Pedersen B, Hougaard DM: High-throughput genotyping on archived dried blood spot samples. Genet Test Mol Biomarkers. 2009, 13 (2): 173-179. 10.1089/gtmb.2008.0073.
Dean FB, Hosono S, Fang L, Wu X, Faruqi AF, Bray-Ward P, Sun Z, Zong Q, Du Y, Du J, et al: Comprehensive human genome amplification using multiple displacement amplification. Proc Natl Acad Sci USA. 2002, 99 (8): 5261-5266. 10.1073/pnas.082089499.
Barker DL, Hansen MS, Faruqi AF, Giannola D, Irsula OR, Lasken RS, Latterich M, Makarov V, Oliphant A, Pinter JH, et al: Two methods of whole-genome amplification enable accurate genotyping across a 2320-SNP linkage panel. Genome Res. 2004, 14 (5): 901-907. 10.1101/gr.1949704.
Skogstrand K, Thorsen P, Norgaard-Pedersen B, Schendel DE, Sorensen LC, Hougaard DM: Simultaneous measurement of 25 inflammatory markers and neurotrophins in neonatal dried blood spots by immunoassay with xMAP technology. Clin Chem. 2005, 51 (10): 1854-1866. 10.1373/clinchem.2005.052241.
Infinium Genotyping Data Analysis Illumina TechNote. [http://www.illumina.com/downloads/GTDataAnalysis_TechNote.pdf]
Infinium HD DNA Analysis BeadChips Illumina TechNote. [http://www.illumina.com/downloads/InfiniumHD_DataSheet.pdf]
Haak PT, Busik JV, Kort EJ, Tikhonenko M, Paneth N, Resau JH: Archived Unfrozen Neonatal Blood Spots Are Amenable to Quantitative Gene Expression Analysis. Neonatology. 2008, 95 (3): 210-216. 10.1159/000155652.
Wong N, Morley R, Saffery R, Craig J: Archived Guthrie blood spots as a novel source for quantitative DNA methylation analysis. Biotechniques. 2008, 45 (4): 423-424. 10.2144/000112945.
The study was supported by grants from the Danish Strategic Research Council (2101-07-0059) Stanley Medical Research Institute, the Danish Medical Research Council (271-06-0368), and the Novo Nordisk Foundation. None of the sponsors participated in any part of this manuscript or study.
MVH prepared the DNA samples, participated in the design of the study and did the major part of data analysis and drafting the manuscript. JG conducted the Illumina GWS and participated in the data analysis. AB, MN, BNP, TFO, PMB, CW, OM, MD and PT initiated the study and participated in its design and the interpretation of results. DMH initiated the study, participated in its design, the interpretation of results and drafting the manuscript. All authors were involved in a critical revising and approved the final manuscript.
Electronic supplementary material
Additional file 1: Performance of different wgaDNA preparations. 1 Year the DBSS was stored. 2 Kit used for DNA-extraction, QIA-QIAamp DNA Blood Micro Kit; ENA-Extract-N-Amp Blood PCR Kit. 3 Extraction of proteins from DBSS prior to extraction of DNA. 4 Number of 3.2 mm disks used. 5 DNA (ng) utilized per WGA reaction. 6 GWS genotypes call-rate (percent). 7 Rate of conflicts (percent) between genotype results on wgaDNA and reference gDNA. 8 wgaDNA (μg) produced per reaction. (PDF 63 KB)
Authors’ original submitted files for images
Below are the links to the authors’ original submitted files for images.
About this article
Cite this article
Hollegaard, M.V., Grauholm, J., Børglum, A. et al. Genome-wide scans using archived neonatal dried blood spot samples. BMC Genomics 10, 297 (2009). https://0-doi-org.brum.beds.ac.uk/10.1186/1471-2164-10-297
- Single Nucleotide Polymorphism
- Newborn Screening Program
- Cluster File
- gDNA Sample
- Input gDNA