Skip to main content

Table 1 Summary statistics for KhoeSan exomes

From: Exome capture from saliva produces high quality genomic and metagenomic data

  

Total readsa

Unmapped reads

% Un-mapped reads

% PCR duplicates

% Mapped on target

Median target coverageb

% of variants coveredc

Autosomal SNV

Autosomal singletons

Non-ref. concordanced

Pilot 1

SA006

69,272,282

9,122,731

13.2%

54.2%

63.5%

12

94.9%

25,225

657

0.9897

 

SA008

113,888,276

2,143,408

1.9%

19.8%

78.2%

73

99.5%

26,408

955

0.9947

 

SA011

78,006,472

1,664,959

2.1%

33.7%

77.4%

40

99.0%

26,365

67

NA

 

SA012

67,209,032

1,353,187

2.0%

20.5%

75.7%

42

99.3%

26,722

86

NA

 

SA035

85,142,498

5,812,851

6.8%

78.0%

79.4%

10

92.1%

24,692

1,726

0.9884

 

SA051

76,076,464

3,102,819

4.1%

27.8%

76.5%

37

98.8%

27,674

1,239

NA

 

SA052

60,375,472

1,247,951

2.1%

12.9%

78.2%

41

98.8%

27,779

755

0.9968

 

SA054

62,358,148

1,959,032

3.1%

27.9%

73.9%

31

99.3%

28,024

817

0.9956

 

Pilot 1 mean

76,541,081

3,300,867

4.4%

34.4%

75.4%

35.75

97.7%

26,611

788 e

0.9930

Pilot 2

SA1000

77,069,730

8,387,491

10.9%

9.5%

57.3%

44

98.4%

27,921

2,483

0.9915

 

SA1001

85,479,934

3,551,500

4.2%

11.4%

74.2%

67

98.7%

27,694

2,318

0.9939

 

SA1002

92,542,846

4,674,919

5.1%

15.5%

70.1%

65

98.8%

27,886

3,286

0.9941

 

SA1006

83,545,692

4,002,665

4.8%

18.1%

74.5%

59

98.4%

27,446

2,442

0.9927

 

SA1010

87,939,484

4,445,502

5.1%

14.5%

71.0%

62

98.6%

27,295

1,782

0.9935

 

SA1011

82,377,158

7,810,714

9.5%

11.6%

49.2%

40

98.5%

27,484

2,717

0.9887

 

SA1025

81,405,650

2,498,412

3.1%

10.0%

87.8%

63

99.3%

28,696

2,676

0.9934

 

Pilot 2 mean

84,337,213

5,053,029

6.1%

12.9%

69.2%

57.14

98.7%

27,775

2,529

0.9925

  1. aTotal number of DNA fragments including: mapped, unmapped and duplicate reads.
  2. bLimited to non-duplicate reads on autosomes, as calculated by GATK Unified Genotype.
  3. cLimited to XX autosomal SNPs identified at the 99% VQSR threshold.
  4. dConcordance at heterozygous and homozygous non-reference positions as compared to Illumina OmniExpress or 550K.v2 SNP arrays.
  5. eFewer average singletons as a result of including closely related individuals in Pilot 1. See Additional file 1: Table S1 for individual data.