Skip to main content

Table 2 Overview of SNP discovery and genotype calling using three different callers

From: Generation of SNP datasets for orangutan population genomics using improved reduced-representation sequencing and direct comparisons of SNP calling algorithms

 

GATK_v.2.5-0

CLC_v.5.0.1

SAMtools_v.0.1.19

 

Pop_SK

Pop_WA

Overall

Pop_SK

Pop_WA

Overall

Pop_SK

Pop_WA

Overall

No. of SNPs

34257

40248

57396

34788

55585

75364

14494

14903

24103

No. of private SNPs

17148

23139

40287

19779

40576

60355

9200

9609

18809

% singletons

7.68

10.83

12.18

11.53

27.47

25.59

14.63

21.66

22.19

Median site heterozygositya

0.267

0.250

/

0.236

0.200

/

0.266

0.231

/

Median coverage per individual

93×

70×

82×

66×

29×

48×

66×

19×

27×

 

GATK-CLC intersect

SAMtools- GATK intersect

SAMtools-CLC intersect

 

Pop_SK

Pop_WA

Overall

Pop_SK

Pop_WA

Overall

Pop_SK

Pop_WA

Overall

No. of SNPs

21475

24936

37085

11325

12350

18933

9861

11310

17163

No. of private SNPs

12149

15610

27759

6583

7608

14191

5853

7302

13155

% singletons

9.91

17.98

12.82

9.99

20.53

19.37

10.54

23.08

21.60

Median site heterozygositya

0.250

0.222

/

0.286

0.231

/

0.266

0.222

/

Median coverage per individualb

107× (65)

81× (27)

96× (37)

55× (98)

18× (98)

20× (99)

69× (76)

19× (35)

26× (46)

  1. We required all SNPs to have a genotype call passing all stringent quality filters in a minimum of eight individuals per population (population-based filtering). The intersect datasets contain exclusively concordant genotype calls between the designated SNP callers. Pop_SK: South Kinabatangan population, Pop_WA: West Alas population.
  2. aBased on the sites being polymorphic within the population.
  3. bCoverage values of intersect datasets are taken from the first named SNP caller. The coverage values of the second named caller are given in brackets.