Skip to main content

Table 4 Evaluation of assemblies of the simulated dataset (200 × 150 bp, 1% error) and dataset D2 and D3 with CloudBrush, Contrail, and Velvet

From: A de novo next generation genomic sequence assembler based on string graph and MapReduce cloud computing framework

Dataset

Assembler

# of contigs1

N50

Largest

contig size

Prec

-ision

Recall

# of valid

contigs1

# of invalid

contigs1

Runtime

(sec)

200 × 150 bp

1% error

CloudBrush

229

112531

327245

99.20%

96.00%

152

77

10616

 

Contrail

2540

7554

36335

90.12%

95.92%

957

1583

15823

 

Velvet

209

78642

327101

99.63%

98.10%

168

41

1317

D2

dataset

CloudBrush

361

52961

156592

98.10%

98.15%

230

131

8622

 

Contrail

300

43609

124089

98.47%

96.98%

250

50

7200

 

Velvet

189

71764

174184

93.60%

92.20%

164

25

927

D3

dataset

CloudBrush

37064

8880

114585

93.65%

92.41%

24603

10387

48603

 

Contrail

31870

8274

105244

96.99%

90.89%

25236

6116

44619

 

Velvet

23565

10847

106863

95.55%

89.01%

20187

2838

13963

  1. 1 Contigs with lengths > 200 bp are counted.