Skip to main content

Table 1 Effect of genome quality on annotation efficacy

From: Gene fragmentation in bacterial draft genomes: extent, consequences and mitigation

Uncorrected

  

Pfama

COGa

KEGGa

Including PP-C42

% Partial ORFs fragments vs. % all ORFs annotated

r = -0.854

P < 0.001

r = -0.551

P = 0.012

r = 0.586

P = 0.007

 

Mean ORF length vs. % all ORFs annotated

r = 0.785

P < 0.001

r = 0.526

P = 0.017

r = -0.403

P = 0.078

Excluding PP-C42

% Partial ORF fragments vs. % all ORFs annotated

r = -0.421

P = 0.073

r = -0.019

P = 0.939

r = 0.415

P = 0.078

 

Mean ORF length vs. % all ORFs annotated

r = 0.406

P = 0.084

r = 0.157

P = 0.520

r = -0.016

P = 0.949

Corrected using matched partial ORF sets

  

Pfam

COG

KEGG

Including PP-C42

% Partial ORFs fragments vs. % all ORFs annotated

r = -0.861

P < 0.001

r = -0.595

P = 0.007

r = 0.469

P = 0.050

 

Mean ORF length vs. % all ORFs annotated

r = 0.787

P < 0.001

r = 0.563

P = 0.012

r = -0.284

P = 0.253

Excluding PP-C42

% Partial ORF fragments vs. % all ORFs annotated

r = -0.338

P = 0.170

r = 0.027

P = 0.915

r = 0.350

P = 0.168

 

Mean ORF length vs. % all ORFs annotated

r = 0.378

P = 0.122

r = 0.155

P = 0.538

r = 0.052

P = 0.842

  1. aPearson correlations between annotation frequency and genome quality, as represented by the percent of the predicted ORFs composed of partial sequences and mean ORF length. Complete genomes are excluded in all cases; including them has essentially no effect.