Skip to main content

Table 3 Effect of sequence redundancy on algorithm cross-validation performance. HMM models were constructed using the 186 bootstrap sequences used to train HMM266, then tested for accuracy and coverage against non-overlapping positive and negative test sets. "Max. sequence similarity" refers to the maximum number of amino acid position matches allowed for the sequences in a given row, either within or between test and training sets. Jack-knife (leave-one-out) testing for each row was performed against the training set described in that row.

From: Predicting N-terminal myristoylation sites in plant proteins

Model Name

Max. sequence similarity

Number train seqs.

Number positive test seqs.

Number negative test seqs.

Accuracy (TP+TN)/TOTAL

Coverage TP/(TP+FN)

Jack-knife Detection

HMM186B

24/25 residues (96%)

186

80

185

96.6%

96.3%

98.4%

HMM162B

20/25 residues (80%)

162

53

128

96.1%

92.5%

96.9%

HMM151B

15/25 residues (60%)

151

42

102

98.6%

95.2%

96.7%

HMM127B

10/25 residues (40%)

127

25

94

97.5%

96.0%

96.1%