Skip to main content

Table 9 Performance of N-signal vs N-signal-free protein binary classification on automatically collected orthologs

From: Plus ça change – evolutionary sequence divergence predicts protein subcellular localization signals

Yeast dataset

Mean accuracy

Mean AUC

Mean MCC

J48

71.47±5.00

0.67±0.07

0.36±0.12

SVM

75.35±3.49

0.71±0.04

0.44±0.08

The majority class fraction

65.23%

N/A

N/A

Human dataset

   

J48

69.32±4.10

0.72±0.07

0.43±0.09

SVM

72.28±5.95

0.72±0.06

0.43±0.12

The majority class fraction

62.41%

N/A

N/A

Plant dataset

   

J48

79.41±6.03

0.75±0.06

0.55±0.13

SVM

83.47±4.01

0.79±0.04

0.64±0.09

The majority class fraction

63.60%

N/A

N/A

  1. Three classification performance measures when using only divergence features are shown for the discrimination of N-signal containing and N-signal-free proteins on automatically collected orthologs. AUC denotes the area under the ROC curves. For each measure the average and standard deviation is shown over the 5 folds of the cross-validation.