Features | Data source | Description | Bins |
---|---|---|---|
Amino acid frequency | Protein sequences from IPI [27] | PNAC considers the relative proportion of leucine, isoleucine, lysine and serine residues | 5 bins for each distinct amino acid considered |
Targeting motifs | The predicted presence of signal peptides, transmembrane domains (TMDs), mitochondrial targeting peptides and nucleolar localisation sequences (NoLSs) | 9 bins detailed in the Methods | |
Gene co-expression | GDS596 from the Gene Expression Omnibus [42] | The average Pearson correlation of expression between the query protein and proteins in the nucleolar-cytoplasmic training group using expression profiles from 79 physiologically normal tissues [35] | 5 bins |
GO | EBI Gene Ontology (GO) annotations [36] for human | Biological process and molecular function Gene Ontology (GO) annotations for the query protein are compared to those of the training set proteins | 4 bins |
Subcellular localisation of interactors | HPRD [31], Uniprot [30], IntAct [39] and PIPs [37, 38] subcellular localisation annotations and/or protein interactions | A nucleolar proximity score is calculated for all the interactors of the query protein | 5 bins |