Skip to main content
Figure 2 | BMC Genomics

Figure 2

From: A data mining approach for classifying DNA repair genes into ageing-related or non-ageing-related

Figure 2

Summary of the procedure for creating a set of predictor attributes involving GO terms. First, a list of gene IDs is used to download from UniProt the specific GO terms annotated for each gene. Next, information about GO term definitions is used to select only the biological process (BP) terms for each gene, and then to find the ancestors of those terms in the GO hierarchy. (The notation "anc(term1)" denotes the set of all ancestors of term1, "anc1(term1)" denotes the first ancestor of term 1, etc.) After adding those ancestor GO terms to the list of GO terms per gene, the dataset is transformed into a format having a fixed-length list of binary attributes (representing GO terms) for each gene, where each attribute value indicates whether or not the gene is annotated with the corresponding GO term.

Back to article page