Skip to main content
Figure 1 | BMC Genomics

Figure 1

From: HGCS: an online tool for prioritizing disease-causing gene variants by biological distance

Figure 1

Schematic representation of the generation, data structure and workflow of the HGCS. (1) Extraction of all human direct protein-protein binding interactions and the corresponding confidence scores from String. (2) Inversion of confidence scores to give direct biological distance metrics and generation of a genome-wide human weighted network. (3) Generation, for each human gene, of a gene-specific connectome — the set of all other human genes ranked according to their biological proximity to the specific gene. (4) Generation of a MySQL table from all human gene-specific connectomes. (5) Extraction, from Ensembl BioMart, of all human protein IDs, gene IDs, and their corresponding conventional and full names. (6,7) Generation of a MySQL table of all alternative gene names for each human gene. (8,9) Establishment of the full set of query gene names by identifying missing genes with alternative gene name aliases, extracting the target genes from the connectomes of the core genes. (10) Sorting of the target genes according to user-defined metrics, by relatedness to any of the core genes, or separated by core gene. The screen output can then be downloaded as a tab-separated text file.

Back to article page