3.10.11. Sample PhoRank Gene Ranking¶
This algorithm ranks genes based on their relevance to user-specified phenotypes as defined by the HPO and GO biomedical ontologies. There are two versions of the algorithm: PhoRank Clinical and PhoRank Research.
PhoRank Clinical is based on an algorithm published by Masino et al. that orders genes by the semantic similarity between the phenotypes associated with each gene and those associated with the patient. This method works well when applied to individuals presenting with disease phenotypes that have established gene associations.
PhoRank Research is modeled on the Phevor algorithm. Phevor assigns scores to ontology terms based on their proximity to the user-specified phenotypes. Nodes that are connected to a search term, either directly or through a shared gene relationship, are called seed nodes and are assigned an initial score of one. The algorithm propagates this score information through the ontologies, so that genes with high scores are more closely related to the specified phenotypes, while genes with low scores have little or no relation to the phenotypes.
This algorithm excels at finding gene associations in individuals with atypical disease presentations, but is much slower than PhoRank Clinical and produces less optimal gene rankings when applied to phenotypes with well-established gene associations.
We have modified the Phevor algorithm by assigning initial scores to seed nodes based on their similarity to the initial search terms. We have also modified Phevor’s propagation mechanism so that the score propagated from one node to another is weighted by the similarity of the two nodes. These modifications increase the scores of more specific nodes that are highly related to the search terms, while decreasing the scores of more general nodes with many neighbors.
This algorithm requires first annotating and classifying variants using a gene annotation source.
After running PhoRank you will be prompted to select a Gene Names field to be used for gene ranking.
After clicking OK you will be prompted to enter a comma delimited list of HPO phenotype terms for each sample. Optionally the list of available phenotypes can be extended to include OMIM provided syndromes and phenotypes. The OMIM content add-on is required for this feature. (see OMIM for further details).
Gene Rank: Percentile rank of the specific gene for each sample.
Gene Score: The score of the gene computed by the ontology propagation algorithm for each sample.
Path: A shortest path from the gene to one of the specified phenotypes (there may be many paths to the phenotypes), for each sample.