3.11.8. CNV PhoRank Gene Ranking¶
This algorithm ranks CNVs and genes based on their relevance to user-specified phenotypes as defined by the HPO and GO biomedical ontologies. There are two versions of the algorithm: PhoRank Clinical and PhoRank Research.
PhoRank Clinical is based on an algorithm published by Masino et al. that orders genes by the semantic similarity between the phenotypes associated with each gene and those associated with the patient. This method works well when applied to individuals presenting with disease phenotypes that have established gene associations.
PhoRank Research is modeled on the Phevor algorithm. Phevor assigns scores to ontology terms based on their proximity to the user-specified phenotypes. Nodes that are connected to a search term, either directly or through a shared gene relationship, are called seed nodes and are assigned an initial score of one. The algorithm propagates this score information through the ontologies, so that genes with high scores are more closely related to the specified phenotypes, while genes with low scores have little or no relation to the phenotypes.
This algorithm excels at finding gene associations in individuals with atypical disease presentations, but is much slower than PhoRank Clinical and produces less optimal gene rankings when applied to phenotypes with well-established gene associations.
We have modified the Phevor algorithm by assigning initial scores to seed nodes based on their similarity to the initial search terms. We have also modified Phevor’s propagation mechanism so that the score propagated from one node to another is weighted by the similarity of the two nodes. These modifications increase the scores of more specific nodes that are highly related to the search terms, while decreasing the scores of more general nodes with many neighbors.
This algorithm requires first calling CNVs from coverage statistics. Next, the called CNVs must be annotated using a gene annotation source.
After running PhoRank you will be prompted to select a Gene Names field to be used for gene ranking.
After clicking OK you will be prompted to enter a comma delimited list of HPO phenotype terms, and name for the phenotype. Optionally the list of available phenotypes can be extended to include OMIM provided syndromes and phenotypes. The OMIM content add-on is required for this feature. (see OMIM for further details).
Sum of Scores: Sum of percentile ranks for each gene.
Max Score: Percentile rank of the highest scoring gene.
Gene Name: Names of the top five ranking genes.
Gene Rank: Percentile rank of the top five ranking genes.
Gene Score: PhoRank score of the top five ranking genes.
Path: Paths for the top five ranking genes.