Study: Network Representation Learning Could Help Identify MS-Related Genes

May 17, 2020

Identifying the particular genes at play in multiple sclerosis (MS) could lead to a better understanding of the disease, and possibly better therapies for it. A new paper proposes network representation learning as the best method to identify those genes.

New research suggests deep learning algorithms and computational methods could help scientists better understand the mechanisms at play in multiple sclerosis (MS).

Scientists know a lot about MS, but one question that has yet to be solved is which specific genes are related to the disease. In a new study, published in the journal Frontiers in Genetics,1 investigators suggest a type of computational network analysis might be the best pathway to discover the exact disease-related genes of MS.

MS disrupts a patient’s myelin and axons, leading to inflammation of the brain and spinal cord. And while some evidence has suggested certain disease-related genes that may play a role in MS2, the unknowns still outweigh the knowns, according to corresponding author Haijie Liu, PhD, of Capital Medical University and Tianjin Medical University General Hospital, both in China.

Liu and colleagues say the discovery of MS’s disease-related genes could have major implications for how scientists understand and how clinicians eventually treat patients with the disease.

“Identifying such genes will effectively contribute to discovering the inner molecular mechanisms of MS as a disease and will help researchers learn more about MS,” they write. “Thus, it is essential and of importance to develop a novel algorithm to identify the disease-related genes of MS rapidly and effectively.”

The newer algorithmic approach would be better than more traditional approaches, Liu and colleagues say, because traditional methods have relied upon “guilt by association” guesswork.

“Specifically, genes associated with the same or similar diseases usually have a higher probability of sharing the same topological structure or similar neighbors as others in the gene interaction networks,” Lui and colleagues write. “Thus, based on this guilt-by-association hypothesis, the core of predicting disease-related genes is calculating the distance or similarity between candidate genes and disease-related genes effectively and correctly.”

However, calculating distances is not a straightforward proposition, which can lead to difficulty in accurately ascertaining disease-related genes. Instead, the authors suggest network representation learning (NRL) methods would be the best fit to identify MS genes. The method, which has been used in a variety of disciplines, combines algorithms and deep learning to predict MS disease-related genes.

In the study, the team uses 3 classical NRL algorithms to learn the topological information in the protein-protein-interaction network, and then uses a stacked autoencoder to extract low-dimensional features from the model. The final step is to use a support vector machine (SVM) to predict disease-related genes of MS.

The authors compared their proposed NRL-based method to existing algorithms for identifying disease-related genes.

“The experimental results show the superior performance of the NRL-based algorithms,” the authors conclude. “Moreover, the proposed NRL-based algorithms are scalable and robust enough to be applied to many other tasks of disease-related gene prediction.”

The new method is only experimental at this phase, and would need to be validated through further research. Still, the authors say that their initial findings suggest the NRL-based system could significantly outperform existing methods. Given the potential for the identification of disease-related genes to lead to breakthroughs in our understanding of MS, the authors argue their approach should be prioritized.

References:

  1. Liu H, Guan J, Li H, Bao Z, Wang Q, Luo X and Xue H (2020) Predicting the Disease Genes of Multiple Sclerosis Based on Network Representation Learning. Front. Genet. 11:328. doi:10.3389/fgene.2020.00328
  2. Pinero J., Bravo l., Queralt-Rosinach N., Gutierrez-Sacristan A., Deu-Pons J., Centeno E., Garcia-Garcia J., et al. . (2017). Disgenet: a comprehensive platform integrating information on human disease-associated genes and variants. Nucleic Acids Res. 45, D833—D839. doi:10.1093/nar/gkw943