Approved research

Polygenic Risk Score and Clinical Information Integration for Disease Prediction

Weill Cornell Medical College

Lay summary

Comparing two people's genomes at the same location in the DNA may reveal that the first person has an A, whereas the second person has a G. If the first person is taller than the second, we could conclude that having the A base makes you taller. By comparing thousands of people we can be more precise in exactly how much the A contributes to a difference in height. By looking at every place in the genome there is variation, we can add up the the contributions of height differences into one total value that is called a polygenic risk score. The higher the score, the taller you likely are. However, there are many other factors that contribute to differences in height, such as diet and susceptibility to disease. To make the best prediction we plan on incorporating all of these related features into one model. Using machine learning, we can find any non-obvious interactions between these features and produce the best prediction. However, even with this comprehensive approach there will likely be some people who have a high score yet are still short. To investigate why this is, we plan on carrying out typical analyses comparing each feature to the per person errors, along with quantile regression, which isolates trends within one risk level of the model. Lastly, in order to provide greater utility to these scores we will use Mendelian Randomization, which effectively creates a randomized control trial and determines any causal links between score and clinical factors. Together, this work should help doctors understand which patients are at greatest risk of a disease and should therefore be prescribed a different medication or be checked on more regularly. Similar models already exist, and are just beginning to be used by doctors, yet none of them include the multitude of features proposed here. Each of these added features, such as occupation and medication history, are already taken into consideration on their own - so it would seem rational that they would combine to form more powerful predictions and therefore better patient care.