Skip to navigation Skip to main content Skip to footer

Approved research

Statistical Methods for Large Scale Genetic Studies

Principal Investigator: Professor Chiara Sabatti
Approved Research ID: 27837
Approval date: August 14th 2017

Lay summary

Our goal is to develop new data analysis methods that are well suited to discover the many genetic signals that influence traits of medical relevance. We aim to increase the sensitivity of current tools, by accounting for the known complexity: it is likely that many different genetic variants contribute to the traits, possibly interacting with each other, and our models capitalize on this. At the same time, we want to minimize the number of false positives results, which are unfortunately quite likely when one searches for possible associations among as many possibilities as those in genomewide studies of multiple traits. The UK Biobank data has one of the largest sample sizes in genetics data and to take fully advantage of this new data analysis methods are needed. Approaches with increased sensitivity and specificity in genetic association studies will facilitate the identification of the biological pathways perturbed in diseases. They will allow us to zoom in more precisely on the important biology?identifying relevant genes even when their effects are small, while avoiding false leads. This knowledge is important for risk assessment, therapy choices, and drug development. We will use the UK Biobank data to identify the concrete challenges presented by the analysis of large datasets and to test the performance of the methods that we will develop, relying both on simulations and on comparative data analysis. We will use the genotype data to generate artificial traits with known genetic architecture and evaluate the performance of different methods in recovering it. We will also use measured traits to understand what type of genetic architecture is likely to be important for medical relevant phenotypes. Because our focus is on the development of methods applicable to large samples, taking advantage of the more detail information they contain, we are interested in working with the full cohort.