The ability to predict disease risk and identify individuals at high risk for developing a certain disease is fundamental to modern healthcare, since it enables the implementation of preventive measures and personalized treatments. Polygenic scores (PGS) have received attention for their promise to improve clinical prediction models. Recently, electronic health records (EHR) have also proven to enhance prediction accuracy. However, the accuracy of both PGS and EHR in clinical prediction models is impacted by individual genetic, environmental and diagnostic heterogeneity, which can lead to racial, gender, and ancestry-based biases. It is important to understand and measure the impact and severity of these types of heterogeneities, in order to develop more inclusive, accurate and robust prediction models. These models need to be evaluated and replicated across cohorts and in individuals of different genetic ancestries.
The proposed PhD project intends to address this by evaluating the impact of these heterogeneities on the predictive performance of PGS, EHR and informed family history (FH) within and across cohorts and ancestries. It will do so by studying the effect of genetic and environmental heterogeneity on the prediction accuracy for numerous health outcomes, characterizing differences in EHR across populations, and providing more robust prediction models that incorporate EHR, PGS and FH.
This PhD project aims to contribute with high-quality research to the field of psychiatric epidemiology and psychiatric genetics by providing insight into the predictive accuracy of prediction models across ancestries and cohorts. It intends to provide a deeper knowledge about the impact of genetic and environmental heterogeneity on the predictive performance of PGS, informed FH and EHR, and may serve as a guide for future research on the development of clinical prediction models.