Loading…
ENHANCING POLYGENIC PREDICTION WITH AN AGNOSTIC MULTI-PGS METHOD THAT LEVERAGES HUNDREDS OF POLYGENIC SCORES
The prediction accuracy of a polygenic score (PGS) is highly determined by the size of the training sample. Although this sample is still limited for psychiatric disorders, these disorders are genetically correlated with multiple behavioral and physical phenotypes. These mostly quantitative phenotyp...
Saved in:
Published in: | EUROPEAN NEUROPSYCHOPHARMACOLOGY 2023-10, Vol.75, p.S30-S31 |
---|---|
Main Authors: | , , , , , , , , , , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | The prediction accuracy of a polygenic score (PGS) is highly determined by the size of the training sample. Although this sample is still limited for psychiatric disorders, these disorders are genetically correlated with multiple behavioral and physical phenotypes. These mostly quantitative phenotypes are much more accessible and thus currently have genome-wide association studies (GWAS) with millions of samples. Generating stand-alone PGS for publicly accessible GWAS summary statistics is nowadays possible with PGS methods that do not require a validation sample, like LDpred2-auto.
There are some available methods that benefit from using genetically correlated phenotypes to increase prediction accuracy, including MTAG and wMT-SBLUP and that have been applied to psychiatric disorders. These methods require a pre-selection of the included phenotypes based on prior information about the genetic correlation estimates with the desired outcome. Here we show the results of a new method, multi-PGS, that does not require to pre-specify genetically correlated phenotypes but relies on an agnostic PGS library based on “all” publicly available GWAS summary statistics. We explore diverse applications of this multi-PGS for psychiatric disorders using the iPSYCH data.
In practice, a large library of PGS including 937 scores was generated from publicly available GWAS summary statistics resources (GWAS Catalog, GWAS ATLAS, PGC) using LDpred2-auto. Then the PGS library together with covariates sex, birth year and 20 PCs were used as predictors in multivariate models. We used both penalized regression models (lasso) and gradient boosted trees (XGBoost). The out-of-sample prediction accuracy of the risk prediction models was assessed.
First, we applied our multi-PGS strategy to predict ADHD, affective disorder, anorexia nervosa, autism, bipolar disorder and schizophrenia in iPSYCH. All multi-PGS models increased both R2 and logOR, with R2 increases of 4-fold on average and up to 9-fold for ADHD and autism. Increased prediction was also observed when compared to wMT-SBLUP. Interestingly, multiple PGS for the same phenotype were selected in the final model. For example, three different depression-related PGS (self-reported, medically diagnosed and broad depression) were included in the affective disorder multi-PGS. This indicates that non-overlapping signals from multiple GWAS of similar phenotypes can be combined to increase prediction accuracy.
Next, we explored further the c |
---|---|
ISSN: | 0924-977X 1873-7862 |
DOI: | 10.1016/j.euroneuro.2023.08.065 |