Loading…

Genetic association models are robust to common population kinship estimation biases

Abstract Common genetic association models for structured populations, including principal component analysis (PCA) and linear mixed-effects models (LMMs), model the correlation structure between individuals using population kinship matrices, also known as genetic relatedness matrices. However, the...

Full description

Saved in:
Bibliographic Details
Published in:Genetics (Austin) 2023-05, Vol.224 (1)
Main Authors: Hou, Zhuoran, Ochoa, Alejandro
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract Common genetic association models for structured populations, including principal component analysis (PCA) and linear mixed-effects models (LMMs), model the correlation structure between individuals using population kinship matrices, also known as genetic relatedness matrices. However, the most common kinship estimators can have severe biases that were only recently determined. Here we characterize the effect of these kinship biases on genetic association. We employ a large simulated admixed family and genotypes from the 1000 Genomes Project, both with simulated traits, to evaluate key kinship estimators. Remarkably, we find practically invariant association statistics for kinship matrices of different bias types (matching all other features). We then prove using statistical theory and linear algebra that LMM association tests are invariant to these kinship biases, and PCA approximately so. Our proof shows that the intercept and relatedness effect coefficients compensate for the kinship bias, an argument that extends to generalized linear models. As a corollary, association testing is also invariant to changing the reference ancestral population of the kinship matrix. Lastly, we observed that all kinship estimators, except for popkin ratio-of-means, can give improper non-positive semidefinite matrices, which can be problematic although some LMMs handle them surprisingly well, and condition numbers can be used to choose kinship estimators. Overall, we find that existing association studies are robust to kinship estimation bias, and our calculations may help improve association methods by taking advantage of this unexpected robustness, as well as help determine the effects of kinship bias in related problems. The most popular genetic association models for structured populations use kinship matrices to model population structure; however, the most common kinship estimator is biased. Here, Hou and Ochoa characterize the effect of kinship bias on genetic association and discover that there is no effect. They prove, theoretically and empirically, how kinship bias is compensated for by the regression intercept and report novel findings regarding variant weighing and power, as well as non-positive semidefinite estimates and their effect on numerical accuracy.
ISSN:1943-2631
0016-6731
1943-2631
DOI:10.1093/genetics/iyad030