Loading…
Machine Learning overview for biogeographical ancestry prediction - a PLS-DA approach
Biogeographical ancestry (BGA) of a trace or person/skeleton refers to the component of ethnicity, which is composed of biological and cultural elements and is biologically determined. Nowadays, many people are interested in researching their genealogy, and the ability to distinguish biogeographic i...
Saved in:
Published in: | Forensic science international. Genetics supplement series 2022-12, Vol.8, p.306-307 |
---|---|
Main Authors: | , , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Biogeographical ancestry (BGA) of a trace or person/skeleton refers to the component of ethnicity, which is composed of biological and cultural elements and is biologically determined. Nowadays, many people are interested in researching their genealogy, and the ability to distinguish biogeographic information about populations and subgroups using DNA analysis plays an essential role in various fields, such as forensics. For example, it is advantageous for investigative and intelligence purposes to infer the biogeographic origin of perpetrators or victims of unsolved cases when reference profiles of perpetrators or database matches are not available for comparison purposes. Current approaches to biogeographic ancestry estimation using SNPs data are generally based on PCA and STRUCTURE software. The present study provides an alternative method that incorporates multivariate data analysis and Machine Learning strategies to assess the BGA discriminatory power of unknown samples using various commercial panels. Using datasets from the 1000 Genomes Project, Simons Genome Diversity Project, and Human Genome Diversity Project, which include African, American, Asian, European, and Oceanic individuals, powerful multivariate techniques such as Partial Least Squares-Discriminant Analysis (PLS-DA) and XGBoost were used and their discriminatory power was compared. |
---|---|
ISSN: | 1875-1768 1875-175X |
DOI: | 10.1016/j.fsigss.2022.10.071 |