Loading…

Machine Learning overview for biogeographical ancestry prediction - a PLS-DA approach

Biogeographical ancestry (BGA) of a trace or person/skeleton refers to the component of ethnicity, which is composed of biological and cultural elements and is biologically determined. Nowadays, many people are interested in researching their genealogy, and the ability to distinguish biogeographic i...

Full description

Saved in:
Bibliographic Details
Published in:Forensic science international. Genetics supplement series 2022-12, Vol.8, p.306-307
Main Authors: Alladio, Eugenio, Poggiali, Brando, Cosenza, Giulia, Cisana, Selena, Omedei, Monica, Garofano, Paolo, Pilli, Elena
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Biogeographical ancestry (BGA) of a trace or person/skeleton refers to the component of ethnicity, which is composed of biological and cultural elements and is biologically determined. Nowadays, many people are interested in researching their genealogy, and the ability to distinguish biogeographic information about populations and subgroups using DNA analysis plays an essential role in various fields, such as forensics. For example, it is advantageous for investigative and intelligence purposes to infer the biogeographic origin of perpetrators or victims of unsolved cases when reference profiles of perpetrators or database matches are not available for comparison purposes. Current approaches to biogeographic ancestry estimation using SNPs data are generally based on PCA and STRUCTURE software. The present study provides an alternative method that incorporates multivariate data analysis and Machine Learning strategies to assess the BGA discriminatory power of unknown samples using various commercial panels. Using datasets from the 1000 Genomes Project, Simons Genome Diversity Project, and Human Genome Diversity Project, which include African, American, Asian, European, and Oceanic individuals, powerful multivariate techniques such as Partial Least Squares-Discriminant Analysis (PLS-DA) and XGBoost were used and their discriminatory power was compared.
ISSN:1875-1768
1875-175X
DOI:10.1016/j.fsigss.2022.10.071