Loading…

Distributing human leukocyte antigen (HLA) database in histocompatibility: a shift in HLA data governance

Aim: Human leukocyte antigen (HLA) population genetics has been a historical field centralizing data resource. HLA genetics databases typically facilitate access to frequencies of allele, haplotype, and genotype format information. Among many resources, the Allele Frequency Net Database (AFND) is a...

Full description

Saved in:
Bibliographic Details
Published in:Exploration of immunology 2022-11, Vol.2 (6), p.749-759
Main Authors: Sayadi, Sirine, Douillard, Venceslas, Vince, Nicolas, Südholt, Mario, Gourraud, Pierre-Antoine
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Aim: Human leukocyte antigen (HLA) population genetics has been a historical field centralizing data resource. HLA genetics databases typically facilitate access to frequencies of allele, haplotype, and genotype format information. Among many resources, the Allele Frequency Net Database (AFND) is a typical centralized repository that allows users to research and analyze immune gene frequencies in different populations around the world. With the massive increase in medical data and the strengthening of data governance laws, the proposal for a new distributed and secure model for the historical centralization method in population genetics has become important. In this paper, a new model of HLA population genetic resources, an alternative distributed version of HLA databases has been developed. It allows users to perform the same research and analysis with other remote sites without sharing their original data and monitoring data access. Methods: This new version uses the Master/Worker distributed model and offers distributed algorithms for the calculation of allelic frequencies, haplotypic frequencies and for individual genotypic calculations. The new model was evaluated on a distributed testbed for experiment-driven research Grid’5000 and has obtained good results of accuracy and execution time compared to the original centralized scheme used by researchers. Results: The results show that distributed algorithm applied to HLA population genetics resources enables usage control and enables enforcing the security framework of the data-owning institution. It gives the same results for all counting methods in population immunogenetics. With the same frequencies’ estimations, it yields a much quicker computation time in many cases, in particular for large samples. Conclusions: Distributing previously centralized resources is an interesting perspective enhancing better control of data sharing.
ISSN:2768-6655
2768-6655
DOI:10.37349/ei.2022.00080