Loading…

Data pre-processing for analyzing microbiome data – A mini review

The human microbiome is an emerging research frontier due to its profound impacts on health. High-throughput microbiome sequencing enables studying microbial communities but suffers from analytical challenges. In particular, the lack of dedicated preprocessing methods to improve data quality impedes...

Full description

Saved in:
Bibliographic Details
Published in:Computational and structural biotechnology journal 2023-01, Vol.21, p.4804-4815
Main Authors: Zhou, Ruwen, Ng, Siu Kin, Sung, Joseph Jao Yiu, Goh, Wilson Wen Bin, Wong, Sunny Hei
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c522t-2a277d09e83d7a3ee1fd1d8d736b77f847881ef2c9049eadf2aec13d43687f1b3
cites cdi_FETCH-LOGICAL-c522t-2a277d09e83d7a3ee1fd1d8d736b77f847881ef2c9049eadf2aec13d43687f1b3
container_end_page 4815
container_issue
container_start_page 4804
container_title Computational and structural biotechnology journal
container_volume 21
creator Zhou, Ruwen
Ng, Siu Kin
Sung, Joseph Jao Yiu
Goh, Wilson Wen Bin
Wong, Sunny Hei
description The human microbiome is an emerging research frontier due to its profound impacts on health. High-throughput microbiome sequencing enables studying microbial communities but suffers from analytical challenges. In particular, the lack of dedicated preprocessing methods to improve data quality impedes effective minimization of biases prior to downstream analysis. This review aims to address this gap by providing a comprehensive overview of preprocessing techniques relevant to microbiome research. We outline a typical workflow for microbiome data analysis. Preprocessing methods discussed include quality filtering, batch effect correction, imputation of missing values, normalization, and data transformation. We highlight strengths and limitations of each technique to serve as a practical guide for researchers and identify areas needing further methodological development. Establishing robust, standardized preprocessing will be essential for drawing valid biological conclusions from microbiome studies.
doi_str_mv 10.1016/j.csbj.2023.10.001
format article
fullrecord <record><control><sourceid>proquest_doaj_</sourceid><recordid>TN_cdi_doaj_primary_oai_doaj_org_article_455954e4c90b418396db2c5ed9be092e</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S2001037023003574</els_id><doaj_id>oai_doaj_org_article_455954e4c90b418396db2c5ed9be092e</doaj_id><sourcerecordid>2878019858</sourcerecordid><originalsourceid>FETCH-LOGICAL-c522t-2a277d09e83d7a3ee1fd1d8d736b77f847881ef2c9049eadf2aec13d43687f1b3</originalsourceid><addsrcrecordid>eNp9kc1O3DAUhS1U1EEDL9AFyrKbDP5JYltCqtCUlpGQ2LRry7FvBkdJPLUzVNNV34E37JPUYQANG7y59vG5n617EPpE8IJgUl20CxPrdkExZUlYYEyO0AlNJceM4w8H-xk6i7HFaQlSSYY_ohnjoiCM4RO0_KpHnW0C5JvgDcTohnXW-JDpQXe7P9Opdyb42vkeMjuZ__19zK6SOrgswIOD36fouNFdhLPnOkc_v13_WN7kt3ffV8ur29yUlI451ZRziyUIZrlmAKSxxArLWVVz3oiCC0GgoUbiQoK2DdVgCLMFqwRvSM3maLXnWq9btQmu12GnvHbqSfBhrXQYnelAFWUpywKKxKoLIpisbE1NCVbWgCWFxPqyZ222dQ_WwDAG3b2Bvr0Z3L1a-wdFcFnJxE6Ez8-E4H9tIY6qd9FA1-kB_DYqKrjARIpSJCvdW9MgYwzQvL5DsJrSVK2a0lRTmpOWoktN54c_fG15yS4ZLvcGSDNPOQQVjYPBgHUBzJiG4t7j_wfRcbEX</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2878019858</pqid></control><display><type>article</type><title>Data pre-processing for analyzing microbiome data – A mini review</title><source>ScienceDirect Additional Titles</source><source>PubMed Central</source><creator>Zhou, Ruwen ; Ng, Siu Kin ; Sung, Joseph Jao Yiu ; Goh, Wilson Wen Bin ; Wong, Sunny Hei</creator><creatorcontrib>Zhou, Ruwen ; Ng, Siu Kin ; Sung, Joseph Jao Yiu ; Goh, Wilson Wen Bin ; Wong, Sunny Hei</creatorcontrib><description>The human microbiome is an emerging research frontier due to its profound impacts on health. High-throughput microbiome sequencing enables studying microbial communities but suffers from analytical challenges. In particular, the lack of dedicated preprocessing methods to improve data quality impedes effective minimization of biases prior to downstream analysis. This review aims to address this gap by providing a comprehensive overview of preprocessing techniques relevant to microbiome research. We outline a typical workflow for microbiome data analysis. Preprocessing methods discussed include quality filtering, batch effect correction, imputation of missing values, normalization, and data transformation. We highlight strengths and limitations of each technique to serve as a practical guide for researchers and identify areas needing further methodological development. Establishing robust, standardized preprocessing will be essential for drawing valid biological conclusions from microbiome studies.</description><identifier>ISSN: 2001-0370</identifier><identifier>EISSN: 2001-0370</identifier><identifier>DOI: 10.1016/j.csbj.2023.10.001</identifier><identifier>PMID: 37841330</identifier><language>eng</language><publisher>Netherlands: Elsevier B.V</publisher><subject>16S rRNA Sequencing ; Batch Effect ; Data Preprocessing ; Microbiome Data ; Mini-Review ; Normalization</subject><ispartof>Computational and structural biotechnology journal, 2023-01, Vol.21, p.4804-4815</ispartof><rights>2023 The Authors</rights><rights>2023 The Authors.</rights><rights>2023 The Authors 2023</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c522t-2a277d09e83d7a3ee1fd1d8d736b77f847881ef2c9049eadf2aec13d43687f1b3</citedby><cites>FETCH-LOGICAL-c522t-2a277d09e83d7a3ee1fd1d8d736b77f847881ef2c9049eadf2aec13d43687f1b3</cites><orcidid>0000-0002-7299-9509</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC10569954/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.sciencedirect.com/science/article/pii/S2001037023003574$$EHTML$$P50$$Gelsevier$$Hfree_for_read</linktohtml><link.rule.ids>230,314,727,780,784,885,3549,27924,27925,45780,53791,53793</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/37841330$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Zhou, Ruwen</creatorcontrib><creatorcontrib>Ng, Siu Kin</creatorcontrib><creatorcontrib>Sung, Joseph Jao Yiu</creatorcontrib><creatorcontrib>Goh, Wilson Wen Bin</creatorcontrib><creatorcontrib>Wong, Sunny Hei</creatorcontrib><title>Data pre-processing for analyzing microbiome data – A mini review</title><title>Computational and structural biotechnology journal</title><addtitle>Comput Struct Biotechnol J</addtitle><description>The human microbiome is an emerging research frontier due to its profound impacts on health. High-throughput microbiome sequencing enables studying microbial communities but suffers from analytical challenges. In particular, the lack of dedicated preprocessing methods to improve data quality impedes effective minimization of biases prior to downstream analysis. This review aims to address this gap by providing a comprehensive overview of preprocessing techniques relevant to microbiome research. We outline a typical workflow for microbiome data analysis. Preprocessing methods discussed include quality filtering, batch effect correction, imputation of missing values, normalization, and data transformation. We highlight strengths and limitations of each technique to serve as a practical guide for researchers and identify areas needing further methodological development. Establishing robust, standardized preprocessing will be essential for drawing valid biological conclusions from microbiome studies.</description><subject>16S rRNA Sequencing</subject><subject>Batch Effect</subject><subject>Data Preprocessing</subject><subject>Microbiome Data</subject><subject>Mini-Review</subject><subject>Normalization</subject><issn>2001-0370</issn><issn>2001-0370</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>DOA</sourceid><recordid>eNp9kc1O3DAUhS1U1EEDL9AFyrKbDP5JYltCqtCUlpGQ2LRry7FvBkdJPLUzVNNV34E37JPUYQANG7y59vG5n617EPpE8IJgUl20CxPrdkExZUlYYEyO0AlNJceM4w8H-xk6i7HFaQlSSYY_ohnjoiCM4RO0_KpHnW0C5JvgDcTohnXW-JDpQXe7P9Opdyb42vkeMjuZ__19zK6SOrgswIOD36fouNFdhLPnOkc_v13_WN7kt3ffV8ur29yUlI451ZRziyUIZrlmAKSxxArLWVVz3oiCC0GgoUbiQoK2DdVgCLMFqwRvSM3maLXnWq9btQmu12GnvHbqSfBhrXQYnelAFWUpywKKxKoLIpisbE1NCVbWgCWFxPqyZ222dQ_WwDAG3b2Bvr0Z3L1a-wdFcFnJxE6Ez8-E4H9tIY6qd9FA1-kB_DYqKrjARIpSJCvdW9MgYwzQvL5DsJrSVK2a0lRTmpOWoktN54c_fG15yS4ZLvcGSDNPOQQVjYPBgHUBzJiG4t7j_wfRcbEX</recordid><startdate>20230101</startdate><enddate>20230101</enddate><creator>Zhou, Ruwen</creator><creator>Ng, Siu Kin</creator><creator>Sung, Joseph Jao Yiu</creator><creator>Goh, Wilson Wen Bin</creator><creator>Wong, Sunny Hei</creator><general>Elsevier B.V</general><general>Research Network of Computational and Structural Biotechnology</general><general>Elsevier</general><scope>6I.</scope><scope>AAFTH</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><scope>5PM</scope><scope>DOA</scope><orcidid>https://orcid.org/0000-0002-7299-9509</orcidid></search><sort><creationdate>20230101</creationdate><title>Data pre-processing for analyzing microbiome data – A mini review</title><author>Zhou, Ruwen ; Ng, Siu Kin ; Sung, Joseph Jao Yiu ; Goh, Wilson Wen Bin ; Wong, Sunny Hei</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c522t-2a277d09e83d7a3ee1fd1d8d736b77f847881ef2c9049eadf2aec13d43687f1b3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>16S rRNA Sequencing</topic><topic>Batch Effect</topic><topic>Data Preprocessing</topic><topic>Microbiome Data</topic><topic>Mini-Review</topic><topic>Normalization</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Zhou, Ruwen</creatorcontrib><creatorcontrib>Ng, Siu Kin</creatorcontrib><creatorcontrib>Sung, Joseph Jao Yiu</creatorcontrib><creatorcontrib>Goh, Wilson Wen Bin</creatorcontrib><creatorcontrib>Wong, Sunny Hei</creatorcontrib><collection>ScienceDirect Open Access Titles</collection><collection>Elsevier:ScienceDirect:Open Access</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><collection>DOAJ: Directory of Open Access Journals</collection><jtitle>Computational and structural biotechnology journal</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Zhou, Ruwen</au><au>Ng, Siu Kin</au><au>Sung, Joseph Jao Yiu</au><au>Goh, Wilson Wen Bin</au><au>Wong, Sunny Hei</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Data pre-processing for analyzing microbiome data – A mini review</atitle><jtitle>Computational and structural biotechnology journal</jtitle><addtitle>Comput Struct Biotechnol J</addtitle><date>2023-01-01</date><risdate>2023</risdate><volume>21</volume><spage>4804</spage><epage>4815</epage><pages>4804-4815</pages><issn>2001-0370</issn><eissn>2001-0370</eissn><abstract>The human microbiome is an emerging research frontier due to its profound impacts on health. High-throughput microbiome sequencing enables studying microbial communities but suffers from analytical challenges. In particular, the lack of dedicated preprocessing methods to improve data quality impedes effective minimization of biases prior to downstream analysis. This review aims to address this gap by providing a comprehensive overview of preprocessing techniques relevant to microbiome research. We outline a typical workflow for microbiome data analysis. Preprocessing methods discussed include quality filtering, batch effect correction, imputation of missing values, normalization, and data transformation. We highlight strengths and limitations of each technique to serve as a practical guide for researchers and identify areas needing further methodological development. Establishing robust, standardized preprocessing will be essential for drawing valid biological conclusions from microbiome studies.</abstract><cop>Netherlands</cop><pub>Elsevier B.V</pub><pmid>37841330</pmid><doi>10.1016/j.csbj.2023.10.001</doi><tpages>12</tpages><orcidid>https://orcid.org/0000-0002-7299-9509</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 2001-0370
ispartof Computational and structural biotechnology journal, 2023-01, Vol.21, p.4804-4815
issn 2001-0370
2001-0370
language eng
recordid cdi_doaj_primary_oai_doaj_org_article_455954e4c90b418396db2c5ed9be092e
source ScienceDirect Additional Titles; PubMed Central
subjects 16S rRNA Sequencing
Batch Effect
Data Preprocessing
Microbiome Data
Mini-Review
Normalization
title Data pre-processing for analyzing microbiome data – A mini review
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-04T16%3A58%3A43IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_doaj_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Data%20pre-processing%20for%20analyzing%20microbiome%20data%20%E2%80%93%20A%20mini%20review&rft.jtitle=Computational%20and%20structural%20biotechnology%20journal&rft.au=Zhou,%20Ruwen&rft.date=2023-01-01&rft.volume=21&rft.spage=4804&rft.epage=4815&rft.pages=4804-4815&rft.issn=2001-0370&rft.eissn=2001-0370&rft_id=info:doi/10.1016/j.csbj.2023.10.001&rft_dat=%3Cproquest_doaj_%3E2878019858%3C/proquest_doaj_%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c522t-2a277d09e83d7a3ee1fd1d8d736b77f847881ef2c9049eadf2aec13d43687f1b3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2878019858&rft_id=info:pmid/37841330&rfr_iscdi=true