Loading…
RACS: rapid analysis of ChIP-Seq data for contig based genomes
Chromatin immunoprecipitation coupled to next generation sequencing (ChIP-Seq) is a widely-used molecular method to investigate the function of chromatin-related proteins by identifying their associated DNA sequences on a genomic scale. ChIP-Seq generates large quantities of data that is difficult t...
Saved in:
Published in: | BMC bioinformatics 2019-10, Vol.20 (1), p.533-533, Article 533 |
---|---|
Main Authors: | , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | cdi_FETCH-LOGICAL-c566t-9685bf9134b5c07b4cca60418329b46d809a5c4aea651c316df948c84bdc539c3 |
---|---|
cites | cdi_FETCH-LOGICAL-c566t-9685bf9134b5c07b4cca60418329b46d809a5c4aea651c316df948c84bdc539c3 |
container_end_page | 533 |
container_issue | 1 |
container_start_page | 533 |
container_title | BMC bioinformatics |
container_volume | 20 |
creator | Saettone, Alejandro Ponce, Marcelo Nabeel-Shah, Syed Fillingham, Jeffrey |
description | Chromatin immunoprecipitation coupled to next generation sequencing (ChIP-Seq) is a widely-used molecular method to investigate the function of chromatin-related proteins by identifying their associated DNA sequences on a genomic scale. ChIP-Seq generates large quantities of data that is difficult to process and analyze, particularly for organisms with a contig-based sequenced genomes that typically have minimal annotation on their associated set of genes other than their associated coordinates primarily predicted by gene finding programs. Poorly annotated genome sequence makes comprehensive analysis of ChIP-Seq data difficult and as such standardized analysis pipelines are lacking.
We present a one-stop computational pipeline, "Rapid Analysis of ChIP-Seq data" (RACS), that utilizes traditional High-Performance Computing (HPC) techniques in association with open source tools for processing and analyzing raw ChIP-Seq data. RACS is an open source computational pipeline available from any of the following repositories https://bitbucket.org/mjponce/RACS or https://gitrepos.scinet.utoronto.ca/public/?a=summary&p=RACS . RACS is particularly useful for ChIP-Seq in organisms with contig-based genomes that have poor gene annotation to aid protein function discovery.To test the performance and efficiency of RACS, we analyzed ChIP-Seq data previously published in a model organism Tetrahymena thermophila which has a contig-based genome. We assessed the generality of RACS by analyzing a previously published data set generated using the model organism Oxytricha trifallax, whose genome sequence is also contig-based with poor annotation.
The RACS computational pipeline presented in this report is an efficient and reliable tool to analyze genome-wide raw ChIP-Seq data generated in model organisms with poorly annotated contig-based genome sequence. Because RACS segregates the found read accumulations between genic and intergenic regions, it is particularly efficient for rapid downstream analyses of proteins involved in gene expression. |
doi_str_mv | 10.1186/s12859-019-3100-2 |
format | article |
fullrecord | <record><control><sourceid>gale_doaj_</sourceid><recordid>TN_cdi_doaj_primary_oai_doaj_org_article_eceafd6ca4fc4ac191dac36ec20fd518</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><galeid>A604551071</galeid><doaj_id>oai_doaj_org_article_eceafd6ca4fc4ac191dac36ec20fd518</doaj_id><sourcerecordid>A604551071</sourcerecordid><originalsourceid>FETCH-LOGICAL-c566t-9685bf9134b5c07b4cca60418329b46d809a5c4aea651c316df948c84bdc539c3</originalsourceid><addsrcrecordid>eNptkk1v1DAQhi0EomXhB3BBkbjQQ4qd2I7NodJqVWClSqAunK2JP1JX2XhrZxH99zikVI2EfLA1fub1eOZF6C3B54QI_jGRSjBZYiLLmmBcVs_QKaENKSuC2fMn5xP0KqVbjEkjMHuJTmrCORWyOkUX1-vN7lMR4eBNAQP098mnIrhic7P9Xu7sXWFghMKFWOgwjL4rWkjWFJ0dwt6m1-iFgz7ZNw_7Cv38fPlj87W8-vZlu1lflZpxPpaSC9Y6SWraMo2blmoNHFMi6kq2lBuBJTBNwQJnROfqjJNUaEFbo1ktdb1C21nXBLhVh-j3EO9VAK_-BkLsFMTR694qqy04wzVQlxU1kcSArrnVFXaG5SdX6GLWOhzbvTXaDmOEfiG6vBn8jerCL8UFyVU1WeDDg0AMd0ebRrX3Sdu-h8GGY1JVHkZDKJY8o-9ntINcmh9cyIp6wtU6N4CxiczU-X-ovIzd-9x263yOLxLOFgnTaOzvsYNjSmq7u16yZGZ1DClF6x5_SrCabKRmG6lsIzXZKNe_Qu-etugx459v6j_HHb_m</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2310714096</pqid></control><display><type>article</type><title>RACS: rapid analysis of ChIP-Seq data for contig based genomes</title><source>PMC (PubMed Central)</source><source>Publicly Available Content (ProQuest)</source><creator>Saettone, Alejandro ; Ponce, Marcelo ; Nabeel-Shah, Syed ; Fillingham, Jeffrey</creator><creatorcontrib>Saettone, Alejandro ; Ponce, Marcelo ; Nabeel-Shah, Syed ; Fillingham, Jeffrey</creatorcontrib><description>Chromatin immunoprecipitation coupled to next generation sequencing (ChIP-Seq) is a widely-used molecular method to investigate the function of chromatin-related proteins by identifying their associated DNA sequences on a genomic scale. ChIP-Seq generates large quantities of data that is difficult to process and analyze, particularly for organisms with a contig-based sequenced genomes that typically have minimal annotation on their associated set of genes other than their associated coordinates primarily predicted by gene finding programs. Poorly annotated genome sequence makes comprehensive analysis of ChIP-Seq data difficult and as such standardized analysis pipelines are lacking.
We present a one-stop computational pipeline, "Rapid Analysis of ChIP-Seq data" (RACS), that utilizes traditional High-Performance Computing (HPC) techniques in association with open source tools for processing and analyzing raw ChIP-Seq data. RACS is an open source computational pipeline available from any of the following repositories https://bitbucket.org/mjponce/RACS or https://gitrepos.scinet.utoronto.ca/public/?a=summary&p=RACS . RACS is particularly useful for ChIP-Seq in organisms with contig-based genomes that have poor gene annotation to aid protein function discovery.To test the performance and efficiency of RACS, we analyzed ChIP-Seq data previously published in a model organism Tetrahymena thermophila which has a contig-based genome. We assessed the generality of RACS by analyzing a previously published data set generated using the model organism Oxytricha trifallax, whose genome sequence is also contig-based with poor annotation.
The RACS computational pipeline presented in this report is an efficient and reliable tool to analyze genome-wide raw ChIP-Seq data generated in model organisms with poorly annotated contig-based genome sequence. Because RACS segregates the found read accumulations between genic and intergenic regions, it is particularly efficient for rapid downstream analyses of proteins involved in gene expression.</description><identifier>ISSN: 1471-2105</identifier><identifier>EISSN: 1471-2105</identifier><identifier>DOI: 10.1186/s12859-019-3100-2</identifier><identifier>PMID: 31664892</identifier><language>eng</language><publisher>England: BioMed Central Ltd</publisher><subject>Analysis ; Bioinformatics pipeline ; Chromatin ; Chromatin immunoprecipitation ; Chromatin Immunoprecipitation Sequencing ; Chromosome Mapping ; Computational biology ; DNA ; DNA sequencing ; Gene expression ; Genes ; Genome ; Genomes ; Genomics ; Genomics - methods ; High-performance computing ; Humans ; Information management ; Methodology ; Methods ; Molecular Sequence Annotation ; Next generation sequencing ; Proteins ; Sequence Analysis, DNA ; Tetrahymena thermophila</subject><ispartof>BMC bioinformatics, 2019-10, Vol.20 (1), p.533-533, Article 533</ispartof><rights>COPYRIGHT 2019 BioMed Central Ltd.</rights><rights>The Author(s) 2019</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c566t-9685bf9134b5c07b4cca60418329b46d809a5c4aea651c316df948c84bdc539c3</citedby><cites>FETCH-LOGICAL-c566t-9685bf9134b5c07b4cca60418329b46d809a5c4aea651c316df948c84bdc539c3</cites><orcidid>0000-0002-9000-1458</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC6819487/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC6819487/$$EHTML$$P50$$Gpubmedcentral$$Hfree_for_read</linktohtml><link.rule.ids>230,314,727,780,784,885,27924,27925,37013,53791,53793</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/31664892$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Saettone, Alejandro</creatorcontrib><creatorcontrib>Ponce, Marcelo</creatorcontrib><creatorcontrib>Nabeel-Shah, Syed</creatorcontrib><creatorcontrib>Fillingham, Jeffrey</creatorcontrib><title>RACS: rapid analysis of ChIP-Seq data for contig based genomes</title><title>BMC bioinformatics</title><addtitle>BMC Bioinformatics</addtitle><description>Chromatin immunoprecipitation coupled to next generation sequencing (ChIP-Seq) is a widely-used molecular method to investigate the function of chromatin-related proteins by identifying their associated DNA sequences on a genomic scale. ChIP-Seq generates large quantities of data that is difficult to process and analyze, particularly for organisms with a contig-based sequenced genomes that typically have minimal annotation on their associated set of genes other than their associated coordinates primarily predicted by gene finding programs. Poorly annotated genome sequence makes comprehensive analysis of ChIP-Seq data difficult and as such standardized analysis pipelines are lacking.
We present a one-stop computational pipeline, "Rapid Analysis of ChIP-Seq data" (RACS), that utilizes traditional High-Performance Computing (HPC) techniques in association with open source tools for processing and analyzing raw ChIP-Seq data. RACS is an open source computational pipeline available from any of the following repositories https://bitbucket.org/mjponce/RACS or https://gitrepos.scinet.utoronto.ca/public/?a=summary&p=RACS . RACS is particularly useful for ChIP-Seq in organisms with contig-based genomes that have poor gene annotation to aid protein function discovery.To test the performance and efficiency of RACS, we analyzed ChIP-Seq data previously published in a model organism Tetrahymena thermophila which has a contig-based genome. We assessed the generality of RACS by analyzing a previously published data set generated using the model organism Oxytricha trifallax, whose genome sequence is also contig-based with poor annotation.
The RACS computational pipeline presented in this report is an efficient and reliable tool to analyze genome-wide raw ChIP-Seq data generated in model organisms with poorly annotated contig-based genome sequence. Because RACS segregates the found read accumulations between genic and intergenic regions, it is particularly efficient for rapid downstream analyses of proteins involved in gene expression.</description><subject>Analysis</subject><subject>Bioinformatics pipeline</subject><subject>Chromatin</subject><subject>Chromatin immunoprecipitation</subject><subject>Chromatin Immunoprecipitation Sequencing</subject><subject>Chromosome Mapping</subject><subject>Computational biology</subject><subject>DNA</subject><subject>DNA sequencing</subject><subject>Gene expression</subject><subject>Genes</subject><subject>Genome</subject><subject>Genomes</subject><subject>Genomics</subject><subject>Genomics - methods</subject><subject>High-performance computing</subject><subject>Humans</subject><subject>Information management</subject><subject>Methodology</subject><subject>Methods</subject><subject>Molecular Sequence Annotation</subject><subject>Next generation sequencing</subject><subject>Proteins</subject><subject>Sequence Analysis, DNA</subject><subject>Tetrahymena thermophila</subject><issn>1471-2105</issn><issn>1471-2105</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2019</creationdate><recordtype>article</recordtype><sourceid>DOA</sourceid><recordid>eNptkk1v1DAQhi0EomXhB3BBkbjQQ4qd2I7NodJqVWClSqAunK2JP1JX2XhrZxH99zikVI2EfLA1fub1eOZF6C3B54QI_jGRSjBZYiLLmmBcVs_QKaENKSuC2fMn5xP0KqVbjEkjMHuJTmrCORWyOkUX1-vN7lMR4eBNAQP098mnIrhic7P9Xu7sXWFghMKFWOgwjL4rWkjWFJ0dwt6m1-iFgz7ZNw_7Cv38fPlj87W8-vZlu1lflZpxPpaSC9Y6SWraMo2blmoNHFMi6kq2lBuBJTBNwQJnROfqjJNUaEFbo1ktdb1C21nXBLhVh-j3EO9VAK_-BkLsFMTR694qqy04wzVQlxU1kcSArrnVFXaG5SdX6GLWOhzbvTXaDmOEfiG6vBn8jerCL8UFyVU1WeDDg0AMd0ebRrX3Sdu-h8GGY1JVHkZDKJY8o-9ntINcmh9cyIp6wtU6N4CxiczU-X-ovIzd-9x263yOLxLOFgnTaOzvsYNjSmq7u16yZGZ1DClF6x5_SrCabKRmG6lsIzXZKNe_Qu-etugx459v6j_HHb_m</recordid><startdate>20191029</startdate><enddate>20191029</enddate><creator>Saettone, Alejandro</creator><creator>Ponce, Marcelo</creator><creator>Nabeel-Shah, Syed</creator><creator>Fillingham, Jeffrey</creator><general>BioMed Central Ltd</general><general>BioMed Central</general><general>BMC</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>ISR</scope><scope>7X8</scope><scope>5PM</scope><scope>DOA</scope><orcidid>https://orcid.org/0000-0002-9000-1458</orcidid></search><sort><creationdate>20191029</creationdate><title>RACS: rapid analysis of ChIP-Seq data for contig based genomes</title><author>Saettone, Alejandro ; Ponce, Marcelo ; Nabeel-Shah, Syed ; Fillingham, Jeffrey</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c566t-9685bf9134b5c07b4cca60418329b46d809a5c4aea651c316df948c84bdc539c3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2019</creationdate><topic>Analysis</topic><topic>Bioinformatics pipeline</topic><topic>Chromatin</topic><topic>Chromatin immunoprecipitation</topic><topic>Chromatin Immunoprecipitation Sequencing</topic><topic>Chromosome Mapping</topic><topic>Computational biology</topic><topic>DNA</topic><topic>DNA sequencing</topic><topic>Gene expression</topic><topic>Genes</topic><topic>Genome</topic><topic>Genomes</topic><topic>Genomics</topic><topic>Genomics - methods</topic><topic>High-performance computing</topic><topic>Humans</topic><topic>Information management</topic><topic>Methodology</topic><topic>Methods</topic><topic>Molecular Sequence Annotation</topic><topic>Next generation sequencing</topic><topic>Proteins</topic><topic>Sequence Analysis, DNA</topic><topic>Tetrahymena thermophila</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Saettone, Alejandro</creatorcontrib><creatorcontrib>Ponce, Marcelo</creatorcontrib><creatorcontrib>Nabeel-Shah, Syed</creatorcontrib><creatorcontrib>Fillingham, Jeffrey</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Gale In Context: Science</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><collection>DOAJ Directory of Open Access Journals</collection><jtitle>BMC bioinformatics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Saettone, Alejandro</au><au>Ponce, Marcelo</au><au>Nabeel-Shah, Syed</au><au>Fillingham, Jeffrey</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>RACS: rapid analysis of ChIP-Seq data for contig based genomes</atitle><jtitle>BMC bioinformatics</jtitle><addtitle>BMC Bioinformatics</addtitle><date>2019-10-29</date><risdate>2019</risdate><volume>20</volume><issue>1</issue><spage>533</spage><epage>533</epage><pages>533-533</pages><artnum>533</artnum><issn>1471-2105</issn><eissn>1471-2105</eissn><abstract>Chromatin immunoprecipitation coupled to next generation sequencing (ChIP-Seq) is a widely-used molecular method to investigate the function of chromatin-related proteins by identifying their associated DNA sequences on a genomic scale. ChIP-Seq generates large quantities of data that is difficult to process and analyze, particularly for organisms with a contig-based sequenced genomes that typically have minimal annotation on their associated set of genes other than their associated coordinates primarily predicted by gene finding programs. Poorly annotated genome sequence makes comprehensive analysis of ChIP-Seq data difficult and as such standardized analysis pipelines are lacking.
We present a one-stop computational pipeline, "Rapid Analysis of ChIP-Seq data" (RACS), that utilizes traditional High-Performance Computing (HPC) techniques in association with open source tools for processing and analyzing raw ChIP-Seq data. RACS is an open source computational pipeline available from any of the following repositories https://bitbucket.org/mjponce/RACS or https://gitrepos.scinet.utoronto.ca/public/?a=summary&p=RACS . RACS is particularly useful for ChIP-Seq in organisms with contig-based genomes that have poor gene annotation to aid protein function discovery.To test the performance and efficiency of RACS, we analyzed ChIP-Seq data previously published in a model organism Tetrahymena thermophila which has a contig-based genome. We assessed the generality of RACS by analyzing a previously published data set generated using the model organism Oxytricha trifallax, whose genome sequence is also contig-based with poor annotation.
The RACS computational pipeline presented in this report is an efficient and reliable tool to analyze genome-wide raw ChIP-Seq data generated in model organisms with poorly annotated contig-based genome sequence. Because RACS segregates the found read accumulations between genic and intergenic regions, it is particularly efficient for rapid downstream analyses of proteins involved in gene expression.</abstract><cop>England</cop><pub>BioMed Central Ltd</pub><pmid>31664892</pmid><doi>10.1186/s12859-019-3100-2</doi><tpages>1</tpages><orcidid>https://orcid.org/0000-0002-9000-1458</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1471-2105 |
ispartof | BMC bioinformatics, 2019-10, Vol.20 (1), p.533-533, Article 533 |
issn | 1471-2105 1471-2105 |
language | eng |
recordid | cdi_doaj_primary_oai_doaj_org_article_eceafd6ca4fc4ac191dac36ec20fd518 |
source | PMC (PubMed Central); Publicly Available Content (ProQuest) |
subjects | Analysis Bioinformatics pipeline Chromatin Chromatin immunoprecipitation Chromatin Immunoprecipitation Sequencing Chromosome Mapping Computational biology DNA DNA sequencing Gene expression Genes Genome Genomes Genomics Genomics - methods High-performance computing Humans Information management Methodology Methods Molecular Sequence Annotation Next generation sequencing Proteins Sequence Analysis, DNA Tetrahymena thermophila |
title | RACS: rapid analysis of ChIP-Seq data for contig based genomes |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-27T16%3A59%3A33IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-gale_doaj_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=RACS:%20rapid%20analysis%20of%20ChIP-Seq%20data%20for%20contig%20based%20genomes&rft.jtitle=BMC%20bioinformatics&rft.au=Saettone,%20Alejandro&rft.date=2019-10-29&rft.volume=20&rft.issue=1&rft.spage=533&rft.epage=533&rft.pages=533-533&rft.artnum=533&rft.issn=1471-2105&rft.eissn=1471-2105&rft_id=info:doi/10.1186/s12859-019-3100-2&rft_dat=%3Cgale_doaj_%3EA604551071%3C/gale_doaj_%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c566t-9685bf9134b5c07b4cca60418329b46d809a5c4aea651c316df948c84bdc539c3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2310714096&rft_id=info:pmid/31664892&rft_galeid=A604551071&rfr_iscdi=true |