Loading…

RACS: rapid analysis of ChIP-Seq data for contig based genomes

Chromatin immunoprecipitation coupled to next generation sequencing (ChIP-Seq) is a widely-used molecular method to investigate the function of chromatin-related proteins by identifying their associated DNA sequences on a genomic scale. ChIP-Seq generates large quantities of data that is difficult t...

Full description

Saved in:
Bibliographic Details
Published in:BMC bioinformatics 2019-10, Vol.20 (1), p.533-533, Article 533
Main Authors: Saettone, Alejandro, Ponce, Marcelo, Nabeel-Shah, Syed, Fillingham, Jeffrey
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c566t-9685bf9134b5c07b4cca60418329b46d809a5c4aea651c316df948c84bdc539c3
cites cdi_FETCH-LOGICAL-c566t-9685bf9134b5c07b4cca60418329b46d809a5c4aea651c316df948c84bdc539c3
container_end_page 533
container_issue 1
container_start_page 533
container_title BMC bioinformatics
container_volume 20
creator Saettone, Alejandro
Ponce, Marcelo
Nabeel-Shah, Syed
Fillingham, Jeffrey
description Chromatin immunoprecipitation coupled to next generation sequencing (ChIP-Seq) is a widely-used molecular method to investigate the function of chromatin-related proteins by identifying their associated DNA sequences on a genomic scale. ChIP-Seq generates large quantities of data that is difficult to process and analyze, particularly for organisms with a contig-based sequenced genomes that typically have minimal annotation on their associated set of genes other than their associated coordinates primarily predicted by gene finding programs. Poorly annotated genome sequence makes comprehensive analysis of ChIP-Seq data difficult and as such standardized analysis pipelines are lacking. We present a one-stop computational pipeline, "Rapid Analysis of ChIP-Seq data" (RACS), that utilizes traditional High-Performance Computing (HPC) techniques in association with open source tools for processing and analyzing raw ChIP-Seq data. RACS is an open source computational pipeline available from any of the following repositories https://bitbucket.org/mjponce/RACS or https://gitrepos.scinet.utoronto.ca/public/?a=summary&p=RACS . RACS is particularly useful for ChIP-Seq in organisms with contig-based genomes that have poor gene annotation to aid protein function discovery.To test the performance and efficiency of RACS, we analyzed ChIP-Seq data previously published in a model organism Tetrahymena thermophila which has a contig-based genome. We assessed the generality of RACS by analyzing a previously published data set generated using the model organism Oxytricha trifallax, whose genome sequence is also contig-based with poor annotation. The RACS computational pipeline presented in this report is an efficient and reliable tool to analyze genome-wide raw ChIP-Seq data generated in model organisms with poorly annotated contig-based genome sequence. Because RACS segregates the found read accumulations between genic and intergenic regions, it is particularly efficient for rapid downstream analyses of proteins involved in gene expression.
doi_str_mv 10.1186/s12859-019-3100-2
format article
fullrecord <record><control><sourceid>gale_doaj_</sourceid><recordid>TN_cdi_doaj_primary_oai_doaj_org_article_eceafd6ca4fc4ac191dac36ec20fd518</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><galeid>A604551071</galeid><doaj_id>oai_doaj_org_article_eceafd6ca4fc4ac191dac36ec20fd518</doaj_id><sourcerecordid>A604551071</sourcerecordid><originalsourceid>FETCH-LOGICAL-c566t-9685bf9134b5c07b4cca60418329b46d809a5c4aea651c316df948c84bdc539c3</originalsourceid><addsrcrecordid>eNptkk1v1DAQhi0EomXhB3BBkbjQQ4qd2I7NodJqVWClSqAunK2JP1JX2XhrZxH99zikVI2EfLA1fub1eOZF6C3B54QI_jGRSjBZYiLLmmBcVs_QKaENKSuC2fMn5xP0KqVbjEkjMHuJTmrCORWyOkUX1-vN7lMR4eBNAQP098mnIrhic7P9Xu7sXWFghMKFWOgwjL4rWkjWFJ0dwt6m1-iFgz7ZNw_7Cv38fPlj87W8-vZlu1lflZpxPpaSC9Y6SWraMo2blmoNHFMi6kq2lBuBJTBNwQJnROfqjJNUaEFbo1ktdb1C21nXBLhVh-j3EO9VAK_-BkLsFMTR694qqy04wzVQlxU1kcSArrnVFXaG5SdX6GLWOhzbvTXaDmOEfiG6vBn8jerCL8UFyVU1WeDDg0AMd0ebRrX3Sdu-h8GGY1JVHkZDKJY8o-9ntINcmh9cyIp6wtU6N4CxiczU-X-ovIzd-9x263yOLxLOFgnTaOzvsYNjSmq7u16yZGZ1DClF6x5_SrCabKRmG6lsIzXZKNe_Qu-etugx459v6j_HHb_m</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2310714096</pqid></control><display><type>article</type><title>RACS: rapid analysis of ChIP-Seq data for contig based genomes</title><source>PMC (PubMed Central)</source><source>Publicly Available Content (ProQuest)</source><creator>Saettone, Alejandro ; Ponce, Marcelo ; Nabeel-Shah, Syed ; Fillingham, Jeffrey</creator><creatorcontrib>Saettone, Alejandro ; Ponce, Marcelo ; Nabeel-Shah, Syed ; Fillingham, Jeffrey</creatorcontrib><description>Chromatin immunoprecipitation coupled to next generation sequencing (ChIP-Seq) is a widely-used molecular method to investigate the function of chromatin-related proteins by identifying their associated DNA sequences on a genomic scale. ChIP-Seq generates large quantities of data that is difficult to process and analyze, particularly for organisms with a contig-based sequenced genomes that typically have minimal annotation on their associated set of genes other than their associated coordinates primarily predicted by gene finding programs. Poorly annotated genome sequence makes comprehensive analysis of ChIP-Seq data difficult and as such standardized analysis pipelines are lacking. We present a one-stop computational pipeline, "Rapid Analysis of ChIP-Seq data" (RACS), that utilizes traditional High-Performance Computing (HPC) techniques in association with open source tools for processing and analyzing raw ChIP-Seq data. RACS is an open source computational pipeline available from any of the following repositories https://bitbucket.org/mjponce/RACS or https://gitrepos.scinet.utoronto.ca/public/?a=summary&amp;p=RACS . RACS is particularly useful for ChIP-Seq in organisms with contig-based genomes that have poor gene annotation to aid protein function discovery.To test the performance and efficiency of RACS, we analyzed ChIP-Seq data previously published in a model organism Tetrahymena thermophila which has a contig-based genome. We assessed the generality of RACS by analyzing a previously published data set generated using the model organism Oxytricha trifallax, whose genome sequence is also contig-based with poor annotation. The RACS computational pipeline presented in this report is an efficient and reliable tool to analyze genome-wide raw ChIP-Seq data generated in model organisms with poorly annotated contig-based genome sequence. Because RACS segregates the found read accumulations between genic and intergenic regions, it is particularly efficient for rapid downstream analyses of proteins involved in gene expression.</description><identifier>ISSN: 1471-2105</identifier><identifier>EISSN: 1471-2105</identifier><identifier>DOI: 10.1186/s12859-019-3100-2</identifier><identifier>PMID: 31664892</identifier><language>eng</language><publisher>England: BioMed Central Ltd</publisher><subject>Analysis ; Bioinformatics pipeline ; Chromatin ; Chromatin immunoprecipitation ; Chromatin Immunoprecipitation Sequencing ; Chromosome Mapping ; Computational biology ; DNA ; DNA sequencing ; Gene expression ; Genes ; Genome ; Genomes ; Genomics ; Genomics - methods ; High-performance computing ; Humans ; Information management ; Methodology ; Methods ; Molecular Sequence Annotation ; Next generation sequencing ; Proteins ; Sequence Analysis, DNA ; Tetrahymena thermophila</subject><ispartof>BMC bioinformatics, 2019-10, Vol.20 (1), p.533-533, Article 533</ispartof><rights>COPYRIGHT 2019 BioMed Central Ltd.</rights><rights>The Author(s) 2019</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c566t-9685bf9134b5c07b4cca60418329b46d809a5c4aea651c316df948c84bdc539c3</citedby><cites>FETCH-LOGICAL-c566t-9685bf9134b5c07b4cca60418329b46d809a5c4aea651c316df948c84bdc539c3</cites><orcidid>0000-0002-9000-1458</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC6819487/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC6819487/$$EHTML$$P50$$Gpubmedcentral$$Hfree_for_read</linktohtml><link.rule.ids>230,314,727,780,784,885,27924,27925,37013,53791,53793</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/31664892$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Saettone, Alejandro</creatorcontrib><creatorcontrib>Ponce, Marcelo</creatorcontrib><creatorcontrib>Nabeel-Shah, Syed</creatorcontrib><creatorcontrib>Fillingham, Jeffrey</creatorcontrib><title>RACS: rapid analysis of ChIP-Seq data for contig based genomes</title><title>BMC bioinformatics</title><addtitle>BMC Bioinformatics</addtitle><description>Chromatin immunoprecipitation coupled to next generation sequencing (ChIP-Seq) is a widely-used molecular method to investigate the function of chromatin-related proteins by identifying their associated DNA sequences on a genomic scale. ChIP-Seq generates large quantities of data that is difficult to process and analyze, particularly for organisms with a contig-based sequenced genomes that typically have minimal annotation on their associated set of genes other than their associated coordinates primarily predicted by gene finding programs. Poorly annotated genome sequence makes comprehensive analysis of ChIP-Seq data difficult and as such standardized analysis pipelines are lacking. We present a one-stop computational pipeline, "Rapid Analysis of ChIP-Seq data" (RACS), that utilizes traditional High-Performance Computing (HPC) techniques in association with open source tools for processing and analyzing raw ChIP-Seq data. RACS is an open source computational pipeline available from any of the following repositories https://bitbucket.org/mjponce/RACS or https://gitrepos.scinet.utoronto.ca/public/?a=summary&amp;p=RACS . RACS is particularly useful for ChIP-Seq in organisms with contig-based genomes that have poor gene annotation to aid protein function discovery.To test the performance and efficiency of RACS, we analyzed ChIP-Seq data previously published in a model organism Tetrahymena thermophila which has a contig-based genome. We assessed the generality of RACS by analyzing a previously published data set generated using the model organism Oxytricha trifallax, whose genome sequence is also contig-based with poor annotation. The RACS computational pipeline presented in this report is an efficient and reliable tool to analyze genome-wide raw ChIP-Seq data generated in model organisms with poorly annotated contig-based genome sequence. Because RACS segregates the found read accumulations between genic and intergenic regions, it is particularly efficient for rapid downstream analyses of proteins involved in gene expression.</description><subject>Analysis</subject><subject>Bioinformatics pipeline</subject><subject>Chromatin</subject><subject>Chromatin immunoprecipitation</subject><subject>Chromatin Immunoprecipitation Sequencing</subject><subject>Chromosome Mapping</subject><subject>Computational biology</subject><subject>DNA</subject><subject>DNA sequencing</subject><subject>Gene expression</subject><subject>Genes</subject><subject>Genome</subject><subject>Genomes</subject><subject>Genomics</subject><subject>Genomics - methods</subject><subject>High-performance computing</subject><subject>Humans</subject><subject>Information management</subject><subject>Methodology</subject><subject>Methods</subject><subject>Molecular Sequence Annotation</subject><subject>Next generation sequencing</subject><subject>Proteins</subject><subject>Sequence Analysis, DNA</subject><subject>Tetrahymena thermophila</subject><issn>1471-2105</issn><issn>1471-2105</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2019</creationdate><recordtype>article</recordtype><sourceid>DOA</sourceid><recordid>eNptkk1v1DAQhi0EomXhB3BBkbjQQ4qd2I7NodJqVWClSqAunK2JP1JX2XhrZxH99zikVI2EfLA1fub1eOZF6C3B54QI_jGRSjBZYiLLmmBcVs_QKaENKSuC2fMn5xP0KqVbjEkjMHuJTmrCORWyOkUX1-vN7lMR4eBNAQP098mnIrhic7P9Xu7sXWFghMKFWOgwjL4rWkjWFJ0dwt6m1-iFgz7ZNw_7Cv38fPlj87W8-vZlu1lflZpxPpaSC9Y6SWraMo2blmoNHFMi6kq2lBuBJTBNwQJnROfqjJNUaEFbo1ktdb1C21nXBLhVh-j3EO9VAK_-BkLsFMTR694qqy04wzVQlxU1kcSArrnVFXaG5SdX6GLWOhzbvTXaDmOEfiG6vBn8jerCL8UFyVU1WeDDg0AMd0ebRrX3Sdu-h8GGY1JVHkZDKJY8o-9ntINcmh9cyIp6wtU6N4CxiczU-X-ovIzd-9x263yOLxLOFgnTaOzvsYNjSmq7u16yZGZ1DClF6x5_SrCabKRmG6lsIzXZKNe_Qu-etugx459v6j_HHb_m</recordid><startdate>20191029</startdate><enddate>20191029</enddate><creator>Saettone, Alejandro</creator><creator>Ponce, Marcelo</creator><creator>Nabeel-Shah, Syed</creator><creator>Fillingham, Jeffrey</creator><general>BioMed Central Ltd</general><general>BioMed Central</general><general>BMC</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>ISR</scope><scope>7X8</scope><scope>5PM</scope><scope>DOA</scope><orcidid>https://orcid.org/0000-0002-9000-1458</orcidid></search><sort><creationdate>20191029</creationdate><title>RACS: rapid analysis of ChIP-Seq data for contig based genomes</title><author>Saettone, Alejandro ; Ponce, Marcelo ; Nabeel-Shah, Syed ; Fillingham, Jeffrey</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c566t-9685bf9134b5c07b4cca60418329b46d809a5c4aea651c316df948c84bdc539c3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2019</creationdate><topic>Analysis</topic><topic>Bioinformatics pipeline</topic><topic>Chromatin</topic><topic>Chromatin immunoprecipitation</topic><topic>Chromatin Immunoprecipitation Sequencing</topic><topic>Chromosome Mapping</topic><topic>Computational biology</topic><topic>DNA</topic><topic>DNA sequencing</topic><topic>Gene expression</topic><topic>Genes</topic><topic>Genome</topic><topic>Genomes</topic><topic>Genomics</topic><topic>Genomics - methods</topic><topic>High-performance computing</topic><topic>Humans</topic><topic>Information management</topic><topic>Methodology</topic><topic>Methods</topic><topic>Molecular Sequence Annotation</topic><topic>Next generation sequencing</topic><topic>Proteins</topic><topic>Sequence Analysis, DNA</topic><topic>Tetrahymena thermophila</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Saettone, Alejandro</creatorcontrib><creatorcontrib>Ponce, Marcelo</creatorcontrib><creatorcontrib>Nabeel-Shah, Syed</creatorcontrib><creatorcontrib>Fillingham, Jeffrey</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Gale In Context: Science</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><collection>DOAJ Directory of Open Access Journals</collection><jtitle>BMC bioinformatics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Saettone, Alejandro</au><au>Ponce, Marcelo</au><au>Nabeel-Shah, Syed</au><au>Fillingham, Jeffrey</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>RACS: rapid analysis of ChIP-Seq data for contig based genomes</atitle><jtitle>BMC bioinformatics</jtitle><addtitle>BMC Bioinformatics</addtitle><date>2019-10-29</date><risdate>2019</risdate><volume>20</volume><issue>1</issue><spage>533</spage><epage>533</epage><pages>533-533</pages><artnum>533</artnum><issn>1471-2105</issn><eissn>1471-2105</eissn><abstract>Chromatin immunoprecipitation coupled to next generation sequencing (ChIP-Seq) is a widely-used molecular method to investigate the function of chromatin-related proteins by identifying their associated DNA sequences on a genomic scale. ChIP-Seq generates large quantities of data that is difficult to process and analyze, particularly for organisms with a contig-based sequenced genomes that typically have minimal annotation on their associated set of genes other than their associated coordinates primarily predicted by gene finding programs. Poorly annotated genome sequence makes comprehensive analysis of ChIP-Seq data difficult and as such standardized analysis pipelines are lacking. We present a one-stop computational pipeline, "Rapid Analysis of ChIP-Seq data" (RACS), that utilizes traditional High-Performance Computing (HPC) techniques in association with open source tools for processing and analyzing raw ChIP-Seq data. RACS is an open source computational pipeline available from any of the following repositories https://bitbucket.org/mjponce/RACS or https://gitrepos.scinet.utoronto.ca/public/?a=summary&amp;p=RACS . RACS is particularly useful for ChIP-Seq in organisms with contig-based genomes that have poor gene annotation to aid protein function discovery.To test the performance and efficiency of RACS, we analyzed ChIP-Seq data previously published in a model organism Tetrahymena thermophila which has a contig-based genome. We assessed the generality of RACS by analyzing a previously published data set generated using the model organism Oxytricha trifallax, whose genome sequence is also contig-based with poor annotation. The RACS computational pipeline presented in this report is an efficient and reliable tool to analyze genome-wide raw ChIP-Seq data generated in model organisms with poorly annotated contig-based genome sequence. Because RACS segregates the found read accumulations between genic and intergenic regions, it is particularly efficient for rapid downstream analyses of proteins involved in gene expression.</abstract><cop>England</cop><pub>BioMed Central Ltd</pub><pmid>31664892</pmid><doi>10.1186/s12859-019-3100-2</doi><tpages>1</tpages><orcidid>https://orcid.org/0000-0002-9000-1458</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1471-2105
ispartof BMC bioinformatics, 2019-10, Vol.20 (1), p.533-533, Article 533
issn 1471-2105
1471-2105
language eng
recordid cdi_doaj_primary_oai_doaj_org_article_eceafd6ca4fc4ac191dac36ec20fd518
source PMC (PubMed Central); Publicly Available Content (ProQuest)
subjects Analysis
Bioinformatics pipeline
Chromatin
Chromatin immunoprecipitation
Chromatin Immunoprecipitation Sequencing
Chromosome Mapping
Computational biology
DNA
DNA sequencing
Gene expression
Genes
Genome
Genomes
Genomics
Genomics - methods
High-performance computing
Humans
Information management
Methodology
Methods
Molecular Sequence Annotation
Next generation sequencing
Proteins
Sequence Analysis, DNA
Tetrahymena thermophila
title RACS: rapid analysis of ChIP-Seq data for contig based genomes
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-27T16%3A59%3A33IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-gale_doaj_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=RACS:%20rapid%20analysis%20of%20ChIP-Seq%20data%20for%20contig%20based%20genomes&rft.jtitle=BMC%20bioinformatics&rft.au=Saettone,%20Alejandro&rft.date=2019-10-29&rft.volume=20&rft.issue=1&rft.spage=533&rft.epage=533&rft.pages=533-533&rft.artnum=533&rft.issn=1471-2105&rft.eissn=1471-2105&rft_id=info:doi/10.1186/s12859-019-3100-2&rft_dat=%3Cgale_doaj_%3EA604551071%3C/gale_doaj_%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c566t-9685bf9134b5c07b4cca60418329b46d809a5c4aea651c316df948c84bdc539c3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2310714096&rft_id=info:pmid/31664892&rft_galeid=A604551071&rfr_iscdi=true