Loading…

application of Markov chain analysis to oligonucleotide frequency prediction and physical mapping of Drosophila melanogaster

Here we compare several methods for predicting oligonucleotide frequencies in 691 kb of Drosophila melanogaster DNA. As in previous work on Escherichia coli and Saccharomyces cerevisiae, a relatively simple equation based on tetranucleotide frequencies can be used in predicting frequencies of higher...

Full description

Saved in:
Bibliographic Details
Published in:Nucleic acids research 1993, Vol.20 (14), p.3651-3657
Main Authors: Cuticchia, A.J, Ivarie, R, Arnold, J
Format: Article
Language:English
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by
cites
container_end_page 3657
container_issue 14
container_start_page 3651
container_title Nucleic acids research
container_volume 20
creator Cuticchia, A.J
Ivarie, R
Arnold, J
description Here we compare several methods for predicting oligonucleotide frequencies in 691 kb of Drosophila melanogaster DNA. As in previous work on Escherichia coli and Saccharomyces cerevisiae, a relatively simple equation based on tetranucleotide frequencies can be used in predicting frequencies of higher order oligonucleotides. For example, the mean of observed/expected abundances of 4,096 hexamers was 1.07 with a sample standard deviation of .55. This simple predictor arises by considering each base on the sense strand of D.melanogaster to depend only on the three bases 5' to it (a 3rd order Markov chain) and is more accurate than the random predictor. This equation is useful in predicting restriction enzyme fragment sizes, selecting restriction enzymes that cut preferentially in coding vs noncoding regions, and in selecting probes to fingerprint clones in contig mapping. Once again, this equation well predicts the occurrence of higher order oligonucleotides, supporting our hypothesis that this predictor holds in evolutionarily diverse organisms. When ranked from highest to lowest abundance, the observed frequencies of oligomers of a given length are closely tracked by the predicted abundances of a 3rd order Markov chain. Through use of the dependence of oligomer frequencies on base composition, we report a list of oligomers that will be useful for the completion of a cosmid physical map of D.melanogaster. Presently, the library is such that it will be possible to construct large contigs using only 30 oligonucleotide probes to fingerprint cosmids.
format article
fullrecord <record><control><sourceid>fao</sourceid><recordid>TN_cdi_fao_agris_US201301783955</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>US201301783955</sourcerecordid><originalsourceid>FETCH-fao_agris_US2013017839553</originalsourceid><addsrcrecordid>eNqFzEFuwjAQhWGrKlLTljMwF4hkxwkk6wLqpqvCOho5TjJgPK4dkJB6-NKqe1Zv87_vQWRKL4u8bJbFo8ikllWuZFk_ieeUDlKqUlVlJr4xBEcGJ2IP3MMHxiNfwIxIHtCjuyZKMDGwo4H92TjLE3UW-mi_ztabK4RoOzJ_APoOwni7GHRwutHkh191HTlxGMkhnKxDzwOmycZXMevRJTv_3xex2G52b-95j9ziECm1-89CKi3VqtZNVen7xQ8EgEzg</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>application of Markov chain analysis to oligonucleotide frequency prediction and physical mapping of Drosophila melanogaster</title><source>Open Access: PubMed Central</source><source>Oxford University Press Archive</source><creator>Cuticchia, A.J ; Ivarie, R ; Arnold, J</creator><creatorcontrib>Cuticchia, A.J ; Ivarie, R ; Arnold, J</creatorcontrib><description>Here we compare several methods for predicting oligonucleotide frequencies in 691 kb of Drosophila melanogaster DNA. As in previous work on Escherichia coli and Saccharomyces cerevisiae, a relatively simple equation based on tetranucleotide frequencies can be used in predicting frequencies of higher order oligonucleotides. For example, the mean of observed/expected abundances of 4,096 hexamers was 1.07 with a sample standard deviation of .55. This simple predictor arises by considering each base on the sense strand of D.melanogaster to depend only on the three bases 5' to it (a 3rd order Markov chain) and is more accurate than the random predictor. This equation is useful in predicting restriction enzyme fragment sizes, selecting restriction enzymes that cut preferentially in coding vs noncoding regions, and in selecting probes to fingerprint clones in contig mapping. Once again, this equation well predicts the occurrence of higher order oligonucleotides, supporting our hypothesis that this predictor holds in evolutionarily diverse organisms. When ranked from highest to lowest abundance, the observed frequencies of oligomers of a given length are closely tracked by the predicted abundances of a 3rd order Markov chain. Through use of the dependence of oligomer frequencies on base composition, we report a list of oligomers that will be useful for the completion of a cosmid physical map of D.melanogaster. Presently, the library is such that it will be possible to construct large contigs using only 30 oligonucleotide probes to fingerprint cosmids.</description><identifier>ISSN: 0305-1048</identifier><identifier>EISSN: 1362-4962</identifier><language>eng</language><subject>chromosome mapping ; Drosophila melanogaster ; markov processes ; nucleotide sequences ; prediction</subject><ispartof>Nucleic acids research, 1993, Vol.20 (14), p.3651-3657</ispartof><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,4024</link.rule.ids></links><search><creatorcontrib>Cuticchia, A.J</creatorcontrib><creatorcontrib>Ivarie, R</creatorcontrib><creatorcontrib>Arnold, J</creatorcontrib><title>application of Markov chain analysis to oligonucleotide frequency prediction and physical mapping of Drosophila melanogaster</title><title>Nucleic acids research</title><description>Here we compare several methods for predicting oligonucleotide frequencies in 691 kb of Drosophila melanogaster DNA. As in previous work on Escherichia coli and Saccharomyces cerevisiae, a relatively simple equation based on tetranucleotide frequencies can be used in predicting frequencies of higher order oligonucleotides. For example, the mean of observed/expected abundances of 4,096 hexamers was 1.07 with a sample standard deviation of .55. This simple predictor arises by considering each base on the sense strand of D.melanogaster to depend only on the three bases 5' to it (a 3rd order Markov chain) and is more accurate than the random predictor. This equation is useful in predicting restriction enzyme fragment sizes, selecting restriction enzymes that cut preferentially in coding vs noncoding regions, and in selecting probes to fingerprint clones in contig mapping. Once again, this equation well predicts the occurrence of higher order oligonucleotides, supporting our hypothesis that this predictor holds in evolutionarily diverse organisms. When ranked from highest to lowest abundance, the observed frequencies of oligomers of a given length are closely tracked by the predicted abundances of a 3rd order Markov chain. Through use of the dependence of oligomer frequencies on base composition, we report a list of oligomers that will be useful for the completion of a cosmid physical map of D.melanogaster. Presently, the library is such that it will be possible to construct large contigs using only 30 oligonucleotide probes to fingerprint cosmids.</description><subject>chromosome mapping</subject><subject>Drosophila melanogaster</subject><subject>markov processes</subject><subject>nucleotide sequences</subject><subject>prediction</subject><issn>0305-1048</issn><issn>1362-4962</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>1993</creationdate><recordtype>article</recordtype><recordid>eNqFzEFuwjAQhWGrKlLTljMwF4hkxwkk6wLqpqvCOho5TjJgPK4dkJB6-NKqe1Zv87_vQWRKL4u8bJbFo8ikllWuZFk_ieeUDlKqUlVlJr4xBEcGJ2IP3MMHxiNfwIxIHtCjuyZKMDGwo4H92TjLE3UW-mi_ztabK4RoOzJ_APoOwni7GHRwutHkh191HTlxGMkhnKxDzwOmycZXMevRJTv_3xex2G52b-95j9ziECm1-89CKi3VqtZNVen7xQ8EgEzg</recordid><startdate>1993</startdate><enddate>1993</enddate><creator>Cuticchia, A.J</creator><creator>Ivarie, R</creator><creator>Arnold, J</creator><scope>FBQ</scope></search><sort><creationdate>1993</creationdate><title>application of Markov chain analysis to oligonucleotide frequency prediction and physical mapping of Drosophila melanogaster</title><author>Cuticchia, A.J ; Ivarie, R ; Arnold, J</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-fao_agris_US2013017839553</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>1993</creationdate><topic>chromosome mapping</topic><topic>Drosophila melanogaster</topic><topic>markov processes</topic><topic>nucleotide sequences</topic><topic>prediction</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Cuticchia, A.J</creatorcontrib><creatorcontrib>Ivarie, R</creatorcontrib><creatorcontrib>Arnold, J</creatorcontrib><collection>AGRIS</collection><jtitle>Nucleic acids research</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Cuticchia, A.J</au><au>Ivarie, R</au><au>Arnold, J</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>application of Markov chain analysis to oligonucleotide frequency prediction and physical mapping of Drosophila melanogaster</atitle><jtitle>Nucleic acids research</jtitle><date>1993</date><risdate>1993</risdate><volume>20</volume><issue>14</issue><spage>3651</spage><epage>3657</epage><pages>3651-3657</pages><issn>0305-1048</issn><eissn>1362-4962</eissn><abstract>Here we compare several methods for predicting oligonucleotide frequencies in 691 kb of Drosophila melanogaster DNA. As in previous work on Escherichia coli and Saccharomyces cerevisiae, a relatively simple equation based on tetranucleotide frequencies can be used in predicting frequencies of higher order oligonucleotides. For example, the mean of observed/expected abundances of 4,096 hexamers was 1.07 with a sample standard deviation of .55. This simple predictor arises by considering each base on the sense strand of D.melanogaster to depend only on the three bases 5' to it (a 3rd order Markov chain) and is more accurate than the random predictor. This equation is useful in predicting restriction enzyme fragment sizes, selecting restriction enzymes that cut preferentially in coding vs noncoding regions, and in selecting probes to fingerprint clones in contig mapping. Once again, this equation well predicts the occurrence of higher order oligonucleotides, supporting our hypothesis that this predictor holds in evolutionarily diverse organisms. When ranked from highest to lowest abundance, the observed frequencies of oligomers of a given length are closely tracked by the predicted abundances of a 3rd order Markov chain. Through use of the dependence of oligomer frequencies on base composition, we report a list of oligomers that will be useful for the completion of a cosmid physical map of D.melanogaster. Presently, the library is such that it will be possible to construct large contigs using only 30 oligonucleotide probes to fingerprint cosmids.</abstract></addata></record>
fulltext fulltext
identifier ISSN: 0305-1048
ispartof Nucleic acids research, 1993, Vol.20 (14), p.3651-3657
issn 0305-1048
1362-4962
language eng
recordid cdi_fao_agris_US201301783955
source Open Access: PubMed Central; Oxford University Press Archive
subjects chromosome mapping
Drosophila melanogaster
markov processes
nucleotide sequences
prediction
title application of Markov chain analysis to oligonucleotide frequency prediction and physical mapping of Drosophila melanogaster
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-06T09%3A22%3A47IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-fao&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=application%20of%20Markov%20chain%20analysis%20to%20oligonucleotide%20frequency%20prediction%20and%20physical%20mapping%20of%20Drosophila%20melanogaster&rft.jtitle=Nucleic%20acids%20research&rft.au=Cuticchia,%20A.J&rft.date=1993&rft.volume=20&rft.issue=14&rft.spage=3651&rft.epage=3657&rft.pages=3651-3657&rft.issn=0305-1048&rft.eissn=1362-4962&rft_id=info:doi/&rft_dat=%3Cfao%3EUS201301783955%3C/fao%3E%3Cgrp_id%3Ecdi_FETCH-fao_agris_US2013017839553%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true