Loading…
A comparative study of patent sequence databases
Nucleic acid and protein sequence data from patent publications is available from a plurality of commercial and public sources. As the searching and analysis of this data is of crucial importance to the life sciences industry, the Patent Documentation Group’s Biotechnology Information Working Group...
Saved in:
Published in: | World patent information 2008-12, Vol.30 (4), p.300-308 |
---|---|
Main Authors: | , , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | cdi_FETCH-LOGICAL-c420t-65e90d66a27f22976b462b8c0e735fc3379348d17380cf7d4d42605b4bcb07d73 |
---|---|
cites | cdi_FETCH-LOGICAL-c420t-65e90d66a27f22976b462b8c0e735fc3379348d17380cf7d4d42605b4bcb07d73 |
container_end_page | 308 |
container_issue | 4 |
container_start_page | 300 |
container_title | World patent information |
container_volume | 30 |
creator | Andree, Piet Jan Harper, Mark F. Nauche, Stéphane Poolman, Robert A. Shaw, Jo Swinkels, Joop C. Wycherley, Sally |
description | Nucleic acid and protein sequence data from patent publications is available from a plurality of commercial and public sources. As the searching and analysis of this data is of crucial importance to the life sciences industry, the Patent Documentation Group’s Biotechnology Information Working Group conducted a study to critically compare and evaluate patent sequence databases for data content. A series of sequences were searched to find similar sequences from several well known sources: GENESEQ™, CAS REGISTRY/CAplus
SM, PCTGEN, NCBI GenBank
®, EMBL-Bank and the EBI Fasta databases. The study highlights some differences between GENESEQ™ and REGISTRY/CAplus
SM results within the context of indexing policy and patent coverage. In comparison to the proprietary databases, the authors have identified important deficiencies in the content of the public databanks. This paper also discusses database timeliness and the choice of algorithm as potential reasons for missing data. |
doi_str_mv | 10.1016/j.wpi.2008.04.005 |
format | article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_57723403</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S0172219008000471</els_id><sourcerecordid>57723403</sourcerecordid><originalsourceid>FETCH-LOGICAL-c420t-65e90d66a27f22976b462b8c0e735fc3379348d17380cf7d4d42605b4bcb07d73</originalsourceid><addsrcrecordid>eNp9kMtOwzAQRS0EEuXxAeyyYpcwfiROxKqqeKoSG5DYWY49Ea6aJthuq_49botYsrgzkmfu1fgQckOhoECru0WxHV3BAOoCRAFQnpAJraXIqwY-T8kEqGQ5ow2ck4sQFgBU1NBMCEwzM_Sj9jq6DWYhru0uG7ps1BFXMQv4vcaVwczqqFsdMFyRs04vA17_9kvy8fjwPnvO529PL7PpPDeCQcyrEhuwVaWZ7BhrZNWKirW1AZS87AznsuGitlTyGkwnrbCCVVC2ojUtSCv5Jbk95o5-SDeEqHoXDC6XeoXDOqhSSsYF8LRIj4vGDyF47NToXa_9TlFQezZqoRIbtWejQKjEJnlejx6PI5o_AyJuB5--rjaKaw6p7JIOTq5dkkgaD7M0TK9fsU9h98cwTDg2Dr0Kxu2hWefRRGUH988pPzdjhFE</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>57723403</pqid></control><display><type>article</type><title>A comparative study of patent sequence databases</title><source>Library & Information Science Abstracts (LISA)</source><source>ScienceDirect Freedom Collection</source><creator>Andree, Piet Jan ; Harper, Mark F. ; Nauche, Stéphane ; Poolman, Robert A. ; Shaw, Jo ; Swinkels, Joop C. ; Wycherley, Sally</creator><creatorcontrib>Andree, Piet Jan ; Harper, Mark F. ; Nauche, Stéphane ; Poolman, Robert A. ; Shaw, Jo ; Swinkels, Joop C. ; Wycherley, Sally</creatorcontrib><description>Nucleic acid and protein sequence data from patent publications is available from a plurality of commercial and public sources. As the searching and analysis of this data is of crucial importance to the life sciences industry, the Patent Documentation Group’s Biotechnology Information Working Group conducted a study to critically compare and evaluate patent sequence databases for data content. A series of sequences were searched to find similar sequences from several well known sources: GENESEQ™, CAS REGISTRY/CAplus
SM, PCTGEN, NCBI GenBank
®, EMBL-Bank and the EBI Fasta databases. The study highlights some differences between GENESEQ™ and REGISTRY/CAplus
SM results within the context of indexing policy and patent coverage. In comparison to the proprietary databases, the authors have identified important deficiencies in the content of the public databanks. This paper also discusses database timeliness and the choice of algorithm as potential reasons for missing data.</description><identifier>ISSN: 0172-2190</identifier><identifier>EISSN: 1874-690X</identifier><identifier>DOI: 10.1016/j.wpi.2008.04.005</identifier><language>eng</language><publisher>Elsevier Ltd</publisher><subject>Biosequences ; CAplus ; EBI Fasta ; EMBL-Bank ; Full text databases ; GenBank ; GENESEQ ; Patent Documentation Group ; Patent sequences ; PCTGEN ; PDG ; REGISTRY ; Searching ; Sequence databases ; Sequence databases Sequence searching Biosequences Patent sequences GENESEQ REGISTRY CAplus PCTGEN GenBank EMBL-Bank EBI Fasta PDG Patent Documentation Group ; Sequence searching</subject><ispartof>World patent information, 2008-12, Vol.30 (4), p.300-308</ispartof><rights>2008 Elsevier Ltd</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c420t-65e90d66a27f22976b462b8c0e735fc3379348d17380cf7d4d42605b4bcb07d73</citedby><cites>FETCH-LOGICAL-c420t-65e90d66a27f22976b462b8c0e735fc3379348d17380cf7d4d42605b4bcb07d73</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,777,781,27905,27906,34117</link.rule.ids><backlink>$$Uhttp://econpapers.repec.org/article/eeeworpat/v_3a30_3ay_3a2008_3ai_3a4_3ap_3a300-308.htm$$DView record in RePEc$$Hfree_for_read</backlink></links><search><creatorcontrib>Andree, Piet Jan</creatorcontrib><creatorcontrib>Harper, Mark F.</creatorcontrib><creatorcontrib>Nauche, Stéphane</creatorcontrib><creatorcontrib>Poolman, Robert A.</creatorcontrib><creatorcontrib>Shaw, Jo</creatorcontrib><creatorcontrib>Swinkels, Joop C.</creatorcontrib><creatorcontrib>Wycherley, Sally</creatorcontrib><title>A comparative study of patent sequence databases</title><title>World patent information</title><description>Nucleic acid and protein sequence data from patent publications is available from a plurality of commercial and public sources. As the searching and analysis of this data is of crucial importance to the life sciences industry, the Patent Documentation Group’s Biotechnology Information Working Group conducted a study to critically compare and evaluate patent sequence databases for data content. A series of sequences were searched to find similar sequences from several well known sources: GENESEQ™, CAS REGISTRY/CAplus
SM, PCTGEN, NCBI GenBank
®, EMBL-Bank and the EBI Fasta databases. The study highlights some differences between GENESEQ™ and REGISTRY/CAplus
SM results within the context of indexing policy and patent coverage. In comparison to the proprietary databases, the authors have identified important deficiencies in the content of the public databanks. This paper also discusses database timeliness and the choice of algorithm as potential reasons for missing data.</description><subject>Biosequences</subject><subject>CAplus</subject><subject>EBI Fasta</subject><subject>EMBL-Bank</subject><subject>Full text databases</subject><subject>GenBank</subject><subject>GENESEQ</subject><subject>Patent Documentation Group</subject><subject>Patent sequences</subject><subject>PCTGEN</subject><subject>PDG</subject><subject>REGISTRY</subject><subject>Searching</subject><subject>Sequence databases</subject><subject>Sequence databases Sequence searching Biosequences Patent sequences GENESEQ REGISTRY CAplus PCTGEN GenBank EMBL-Bank EBI Fasta PDG Patent Documentation Group</subject><subject>Sequence searching</subject><issn>0172-2190</issn><issn>1874-690X</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2008</creationdate><recordtype>article</recordtype><sourceid>F2A</sourceid><recordid>eNp9kMtOwzAQRS0EEuXxAeyyYpcwfiROxKqqeKoSG5DYWY49Ea6aJthuq_49botYsrgzkmfu1fgQckOhoECru0WxHV3BAOoCRAFQnpAJraXIqwY-T8kEqGQ5ow2ck4sQFgBU1NBMCEwzM_Sj9jq6DWYhru0uG7ps1BFXMQv4vcaVwczqqFsdMFyRs04vA17_9kvy8fjwPnvO529PL7PpPDeCQcyrEhuwVaWZ7BhrZNWKirW1AZS87AznsuGitlTyGkwnrbCCVVC2ojUtSCv5Jbk95o5-SDeEqHoXDC6XeoXDOqhSSsYF8LRIj4vGDyF47NToXa_9TlFQezZqoRIbtWejQKjEJnlejx6PI5o_AyJuB5--rjaKaw6p7JIOTq5dkkgaD7M0TK9fsU9h98cwTDg2Dr0Kxu2hWefRRGUH988pPzdjhFE</recordid><startdate>20081201</startdate><enddate>20081201</enddate><creator>Andree, Piet Jan</creator><creator>Harper, Mark F.</creator><creator>Nauche, Stéphane</creator><creator>Poolman, Robert A.</creator><creator>Shaw, Jo</creator><creator>Swinkels, Joop C.</creator><creator>Wycherley, Sally</creator><general>Elsevier Ltd</general><general>Elsevier</general><scope>DKI</scope><scope>X2L</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>E3H</scope><scope>F2A</scope></search><sort><creationdate>20081201</creationdate><title>A comparative study of patent sequence databases</title><author>Andree, Piet Jan ; Harper, Mark F. ; Nauche, Stéphane ; Poolman, Robert A. ; Shaw, Jo ; Swinkels, Joop C. ; Wycherley, Sally</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c420t-65e90d66a27f22976b462b8c0e735fc3379348d17380cf7d4d42605b4bcb07d73</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2008</creationdate><topic>Biosequences</topic><topic>CAplus</topic><topic>EBI Fasta</topic><topic>EMBL-Bank</topic><topic>Full text databases</topic><topic>GenBank</topic><topic>GENESEQ</topic><topic>Patent Documentation Group</topic><topic>Patent sequences</topic><topic>PCTGEN</topic><topic>PDG</topic><topic>REGISTRY</topic><topic>Searching</topic><topic>Sequence databases</topic><topic>Sequence databases Sequence searching Biosequences Patent sequences GENESEQ REGISTRY CAplus PCTGEN GenBank EMBL-Bank EBI Fasta PDG Patent Documentation Group</topic><topic>Sequence searching</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Andree, Piet Jan</creatorcontrib><creatorcontrib>Harper, Mark F.</creatorcontrib><creatorcontrib>Nauche, Stéphane</creatorcontrib><creatorcontrib>Poolman, Robert A.</creatorcontrib><creatorcontrib>Shaw, Jo</creatorcontrib><creatorcontrib>Swinkels, Joop C.</creatorcontrib><creatorcontrib>Wycherley, Sally</creatorcontrib><collection>RePEc IDEAS</collection><collection>RePEc</collection><collection>CrossRef</collection><collection>Library & Information Sciences Abstracts (LISA)</collection><collection>Library & Information Science Abstracts (LISA)</collection><jtitle>World patent information</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Andree, Piet Jan</au><au>Harper, Mark F.</au><au>Nauche, Stéphane</au><au>Poolman, Robert A.</au><au>Shaw, Jo</au><au>Swinkels, Joop C.</au><au>Wycherley, Sally</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A comparative study of patent sequence databases</atitle><jtitle>World patent information</jtitle><date>2008-12-01</date><risdate>2008</risdate><volume>30</volume><issue>4</issue><spage>300</spage><epage>308</epage><pages>300-308</pages><issn>0172-2190</issn><eissn>1874-690X</eissn><abstract>Nucleic acid and protein sequence data from patent publications is available from a plurality of commercial and public sources. As the searching and analysis of this data is of crucial importance to the life sciences industry, the Patent Documentation Group’s Biotechnology Information Working Group conducted a study to critically compare and evaluate patent sequence databases for data content. A series of sequences were searched to find similar sequences from several well known sources: GENESEQ™, CAS REGISTRY/CAplus
SM, PCTGEN, NCBI GenBank
®, EMBL-Bank and the EBI Fasta databases. The study highlights some differences between GENESEQ™ and REGISTRY/CAplus
SM results within the context of indexing policy and patent coverage. In comparison to the proprietary databases, the authors have identified important deficiencies in the content of the public databanks. This paper also discusses database timeliness and the choice of algorithm as potential reasons for missing data.</abstract><pub>Elsevier Ltd</pub><doi>10.1016/j.wpi.2008.04.005</doi><tpages>9</tpages></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0172-2190 |
ispartof | World patent information, 2008-12, Vol.30 (4), p.300-308 |
issn | 0172-2190 1874-690X |
language | eng |
recordid | cdi_proquest_miscellaneous_57723403 |
source | Library & Information Science Abstracts (LISA); ScienceDirect Freedom Collection |
subjects | Biosequences CAplus EBI Fasta EMBL-Bank Full text databases GenBank GENESEQ Patent Documentation Group Patent sequences PCTGEN PDG REGISTRY Searching Sequence databases Sequence databases Sequence searching Biosequences Patent sequences GENESEQ REGISTRY CAplus PCTGEN GenBank EMBL-Bank EBI Fasta PDG Patent Documentation Group Sequence searching |
title | A comparative study of patent sequence databases |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-19T23%3A40%3A12IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20comparative%20study%20of%20patent%20sequence%20databases&rft.jtitle=World%20patent%20information&rft.au=Andree,%20Piet%20Jan&rft.date=2008-12-01&rft.volume=30&rft.issue=4&rft.spage=300&rft.epage=308&rft.pages=300-308&rft.issn=0172-2190&rft.eissn=1874-690X&rft_id=info:doi/10.1016/j.wpi.2008.04.005&rft_dat=%3Cproquest_cross%3E57723403%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c420t-65e90d66a27f22976b462b8c0e735fc3379348d17380cf7d4d42605b4bcb07d73%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=57723403&rft_id=info:pmid/&rfr_iscdi=true |