Loading…
A comparative study of patent sequence databases
Nucleic acid and protein sequence data from patent publications is available from a plurality of commercial and public sources. As the searching and analysis of this data is of crucial importance to the life sciences industry, the Patent Documentation Group’s Biotechnology Information Working Group...
Saved in:
Published in: | World patent information 2008-12, Vol.30 (4), p.300-308 |
---|---|
Main Authors: | , , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Nucleic acid and protein sequence data from patent publications is available from a plurality of commercial and public sources. As the searching and analysis of this data is of crucial importance to the life sciences industry, the Patent Documentation Group’s Biotechnology Information Working Group conducted a study to critically compare and evaluate patent sequence databases for data content. A series of sequences were searched to find similar sequences from several well known sources: GENESEQ™, CAS REGISTRY/CAplus
SM, PCTGEN, NCBI GenBank
®, EMBL-Bank and the EBI Fasta databases. The study highlights some differences between GENESEQ™ and REGISTRY/CAplus
SM results within the context of indexing policy and patent coverage. In comparison to the proprietary databases, the authors have identified important deficiencies in the content of the public databanks. This paper also discusses database timeliness and the choice of algorithm as potential reasons for missing data. |
---|---|
ISSN: | 0172-2190 1874-690X |
DOI: | 10.1016/j.wpi.2008.04.005 |