Loading…

PROF_ PAT 1.3: Updated database of patterns used to detect local similarities

Motivation: When analysing novel protein sequences, it is now essential to extend search strategies to include a range of ‘secondary’ databases. Pattern databases have become vital tools for identifying distant relationships in sequences, and hence for predicting protein function and structure. The...

Full description

Saved in:
Bibliographic Details
Published in:Bioinformatics 2000-04, Vol.16 (4), p.358-366
Main Authors: Bachinsky, A. G., Frolov, A. S., Naumochkin, A. N., Nizolenko, L. Ph, Yarigin, A. A.
Format: Article
Language:English
Subjects:
Citations: Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Motivation: When analysing novel protein sequences, it is now essential to extend search strategies to include a range of ‘secondary’ databases. Pattern databases have become vital tools for identifying distant relationships in sequences, and hence for predicting protein function and structure. The main drawback of such methods is the relatively small representation of proteins in trial samples at the time of their construction. Therefore, a negative result of an amino acid sequence comparison with such a databank forces a researcher to search for similarities in the original protein banks. We developed a database of patterns constructed for groups of related proteins with maximum representation of amino acid sequences of SWISS-PROT in the groups. Results: Software tools and a new method have been designed to construct patterns of protein families. By using such method, a new version of databank of protein family patterns, PROF_ PAT 1.3, is produced. This bank is based on SWISS-PROT (r1.38) and TrEMBL (r1.11), and contains patterns of more than 13 000 groups of related proteins in a format similar to that of the PROSITE. Motifs of patterns, which had the minimum level of probability to be found in random sequences, were selected. Flexible fast search program accompanies the bank. The researcher can specify a similarity matrix (the type PAM, BLOSUM and other). Variable levels of similarity can be set (permitting search strategies ranging from exact matches to increasing levels of ‘fuzziness’). Availability: The Internet address for comparing sequences with the bank is: http://wwwmgs.bionet.nsc.ru/mgs/programs/prof_pat/. The local version of the bank and search programs (approximately 50 Mb) is available via ftp: ftp://ftp.bionet.nsc.ru/pub/biology/vector/prof_pat/, and ftp://ftp.ebi.ac.uk/pub/databases/prof_pat/. Another appropriate way for its external use is to mail amino acid sequences to bachin@vector.nsc.ru for comparison with PROF_ PAT 1.3. Contact: bachin@vector.nsc.ru
ISSN:1367-4803
1460-2059
1367-4811
DOI:10.1093/bioinformatics/16.4.358