Loading…

T3_MM: a Markov model effectively classifies bacterial type III secretion signals

Type III Secretion Systems (T3SSs) play important roles in the interaction between gram-negative bacteria and their hosts. T3SSs function by translocating a group of bacterial effector proteins into the host cytoplasm. The details of specific type III secretion process are yet to be clarified. This...

Full description

Saved in:
Bibliographic Details
Published in:PloS one 2013-03, Vol.8 (3), p.e58173
Main Authors: Wang, Yejun, Sun, Ming'an, Bao, Hongxia, White, Aaron P
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Type III Secretion Systems (T3SSs) play important roles in the interaction between gram-negative bacteria and their hosts. T3SSs function by translocating a group of bacterial effector proteins into the host cytoplasm. The details of specific type III secretion process are yet to be clarified. This research focused on comparing the amino acid composition within the N-terminal 100 amino acids from type III secretion (T3S) signal sequences or non-T3S proteins, specifically whether each residue exerts a constraint on residues found in adjacent positions. We used these comparisons to set up a statistic model to quantitatively model and effectively distinguish T3S effectors. In this study, the amino acid composition (Aac) probability profiles conditional on its sequentially preceding position and corresponding amino acids were compared between N-terminal sequences of T3S and non-T3S proteins. The profiles are generally different. A Markov model, namely T3_MM, was consequently designed to calculate the total Aac conditional probability difference, i.e., the likelihood ratio of a sequence being a T3S or a non-T3S protein. With T3_MM, known T3S and non-T3S proteins were found to well approximate two distinct normal distributions. The model could distinguish validated T3S and non-T3S proteins with a 5-fold cross-validation sensitivity of 83.9% at a specificity of 90.3%. T3_MM was also shown to be more robust, accurate, simple, and statistically quantitative, when compared with other T3S protein prediction models. The high effectiveness of T3_MM also indicated the overall Aac difference between N-termini of T3S and non-T3S proteins, and the constraint of Aac exerted by its preceding position and corresponding Aac. An R package for T3_MM is freely downloadable from: http://biocomputer.bio.cuhk.edu.hk/softwares/T3_MM. T3_MM web server: http://biocomputer.bio.cuhk.edu.hk/T3DB/T3_MM.php.
ISSN:1932-6203
1932-6203
DOI:10.1371/journal.pone.0058173