Loading…

Application of Intelligent Techniques for Classification of Bacteria Using Protein Sequence-Derived Features

Standard molecular experimental methodologies and mathematical procedures often fail to answer many phylogeny and classification related issues. Modern artificial intelligent-based techniques, such as radial basis function, genetic algorithm, artificial neural network, and support vector machines ar...

Full description

Saved in:
Bibliographic Details
Published in:Applied biochemistry and biotechnology 2013-07, Vol.170 (6), p.1263-1281
Main Authors: Banerjee, Amit Kumar, Ravi, Vadlamani, Murty, U. S. N., Sengupta, Neelava, Karuna, Batepatti
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c435t-ea963f4f3cfaa00a7fd5ee8d9542a596e903148eb848bd60c59ec38dcd5059a73
cites cdi_FETCH-LOGICAL-c435t-ea963f4f3cfaa00a7fd5ee8d9542a596e903148eb848bd60c59ec38dcd5059a73
container_end_page 1281
container_issue 6
container_start_page 1263
container_title Applied biochemistry and biotechnology
container_volume 170
creator Banerjee, Amit Kumar
Ravi, Vadlamani
Murty, U. S. N.
Sengupta, Neelava
Karuna, Batepatti
description Standard molecular experimental methodologies and mathematical procedures often fail to answer many phylogeny and classification related issues. Modern artificial intelligent-based techniques, such as radial basis function, genetic algorithm, artificial neural network, and support vector machines are of ample potential in this regard. Reliance on a large number of essential parameters will aid in enhanced robustness, reliability, and better accuracy as opposed to single molecular parameter. This study was conducted with dataset of computed protein physicochemical properties belonging to 20 different bacterial genera. A total of 57 sequential and structural parameters derived from protein sequences were considered for the initial classification. Feature selection based techniques were employed to find out the most important features influencing the dataset. Various amino acids, hydrophobicity, relative sulfur percentage, and codon number were selected as important parameters during the study. Comparative analyses were performed applying RapidMiner data mining platform. Support vector machine proved to be the best method with maximum accuracy of more than 91 %.
doi_str_mv 10.1007/s12010-013-0268-1
format article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_1443377890</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>1443377890</sourcerecordid><originalsourceid>FETCH-LOGICAL-c435t-ea963f4f3cfaa00a7fd5ee8d9542a596e903148eb848bd60c59ec38dcd5059a73</originalsourceid><addsrcrecordid>eNqFkVFrFDEUhUNR2nX1B_RFAiL4MvYmmUySx7paLRQs2D4P2czNmjKb2SYzgv_ejLvaIhSfwuV-5-QeDiGnDN4zAHWWGQcGFTBRAW90xY7IgklpymTYM7IArkTFuTYn5EXOdwCMa6mOyQkXjVQG-IL057tdH5wdwxDp4OllHLHvwwbjSG_QfY_hfsJM_ZDoqrc5B_8I_mDdiClYeptD3NDrNIwYIv2GRRMdVh_L8gd29ALtOCXML8lzb_uMrw7vktxefLpZfamuvn6-XJ1fVa4WcqzQmkb42gvnrQWwyncSUXdG1txK06ABwWqNa13rddeAkwad0J3rJEhjlViSd3vfXRrm88d2G7IruWzEYcotq2shlNLF57-oUKIcJQ0v6Jt_0LthSrEEmSkugLPfhmxPuTTknNC3uxS2Nv1sGbRza-2-tba01s6tFfGSvD44T-stdn8Vf2oqwNsDYLOzvU82upAfOCWbxvA5ON9zuaziBtOjE5_8_RdQA68P</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1372302190</pqid></control><display><type>article</type><title>Application of Intelligent Techniques for Classification of Bacteria Using Protein Sequence-Derived Features</title><source>Springer Nature</source><creator>Banerjee, Amit Kumar ; Ravi, Vadlamani ; Murty, U. S. N. ; Sengupta, Neelava ; Karuna, Batepatti</creator><creatorcontrib>Banerjee, Amit Kumar ; Ravi, Vadlamani ; Murty, U. S. N. ; Sengupta, Neelava ; Karuna, Batepatti</creatorcontrib><description>Standard molecular experimental methodologies and mathematical procedures often fail to answer many phylogeny and classification related issues. Modern artificial intelligent-based techniques, such as radial basis function, genetic algorithm, artificial neural network, and support vector machines are of ample potential in this regard. Reliance on a large number of essential parameters will aid in enhanced robustness, reliability, and better accuracy as opposed to single molecular parameter. This study was conducted with dataset of computed protein physicochemical properties belonging to 20 different bacterial genera. A total of 57 sequential and structural parameters derived from protein sequences were considered for the initial classification. Feature selection based techniques were employed to find out the most important features influencing the dataset. Various amino acids, hydrophobicity, relative sulfur percentage, and codon number were selected as important parameters during the study. Comparative analyses were performed applying RapidMiner data mining platform. Support vector machine proved to be the best method with maximum accuracy of more than 91 %.</description><identifier>ISSN: 0273-2289</identifier><identifier>EISSN: 1559-0291</identifier><identifier>DOI: 10.1007/s12010-013-0268-1</identifier><identifier>PMID: 23657902</identifier><identifier>CODEN: ABIBDL</identifier><language>eng</language><publisher>New York: Springer New York</publisher><subject>Algorithms ; Amino Acid Sequence ; Amino acids ; Artificial Intelligence ; Bacteria ; Bacteria - classification ; Bacteria - enzymology ; Bacterial Typing Techniques - methods ; Biochemistry ; Biological and medical sciences ; Biotechnology ; Chemistry ; Chemistry and Materials Science ; Classification ; Data mining ; Experimental methods ; Fundamental and applied biological sciences. Psychology ; Histidine Kinase ; Molecular biology ; Molecular Sequence Data ; Phylogeny ; Physicochemical properties ; Protein Kinases - chemistry ; Protein Kinases - metabolism ; Proteins ; Sequence Analysis, Protein - methods ; Sulfur</subject><ispartof>Applied biochemistry and biotechnology, 2013-07, Vol.170 (6), p.1263-1281</ispartof><rights>Springer Science+Business Media New York 2013</rights><rights>2014 INIST-CNRS</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c435t-ea963f4f3cfaa00a7fd5ee8d9542a596e903148eb848bd60c59ec38dcd5059a73</citedby><cites>FETCH-LOGICAL-c435t-ea963f4f3cfaa00a7fd5ee8d9542a596e903148eb848bd60c59ec38dcd5059a73</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27922,27923</link.rule.ids><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&amp;idt=27566927$$DView record in Pascal Francis$$Hfree_for_read</backlink><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/23657902$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Banerjee, Amit Kumar</creatorcontrib><creatorcontrib>Ravi, Vadlamani</creatorcontrib><creatorcontrib>Murty, U. S. N.</creatorcontrib><creatorcontrib>Sengupta, Neelava</creatorcontrib><creatorcontrib>Karuna, Batepatti</creatorcontrib><title>Application of Intelligent Techniques for Classification of Bacteria Using Protein Sequence-Derived Features</title><title>Applied biochemistry and biotechnology</title><addtitle>Appl Biochem Biotechnol</addtitle><addtitle>Appl Biochem Biotechnol</addtitle><description>Standard molecular experimental methodologies and mathematical procedures often fail to answer many phylogeny and classification related issues. Modern artificial intelligent-based techniques, such as radial basis function, genetic algorithm, artificial neural network, and support vector machines are of ample potential in this regard. Reliance on a large number of essential parameters will aid in enhanced robustness, reliability, and better accuracy as opposed to single molecular parameter. This study was conducted with dataset of computed protein physicochemical properties belonging to 20 different bacterial genera. A total of 57 sequential and structural parameters derived from protein sequences were considered for the initial classification. Feature selection based techniques were employed to find out the most important features influencing the dataset. Various amino acids, hydrophobicity, relative sulfur percentage, and codon number were selected as important parameters during the study. Comparative analyses were performed applying RapidMiner data mining platform. Support vector machine proved to be the best method with maximum accuracy of more than 91 %.</description><subject>Algorithms</subject><subject>Amino Acid Sequence</subject><subject>Amino acids</subject><subject>Artificial Intelligence</subject><subject>Bacteria</subject><subject>Bacteria - classification</subject><subject>Bacteria - enzymology</subject><subject>Bacterial Typing Techniques - methods</subject><subject>Biochemistry</subject><subject>Biological and medical sciences</subject><subject>Biotechnology</subject><subject>Chemistry</subject><subject>Chemistry and Materials Science</subject><subject>Classification</subject><subject>Data mining</subject><subject>Experimental methods</subject><subject>Fundamental and applied biological sciences. Psychology</subject><subject>Histidine Kinase</subject><subject>Molecular biology</subject><subject>Molecular Sequence Data</subject><subject>Phylogeny</subject><subject>Physicochemical properties</subject><subject>Protein Kinases - chemistry</subject><subject>Protein Kinases - metabolism</subject><subject>Proteins</subject><subject>Sequence Analysis, Protein - methods</subject><subject>Sulfur</subject><issn>0273-2289</issn><issn>1559-0291</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2013</creationdate><recordtype>article</recordtype><recordid>eNqFkVFrFDEUhUNR2nX1B_RFAiL4MvYmmUySx7paLRQs2D4P2czNmjKb2SYzgv_ejLvaIhSfwuV-5-QeDiGnDN4zAHWWGQcGFTBRAW90xY7IgklpymTYM7IArkTFuTYn5EXOdwCMa6mOyQkXjVQG-IL057tdH5wdwxDp4OllHLHvwwbjSG_QfY_hfsJM_ZDoqrc5B_8I_mDdiClYeptD3NDrNIwYIv2GRRMdVh_L8gd29ALtOCXML8lzb_uMrw7vktxefLpZfamuvn6-XJ1fVa4WcqzQmkb42gvnrQWwyncSUXdG1txK06ABwWqNa13rddeAkwad0J3rJEhjlViSd3vfXRrm88d2G7IruWzEYcotq2shlNLF57-oUKIcJQ0v6Jt_0LthSrEEmSkugLPfhmxPuTTknNC3uxS2Nv1sGbRza-2-tba01s6tFfGSvD44T-stdn8Vf2oqwNsDYLOzvU82upAfOCWbxvA5ON9zuaziBtOjE5_8_RdQA68P</recordid><startdate>20130701</startdate><enddate>20130701</enddate><creator>Banerjee, Amit Kumar</creator><creator>Ravi, Vadlamani</creator><creator>Murty, U. S. N.</creator><creator>Sengupta, Neelava</creator><creator>Karuna, Batepatti</creator><general>Springer New York</general><general>Springer</general><general>Springer Nature B.V</general><scope>IQODW</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7ST</scope><scope>7T7</scope><scope>7TM</scope><scope>7X7</scope><scope>7XB</scope><scope>88A</scope><scope>88E</scope><scope>88I</scope><scope>8AO</scope><scope>8FD</scope><scope>8FE</scope><scope>8FH</scope><scope>8FI</scope><scope>8FJ</scope><scope>8FK</scope><scope>ABUWG</scope><scope>AEUYN</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BBNVY</scope><scope>BENPR</scope><scope>BHPHI</scope><scope>C1K</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FR3</scope><scope>FYUFA</scope><scope>GHDGH</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>K9.</scope><scope>LK8</scope><scope>M0S</scope><scope>M1P</scope><scope>M2P</scope><scope>M7P</scope><scope>P64</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>Q9U</scope><scope>RC3</scope><scope>SOI</scope><scope>7X8</scope><scope>7QL</scope><scope>7QO</scope></search><sort><creationdate>20130701</creationdate><title>Application of Intelligent Techniques for Classification of Bacteria Using Protein Sequence-Derived Features</title><author>Banerjee, Amit Kumar ; Ravi, Vadlamani ; Murty, U. S. N. ; Sengupta, Neelava ; Karuna, Batepatti</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c435t-ea963f4f3cfaa00a7fd5ee8d9542a596e903148eb848bd60c59ec38dcd5059a73</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2013</creationdate><topic>Algorithms</topic><topic>Amino Acid Sequence</topic><topic>Amino acids</topic><topic>Artificial Intelligence</topic><topic>Bacteria</topic><topic>Bacteria - classification</topic><topic>Bacteria - enzymology</topic><topic>Bacterial Typing Techniques - methods</topic><topic>Biochemistry</topic><topic>Biological and medical sciences</topic><topic>Biotechnology</topic><topic>Chemistry</topic><topic>Chemistry and Materials Science</topic><topic>Classification</topic><topic>Data mining</topic><topic>Experimental methods</topic><topic>Fundamental and applied biological sciences. Psychology</topic><topic>Histidine Kinase</topic><topic>Molecular biology</topic><topic>Molecular Sequence Data</topic><topic>Phylogeny</topic><topic>Physicochemical properties</topic><topic>Protein Kinases - chemistry</topic><topic>Protein Kinases - metabolism</topic><topic>Proteins</topic><topic>Sequence Analysis, Protein - methods</topic><topic>Sulfur</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Banerjee, Amit Kumar</creatorcontrib><creatorcontrib>Ravi, Vadlamani</creatorcontrib><creatorcontrib>Murty, U. S. N.</creatorcontrib><creatorcontrib>Sengupta, Neelava</creatorcontrib><creatorcontrib>Karuna, Batepatti</creatorcontrib><collection>Pascal-Francis</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>Environment Abstracts</collection><collection>Industrial and Applied Microbiology Abstracts (Microbiology A)</collection><collection>Nucleic Acids Abstracts</collection><collection>ProQuest Health &amp; Medical Collection</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Biology Database (Alumni Edition)</collection><collection>Medical Database (Alumni Edition)</collection><collection>Science Database (Alumni Edition)</collection><collection>ProQuest Pharma Collection</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Natural Science Collection</collection><collection>Hospital Premium Collection</collection><collection>Hospital Premium Collection (Alumni Edition)</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest One Sustainability</collection><collection>ProQuest Central</collection><collection>ProQuest Central Essentials</collection><collection>Biological Science Collection</collection><collection>ProQuest Central</collection><collection>ProQuest Natural Science Collection</collection><collection>Environmental Sciences and Pollution Management</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>Engineering Research Database</collection><collection>Health Research Premium Collection</collection><collection>Health Research Premium Collection (Alumni)</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Health &amp; Medical Complete (Alumni)</collection><collection>ProQuest Biological Science Collection</collection><collection>Health &amp; Medical Collection (Alumni Edition)</collection><collection>PML(ProQuest Medical Library)</collection><collection>Science Database</collection><collection>Biological Science Database</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central Basic</collection><collection>Genetics Abstracts</collection><collection>Environment Abstracts</collection><collection>MEDLINE - Academic</collection><collection>Bacteriology Abstracts (Microbiology B)</collection><collection>Biotechnology Research Abstracts</collection><jtitle>Applied biochemistry and biotechnology</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Banerjee, Amit Kumar</au><au>Ravi, Vadlamani</au><au>Murty, U. S. N.</au><au>Sengupta, Neelava</au><au>Karuna, Batepatti</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Application of Intelligent Techniques for Classification of Bacteria Using Protein Sequence-Derived Features</atitle><jtitle>Applied biochemistry and biotechnology</jtitle><stitle>Appl Biochem Biotechnol</stitle><addtitle>Appl Biochem Biotechnol</addtitle><date>2013-07-01</date><risdate>2013</risdate><volume>170</volume><issue>6</issue><spage>1263</spage><epage>1281</epage><pages>1263-1281</pages><issn>0273-2289</issn><eissn>1559-0291</eissn><coden>ABIBDL</coden><abstract>Standard molecular experimental methodologies and mathematical procedures often fail to answer many phylogeny and classification related issues. Modern artificial intelligent-based techniques, such as radial basis function, genetic algorithm, artificial neural network, and support vector machines are of ample potential in this regard. Reliance on a large number of essential parameters will aid in enhanced robustness, reliability, and better accuracy as opposed to single molecular parameter. This study was conducted with dataset of computed protein physicochemical properties belonging to 20 different bacterial genera. A total of 57 sequential and structural parameters derived from protein sequences were considered for the initial classification. Feature selection based techniques were employed to find out the most important features influencing the dataset. Various amino acids, hydrophobicity, relative sulfur percentage, and codon number were selected as important parameters during the study. Comparative analyses were performed applying RapidMiner data mining platform. Support vector machine proved to be the best method with maximum accuracy of more than 91 %.</abstract><cop>New York</cop><pub>Springer New York</pub><pmid>23657902</pmid><doi>10.1007/s12010-013-0268-1</doi><tpages>19</tpages></addata></record>
fulltext fulltext
identifier ISSN: 0273-2289
ispartof Applied biochemistry and biotechnology, 2013-07, Vol.170 (6), p.1263-1281
issn 0273-2289
1559-0291
language eng
recordid cdi_proquest_miscellaneous_1443377890
source Springer Nature
subjects Algorithms
Amino Acid Sequence
Amino acids
Artificial Intelligence
Bacteria
Bacteria - classification
Bacteria - enzymology
Bacterial Typing Techniques - methods
Biochemistry
Biological and medical sciences
Biotechnology
Chemistry
Chemistry and Materials Science
Classification
Data mining
Experimental methods
Fundamental and applied biological sciences. Psychology
Histidine Kinase
Molecular biology
Molecular Sequence Data
Phylogeny
Physicochemical properties
Protein Kinases - chemistry
Protein Kinases - metabolism
Proteins
Sequence Analysis, Protein - methods
Sulfur
title Application of Intelligent Techniques for Classification of Bacteria Using Protein Sequence-Derived Features
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-14T00%3A24%3A36IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Application%20of%20Intelligent%20Techniques%20for%20Classification%20of%20Bacteria%20Using%20Protein%20Sequence-Derived%20Features&rft.jtitle=Applied%20biochemistry%20and%20biotechnology&rft.au=Banerjee,%20Amit%20Kumar&rft.date=2013-07-01&rft.volume=170&rft.issue=6&rft.spage=1263&rft.epage=1281&rft.pages=1263-1281&rft.issn=0273-2289&rft.eissn=1559-0291&rft.coden=ABIBDL&rft_id=info:doi/10.1007/s12010-013-0268-1&rft_dat=%3Cproquest_cross%3E1443377890%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c435t-ea963f4f3cfaa00a7fd5ee8d9542a596e903148eb848bd60c59ec38dcd5059a73%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=1372302190&rft_id=info:pmid/23657902&rfr_iscdi=true