Loading…
Application of Intelligent Techniques for Classification of Bacteria Using Protein Sequence-Derived Features
Standard molecular experimental methodologies and mathematical procedures often fail to answer many phylogeny and classification related issues. Modern artificial intelligent-based techniques, such as radial basis function, genetic algorithm, artificial neural network, and support vector machines ar...
Saved in:
Published in: | Applied biochemistry and biotechnology 2013-07, Vol.170 (6), p.1263-1281 |
---|---|
Main Authors: | , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | cdi_FETCH-LOGICAL-c435t-ea963f4f3cfaa00a7fd5ee8d9542a596e903148eb848bd60c59ec38dcd5059a73 |
---|---|
cites | cdi_FETCH-LOGICAL-c435t-ea963f4f3cfaa00a7fd5ee8d9542a596e903148eb848bd60c59ec38dcd5059a73 |
container_end_page | 1281 |
container_issue | 6 |
container_start_page | 1263 |
container_title | Applied biochemistry and biotechnology |
container_volume | 170 |
creator | Banerjee, Amit Kumar Ravi, Vadlamani Murty, U. S. N. Sengupta, Neelava Karuna, Batepatti |
description | Standard molecular experimental methodologies and mathematical procedures often fail to answer many phylogeny and classification related issues. Modern artificial intelligent-based techniques, such as radial basis function, genetic algorithm, artificial neural network, and support vector machines are of ample potential in this regard. Reliance on a large number of essential parameters will aid in enhanced robustness, reliability, and better accuracy as opposed to single molecular parameter. This study was conducted with dataset of computed protein physicochemical properties belonging to 20 different bacterial genera. A total of 57 sequential and structural parameters derived from protein sequences were considered for the initial classification. Feature selection based techniques were employed to find out the most important features influencing the dataset. Various amino acids, hydrophobicity, relative sulfur percentage, and codon number were selected as important parameters during the study. Comparative analyses were performed applying RapidMiner data mining platform. Support vector machine proved to be the best method with maximum accuracy of more than 91 %. |
doi_str_mv | 10.1007/s12010-013-0268-1 |
format | article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_1443377890</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>1443377890</sourcerecordid><originalsourceid>FETCH-LOGICAL-c435t-ea963f4f3cfaa00a7fd5ee8d9542a596e903148eb848bd60c59ec38dcd5059a73</originalsourceid><addsrcrecordid>eNqFkVFrFDEUhUNR2nX1B_RFAiL4MvYmmUySx7paLRQs2D4P2czNmjKb2SYzgv_ejLvaIhSfwuV-5-QeDiGnDN4zAHWWGQcGFTBRAW90xY7IgklpymTYM7IArkTFuTYn5EXOdwCMa6mOyQkXjVQG-IL057tdH5wdwxDp4OllHLHvwwbjSG_QfY_hfsJM_ZDoqrc5B_8I_mDdiClYeptD3NDrNIwYIv2GRRMdVh_L8gd29ALtOCXML8lzb_uMrw7vktxefLpZfamuvn6-XJ1fVa4WcqzQmkb42gvnrQWwyncSUXdG1txK06ABwWqNa13rddeAkwad0J3rJEhjlViSd3vfXRrm88d2G7IruWzEYcotq2shlNLF57-oUKIcJQ0v6Jt_0LthSrEEmSkugLPfhmxPuTTknNC3uxS2Nv1sGbRza-2-tba01s6tFfGSvD44T-stdn8Vf2oqwNsDYLOzvU82upAfOCWbxvA5ON9zuaziBtOjE5_8_RdQA68P</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1372302190</pqid></control><display><type>article</type><title>Application of Intelligent Techniques for Classification of Bacteria Using Protein Sequence-Derived Features</title><source>Springer Nature</source><creator>Banerjee, Amit Kumar ; Ravi, Vadlamani ; Murty, U. S. N. ; Sengupta, Neelava ; Karuna, Batepatti</creator><creatorcontrib>Banerjee, Amit Kumar ; Ravi, Vadlamani ; Murty, U. S. N. ; Sengupta, Neelava ; Karuna, Batepatti</creatorcontrib><description>Standard molecular experimental methodologies and mathematical procedures often fail to answer many phylogeny and classification related issues. Modern artificial intelligent-based techniques, such as radial basis function, genetic algorithm, artificial neural network, and support vector machines are of ample potential in this regard. Reliance on a large number of essential parameters will aid in enhanced robustness, reliability, and better accuracy as opposed to single molecular parameter. This study was conducted with dataset of computed protein physicochemical properties belonging to 20 different bacterial genera. A total of 57 sequential and structural parameters derived from protein sequences were considered for the initial classification. Feature selection based techniques were employed to find out the most important features influencing the dataset. Various amino acids, hydrophobicity, relative sulfur percentage, and codon number were selected as important parameters during the study. Comparative analyses were performed applying RapidMiner data mining platform. Support vector machine proved to be the best method with maximum accuracy of more than 91 %.</description><identifier>ISSN: 0273-2289</identifier><identifier>EISSN: 1559-0291</identifier><identifier>DOI: 10.1007/s12010-013-0268-1</identifier><identifier>PMID: 23657902</identifier><identifier>CODEN: ABIBDL</identifier><language>eng</language><publisher>New York: Springer New York</publisher><subject>Algorithms ; Amino Acid Sequence ; Amino acids ; Artificial Intelligence ; Bacteria ; Bacteria - classification ; Bacteria - enzymology ; Bacterial Typing Techniques - methods ; Biochemistry ; Biological and medical sciences ; Biotechnology ; Chemistry ; Chemistry and Materials Science ; Classification ; Data mining ; Experimental methods ; Fundamental and applied biological sciences. Psychology ; Histidine Kinase ; Molecular biology ; Molecular Sequence Data ; Phylogeny ; Physicochemical properties ; Protein Kinases - chemistry ; Protein Kinases - metabolism ; Proteins ; Sequence Analysis, Protein - methods ; Sulfur</subject><ispartof>Applied biochemistry and biotechnology, 2013-07, Vol.170 (6), p.1263-1281</ispartof><rights>Springer Science+Business Media New York 2013</rights><rights>2014 INIST-CNRS</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c435t-ea963f4f3cfaa00a7fd5ee8d9542a596e903148eb848bd60c59ec38dcd5059a73</citedby><cites>FETCH-LOGICAL-c435t-ea963f4f3cfaa00a7fd5ee8d9542a596e903148eb848bd60c59ec38dcd5059a73</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27922,27923</link.rule.ids><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=27566927$$DView record in Pascal Francis$$Hfree_for_read</backlink><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/23657902$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Banerjee, Amit Kumar</creatorcontrib><creatorcontrib>Ravi, Vadlamani</creatorcontrib><creatorcontrib>Murty, U. S. N.</creatorcontrib><creatorcontrib>Sengupta, Neelava</creatorcontrib><creatorcontrib>Karuna, Batepatti</creatorcontrib><title>Application of Intelligent Techniques for Classification of Bacteria Using Protein Sequence-Derived Features</title><title>Applied biochemistry and biotechnology</title><addtitle>Appl Biochem Biotechnol</addtitle><addtitle>Appl Biochem Biotechnol</addtitle><description>Standard molecular experimental methodologies and mathematical procedures often fail to answer many phylogeny and classification related issues. Modern artificial intelligent-based techniques, such as radial basis function, genetic algorithm, artificial neural network, and support vector machines are of ample potential in this regard. Reliance on a large number of essential parameters will aid in enhanced robustness, reliability, and better accuracy as opposed to single molecular parameter. This study was conducted with dataset of computed protein physicochemical properties belonging to 20 different bacterial genera. A total of 57 sequential and structural parameters derived from protein sequences were considered for the initial classification. Feature selection based techniques were employed to find out the most important features influencing the dataset. Various amino acids, hydrophobicity, relative sulfur percentage, and codon number were selected as important parameters during the study. Comparative analyses were performed applying RapidMiner data mining platform. Support vector machine proved to be the best method with maximum accuracy of more than 91 %.</description><subject>Algorithms</subject><subject>Amino Acid Sequence</subject><subject>Amino acids</subject><subject>Artificial Intelligence</subject><subject>Bacteria</subject><subject>Bacteria - classification</subject><subject>Bacteria - enzymology</subject><subject>Bacterial Typing Techniques - methods</subject><subject>Biochemistry</subject><subject>Biological and medical sciences</subject><subject>Biotechnology</subject><subject>Chemistry</subject><subject>Chemistry and Materials Science</subject><subject>Classification</subject><subject>Data mining</subject><subject>Experimental methods</subject><subject>Fundamental and applied biological sciences. Psychology</subject><subject>Histidine Kinase</subject><subject>Molecular biology</subject><subject>Molecular Sequence Data</subject><subject>Phylogeny</subject><subject>Physicochemical properties</subject><subject>Protein Kinases - chemistry</subject><subject>Protein Kinases - metabolism</subject><subject>Proteins</subject><subject>Sequence Analysis, Protein - methods</subject><subject>Sulfur</subject><issn>0273-2289</issn><issn>1559-0291</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2013</creationdate><recordtype>article</recordtype><recordid>eNqFkVFrFDEUhUNR2nX1B_RFAiL4MvYmmUySx7paLRQs2D4P2czNmjKb2SYzgv_ejLvaIhSfwuV-5-QeDiGnDN4zAHWWGQcGFTBRAW90xY7IgklpymTYM7IArkTFuTYn5EXOdwCMa6mOyQkXjVQG-IL057tdH5wdwxDp4OllHLHvwwbjSG_QfY_hfsJM_ZDoqrc5B_8I_mDdiClYeptD3NDrNIwYIv2GRRMdVh_L8gd29ALtOCXML8lzb_uMrw7vktxefLpZfamuvn6-XJ1fVa4WcqzQmkb42gvnrQWwyncSUXdG1txK06ABwWqNa13rddeAkwad0J3rJEhjlViSd3vfXRrm88d2G7IruWzEYcotq2shlNLF57-oUKIcJQ0v6Jt_0LthSrEEmSkugLPfhmxPuTTknNC3uxS2Nv1sGbRza-2-tba01s6tFfGSvD44T-stdn8Vf2oqwNsDYLOzvU82upAfOCWbxvA5ON9zuaziBtOjE5_8_RdQA68P</recordid><startdate>20130701</startdate><enddate>20130701</enddate><creator>Banerjee, Amit Kumar</creator><creator>Ravi, Vadlamani</creator><creator>Murty, U. S. N.</creator><creator>Sengupta, Neelava</creator><creator>Karuna, Batepatti</creator><general>Springer New York</general><general>Springer</general><general>Springer Nature B.V</general><scope>IQODW</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7ST</scope><scope>7T7</scope><scope>7TM</scope><scope>7X7</scope><scope>7XB</scope><scope>88A</scope><scope>88E</scope><scope>88I</scope><scope>8AO</scope><scope>8FD</scope><scope>8FE</scope><scope>8FH</scope><scope>8FI</scope><scope>8FJ</scope><scope>8FK</scope><scope>ABUWG</scope><scope>AEUYN</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BBNVY</scope><scope>BENPR</scope><scope>BHPHI</scope><scope>C1K</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FR3</scope><scope>FYUFA</scope><scope>GHDGH</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>K9.</scope><scope>LK8</scope><scope>M0S</scope><scope>M1P</scope><scope>M2P</scope><scope>M7P</scope><scope>P64</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>Q9U</scope><scope>RC3</scope><scope>SOI</scope><scope>7X8</scope><scope>7QL</scope><scope>7QO</scope></search><sort><creationdate>20130701</creationdate><title>Application of Intelligent Techniques for Classification of Bacteria Using Protein Sequence-Derived Features</title><author>Banerjee, Amit Kumar ; Ravi, Vadlamani ; Murty, U. S. N. ; Sengupta, Neelava ; Karuna, Batepatti</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c435t-ea963f4f3cfaa00a7fd5ee8d9542a596e903148eb848bd60c59ec38dcd5059a73</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2013</creationdate><topic>Algorithms</topic><topic>Amino Acid Sequence</topic><topic>Amino acids</topic><topic>Artificial Intelligence</topic><topic>Bacteria</topic><topic>Bacteria - classification</topic><topic>Bacteria - enzymology</topic><topic>Bacterial Typing Techniques - methods</topic><topic>Biochemistry</topic><topic>Biological and medical sciences</topic><topic>Biotechnology</topic><topic>Chemistry</topic><topic>Chemistry and Materials Science</topic><topic>Classification</topic><topic>Data mining</topic><topic>Experimental methods</topic><topic>Fundamental and applied biological sciences. Psychology</topic><topic>Histidine Kinase</topic><topic>Molecular biology</topic><topic>Molecular Sequence Data</topic><topic>Phylogeny</topic><topic>Physicochemical properties</topic><topic>Protein Kinases - chemistry</topic><topic>Protein Kinases - metabolism</topic><topic>Proteins</topic><topic>Sequence Analysis, Protein - methods</topic><topic>Sulfur</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Banerjee, Amit Kumar</creatorcontrib><creatorcontrib>Ravi, Vadlamani</creatorcontrib><creatorcontrib>Murty, U. S. N.</creatorcontrib><creatorcontrib>Sengupta, Neelava</creatorcontrib><creatorcontrib>Karuna, Batepatti</creatorcontrib><collection>Pascal-Francis</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>Environment Abstracts</collection><collection>Industrial and Applied Microbiology Abstracts (Microbiology A)</collection><collection>Nucleic Acids Abstracts</collection><collection>ProQuest Health & Medical Collection</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Biology Database (Alumni Edition)</collection><collection>Medical Database (Alumni Edition)</collection><collection>Science Database (Alumni Edition)</collection><collection>ProQuest Pharma Collection</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Natural Science Collection</collection><collection>Hospital Premium Collection</collection><collection>Hospital Premium Collection (Alumni Edition)</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest One Sustainability</collection><collection>ProQuest Central</collection><collection>ProQuest Central Essentials</collection><collection>Biological Science Collection</collection><collection>ProQuest Central</collection><collection>ProQuest Natural Science Collection</collection><collection>Environmental Sciences and Pollution Management</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>Engineering Research Database</collection><collection>Health Research Premium Collection</collection><collection>Health Research Premium Collection (Alumni)</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Health & Medical Complete (Alumni)</collection><collection>ProQuest Biological Science Collection</collection><collection>Health & Medical Collection (Alumni Edition)</collection><collection>PML(ProQuest Medical Library)</collection><collection>Science Database</collection><collection>Biological Science Database</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central Basic</collection><collection>Genetics Abstracts</collection><collection>Environment Abstracts</collection><collection>MEDLINE - Academic</collection><collection>Bacteriology Abstracts (Microbiology B)</collection><collection>Biotechnology Research Abstracts</collection><jtitle>Applied biochemistry and biotechnology</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Banerjee, Amit Kumar</au><au>Ravi, Vadlamani</au><au>Murty, U. S. N.</au><au>Sengupta, Neelava</au><au>Karuna, Batepatti</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Application of Intelligent Techniques for Classification of Bacteria Using Protein Sequence-Derived Features</atitle><jtitle>Applied biochemistry and biotechnology</jtitle><stitle>Appl Biochem Biotechnol</stitle><addtitle>Appl Biochem Biotechnol</addtitle><date>2013-07-01</date><risdate>2013</risdate><volume>170</volume><issue>6</issue><spage>1263</spage><epage>1281</epage><pages>1263-1281</pages><issn>0273-2289</issn><eissn>1559-0291</eissn><coden>ABIBDL</coden><abstract>Standard molecular experimental methodologies and mathematical procedures often fail to answer many phylogeny and classification related issues. Modern artificial intelligent-based techniques, such as radial basis function, genetic algorithm, artificial neural network, and support vector machines are of ample potential in this regard. Reliance on a large number of essential parameters will aid in enhanced robustness, reliability, and better accuracy as opposed to single molecular parameter. This study was conducted with dataset of computed protein physicochemical properties belonging to 20 different bacterial genera. A total of 57 sequential and structural parameters derived from protein sequences were considered for the initial classification. Feature selection based techniques were employed to find out the most important features influencing the dataset. Various amino acids, hydrophobicity, relative sulfur percentage, and codon number were selected as important parameters during the study. Comparative analyses were performed applying RapidMiner data mining platform. Support vector machine proved to be the best method with maximum accuracy of more than 91 %.</abstract><cop>New York</cop><pub>Springer New York</pub><pmid>23657902</pmid><doi>10.1007/s12010-013-0268-1</doi><tpages>19</tpages></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0273-2289 |
ispartof | Applied biochemistry and biotechnology, 2013-07, Vol.170 (6), p.1263-1281 |
issn | 0273-2289 1559-0291 |
language | eng |
recordid | cdi_proquest_miscellaneous_1443377890 |
source | Springer Nature |
subjects | Algorithms Amino Acid Sequence Amino acids Artificial Intelligence Bacteria Bacteria - classification Bacteria - enzymology Bacterial Typing Techniques - methods Biochemistry Biological and medical sciences Biotechnology Chemistry Chemistry and Materials Science Classification Data mining Experimental methods Fundamental and applied biological sciences. Psychology Histidine Kinase Molecular biology Molecular Sequence Data Phylogeny Physicochemical properties Protein Kinases - chemistry Protein Kinases - metabolism Proteins Sequence Analysis, Protein - methods Sulfur |
title | Application of Intelligent Techniques for Classification of Bacteria Using Protein Sequence-Derived Features |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-14T00%3A24%3A36IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Application%20of%20Intelligent%20Techniques%20for%20Classification%20of%20Bacteria%20Using%20Protein%20Sequence-Derived%20Features&rft.jtitle=Applied%20biochemistry%20and%20biotechnology&rft.au=Banerjee,%20Amit%20Kumar&rft.date=2013-07-01&rft.volume=170&rft.issue=6&rft.spage=1263&rft.epage=1281&rft.pages=1263-1281&rft.issn=0273-2289&rft.eissn=1559-0291&rft.coden=ABIBDL&rft_id=info:doi/10.1007/s12010-013-0268-1&rft_dat=%3Cproquest_cross%3E1443377890%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c435t-ea963f4f3cfaa00a7fd5ee8d9542a596e903148eb848bd60c59ec38dcd5059a73%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=1372302190&rft_id=info:pmid/23657902&rfr_iscdi=true |