Loading…

IMMAN: free software for information theory-based chemometric analysis

The features and theoretical background of a new and free computational program for chemometric analysis denominated IMMAN (acronym for I nformation theory-based Che M o M etrics AN alysis) are presented. This is multi-platform software developed in the Java programming language, designed with a rem...

Full description

Saved in:
Bibliographic Details
Published in:Molecular diversity 2015-05, Vol.19 (2), p.305-319
Main Authors: Urias, Ricardo W. Pino, Barigye, Stephen J., Marrero-Ponce, Yovani, García-Jacas, César R., Valdes-Martiní, José R., Perez-Gimenez, Facundo
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c442t-b08c02cf81274b348405b13776e60e42d9886ea9d90361aacb37c5d360d7ffe33
cites cdi_FETCH-LOGICAL-c442t-b08c02cf81274b348405b13776e60e42d9886ea9d90361aacb37c5d360d7ffe33
container_end_page 319
container_issue 2
container_start_page 305
container_title Molecular diversity
container_volume 19
creator Urias, Ricardo W. Pino
Barigye, Stephen J.
Marrero-Ponce, Yovani
García-Jacas, César R.
Valdes-Martiní, José R.
Perez-Gimenez, Facundo
description The features and theoretical background of a new and free computational program for chemometric analysis denominated IMMAN (acronym for I nformation theory-based Che M o M etrics AN alysis) are presented. This is multi-platform software developed in the Java programming language, designed with a remarkably user-friendly graphical interface for the computation of a collection of information-theoretic functions adapted for rank-based unsupervised and supervised feature selection tasks. A total of 20 feature selection parameters are presented, with the unsupervised and supervised frameworks represented by 10 approaches in each case. Several information-theoretic parameters traditionally used as molecular descriptors (MDs) are adapted for use as unsupervised rank-based feature selection methods. On the other hand, a generalization scheme for the previously defined differential Shannon’s entropy is discussed, as well as the introduction of Jeffreys information measure for supervised feature selection. Moreover, well-known information-theoretic feature selection parameters, such as information gain, gain ratio, and symmetrical uncertainty are incorporated to the IMMAN software ( http://mobiosd-hub.com/imman-soft/ ), following an equal-interval discretization approach. IMMAN offers data pre-processing functionalities, such as missing values processing, dataset partitioning, and browsing. Moreover, single parameter or ensemble (multi-criteria) ranking options are provided. Consequently, this software is suitable for tasks like dimensionality reduction, feature ranking, as well as comparative diversity analysis of data matrices. Simple examples of applications performed with this program are presented. A comparative study between IMMAN and WEKA feature selection tools using the Arcene dataset was performed, demonstrating similar behavior. In addition, it is revealed that the use of IMMAN unsupervised feature selection methods improves the performance of both IMMAN and WEKA supervised algorithms. Graphical abstract Graphic representation for Shannon’s distribution of MD calculating software.
doi_str_mv 10.1007/s11030-014-9565-z
format article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_1673073442</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3651907061</sourcerecordid><originalsourceid>FETCH-LOGICAL-c442t-b08c02cf81274b348405b13776e60e42d9886ea9d90361aacb37c5d360d7ffe33</originalsourceid><addsrcrecordid>eNp1kEtLxDAUhYMojq8f4EYKbtxE703apnUngy_wsVFwF9L01qlMG006yMyvN-OoiODm5kK-c87lMLaPcIwA6iQgggQOmPIyyzO-WGNbmCnJM8Cn9bjLAjmWJY7YdggvAFGFcpONRJYLUAK32MX17e3Z3WnSeKIkuGZ4N56Sxvmk7ePszNC6Phkm5PycVyZQndgJda6jwbc2Mb2ZzkMbdtlGY6aB9r7eHfZ4cf4wvuI395fX47MbbtNUDLyCwoKwTYFCpZVMixSyCqVSOeVAqajLosjJlHUJMkdjbCWVzWqZQ62ahqTcYUcr31fv3mYUBt21wdJ0anpys6AxVxKUjGERPfyDvriZj_d-UuIzvIgUrijrXQieGv3q2874uUbQy5L1qmQdS9bLkvUiag6-nGdVR_WP4rvVCIgVEOJX_0z-V_S_rh8v_oYu</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1672348408</pqid></control><display><type>article</type><title>IMMAN: free software for information theory-based chemometric analysis</title><source>Springer Nature</source><creator>Urias, Ricardo W. Pino ; Barigye, Stephen J. ; Marrero-Ponce, Yovani ; García-Jacas, César R. ; Valdes-Martiní, José R. ; Perez-Gimenez, Facundo</creator><creatorcontrib>Urias, Ricardo W. Pino ; Barigye, Stephen J. ; Marrero-Ponce, Yovani ; García-Jacas, César R. ; Valdes-Martiní, José R. ; Perez-Gimenez, Facundo</creatorcontrib><description>The features and theoretical background of a new and free computational program for chemometric analysis denominated IMMAN (acronym for I nformation theory-based Che M o M etrics AN alysis) are presented. This is multi-platform software developed in the Java programming language, designed with a remarkably user-friendly graphical interface for the computation of a collection of information-theoretic functions adapted for rank-based unsupervised and supervised feature selection tasks. A total of 20 feature selection parameters are presented, with the unsupervised and supervised frameworks represented by 10 approaches in each case. Several information-theoretic parameters traditionally used as molecular descriptors (MDs) are adapted for use as unsupervised rank-based feature selection methods. On the other hand, a generalization scheme for the previously defined differential Shannon’s entropy is discussed, as well as the introduction of Jeffreys information measure for supervised feature selection. Moreover, well-known information-theoretic feature selection parameters, such as information gain, gain ratio, and symmetrical uncertainty are incorporated to the IMMAN software ( http://mobiosd-hub.com/imman-soft/ ), following an equal-interval discretization approach. IMMAN offers data pre-processing functionalities, such as missing values processing, dataset partitioning, and browsing. Moreover, single parameter or ensemble (multi-criteria) ranking options are provided. Consequently, this software is suitable for tasks like dimensionality reduction, feature ranking, as well as comparative diversity analysis of data matrices. Simple examples of applications performed with this program are presented. A comparative study between IMMAN and WEKA feature selection tools using the Arcene dataset was performed, demonstrating similar behavior. In addition, it is revealed that the use of IMMAN unsupervised feature selection methods improves the performance of both IMMAN and WEKA supervised algorithms. Graphical abstract Graphic representation for Shannon’s distribution of MD calculating software.</description><identifier>ISSN: 1381-1991</identifier><identifier>EISSN: 1573-501X</identifier><identifier>DOI: 10.1007/s11030-014-9565-z</identifier><identifier>PMID: 25620721</identifier><language>eng</language><publisher>Cham: Springer International Publishing</publisher><subject>Algorithms ; Biochemistry ; Biomedical and Life Sciences ; Full-Length Paper ; Information management ; Life Sciences ; Metric system ; Models, Theoretical ; Organic Chemistry ; Pharmacy ; Polymer Sciences ; Software ; Theory</subject><ispartof>Molecular diversity, 2015-05, Vol.19 (2), p.305-319</ispartof><rights>Springer International Publishing Switzerland 2015</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c442t-b08c02cf81274b348405b13776e60e42d9886ea9d90361aacb37c5d360d7ffe33</citedby><cites>FETCH-LOGICAL-c442t-b08c02cf81274b348405b13776e60e42d9886ea9d90361aacb37c5d360d7ffe33</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27924,27925</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/25620721$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Urias, Ricardo W. Pino</creatorcontrib><creatorcontrib>Barigye, Stephen J.</creatorcontrib><creatorcontrib>Marrero-Ponce, Yovani</creatorcontrib><creatorcontrib>García-Jacas, César R.</creatorcontrib><creatorcontrib>Valdes-Martiní, José R.</creatorcontrib><creatorcontrib>Perez-Gimenez, Facundo</creatorcontrib><title>IMMAN: free software for information theory-based chemometric analysis</title><title>Molecular diversity</title><addtitle>Mol Divers</addtitle><addtitle>Mol Divers</addtitle><description>The features and theoretical background of a new and free computational program for chemometric analysis denominated IMMAN (acronym for I nformation theory-based Che M o M etrics AN alysis) are presented. This is multi-platform software developed in the Java programming language, designed with a remarkably user-friendly graphical interface for the computation of a collection of information-theoretic functions adapted for rank-based unsupervised and supervised feature selection tasks. A total of 20 feature selection parameters are presented, with the unsupervised and supervised frameworks represented by 10 approaches in each case. Several information-theoretic parameters traditionally used as molecular descriptors (MDs) are adapted for use as unsupervised rank-based feature selection methods. On the other hand, a generalization scheme for the previously defined differential Shannon’s entropy is discussed, as well as the introduction of Jeffreys information measure for supervised feature selection. Moreover, well-known information-theoretic feature selection parameters, such as information gain, gain ratio, and symmetrical uncertainty are incorporated to the IMMAN software ( http://mobiosd-hub.com/imman-soft/ ), following an equal-interval discretization approach. IMMAN offers data pre-processing functionalities, such as missing values processing, dataset partitioning, and browsing. Moreover, single parameter or ensemble (multi-criteria) ranking options are provided. Consequently, this software is suitable for tasks like dimensionality reduction, feature ranking, as well as comparative diversity analysis of data matrices. Simple examples of applications performed with this program are presented. A comparative study between IMMAN and WEKA feature selection tools using the Arcene dataset was performed, demonstrating similar behavior. In addition, it is revealed that the use of IMMAN unsupervised feature selection methods improves the performance of both IMMAN and WEKA supervised algorithms. Graphical abstract Graphic representation for Shannon’s distribution of MD calculating software.</description><subject>Algorithms</subject><subject>Biochemistry</subject><subject>Biomedical and Life Sciences</subject><subject>Full-Length Paper</subject><subject>Information management</subject><subject>Life Sciences</subject><subject>Metric system</subject><subject>Models, Theoretical</subject><subject>Organic Chemistry</subject><subject>Pharmacy</subject><subject>Polymer Sciences</subject><subject>Software</subject><subject>Theory</subject><issn>1381-1991</issn><issn>1573-501X</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2015</creationdate><recordtype>article</recordtype><recordid>eNp1kEtLxDAUhYMojq8f4EYKbtxE703apnUngy_wsVFwF9L01qlMG006yMyvN-OoiODm5kK-c87lMLaPcIwA6iQgggQOmPIyyzO-WGNbmCnJM8Cn9bjLAjmWJY7YdggvAFGFcpONRJYLUAK32MX17e3Z3WnSeKIkuGZ4N56Sxvmk7ePszNC6Phkm5PycVyZQndgJda6jwbc2Mb2ZzkMbdtlGY6aB9r7eHfZ4cf4wvuI395fX47MbbtNUDLyCwoKwTYFCpZVMixSyCqVSOeVAqajLosjJlHUJMkdjbCWVzWqZQ62ahqTcYUcr31fv3mYUBt21wdJ0anpys6AxVxKUjGERPfyDvriZj_d-UuIzvIgUrijrXQieGv3q2874uUbQy5L1qmQdS9bLkvUiag6-nGdVR_WP4rvVCIgVEOJX_0z-V_S_rh8v_oYu</recordid><startdate>20150501</startdate><enddate>20150501</enddate><creator>Urias, Ricardo W. Pino</creator><creator>Barigye, Stephen J.</creator><creator>Marrero-Ponce, Yovani</creator><creator>García-Jacas, César R.</creator><creator>Valdes-Martiní, José R.</creator><creator>Perez-Gimenez, Facundo</creator><general>Springer International Publishing</general><general>Springer Nature B.V</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7X7</scope><scope>7XB</scope><scope>88E</scope><scope>88I</scope><scope>8FI</scope><scope>8FJ</scope><scope>8FK</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FYUFA</scope><scope>GHDGH</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>K9.</scope><scope>M0S</scope><scope>M1P</scope><scope>M2P</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>Q9U</scope><scope>7X8</scope></search><sort><creationdate>20150501</creationdate><title>IMMAN: free software for information theory-based chemometric analysis</title><author>Urias, Ricardo W. Pino ; Barigye, Stephen J. ; Marrero-Ponce, Yovani ; García-Jacas, César R. ; Valdes-Martiní, José R. ; Perez-Gimenez, Facundo</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c442t-b08c02cf81274b348405b13776e60e42d9886ea9d90361aacb37c5d360d7ffe33</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2015</creationdate><topic>Algorithms</topic><topic>Biochemistry</topic><topic>Biomedical and Life Sciences</topic><topic>Full-Length Paper</topic><topic>Information management</topic><topic>Life Sciences</topic><topic>Metric system</topic><topic>Models, Theoretical</topic><topic>Organic Chemistry</topic><topic>Pharmacy</topic><topic>Polymer Sciences</topic><topic>Software</topic><topic>Theory</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Urias, Ricardo W. Pino</creatorcontrib><creatorcontrib>Barigye, Stephen J.</creatorcontrib><creatorcontrib>Marrero-Ponce, Yovani</creatorcontrib><creatorcontrib>García-Jacas, César R.</creatorcontrib><creatorcontrib>Valdes-Martiní, José R.</creatorcontrib><creatorcontrib>Perez-Gimenez, Facundo</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>Health &amp; Medical Collection</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Medical Database (Alumni Edition)</collection><collection>Science Database (Alumni Edition)</collection><collection>Hospital Premium Collection</collection><collection>Hospital Premium Collection (Alumni Edition)</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>Health Research Premium Collection</collection><collection>Health Research Premium Collection (Alumni)</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Health &amp; Medical Complete (Alumni)</collection><collection>Health &amp; Medical Collection (Alumni Edition)</collection><collection>Medical Database</collection><collection>Science Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>ProQuest Central Basic</collection><collection>MEDLINE - Academic</collection><jtitle>Molecular diversity</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Urias, Ricardo W. Pino</au><au>Barigye, Stephen J.</au><au>Marrero-Ponce, Yovani</au><au>García-Jacas, César R.</au><au>Valdes-Martiní, José R.</au><au>Perez-Gimenez, Facundo</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>IMMAN: free software for information theory-based chemometric analysis</atitle><jtitle>Molecular diversity</jtitle><stitle>Mol Divers</stitle><addtitle>Mol Divers</addtitle><date>2015-05-01</date><risdate>2015</risdate><volume>19</volume><issue>2</issue><spage>305</spage><epage>319</epage><pages>305-319</pages><issn>1381-1991</issn><eissn>1573-501X</eissn><abstract>The features and theoretical background of a new and free computational program for chemometric analysis denominated IMMAN (acronym for I nformation theory-based Che M o M etrics AN alysis) are presented. This is multi-platform software developed in the Java programming language, designed with a remarkably user-friendly graphical interface for the computation of a collection of information-theoretic functions adapted for rank-based unsupervised and supervised feature selection tasks. A total of 20 feature selection parameters are presented, with the unsupervised and supervised frameworks represented by 10 approaches in each case. Several information-theoretic parameters traditionally used as molecular descriptors (MDs) are adapted for use as unsupervised rank-based feature selection methods. On the other hand, a generalization scheme for the previously defined differential Shannon’s entropy is discussed, as well as the introduction of Jeffreys information measure for supervised feature selection. Moreover, well-known information-theoretic feature selection parameters, such as information gain, gain ratio, and symmetrical uncertainty are incorporated to the IMMAN software ( http://mobiosd-hub.com/imman-soft/ ), following an equal-interval discretization approach. IMMAN offers data pre-processing functionalities, such as missing values processing, dataset partitioning, and browsing. Moreover, single parameter or ensemble (multi-criteria) ranking options are provided. Consequently, this software is suitable for tasks like dimensionality reduction, feature ranking, as well as comparative diversity analysis of data matrices. Simple examples of applications performed with this program are presented. A comparative study between IMMAN and WEKA feature selection tools using the Arcene dataset was performed, demonstrating similar behavior. In addition, it is revealed that the use of IMMAN unsupervised feature selection methods improves the performance of both IMMAN and WEKA supervised algorithms. Graphical abstract Graphic representation for Shannon’s distribution of MD calculating software.</abstract><cop>Cham</cop><pub>Springer International Publishing</pub><pmid>25620721</pmid><doi>10.1007/s11030-014-9565-z</doi><tpages>15</tpages></addata></record>
fulltext fulltext
identifier ISSN: 1381-1991
ispartof Molecular diversity, 2015-05, Vol.19 (2), p.305-319
issn 1381-1991
1573-501X
language eng
recordid cdi_proquest_miscellaneous_1673073442
source Springer Nature
subjects Algorithms
Biochemistry
Biomedical and Life Sciences
Full-Length Paper
Information management
Life Sciences
Metric system
Models, Theoretical
Organic Chemistry
Pharmacy
Polymer Sciences
Software
Theory
title IMMAN: free software for information theory-based chemometric analysis
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-27T11%3A01%3A46IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=IMMAN:%20free%20software%20for%20information%20theory-based%20chemometric%20analysis&rft.jtitle=Molecular%20diversity&rft.au=Urias,%20Ricardo%20W.%20Pino&rft.date=2015-05-01&rft.volume=19&rft.issue=2&rft.spage=305&rft.epage=319&rft.pages=305-319&rft.issn=1381-1991&rft.eissn=1573-501X&rft_id=info:doi/10.1007/s11030-014-9565-z&rft_dat=%3Cproquest_cross%3E3651907061%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c442t-b08c02cf81274b348405b13776e60e42d9886ea9d90361aacb37c5d360d7ffe33%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=1672348408&rft_id=info:pmid/25620721&rfr_iscdi=true