Loading…

Ensemble of classifiers for handling biomedical problems

Machine learning and statistical techniques applied to gene expression data have been used to address the questions of distinguishing tumor morphology, predicting post treatment outcome, and finding molecular markers for disease. Today the classification of different morphologies, lineages and cell...

Full description

Saved in:
Bibliographic Details
Main Authors: Kotsiantis, S B, Tsagaraki, I
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Machine learning and statistical techniques applied to gene expression data have been used to address the questions of distinguishing tumor morphology, predicting post treatment outcome, and finding molecular markers for disease. Today the classification of different morphologies, lineages and cell histologies can be performed successfully in many instances. The performance in predicting treatment outcome or drug response has been more limited but some of the results are quite promising. The scope of the research reported here is to investigate the efficiency of machine learning techniques in a number of different biomedical problems. To this end, a number of experiments have been conducted using representative learning algorithms. The decision of which particular method to choose is a complicated problem. A good alternative to choosing only one method is to create a hybrid forecasting system incorporating a number of possible solution methods as components (an ensemble of classifiers). For this purpose, we have implemented a hybrid decision support system that combines the representative algorithms using a stacking variant methodology and achieves better performance than any examined simple and ensemble method. In this work, we use a feature selection pre-process before the usage of the stacking. Feature subset selection is the process of identifying and removing as much irrelevant and redundant features as possible. This reduces the dimensionality of the data enabling the proposed ensemble to operate faster and more effectively.