Loading…
A grid-quadtree model selection method for support vector machines
•The quadtree does not evaluate unnecessary regions of the hyperparameters space.•The grid-quadtree is efficient and fast to determine parameters for large data sets.•The quadtree significantly decreases the computational time of the grid search.•The parameters determined by the grid-quadtree provid...
Saved in:
Published in: | Expert systems with applications 2020-05, Vol.146, p.113172, Article 113172 |
---|---|
Main Authors: | , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | •The quadtree does not evaluate unnecessary regions of the hyperparameters space.•The grid-quadtree is efficient and fast to determine parameters for large data sets.•The quadtree significantly decreases the computational time of the grid search.•The parameters determined by the grid-quadtree provides high accuracy for the SVM.
In this paper, a new model selection approach for Support Vector Machine (SVM), which integrates the quadtree technique with the grid search, denominated grid-quadtree (GQ) is proposed. The developed method is the first in the literature to apply the quadtree for the SVM parameters optimization. The SVM is a machine-learning technique for pattern recognition whose performance relies on its parameters determination. Thus, the model selection problem for SVM is an important field of study and requires expert and intelligent systems to solve it. Real classification data sets involve a huge number of instances and features, and the greater is the training data set dimension, the larger is the cost of a recognition system. The grid search (GS) is the most popular and the simplest method to select parameters for SVM. However, it is time-consuming, which limits its application for big-sized problems. With this in mind, the main idea of this research is to apply the quadtree technique to the GS to make it faster. Hence, this may lower computational time cost for solving problems such as bio-identification, bank credit risk and cancer detection. Based on the asymptotic behaviors of the SVM, it was noticeably observed that the quadtree is able to avoid the GS full search space evaluation. As a consequence, the GQ carries out fewer parameters analysis, solving the same problem with much more efficiency. To assess the GQ performance, ten classification benchmark data set were used. The obtained results were compared with the ones of the traditional GS. The outcomes showed that the GQ is able to find parameters that are as good as the GS ones, executing 78.8124% to 85.8415% fewer operations. This research points out that the adoption of quadtree expressively reduces the computational time of the original GS, making it much more efficient to deal with high dimensional and large data sets. |
---|---|
ISSN: | 0957-4174 1873-6793 |
DOI: | 10.1016/j.eswa.2019.113172 |