Loading…

Research on the Feature Selection Approach Based on IFDS and DPSO With Variable Thresholds in Complex Data Environments

Neighborhood rough model is widely used in feature selection with high dimension, fuzzy, continuous and discrete attributes, incomplete data and so on, and the application of neighborhood rough model depends on neighborhood threshold. In the application of the model, the point-value neighborhood thr...

Full description

Saved in:
Bibliographic Details
Published in:IEEE access 2020, Vol.8, p.58645-58659
Main Authors: Hu, Yanan, Li, Chunsheng, Zhang, Kejia, Wang, Mei, Gao, Yatian
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Neighborhood rough model is widely used in feature selection with high dimension, fuzzy, continuous and discrete attributes, incomplete data and so on, and the application of neighborhood rough model depends on neighborhood threshold. In the application of the model, the point-value neighborhood threshold is not adaptive, which leads to low classification accuracy and high time complexity of the algorithm. In order to solve the above problems, a feature selection approach based on IFDS (Incomplete Fuzzy Hybrid Decision System) and DPSO (Discrete Particle Swarm Optimization Algorithm) with variable thresholds is proposed. Firstly, a neighborhood rough model capable of simultaneously processing fuzzy, hybrid and incomplete data was established. The average reachable distance was introduced to construct the attribute neighborhood threshold set and reduce the interference of noise data on classification accuracy. Secondly, we constructed the DPSO particle fitness function using the feature subset length, the significance of the attribute and the negative domain of the neighborhood, and improved the inertia weight computing method, so as to enhance the feature selection speed and the feature subset quality. Finally, the simulation experiment was performed using the real industrial production data. The experiment effect shows that this method has obvious advantages in improving the classification accuracy, optimizing the search speed and the optimal feature subset quality.
ISSN:2169-3536
2169-3536
DOI:10.1109/ACCESS.2020.2980162