Loading…
A new training algorithm for long short-term memory artificial neural network based on particle swarm optimization
Long short-term memory deep artificial neural network is the most commonly used artificial neural network in the literature to solve the forecasting problem, and it is usually trained with the Adam algorithm, which is a derivative-based method. It is known that derivative-based methods are adversely...
Saved in:
Published in: | Granular computing (Internet) 2023-11, Vol.8 (6), p.1645-1658 |
---|---|
Main Authors: | , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Long short-term memory deep artificial neural network is the most commonly used artificial neural network in the literature to solve the forecasting problem, and it is usually trained with the Adam algorithm, which is a derivative-based method. It is known that derivative-based methods are adversely affected by local optimum points and training results can have large variance due to their random initial weights. In this study, a new training algorithm is proposed, which is less affected by the local optimum problem and has a lower variance due to the random selection of initial weights. The proposed new training algorithm is based on particle swarm optimization, which is an artificial intelligence optimization method used to solve the numerical optimization problem. Since particle swarm optimization does not need the derivative of the objective function and it searches in the random search space with more than one solution point, the probability of getting stuck in the local optimum problem is lower than the derivative algorithms. When the proposed training algorithm is based on particle swarm optimization, the probability of getting stuck in the local optimum problem is very low. In the training algorithm, the restart strategy and the early stop condition are included so that the algorithm eliminates the overfitting problem. To test the proposed training algorithm, 10-time series obtained from FTSE stock exchange data sets are used. The proposed training algorithm is compared with Adam’s algorithm and other ANNs using various statistics and statistical hypothesis tests. The application results show that the proposed training algorithm improves the results of long short-term memory, it is more successful than the Adam algorithm, and the long short-term memory trained with the proposed training algorithm gives superior forecasting performance compared to other ANN types. |
---|---|
ISSN: | 2364-4966 2364-4974 |
DOI: | 10.1007/s41066-023-00389-8 |