Loading…

Fast Training Methods and Their Experiments for Deep Learning CNN Models

Artificial intelligence and its deep learning have broad application areas, including medical diagnosis, industrial area, data analysis, identification, system control, self-driving, and games. Deep learning methods show the advantage of using neural networks to solve complex tasks. The network trai...

Full description

Saved in:
Bibliographic Details
Main Authors: Jiang, Shanshan, Wang, Sheng-Guo
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Artificial intelligence and its deep learning have broad application areas, including medical diagnosis, industrial area, data analysis, identification, system control, self-driving, and games. Deep learning methods show the advantage of using neural networks to solve complex tasks. The network training speed is highly dependent on the hardware environments and its training methodology. Training a large dataset on a Convolutional Neural Network (CNN) can last for weeks with a limited number of GPUs. Although cloud training is a good method to train large networks within a short time, it is usually costly. What is more, since a good network needs a lot of parameter tunings to get an optimal result, a network needs to train multiple times with different parameter settings. The training speed and the cost for large networks have become a constraint to verify algorithm results when computing resources are limited. Thus, fast training a large network with a limited number of computing resources is needed and challenging, especially for large datasets. In this paper, we propose new fast training methods to accelerate the CNNs training time from the data sampling level. We introduce a flat reduced random sampling strategy and a bottleneck reduced random sampling strategy for deep learning model training. Furthermore, the theoretical analysis for the proposed fast training methods are derived by four theorems and two corollaries to show the properties and benefits of the proposed new methods. The experiments of these fast training methods are run on three image classification datasets: CIFAR-10, CIFAR-100 and ImageNet. The experimental results show that our proposed new training methods can achieve significant training time percentage reduction with a very small accuracy cost correspondingly, i.e., to well provide the fast training method for deep learning CNN models as our theoretical analysis shows and predicts.
ISSN:2161-2927
DOI:10.23919/CCC52363.2021.9549817