Loading…
Using Supercomputer to Speed up Neural Network Training
Recent works in deep learning have shown that large models can dramatically improve performance. In this paper, we accelerated the deep network training using many GPUs. We have developed a framework based on Caffe called Caffe-HPC that can utilize computing clusters with multiple GPUs to train larg...
Saved in:
Main Authors: | , , |
---|---|
Format: | Conference Proceeding |
Language: | eng ; jpn |
Subjects: | |
Online Access: | Request full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Recent works in deep learning have shown that large models can dramatically improve performance. In this paper, we accelerated the deep network training using many GPUs. We have developed a framework based on Caffe called Caffe-HPC that can utilize computing clusters with multiple GPUs to train large models. Caffe[6] provides multimedia scientists and practitioners with a clean and modifiable framework for state-of-the-art deep learning algorithms and a collection of reference models. And Caffe-HPC retains all the features of the original Caffe, the model trained on original Caffe can be continue to trained on Caffe-HPC. It provides a convenient solution for people who are using Caffe and want to speed up the training. Using an Asynchronous Stochastic Gradient Descent optimizer, We made a good acceleration on training a CNN model on ILSVRC[5] 2012 dataset. And we have compared the convergence of different SGD algorithms. We believe our work will makes it possible to train larger networks on larger training sets in a reasonable amount of time. |
---|---|
ISSN: | 1521-9097 2690-5965 |
DOI: | 10.1109/ICPADS.2016.0126 |