Loading…

Adaptive activation functions accelerate convergence in deep and physics-informed neural networks

•We employed adaptive activation functions in deep and physics-informed neural networks.•The proposed method is very simple and it is shown to accelerate convergence in neural networks.•In particular, we approximate various nonlinear functions using deep neural networks.•Physics-informed neural netw...

Full description

Saved in:
Bibliographic Details
Published in:Journal of computational physics 2020-03, Vol.404 (C), p.109136, Article 109136
Main Authors: Jagtap, Ameya D., Kawaguchi, Kenji, Karniadakis, George Em
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:•We employed adaptive activation functions in deep and physics-informed neural networks.•The proposed method is very simple and it is shown to accelerate convergence in neural networks.•In particular, we approximate various nonlinear functions using deep neural networks.•Physics-informed neural networks are employed to solve both forward and inverse problems of PDEs.•We also solved standard deep learning benchmarks problems and theoretically proved convergence results. We employ adaptive activation functions for regression in deep and physics-informed neural networks (PINNs) to approximate smooth and discontinuous functions as well as solutions of linear and nonlinear partial differential equations. In particular, we solve the nonlinear Klein-Gordon equation, which has smooth solutions, the nonlinear Burgers equation, which can admit high gradient solutions, and the Helmholtz equation. We introduce a scalable hyper-parameter in the activation function, which can be optimized to achieve best performance of the network as it changes dynamically the topology of the loss function involved in the optimization process. The adaptive activation function has better learning capabilities than the traditional one (fixed activation) as it improves greatly the convergence rate, especially at early training, as well as the solution accuracy. To better understand the learning process, we plot the neural network solution in the frequency domain to examine how the network captures successively different frequency bands present in the solution. We consider both forward problems, where the approximate solutions are obtained, as well as inverse problems, where parameters involved in the governing equation are identified. Our simulation results show that the proposed method is a very simple and effective approach to increase the efficiency, robustness and accuracy of the neural network approximation of nonlinear functions as well as solutions of partial differential equations, especially for forward problems. We theoretically prove that in the proposed method, gradient descent algorithms are not attracted to suboptimal critical points or local minima. Furthermore, the proposed adaptive activation functions are shown to accelerate the minimization process of the loss values in standard deep learning benchmarks using CIFAR-10, CIFAR-100, SVHN, MNIST, KMNIST, Fashion-MNIST, and Semeion datasets with and without data augmentation.
ISSN:0021-9991
1090-2716
DOI:10.1016/j.jcp.2019.109136