Convergence analysis of feedforward neural networks using the online gradient method with smoothing L1 regularization

Journal: International Journal of Advanced Technology and Engineering Exploration (IJATEE) (Vol.11, No. 117)

Publication Date: 2024-08-31

Authors : Khidir Shaib Mohamed; Suhail Abdullah Alsaqer;

Page : 1127-1142

Keywords : Online gradient method; Smoothing function; L1 regularization; Convergence; Feedforward neural network.;

Source : Download Find it from : Google Scholar

Abstract

An online gradient method is the simplest and most widely used training method for feed forward neural networks (FFNNs). However, a problem can arise with this method: sometimes the weights become very large, leading to overfitting. Regularization is a technique used to improve the generalization performance and prevent overfitting in networks. This paper focused on the convergence analysis of an online gradient method with L1 regularization for training FFNNs. L1 regularization promotes sparse models but complicates the convergence analysis process due to the inclusion of the absolute value function, which is not differentiable. To address this issue, an adaptive smoothing function was introduced into the error function to replace the L1 regularization term at the origin. This approach encourages a sparser network structure by forcing weights to become smaller during training and eventually eliminating them after training. This strategy simplifies the network structure and accelerates convergence, as demonstrated by the numerical experiments presented in this paper. Additionally, it enables us to prove the convergence of the proposed training method. Numerical experiments based on 4-bit and 5-bit parity problems, Gaussian and hyperbolic function approximations, and Monk and Sonar classifications are provided to validate the theoretical findings and the superiority of the proposed algorithm.

Main Menu

Searching By

PARTNERS

Convergence analysis of feedforward neural networks using the online gradient method with smoothing L1 regularization

Abstract

Advertisement