Stability-driven CNN training with Lyapunov-based dynamic learning rate
Tang, Dahao, Yang, Nan, Deng, Yongkun, Zhang, Yuning, Sani, Abubakar Sadiq and Yuan, Dong (2024) Stability-driven CNN training with Lyapunov-based dynamic learning rate. In: Databases Theory and Applications: 35th Australasian Database Conference, ADC 2024, Gold Coast, QLD, Australia, and Tokyo, Japan, December 16–18, 2024, Proceedings. Lecture Notes in Computer Science, 15449 . Springer, pp. 58-70. ISBN 978-9819612413 ISSN 0302-9743 (Print), 1611-3349 (Online) (doi:10.1007/978-981-96-1242-0_5)
PDF (Author's Accepted Manuscript)
48299 SANI_Stability_Driven_CNN_Training_with_Lyapunov_Based_Dynamic_Learning_Rate_(AAM)_2024.pdf - Accepted Version Restricted to Repository staff only until 13 December 2025. Download (212kB) | Request a copy |
Abstract
In recent years, Convolutional Neural Networks (CNNs) have become a cornerstone in computer vision tasks, but ensuring stable training remains a challenge, especially when high learning rates or large datasets are involved, as standard optimization techniques like Stochastic Gradient Descent (SGD) can suffer from oscillations and slow convergence. In this paper, we leverage control theory to propose a novel stability-driven training method by modeling the CNN training process as a dynamic control system where we introduce Lyapunov Stability Analysis, implemented with Quadratic Lyapunov Function, to guide real-time learning rate adjustments, ensuring stability and faster convergence. We provide both theoretical insights and practical guidelines for the implementation of the learning rate adaptation. We examine the effectiveness of this approach in mitigating oscillations and improving training performance by comparing the proposed Lyanpunov-stability-enhanced SGD, termed SGD-DLR (SGD with Lyapunov-based Dynamic Learning Rate), to traditional SGD with a fixed learning rate. We also conduct experiments on the datasets CIFAR-10 and CIFAR-100 to demonstrate that SGD-DLR enhances both stability and performance, outperforming standard SGD. The code used for the experiment has been released on GitHub: https://github.com/DahaoTang/ADC-2024-SGD_DLR.
Item Type: | Conference Proceedings |
---|---|
Title of Proceedings: | Databases Theory and Applications: 35th Australasian Database Conference, ADC 2024, Gold Coast, QLD, Australia, and Tokyo, Japan, December 16–18, 2024, Proceedings |
Uncontrolled Keywords: | convolutional neural networks, control theory, Lyapunov stability analysis, learning rate |
Subjects: | Q Science > Q Science (General) Q Science > QA Mathematics Q Science > QA Mathematics > QA75 Electronic computers. Computer science |
Faculty / School / Research Centre / Research Group: | Faculty of Engineering & Science Faculty of Engineering & Science > School of Computing & Mathematical Sciences (CMS) |
Related URLs: | |
Last Modified: | 22 Jan 2025 11:06 |
URI: | http://gala.gre.ac.uk/id/eprint/48299 |
Actions (login required)
View Item |
Downloads
Downloads per month over past year