Decoding Optimization Algorithms for Convolutional Neural Networks in Time Series Regression Tasks

Full Text (PDF, 673KB), PP.37-49

Views: 0 Downloads: 0

Author(s)

Deep Karan Singh 1,* Nisha Rawat 2

1. India Meteorological Department, MoES, Visakhapatnam, India

2. Meteorological Office, INS Dega, Visakhapatnam, India

* Corresponding author.

DOI: https://doi.org/10.5815/ijitcs.2023.06.04

Received: 2 Jul. 2023 / Revised: 7 Sep. 2023 / Accepted: 21 Oct. 2023 / Published: 8 Dec. 2023

Index Terms

Optimizers, Convolutional Neural Networks, Regression, Temperature Prediction, Performance Comparison

Abstract

Optimization algorithms play a vital role in training deep learning models effectively. This research paper presents a comprehensive comparative analysis of various optimization algorithms for Convolutional Neural Networks (CNNs) in the context of time series regression. The study focuses on the specific application of maximum temperature prediction, utilizing a dataset of historical temperature records. The primary objective is to investigate the performance of different optimizers and evaluate their impact on the accuracy and convergence properties of the CNN model. Experiments were conducted using different optimizers, including Stochastic Gradient Descent (SGD), RMSprop, Adagrad, Adadelta, Adam, and Adamax, while keeping other factors constant. Their performance was evaluated and compared based on metrics such as mean squared error (MSE), mean absolute error (MAE), root mean squared error (RMSE), R-squared (R²), mean absolute percentage error (MAPE), and explained variance score (EVS) to measure the predictive accuracy and generalization capability of the models. Additionally, learning curves are analyzed to observe the convergence behavior of each optimizer. The experimental results, indicating significant variations in convergence speed, accuracy, and robustness among the optimizers, underscore the research value of this work. By comprehensively evaluating and comparing various optimization algorithms, we aimed to provide valuable insights into their performance characteristics in the context of time series regression using CNN models. This work contributes to the understanding of optimizer selection and its impact on model performance, assisting researchers and practitioners in choosing the most suitable optimization algorithm for time series regression tasks.

Cite This Paper

Deep Karan Singh, Nisha Rawat, "Decoding Optimization Algorithms for Convolutional Neural Networks in Time Series Regression Tasks", International Journal of Information Technology and Computer Science(IJITCS), Vol.15, No.6, pp.37-49, 2023. DOI:10.5815/ijitcs.2023.06.04

Reference

[1]LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436-444.
[2]Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep Learning. MIT Press.
[3]Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). ImageNet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems (pp. 1097-1105).
[4]Kingma, D. P., & Ba, J. (2014). Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980.
[5]Zeiler, M. D. (2012). ADADELTA: An adaptive learning rate method. arXiv preprint arXiv:1212.5701.
[6]Tieleman, T., & Hinton, G. (2012). RMSProp: Divide the gradient by a running average of its recent magnitude. COURSERA: Neural Networks for Machine Learning, 4(2), 26-31.
[7]Duchi, J., Hazan, E., & Singer, Y. (2011). Adaptive subgradient methods for online learning and stochastic optimization. Journal of Machine Learning Research, 12(Jul), 2121-2159.
[8]Hwang, Y., Goyat, S., & Kim, J. (2020). A comprehensive study on optimization algorithms for deep learning. Applied Sciences, 10(7), 2335.
[9]Bontempi, G., Ben Taieb, S., & Le Borgne, Y. (2012). Machine learning strategies for time series forecasting. In European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (pp. 145-159). Springer.
[10]Deng, L., & Yu, D. (2014). Deep learning: Methods and applications. Foundations and Trends® in Signal Processing, 7(3-4), 197-387.
[11]Gal, Y., & Ghahramani, Z. (2016). Dropout as a Bayesian approximation: Representing model uncertainty in deep learning. In international conference on machine learning (pp. 1050-1059).
[12]Graves, A. (2013). Generating sequences with recurrent neural networks. arXiv preprint arXiv:1308.0850.
[13]Lipton, Z. C., Berkowitz, J., & Elkan, C. (2015). A critical review of recurrent neural networks for sequence learning. arXiv preprint arXiv:1506.00019.
[14]Ruder, S. (2016). An overview of gradient descent optimization algorithms. arXiv preprint arXiv:1609.04747.
[15]Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. (2014). Dropout: A simple way to prevent neural networks from overfitting. The Journal of Machine Learning Research, 15(1), 1929-1958.
[16]Wang, X., Zheng, T., Feng, Z., & Liu, R. (2016). Time series forecasting with deep learning: A systematic literature review. Neurocomputing, 315, 91-101.
[17]Xu, B., Wang, N., Chen, T., & Li, M. (2015). Empirical evaluation of rectified activations in convolutional network. arXiv preprint arXiv:1505.00853.
[18]Yao, Q., Cao, J., Wu, Y., Li, Q., & Wang, Y. (2020). Review on deep learning for time series analysis. Neural Networks, 121, 459-472.