An Optimized Architecture of Image Classification Using Convolutional Neural Network

Full Text (PDF, 707KB), PP.30-39

Views: 0 Downloads: 0

Author(s)

Muhammad Aamir 1,* Ziaur Rahman 1 Waheed Ahmed Abro 2 Muhammad Tahir 3 Syed Mustajar Ahmed 4

1. College of Computer Science Sichuan University, No.24 South Section 1, Yihuan Road, Chengdu, China, 610065

2. School of Computer Science and Engineering Southeast University Sipailou No.2, Nanjing, China,210096

3. School of Software Technology, Dalian University of Technology, Dalian, China,116620

4. School of Computer Science and Electrical Engineering Dalian University of Technology, Dalian, China,116620

* Corresponding author.

DOI: https://doi.org/10.5815/ijigsp.2019.10.05

Received: 22 Jul. 2019 / Revised: 8 Aug. 2019 / Accepted: 23 Aug. 2019 / Published: 8 Oct. 2019

Index Terms

Convolutional neural network, deep learning, image classification, precision, recall.

Abstract

The convolutional neural network (CNN) is the type of deep neural networks which has been widely used in visual recognition. Over the years, CNN has gained lots of attention due to its high capability to appropriately classifying the images and feature learning. However, there are many factors such as the number of layers and their depth, number of features map, kernel size, batch size, etc. They must be analyzed to determine how they influence the performance of network. In this paper, the performance evaluation of CNN is conducted by designing a simple architecture for image classification. We evaluated the performance of our proposed network on the most famous image repository name CIFAR-10 used for the detection and classification task. The experiment results show that the proposed network yields the best classification accuracy as compared to existing techniques. Besides, this paper will help the researchers to better understand the CNN models for a variety of image classification task. Moreover, this paper provides a brief introduction to CNN, their applications in image processing, and discuss recent advances in region-based CNN for the past few years.

Cite This Paper

Muhammad Aamir, Ziaur Rahman, Waheed Ahmed Abro, Muhammad Tahir, Syed Mustajar Ahmed, "An Optimized Architecture of Image Classification Using Convolutional Neural Network", International Journal of Image, Graphics and Signal Processing(IJIGSP), Vol.11, No.10, pp. 30-39, 2019. DOI: 10.5815/ijigsp.2019.10.05

Reference

[1]J. Gareth, W. Daniela, H. Trevor, and T. Rober, An Introduction to Statistical Learning with Applications in R. 2000.

[2]A. M. Andrew, “Second-order Methods for Neural Networks: Fast and Reliable Training Methods for Multi-Layer Perceptrons (Perspectives in Neural Computing Series),” Kybernetes. 1998.

[3]D. Li and W. Chen, “Object tracking with convolutional neural networks and kernelized correlation filters,” in Proceedings of the 29th Chinese Control and Decision Conference, CCDC 2017, 2017.

[4]S. Albawi, T. A. Mohammed, and S. Al-Zawi, “Understanding of a convolutional neural network,” in Proceedings of 2017 International Conference on Engineering and Technology, ICET 2017, 2018.

[5]L. Cun et al., “Handwritten Digit Recognition with a Back-Propagation Network,” in Advances in Neural Information Processing Systems 2, 1990.

[6]Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,” Proc. IEEE, 1998.

[7]J. Gu et al., “Recent advances in convolutional neural networks,” Pattern Recognit., 2018.

[8]R. Hecht-Nielsen, “Theory of the backpropagation neural network,” Neural Networks, 1988.

[9]A. Krizhevsky et al., “ImageNet Classification with Deep Convolutional Neural Networks Alex,” Proc. 31st Int. Conf. Mach. Learn., 2012.

[10]M. D. Zeiler and R. Fergus, “Visualizing and understanding convolutional networks,” in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2014.

[11]M. Lin, Q. Chen, and S. Yan, “Network In Network,” pp. 1–10.

[12]C. Szegedy et al., “Going deeper with convolutions,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2015.

[13]D. A. Dumitru Erhan, Christian Szegedy, Alexander Toshev, “Scalable Object Detection using Deep Neural Networks,” IEEE Conf. Comput. Vis. Pattern Recognit., vol. 2155–2162, pp. 787–790, 2014.

[14]J. Dai et al., “Deformable Convolutional Networks,” in Proceedings of the IEEE International Conference on Computer Vision, 2017.

[15]K. Simonyan and A. Zisserman, “Very Deep Convolutional Networks for Large-Scale Image  Recognition,” pp. 1–14, 2014.

[16]F. N. Iandola, M. W. Moskewicz, K. Ashraf, S. Han, W. J. Dally, and K. Keutzer, “SqueezeNet,” arXiv, 2016.

[17]J. Redmon et al., “You Only Look Once: Unified, Real-Time Object Detection,” Adv. Neural Inf. Process. Syst. 27, 2015.

[18]J. Gehring, M. Auli, D. Grangier, D. Yarats, and Y. N. Dauphin, “Convolutional Sequence to Sequence Learning,” 2017.

[19]X. Chen and B. Russell, “PixelNet: Representation of the pixels, by the pixels, and for the pixels.”

[20]M. Aamir, Y. F. Pu, W. A. Abro, H. Naeem, and Z. Rahman, “A hybrid approach for object proposal generation,” in Lecture Notes in Electrical Engineering, 2019.

[21]H. Lee, S. Eum, and H. Kwon, “ME R-CNN : Multi-Expert R-CNN for Object Detection.”

[22]R. Girshick, J. Donahue, T. Darrell, U. C. Berkeley, and J. Malik, “R-CNN,” 1311.2524v5, 2014.

[23]R. Girshick, “Fast R-CNN,” Proc. IEEE Int. Conf. Comput. Vis., vol. 2015 Inter, pp. 1440–1448, 2015.

[24]S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN,” arixv, 2015.

[25]R. F. C. Networks and J. Dai, “R-FCN : Object Detection via,” arXiv Prepr., 2016.

[26]W. Liu et al., “SSD: Single shot multibox detector,” in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2016.

[27]Y. Zhang, Y. Tang, B. Fang, and Z. Shang, “Fast multi-object tracking using convolutional neural networks with tracklets updating,” in 2017 International Conference on Security, Pattern Analysis, and Cybernetics, SPAC 2017, 2018.

[28]T. Kokul, C. Fookes, S. Sridharan, A. Ramanan, and U. A. J. Pinidiyaarachchi, “Gate connected convolutional neural network for object tracking,” in Proceedings - International Conference on Image Processing, ICIP, 2018.

[29]X. Ren, K. Chen, X. Yang, Y. Zhou, J. He, and J. Sun, “A novel scene text detection algorithm based on convolutional neural network,” in VCIP 2016 - 30th Anniversary of Visual Communication and Image Processing, 2017.

[30]Y. Nagaoka, T. Miyazaki, Y. Sugaya, and S. Omachi, “Text Detection by Faster R-CNN with Multiple Region Proposal Networks,” in Proceedings of the International Conference on Document Analysis and Recognition, ICDAR, 2018.

[31]Z. Rahman, Y. F. Pu, M. Aamir, and F. Ullah, “A framework for fast automatic image cropping based on deep saliency map detection and gaussian filter,” Int. J. Comput. Appl., 2019.

[32]H. Misaghi, R. A. Moghadam, and K. Madani, “Convolutional neural network for saliency detection in images,” 2018 6th Iran. Jt. Congr. Fuzzy Intell. Syst. CFIS 2018, vol. 2018-Janua, no. February, pp. 17–19, 2018.

[33]G. Li and Y. Yu, “Visual saliency detection based on multiscale deep CNN features,” IEEE Trans. Image Process., 2016.

[34]X. Li et al., “DeepSaliency: Multi-Task Deep Neural Network Model for Salient Object Detection,” IEEE Trans. Image Process., 2016.

[35]M. Iyyer, “Deep Learning for Visual Question Answering,” Slides, no. November, pp. 1–7, 2015.

[36]H. Noh, P. H. Seo, and B. Han, “Image question answering using convolutional neural network with dynamic parameter prediction,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2016.

[37]A. Bearman and C. Dong, “Human Pose Estimation and Activity Classification Using Convolutional Neural Networks,” Stanford CS231n, 2015.

[38]A. Toshev and C. Szegedy, “DeepPose: Human pose estimation via deep neural networks,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2014.

[39]X. Liu, “Head pose Estimation Using Convolutional Neural Networks,” 2016.

[40]T. Pfister, K. Simonyan, J. Charles, and A. Zisserman, “Deep convolutional neural networks for efficient pose estimation in gesture videos,” in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2015.

[41]H. Vu, E. Cheng, R. Wilkinson, and M. Lech, “On the use of convolutional neural networks for graphical model-based human pose estimation,” in Proceedings - 2017 International Conference on Recent Advances in Signal Processing, Telecommunications and Computing, SigTelCom 2016, 2017.

[42]C. Bin Jin, S. Li, T. D. Do, and H. Kim, “Real-time human action recognition using CNN over temporal images for static video surveillance cameras,” in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2015.

[43]M. Wang, “Human Action Recognition Using CNN and BoW Methods t-Distributed Stochastic Neighbor Embedding,” vol. 2012, 2016.

[44]M. Ravanbakhsh, H. Mousavi, M. Rastegari, V. Murino, and L. S. Davis, “Action Recognition with Image Based CNN Features,” 2015.

[45]L. Sun, K. Jia, D. Y. Yeung, and B. E. Shi, “Human action recognition using factorized spatio-temporal convolutional networks,” in Proceedings of the IEEE International Conference on Computer Vision, 2015.

[46]E. P. Ijjina and C. Krishna Mohan, “Hybrid deep neural network model for human action recognition,” Appl. Soft Comput. J., 2016.

[47]M. Liang, X. Hu, and B. Zhang, “Convolutional Neural Networks with Intra-layer Recurrent Connections for Scene Labeling,” Adv. Neural Inf. Process. Syst., 2015.

[48]D. Eigen and R. Fergus, “Nonparametric image parsing using adaptive neighbor sets,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2012.

[49]C. Liu, J. Yuen, and A. Torralba, “Nonparametric scene parsing via label transfer,” in Dense Image Correspondences for Computer Vision, 2015.

[50]S. Gould, R. Fulton, and D. Koller, “Decomposing a scene into geometric and semantically consistent regions,” in Proceedings of the IEEE International Conference on Computer Vision, 2009.

[51]V. S. Lempitsky, A. Vedaldi, and A. Zisserman, “A Pylon Model for Semantic Segmentation,” Nips’11, 2011.

[52]C. Farabet, C. Couprie, L. Najman, and Y. Lecun, “Learning hierarchical features for scene labeling,” IEEE Trans. Pattern Anal. Mach. Intell., 2013.

[53]E. Fromont, R. Emonet, T. Kekec, A. Trémeau, and C. Wolf, “Contextually Constrained Deep Networks for Scene Labeling,” 2015.

[54]M. A. Islam, N. Bruce, and Y. Wang, “Dense image labeling using Deep Convolutional Neural Networks,” in Proceedings - 2016 13th Conference on Computer and Robot Vision, CRV 2016, 2016.

[55]A. Krizhevsky, “Learning Multiple Layers of Features from Tiny Images,” Ttechnical Rreport, Dep. Comput. Sci. Univ. Toronto, 2009.

[56]“Logistic Regression SoftMax.” [Online]. Available: https://github.com/wikiabhi/Cifar-10.

[57]“K-nearest neighbor classification.” [Online]. Available: http://cs231n.github.io/classification/.

[58]“Pattern Recognition Network.” [Online]. Available: https://www.mathworks.com/help/nnet/ref/patternnet.html.

[59]“Support Vector Machine.” [Online]. Available: https://houxianxu.github.io/implementation/SVM.html.

[60]M. D. McDonnell and T. Vladusich, “Enhanced image classification with a fast-learning shallow convolutional neural network,” in Proceedings of the International Joint Conference on Neural Networks, 2015.