IJIGSP Vol. 8, No. 3, 8 Mar. 2016
Cover page and Table of Contents: PDF (size: 579KB)
Full Text (PDF, 579KB), PP.19-27
Views: 0 Downloads: 0
Convolutional neural network, auto encoders, pattern invariance, character recognition, Yoruba vowel characters
The ability of the human visual processing system to accommodate and retain clear understanding or identification of patterns irrespective of their orientations is quite remarkable. Conversely, pattern invariance, a common problem in intelligent recognition systems is not one that can be overemphasized; obviously, one's definition of an intelligent system broadens considering the large variability with which the same patterns can occur. This research investigates and reviews the performance of convolutional networks, and its variant, convolutional auto encoder networks when tasked with recognition problems considering invariances such as translation, rotation, and scale. While, various patterns can be used to validate this query, handwritten Yoruba vowel characters have been used in this research. Databases of images containing patterns with constraints of interest are collected, processed, and used to train and simulate the designed networks. We provide extensive architectural and learning paradigms review of the considered networks, in view of how built-in invariance is learned. Lastly, we provide a comparative analysis of achieved error rates against back propagation neural networks, denoising auto encoder, stacked denoising auto encoder, and deep belief network.
Oyebade K. Oyedotun, Kamil Dimililer,"Pattern Recognition: Invariance Learning in Convolutional Auto Encoder Network", International Journal of Image, Graphics and Signal Processing(IJIGSP), Vol.8, No.3, pp.19-27, 2016. DOI: 10.5815/ijigsp.2016.03.03
[1]L.G. Shapiro and G.C. Stockman: Computer Vision, Prentice-Hall, Inc., New Jersey, pp. 1-4. 2001.
[2]R.F. Abdel-Kader, R.M. Ramadan, F.W. Zaki: Rotation-Invariant Pattern Recognition Approach Using Extracted Descriptive Symmetrical Patterns, International Journal of Advanced Computer Science and Applications, vol. 3(5), pp. 151-158. 2012.
[3]Chen Guangyi, and Tien D. Bui: Invariant Fourier-wavelet descriptor for pattern recognition, Pattern Recognition, vol. 32(7), pp. 1083-1088. 1999.
[4]Prevost, Donald, et al.: Rotation, scale, and translation invariant pattern recognition using feature extraction, AeroSense 97, International Society for Optics and Photonics, pp. 255-264. 1997.
[5]S.K. Ranade, S. Anand: Empirical Analysis of Rotation Invariance in Moment Coefficients, International Journal of Computer Applications, vol. 119 (15), 2015.
[6]A. Khashman, B. Sekeroglu, K. Dimililer: Intelligent Rotation-Invariant Coin Identification System, WSEAS Transactions on Signal Processing, ISSN 1790-5022, Issue 5, vol. 2. 2006.
[7]J.Z. Leibo, et al.: Learning generic invariances in object recognition: translation and scale, Computer Science and Artificial Intelligence Laboratory Technical Report, pp. 1- 6. 2010.
[8]D. Ciresan, U. Meier, & J. Schmidhuber: Multi-column deep neural networks for image classification, In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3642-3649. 2012.
[9]Wersing, Heiko, and Edgar Körner: Learning optimized features for hierarchical models of invariant object recognition, Neural computation vol. 15(7), pp. 1559-1588. 2003.
[10]C. Neubaurer.: Recognition of Handwritten Digits and Human Faces by Convolutional Neural Networks, International Computer Science Institute, 1996, pp. 1-9.
[11]K. Fukushima and T. Imagawa, Recognition and Segmentation of Connected Characters with Selective Attention, Neural Networks (6), 33-41. 1993.
[12]Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner: Gradient-Based Learning Applied to Document Recognition, Proceedings of the IEEE, pp. 6-10. 1998.
[13]D. C. Cires¸an, U. Meier, J. Masci, L. M. Gambardella, and J. Schmidhuber: Flexible, High Performance Convolutional Neural Networks for Image Classification, Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence (2011), 1238-1243.
[14]J. Bouvrie, Notes on Convolutional Neural Networks, Center for Biological and Com-putational Learning, Massachusetts Institute of Technology, Cambridge, MA 02139, pp. 3-6. 2010.
[15]N. Sauder, Encoded Invariance in Convolutional Neural Networks, University of Chicago, pp. 2-6. 2006.
[16]D. Scherer, A. Muller, and S. Behnke, Evaluation of pooling operations in convolutional architectures for object recognition, Proc. of the Intl. Conf. on Artificial Neural Networks (2010), 92–101.
[17]J. Nagi, F. Ducatelle, G.A. Di Caro, at el.: Max-Pooling Convolutional Neural Networks for Vision-based Hand Gesture Recognition, IEEE International Conference on Signal and Image Processing Applications (2011), 343-349.
[18]O.K. Oyedotun, E.O. Olaniyi, and A. Khashman, Deep Learning in Character Recognition considering Pattern Invariance Constraints, International Journal of Intelligent Systems and Applications, 7 (7), pp. 1-10. 2015. DOI: 10.5815/ijisa.2015.07.01.
[19]L. Deng, An Overview of Deep-Structured Learning for Information Processing, Asia-Pacific Signal and Information Processing Association: Annual Summit and Conference (2014), 2-4.
[20]J. Masci, U. Meier, D. Cire¸san, and J. Schmidhuber.: Stacked Convolutional Auto-Encoders for Hierarchical Feature Extraction: ICANN 2011, T. Honkela et al., ed., Part I, Springer-Verlag Berlin Heidelberg, LNCS 6791, pp. 52–59. 2011.
[21]D. Erhan at el.: Why Does Unsupervised Pre-training Help Deep Learning?,Appearing in Proceedings of the 13th International Conference on Artificial Intelligence and Statistics (AISTATS), Chia Laguna Resort, Sardinia, Italy (2010): Vol. 9 of JMLR: W&CP 9, 1-7.