IJEM Vol. 8, No. 4, 8 Jul. 2018
Cover page and Table of Contents: PDF (size: 742KB)
Full Text (PDF, 742KB), PP.40-55
Views: 0 Downloads: 0
Pre-trained neural networks, deep learning, transfer learning, accuracy, hyper parameters, small datasets
Nowadays the rise of the artificial intelligence is with high speed. Even we are far away from the moment when machines are going to make decisions instead of human beings, the development in some fields of artificial intelligence is astonishing. Deep neural networks are such a filed. They are in a big expansion in a new millennium. Their application is wide: they are used in processing images, video, speech, audio, and text. In the last decade, researches put special attention and resources in the development of special kind of neural networks, convolutional neural networks. These networks have been widely applied to a variety of pattern recognition problems. Convolutional neural networks were trained on millions of images and it is difficult to outperform the accuracies that have been achieved. On the other hand, when we have a small dataset to train the network, there is no success to do it from a scratch. This article exploits the technique of transfer learning for classifying the images of small datasets. It consists fine-tuning of the pre-trained neural network. Here in details is presented the selection of hyper parameters in such networks, in order to maximize the classification accuracy. In the end, the directions have been proposed for the selection of the hyper parameters and of the pre-trained network which can be suitable for transfer learning.
Biserka Petrovska, Igor Stojanovic, Tatjana Atanasova-Pacemska,"Classification of Small Sets of Images with Pre-trained Neural Networks", International Journal of Engineering and Manufacturing(IJEM), Vol.8, No.4, pp.40-55, 2018. DOI: 10.5815/ijem.2018.04.05
[1]J. Schmidhuber, “Deep Learning in Neural Networks: An Overview”, Neural Networks, Volume 61, January 2015, Pages 85–117.
[2]Sch¨olkopf, B., Burges, C. J. C., and Smola, A. J., editors (1998), “Advances in Kernel Methods Support Vector Learning”, MIT Press, Cambridge, MA.
[3]Vapnik, V. (1995), “The Nature of Statistical Learning Theory”, Springer, New York.
[4]Siegelmann, H. T., and Sontag, E. D. (1991), “Turing Computability with Neural Nets”, Applied Mathematics Letters, 4(6):77–80R.
[5]Y. LeCun, Y. Bengio, G. Hinton, “Deep Learning”, Nature 521, 436–444 (28 May 2015).
[6]A. Krizhevsky, I. Sutskever, G. E. Hinton, “ImageNet Classification with Deep Convolutional Neural Networks”, pp. 1106–1114, 2012.
[7]D. H. Hubel, T. N. Wiesel, “Receptive Fields, Binocular Interaction and Functional Architecture in The Cat’s Visual Cortex”, The Journal of Physiology, Vol. 160, No. 1, pp. 106–154.2, Jan. 1962.
[8]K. Fukushima, “Neocognitron: A Self-Organizing Neural Network Model for A Mechanism of Pattern Recognition Unaffected by Shift in Position”, Biological Cybernetics, Vol. 36, No. 4, pp. 193–202, April 1980.
[9]Y. Lecun, L. Bottou, Y. Bengio, P. Haffner, “Gradient-Based Learning Applied to Document Recognition”, Proceedings of the IEEE, Vol. 86, No. 11, pp. 2278–2324, Nov. 1998.
[10]T. Serre, L. Wolf, S. Bileschi, M. Riesenhuber, T. Poggio, “Robust Object Recognition with Cortex-Like Mechanisms”, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 29, No. 3, pp. 411–426, March 2007.
[11]C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, A. Rabinovich, “Going Deeper with Convolutions”, In 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1–9, June 2015.
[12]K. He, X. Zhang, S. Ren, J. Sun, “Deep Residual Learning for Image Recognition”, arXiv:1512.03385 [cs], Vol. 1, Dec. 2015. arXiv: 1512.03385.
[13]Michael A. Nielsen, “Neural Networks and Deep learning”, Determination Press 2015.
[14]Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, “ImageNet: A Large-Scale Hierarchical Image Database”, In CVPR09, 2009.
[15]Wang C, Mahad-evan S, “Heterogeneous Domain Adaptation Using Manifold Alignment”, Proceedings of the 22nd international joint conference on artificial intelligence, vol. 2. 2011. p. 541–46.
[16]Duan L, Xu D, Tsang IW, “Learning with Augmented Features for Heterogeneous Domain Adaptation”, IEEE Trans Pattern Anal Mach Intell 2012; 36(6):1134–48.
[17]Zhu Y, Chen Y, Lu Z, Pan S, Xue G, Yu Y, Yang Q, “Heterogeneous Transfer Learning for Image Classification”, Proceedings of the national conference on artificial intelligence, vol. 2. 2011. p. 1304–9.
[18]Harel M, Mannor, “Learning from Multiple Outlooks”, Proceedings of the 28th international conference on machine learning 2011, p. 401–8.
[19]Nam J, Kim S, “Heterogeneous Defect Prediction”, Proceedings of the 2015 10th joint meeting on foundations of software engineering, 2015. p. 508–19.
[20]Zhou JT, Pan S, Tsang IW, Yan Y, “Hybrid Heterogeneous Transfer Learning Through Deep Learning”, Proceedings of the national conference on artificial intelligence, vol. 3. 2014. p. 2213–20.
[21]Pan SJ, Yang Q, “A Survey on Transfer Learning” IEEE Trans Knowl Data Eng 2010; 22(10):1345–59.
[22]Bergstra, J. and Bengio, Y. (2012), “Random Search for Hyper-Parameter Optimization”, J. Machine Learning Res., 13, 281–305.
[23]Hinton, G. E. (2010), “A Practical Guide to Training Restricted Boltzmann Machines”, Technical Report UTML TR 2010-003, Department of Computer Science, University of Toronto.
[24]Courville, A., Bergstra, J., and Bengio, Y. (2011), “Unsupervised Models of Images by Spike-And-Slab RBMs”, In ICML’2011.
[25]Collobert, R. and Bengio, S. (2004a), “Links Between Perceptrons, MLPs and SVMs”, In ICML’2004.
[26]Alex Krizhevsky, “Learning Multiple Layers of Features from Tiny Images”, 2009.
[27]http://socs.binus.ac.id/2017/02/27/convolutional-neural-network/
[28]https://www.analyticsvidhya.com/blog/2017/06/transfer-learning-the-art-of-fine-tuning-a-pre-trained-model/
[29]https://www.innoarchitech.com/artificial-intelligence-deep-learning-neural-networks-explained/
[30]Bengio, Y. (2012), “Practical Recommendations for Gradient-Based Training of Deep Architectures”, arXiv: 1206.5533V2 [cs.LG] 16 Sep 2012.
[31]Priya Gupta, Nidhi Saxena, Meetika Sharma, Jagriti Tripathi,"Deep Neural Network for Human Face Recognition", International Journal of Engineering and Manufacturing(IJEM), Vol.8, No.1, pp.63-71, 2018.DOI: 10.5815/ijem.2018.01.06
[32]Kalid A.Smadi, Takialddin Al Smadi,"Automatic System Recognition of License Plates using Neural Networks", International Journal of Engineering and Manufacturing(IJEM), Vol.7, No.4, pp.26-35, 2017.DOI: 10.5815/ijem.2017.04.03