Big Data Optimization Techniques: A Survey

Full Text (PDF, 855KB), PP.41-48

Views: 0 Downloads: 0

Author(s)

Chandrima Roy 1,* Siddharth Swarup Rautaray 1 Manjusha Pandey 1

1. Kiit University, Bhubaneswar, India

* Corresponding author.

DOI: https://doi.org/10.5815/ijieeb.2018.04.06

Received: 13 Nov. 2017 / Revised: 26 Dec. 2017 / Accepted: 19 Jan. 2018 / Published: 8 Jul. 2018

Index Terms

Big Data, Hadoop, Optimization, Scalability

Abstract

As the world is getting digitized the speed in which the amount of data is over owing from different sources in different format, it is not possible for the traditional system to compute and analysis this kind of big data for which big data tool like Hadoop is used which is an open source software. It stores and computes data in a distributed environment. In the last few years developing Big Data Applications has become increasingly important. In fact many organizations are depending upon knowledge extracted from huge amount of data. However traditional data technique shows a reduced performance, accuracy, slow responsiveness and lack of scalability. To solve the complicated Big Data problem, lots of work has been carried out. As a result various types of technologies have been developed. As the world is getting digitized the speed in which the amount of data is over owing from different sources in different format, it is not possible for the traditional system to compute and analysis this kind of big data for which big data tool like Hadoop is used which is an open source software. This research work is a survey about the survey of recent optimization technologies and their applications developed for Big Data. It aims to help to choose the right collaboration of various Big Data technologies according to requirements.

Cite This Paper

Chandrima Roy, Siddharth Swarup Rautaray, Manjusha Pandey, "Big Data Optimization Techniques: A Survey", International Journal of Information Engineering and Electronic Business(IJIEEB), Vol.10, No.4, pp. 41-48, 2018. DOI:10.5815/ijieeb.2018.04.06

Reference

[1]Bandaru, S., Ng, A. H., and Deb, K. Data mining methods for knowledge discovery in multi-objective optimization: Part a-survey. Expert Systems with Applications 70 (2017), 139-159.
[2]Bhattacharya, M., Islam, R., and Abawajy, J. Evolutionary optimization: a big data perspective. Journal of network and computer applications 59 (2016), 416-426.
[3]Dong, B., Zheng, Q., Tian, F., Chao, K.-M., Ma, R., and Anane, R. An optimized approach for storing and accessing small files on cloud storage. Journal of Network and Computer Applications 35, 6 (2012), 1847-1862.
[4]Gu, R., Yang, X., Yan, J., Sun, Y., Wang, B., Yuan, C., and Huang, Y. Shadoop: Improving mapreduce performance by optimizing job execution mechanism in hadoop clusters. Journal of parallel and distributed computing 74, 3 (2014), 2166-2179.
[5]Hua, X., Wu, H., Li, Z., and Ren, S. Enhancing throughput of the hadoop distributed file system for interaction-intensive tasks. Journal of Parallel and Distributed Computing 74, 8 (2014), 2770-2779.
[6]Kolomvatsos, K., Anagnostopoulos, C., and Hadjiefthymiades, S. An efficient time optimized scheme for progressive analytics in big data. Big Data Research 2, 4 (2015), 155-165.
[7]Mr. Marisiddanagouda. M, M. R. M. Survey on performance of hadoop map-reduce optimization methods. International Journal of Recent Research in Mathematics Computer Science and Information Technology 2 (2015), 114-121.
[8]Nagina, D., and Dhingra, S. Scheduling algorithms in big data: A survey. Int.J. Eng. Comput. Sci 5, 8 (2016).
[9]Nghiem, P. P., and Figueira, S. M. Towards efficient resource provisioning in mapreduce. Journal of Parallel and Distributed Computing 95 (2016), 29-41.
[10]Rumi, G., Colella, C., and Ardagna, D. Optimization techniques within the hadoop eco-system: A survey. In Symbolic and Numeric Algorithms for Scientific Computing (SYNASC), 2014 16th International Symposium on (2014), IEEE, pp. 437-444.
[11]Shivaraj B. G., N. N. Survey on schedulers optimization to handle multiple jobs in hadoop cluster. International Journal of Science and Research 4 (2013), 1179-1184.
[12]Shu-Jun, P., Xi-Min, Z., Da-Ming, H., Shu-Hui, L., and Yuan-Xu, Z. Optimization and research of hadoop platform based on fifo scheduler. In MeasuringTechnology and Mechatronics Automation (ICMTMA), 2015 Seventh International Conference on (2015), IEEE, pp. 727-730.
[13]Singh, D., and Reddy, C. K. A survey on platforms for big data analytics. Journal of Big Data 2, 1 (2015), 8.
[14]Tamboli, S., and Patel, S. S. A survey on innovative approach for improvement in efficiency of caching technique for big data application. In Pervasive omputing (ICPC), 2015 International Conference on (2015), IEEE, pp. 1-6.
[15]Ur Rehman, M. H., Liew, C. S., Abbas, A., Jayaraman, P. P., Wah, T. Y., and Khan, S. U. Big data reduction methods: a survey. Data Science and Engineering 1, 4 (2016), 265-284.
[16]Wolf, J., Rajan, D., Hildrum, K., Khandekar, R., Kumar, V., Parekh, S., Wu, K.-L., et al. Flex: A slot allocation scheduling optimizer for mapreduce workloads. In Proceedings of the ACM/IFIP/USENIX 11th International Conference on Middleware (2010), Springer-Verlag, pp. 1-20.
[17]Yildirim, E., Arslan, E., Kim, J., and Kosar, T. Application-level optimization of big data transfers through pipelining, parallelism and concurrency. IEEE Transactions on Cloud Computing 4, 1 (2016), 63-75.
[18]Zhang, H., Chen, G., Ooi, B. C., Tan, K.-L., and Zhang, M. In-memory big data management and processing: A survey. IEEE Transactions on Knowledge and Data Engineering 27, 7 (2015), 1920-1948.
[19]Jena, Bibhudutta, et al. "Name node performance enlarging by aggregator based HADOOP framework." I-SMAC (IoT in Social, Mobile, Analytics and Cloud)(I-SMAC), 2017 International Conference on. IEEE, 2017.
[20]Yadav, Kusum, Manjusha Pandey, and Siddharth Swarup Rautaray. "Feedback analysis using big data tools." ICT in Business Industry & Government (ICTBIG), International Conference on. IEEE, 2016.
[21]Jena, Bibhudutta, et al. "A Survey Work on Optimization Techniques Utilizing Map Reduce Framework in Hadoop Cluster." International Journal of Intelligent Systems and Applications 9.4 (2017): 61.
[22]Chakraborty, Sabyasachi, et al. "A Proposal for High Availability of HDFS Architecture based on Threshold Limit and Saturation Limit of the Namenode." (2017).
[23]Kanaujia, Pradeep Kumar M., Manjusha Pandey, and Siddharth Swarup Rautaray. "Real time financial analysis using big data technologies." I-SMAC (IoT in Social, Mobile, Analytics and Cloud)(I-SMAC), 2017 International Conference on. IEEE, 2017.