A Roadmap Towards Big Data Opportunities, Emerging Issues and Hadoop as a Solution

Full Text (PDF, 542KB), PP.8-17

Views: 0 Downloads: 0

Author(s)

Rida Qayyum 1,*

1. Deaprtment of Computer Science, Government College Women University Sialkot, 51040, Pakistan

* Corresponding author.

DOI: https://doi.org/10.5815/ijeme.2020.04.02

Received: 24 May 2020 / Revised: 30 May 2020 / Accepted: 6 Jun. 2020 / Published: 8 Aug. 2020

Index Terms

Big Data, Internet of things (IoT), Social Media, Big Data Analytics, Hadoop, HDFS, MapReduce, YARN.

Abstract

The concept of Big Data become extensively popular for their vast usage in emerging technologies. Despite being complex and dynamic, big data environment has been generating the colossal amount of data which is impossible to handle from traditional data processing applications. Nowadays, the Internet of things (IoT) and social media platforms like, Facebook, Instagram, Twitter, WhatsApp, LinkedIn, and YouTube generating data in various formats. Therefore, this promotes a drastic need for technology to store and process this tremendous volume of data. This research outlines the fundamental literature required to understand the concept of big data including its nature, definitions, types, and characteristics. Additionally, the primary focus of the current study is to deal with two fundamental issues; storing an enormous amount of data and fast data processing. Leading to objectives, the paper presents Hadoop as a solution to address the problem and discussed the Hadoop Distributed File System (HDFS) and MapReduce programming framework for storage and processing in Big Data efficiently. Future research directions in this field determined based on opportunities and several emerging issues in Big Data domination. These research directions facilitate the exploration of the domain and the development of optimal solutions to address Big Data storage and processing problems. Moreover, this study contributes to the existing body of knowledge by comprehensively addressing the opportunities and emerging issues of Big Data.

Cite This Paper

Rida Qayyum. " A Roadmap Towards Big Data Opportunities, Emerging Issues and Hadoop as a Solution ", International Journal of Education and Management Engineering (IJEME), Vol.10, No.4, pp.8-17, 2020. DOI: 10.5815/ijeme.2020.04.02

Reference

[1]Priyadarshini, S.B.B., BhusanBagjadab, A. and Mishra, B.K, “The role of IoT and big data in modern technological arena: A comprehensive study,” Internet of Things and Big Data Analytics for Smart Generation, pp. 13-25, 2019.

[2]V.Mayer-Schonberger, K. Cukier, “Big Data: A Revolution That Will Transform How We Live Work and Think,” Pub John Murray, pp. 256, 2013.

[3]Hong-Ning Dai, Hao Wang, Guangquan Xu, Jiafu Wan and Muhammad Imran, “Big data analytics for manufacturing internet of things: opportunities, challenges and enabling technologies”, Enterprise Information Systems, 2019. DOI: 10.1080/17517575.2019.1633689

[4]Betty Jane J., Ganesh E.N, “Big Data and Internet of Things for Smart Data Analytics Using Machine Learning Techniques,” Proceeding of the International Conference on Computer Networks, Big Data and IoT, vol. 49, 2019.

[5]Norjihan A. Ghani, Suraya H Ibrahim, Abaker T. Hashemb E. Ahmed, “Social media big data analytics: A survey,” Elsevier Computers in Human Behavior, vol. 101, pp. 417-428, December 2019. DOI: 10.1016/j.chb.2018.08.039

[6]Jose L. J. Marquez Israel G. C. Jose, Luis L. C, Belen R. Mezcua, “Towards a big data framework for analyzing social media content”, Elsevier International Journal of Information Management, vol. 44, pp. 1-12, February 2019, DOI: 10.1016/j.ijinfomgt.2018.09.003.

[7]YouTube, “YouTube statistics,” 2014, http://www.youtube.com/ yt/press/statistics.html. 

[8]Facebook, Facebook Statistics, 2014, http://www.statisticbrain .com/facebook-statistics/.

[9]Twitter, “Twitter statistics,” 2014, http://www.statisticbrain .com/twitter-statistics/.

[10]Foursquare, “Foursquare statistics,” 2014, https://foursquare .com/about. 

[11]Jeff Bullas, “Social Media Facts and Statistics You Should Know in 2014,” 2014, http://www.jeffbullas.com/2014/01/17/20-socialmedia-facts-and-statistics-you-should-know-in-2014/.

[12]Marcia, “Data on Big Data,” 2012, http://marciaconner.com/ blog/data-on-big-data/.

[13]Younas, M., “Research challenges of big data,” 2019.

[14]Al-Mekhlal, M. and Khwaja, A.A, “A Synthesis of Big Data Definition and Characteristics,” IEEE International Conference on Computational Science and Engineering (CSE) and IEEE International Conference on Embedded and Ubiquitous Computing (EUC), pp. 314-322, August 2019.

[15]S. Madden, “From Databases to Big Data”, IEEE Internet Computing, vol.16, no.3, pp. 4–6, 2012.

[16]El Alaoui, I., Gahi, Y. and Messoussi, R., “Full consideration of Big Data characteristics in sentiment analysis context,” IEEE 4th International Conference on Cloud Computing and Big Data Analysis (ICCCBDA), pp. 126-130, 2019.

[17]Khan, N., Naim, A., Hussain, M.R., Naveed, Q.N., Ahmad, N. and Qamar, S., “The 51 V's Of Big Data: Survey, Technologies, Characteristics, Opportunities, Issues and Challenges”. Proceedings of the International Conference on Omni-Layer Intelligent Systems, pp. 19-24. May 2019.

[18]Aggarwal, A.K, “Opportunities and challenges of big data in public sector,” Web Services: Concepts, Methodologies, Tools, and Applications, pp. 1749-1761, 2019.

[19]S. Kaisler, F. Armour, J. A. Espinosa, and W. Money, “Big Data: Issues and Challenges Moving Forward,” in 46th Hawaii International Conference on System Sciences, 2013, pp. 995–1004.

[20]Dai, H.N., Wang, H., Xu, G., Wan, J. and Imran, M., “Big data analytics for manufacturing internet of things: opportunities, challenges and enabling technologies,” Enterprise Information Systems, pp.1-25, 2019.

[21]Subramaniam, Anushree. "What Is Big Data Analytics | Big Data Analytics Tools and Trends | Edureka". Edureka, 2020, https://www.edureka.co/blog/big-data-analytics/.

[22]Dai, H.N., Wong, R.C.W., Wang, H., Zheng, Z. and Vasilakos, A.V, “Big data analytics for large-scale wireless networks: Challenges and opportunities,” ACM Computing Surveys (CSUR), pp. 1-36., 2019.

[23]Shirdastian, H., Laroche, M. and Richard, M.O, “Using big data analytics to study brand authenticity sentiments: The case of Starbucks on Twitter,” International Journal of Information Management,  pp.291-307, 2019.

[24]Zhang, Z, “Predictive analytics in the era of big data: opportunities and challenges,” Annals of Translational Medicine, 2020.

[25]C. Statchuk, M. Iles, F. Thomas, “Big data and analytics”, in Proceedings of the 2013 Conference of the Center for Advanced Studies on Collaborative Research (CASCON 13), pp. 341–343. 2013.

[26]ur Rehman, M.H., Yaqoob, I., Salah, K., Imran, M., Jayaraman, P.P. and Perera, C., “The role of big data analytics in industrial Internet of Things,” Future Generation Computer Systems, pp. 247-259, 2019

[27]Paliszkiewicz, J., “Management in the Era of Big Data: Issues and Challenges,” 2020.

[28]Gupta, H.K. and Parveen, R., “Comparative Study of Big Data Frameworks,” IEEE International Conference on Issues and Challenges in Intelligent Computing Techniques (ICICT), vol. 1, pp. 1-4, September 2019.

[29]Raj, A. and D’Souza, R., “A Review on Hadoop Eco System for Big Data,” 2019. 

[30]Hussain, T., Sanga, A. and Mongia, S, “Big Data Hadoop Tools and Technologies: A Review”, 2019, Available at SSRN 3462554.

[31]Asim, M., McKinnel, D.R., Dehghantanha, A., Parizi, R.M., Hammoudeh, M. and Epiphaniou, G., “Big data forensics: Hadoop distributed file systems as a case study” Handbook of Big Data and IoT Security, pp. 179-210, 2019.

[32]Deshai, N., Sekhar, B.V.D.S., Venkataramana, S., Srinivas, K. and Varma, G.P.S, “Big Data Hadoop MapReduce Job Scheduling: A Short Survey,” Information Systems Design and Intelligent Applications, pp. 349-365, 2019.

[33]Hu, F., Yang, C., Jiang, Y., Li, Y., Song, W., Duffy, D.Q., Schnase, J.L. and Lee, T., “A hierarchical indexing strategy for optimizing Apache Spark with HDFS to efficiently query big geospatial raster data,” International Journal of Digital Earth,  pp.410-428, 2020.

[34]Deshai, N., Sekhar, B.V.D.S., Venkataramana, S., Srinivas, K. and Varma, G.P.S., “Big Data Hadoop MapReduce Job Scheduling: A Short Survey,” Information Systems Design and Intelligent Applications, pp. 349-365, 2019.

[35]Lev-Libfeld, A. and Margolin, A., “Fast Data: Moving beyond from Big Data's map-reduce,” arXiv preprint arXiv:1906.10468

[36]Monu, M. and Pal, S., “A Review on Storage and Large-Scale Processing of Data-Sets Using Map Reduce, YARN, SPARK, AVRO, MongoDB. YARN, SPARK, AVRO, MongoDB,” April 2019.

[37]Li, R., Yang, Q., Li, Y., Gu, X., Xiao, W. and Li, K, “HeteroYARN: a heterogeneous FPGA-accelerated architecture based on YARN,” IEEE Transactions on Parallel and Distributed Systems. 2019.