IJMECS Vol. 7, No. 6, 8 Jun. 2015
Cover page and Table of Contents: PDF (size: 428KB)
Full Text (PDF, 428KB), PP.50-58
Views: 0 Downloads: 0
Big Data, Query Processing, Cloud Computing, Distributed Storage
The influx of Big Data on the Internet has become a question for many businesses of how they can benefit from big data and how to use cloud computing to make it happen. The magnitude at which data is getting generated day by day is hard to believe and is beyond the scope of a human’s capability to view and analyze it and hence there is an imperative need for data management and analytical tools to leverage this big data. Companies require a fine blend of technologies to collect, analyze, visualize, and process large volume of data. Big Data initiatives are driving urgent demand for algorithms to process data, accentuating challenges around data security with minimal impact on existing systems. In this paper, we present many existing cloud storage systems and query processing techniques to process the large scale data on the cloud. The paper also explores the challenges of big data management on the cloud and related factors that encourage the research work in this field.
Narinder K. Seera, Vishal Jain, "Perspective of Database Services for Managing Large-Scale Data on the Cloud: A Comparative Study", International Journal of Modern Education and Computer Science (IJMECS), vol.7, no.6, pp.50-58, 2015. DOI:10.5815/ijmecs.2015.06.08
[1]SteveLaValle, Eric Lesser, Rebecca Shockley, Michael S. Hopkins and Nina Kruschwitz, “Big data, Analytics and the Path from Insights to Value”, December 2010.
[2]James Manyika, Michael Chui, Brad Brown, Jacques Bughin, Richard Dobbs, Charles Roxburgh, Angela Hung Byers, “Big data: The next frontier for innovation, competition, and productivity”, May 2011.
[3]Divyakant Agarwal, S.Das, S.E. Abbadi, “Big Data and Cloud Computing : Current State and Future Opportunities” EDBT 2011, March 22–24, 2011, Uppsala, Sweden.
[4]Arup Dasgupta, “Big Data-The future is in Analytics” published in Geospatial World April 2013.
[5]Divyakant Agrawal, Elisa Bertino, Michael Franklin, “Challenges and Opportunities with Big Data”.
[6]Van Renesse, R., Birman, K.P., Vogels, W.: Astrolabe: A robust and scalable technology for distributed system monitoring, management, and data mining. ACM Trans. Comput. Syst. 21(2) (2003)
[7]A.N.Paidi, “Data mining: Future trends and applications”, International Journal of Modern Engineering Research, vol 2, Issue 6, Nov-Dec 2012, pp 4657-4663.
[8]Venkatadri M., L.C.Reddy, “A review on data mining from past to the future”, International Journal of Computer Applications (0975-8887), Volujme 15, No. 7, Feb 2011.
[9]Hans-Peter Kriegel, Karsten M. Borgwardt, Peer Kröger, Alexey Pryakhin, Matthias, Schubert, Arthur, “Future trends in Data Mining”, Springer Science+Business Media, LLC 2007
[10]Katarina Grolinger, Wilson A Higashino, Abhinav Tiwari and Miriam AM Capretz, “Data Management in Cloud environments: NoSQl and NewSQL data stores” Journal of Cloud Computing: Advances, Systems and Applications 2013, pp. 2-22.
[11]Phyo Thandar Thant, “Improving the availability of NoSQL databases for Cloud Storage” available online at http://www.academia.edu/4112230/Improving_the_Availability_of_NoSQL_Databases_for_Cloud_Storage.
[12]A. Pavlao, E.Paulson, A. Rasin, D.Abadi, S.Madden, M.Stonebraker, “A Comparison of approaches to large-scale data analysis” SIGMOD’09, June 29–July 2, 2009, Providence, Rhode Island, USA.
[13]R. Gellman, “Privacy in the clouds: Risks to privacy and confidentiality from cloud computing”, Prepared for the World Privacy Forum, online at http://www.worldprivacyforum.org/pdf/WPF Cloud Privacy Report.pdf,Feb 2009.
[14]Pawel Jurczyk and Li Xiong, “Dynamic Query Processing for P2P data services in the Cloud”. Emory University, Atlanta GA 30322, USA
[15]Ioannis Konstantinou, Evangelos Angelou, Christina Boumpouka, Dimitrios Tsoumakos, Nectarios Koziris, “On the Elasticity of NoSQL Databases over Cloud Management Platforms (extended version)”, CIKM Oct 2011, Glasgow UK.
[16]W. Itani, A. Kayssi, A. Chehab, “Privacy as a Service: Privacy-Aware Data Storage and Processing in Cloud Computing Architectures,” Eighth IEEE International Conference on Dependable, Autonomic and Secure Computing, Dec 2009
[17]M. Jensen, J. Schwenk, N. Gruschka, L.L. Iacono, “On Technical Security Issues in Cloud Computing”, IEEE International Conference on Cloud Computing, (CLOUD II 2009), Banglore, India, September 2009.
[18]S. Ghemawat, H. Gobioff, and S.-T. Leung, “The google file system,” in Proceedings of the nineteenth ACM symposium on Operating systems principles”.
[19]D.Borthakur. “The Hadoop Distributed File System : Architecture and Design”, Apache software Foundation, 2007.
[20]Google Inc. Google App Engine. [Online] 2010. [Cited: 07 17, 2010.] http://code.google.com/intl/de-DE/appengine/
[21]Severance, C. Using Google App Engine. Sebastopol : O’Reilly Media, 2009.
[22]Daniel J. Abadi, “Data Management in the Cloud: Limitations and Opportunities”, IEEE 2009.
[23]Jinesh Varia, “Cloud Architectures”, Amazon Web Services, June 2008.
[24]C.Curino, E.P.Jones, R.A.Popa, N.Malviya, E.Wu, S.Madden, H.Balakrishnan, N.Zeldovich, “Relational Cloud: A Database-As-A-Service For The Cloud”.
[25]R. Chaiken, B. Jenkins, P.A. Larson, B. Ramsey, D. Shakib, S. Weaver, and J. Zhou, ”Easy and efficient parallel processing of massive data sets”.
[26]Jimmy Lin, “MapReduce is good enough? If all you have is a hammer, throw away everything That’s not a nail!” arXiv:1209.2191v1 [cs.DC], Sep 2012.
[27]Christos Doulkeridis, Kjetil N., “A Survey of large scale Analutical Query processing in MapReduce”, VLDB Journal.
[28]K. Lee, Y. Lee, H. Choi, Y. Chung, N. Moon, “Parallel Data Processing with MapReduce: A Survey”, SIGMOD Record, Dec 2011 (Vol 40 No. 4)
[29]Patrick Valdureiz, “Parallel database systems: Open Problems and New issues”, Kluwer Academic Publishers, Boston, 1993 pp 137-165.
[30]D.Dewitt, Jim Gray, “Parallel database systems: The future of high performance database systems”, Comm of the ACM, June 1992, Vol 35 No. 6.
[31]Shyam Kotecha, “Platform-as-a-Service”, available online at http://www.ieee.ldrp.ac.in/index.php? option=com_phocadownload&view=category&download=4:pdf&id=1:workshop&Itemid=216
[32]Chun Chen, Gang Chen, Dawei Jiang, Beng Chin Ooi, Hoang Tam Vo, Sai Wu, and Quanqing Xu, “ Providing Scalable Database Services on the Cloud”.
[33]Y. Cao, C. Chen, F. Guo, D. Jiang, Y. Lin, B. C. Ooi, H. T. Vo, S. Wu, and Q. Xu., “A cloud data storage system for supporting both OLTP and OLAP”, Technical Report, National University of Singapore, School of Computing. TRA8/10, 2010.
[34]Meng-Ju Hsieh , Chao-Rui Chang , Li-Yung Ho , Jan-Jan Wu , Pangfeng Liu ,“SQLMR : A Scalable Database Management System for Cloud Computing”.
[35]F. Chang, J. Dean, S. Ghemawat, W. C. Hsieh, D. A. Wallach, M. Burrows, T. Chandra, A. Fikes, and R. E. Gruber, “Bigtable : A distributed storage system for structured data,” ACM Trans Comput. Syst., vol. 26, June 2008.
[36]Xiao Chen, “Google BigTable”, available online at http://www.net.in.tum.de/fileadmin/TUM/NET/NET-2010-08-2/NET-2010-08-2_06.pdf.
[37]Minpeng Zhu and Tore Risch, “Querying Combined cloud-based and Relational Databases” in International Conference on Cloud and service computing 2011.
[38]Herald Kllapi, Dimitris Bilidas, Ian Horrocks, Yannis Ioannidis,Ernesto Jimenez-Ruiz, Evgeny Kharlamov, Manolis Koubarakis, Dmitriy Zheleznyakov, “Distributed Query Processing on the Cloud : the Optique point of View”.
[39]R. Kontchakov,C. Lutz, D. Toman, F. Wolter and M. Zakharyaschev, “The Combined approach to Ontology based database access”.
[40]Mariano Rodruez-Muro, Roman Kontchakov and Michael Zakharyaschev, “Ontology based database access: Ontop of databases” available online at http://www.dcs.bbk.ac.uk/~roman/papers/ISWC13.pdf
[41]D. Campbell, G. Kakivaya and N. Ellis, “Extreme scale with full SQL language support in Microsoft SQL Azure,” in SIGMOD, 2010.
[42]Chad DeLoatch and Scott Blindt, “NoSQL databases: Scalable Cloud and Enterprise Solutions”, Aug 2012.
[43]Christos Doulkeridis, Kjetil Nervag, ” A Survey of Large-Scale Analytical Query Processing in MapReduce”.
[44]J. Dittrich and J.A. Quian, “Efficient Big Data processing in Hadoop MapReduce”, Proceedings of the VLDB Endowment.
[45]Ashish Thusoo, Joydeep Sen Sarma, Namit Jain, Zheng Shao, Prasad Chakka, Suresh Anthony, Hao Liu, Pete Wyckoff and Raghotham Murthy, “Hive - A Warehousing Solution Over a Map-Reduce Framework”, VLDB‘09, August 24-28, 2009, Lyon, France.
[46]“Evaluating Apache Cassandra as a Cloud database”, White Paper by Datastax Corporation, Oct 2013.
[47]Kristóf Kovács, “Cassandra vs MongoDB vs CouchDBvs Redis vs Riak vs HBase comparison”, available online at http://kkovacs.eu/cassandra vs mongodb vs couchdb vs redis.
[48]http://www.sdss.org
[49]Eve S. McCulloch, “Harnessing the Power of Big Data in Biological Research”, AIBS Washington Watch, September 2013.
[50]Spotfire Blogging Team, “10 trends shaping big Data in financial services”, January 2014.
[51]Richard Winter, “Big Data : Business Opportunities, Requirements and Oracle’s Approach”, December 2011.
[52]Lisa Fleisher, “Big Data Enters the Classroom: Technological Advances and Privacy Concerns Clash”.
[53]Darrell M. West, “Big Data for Education: Data Mining, Data Analytics, and Web Dashboards”, Governance Studies at Brookings.
[54]Taylor Shelton and Mark Graham, “Geography and the future of Big Data, Big Data and the future of Geography”, December 2013.
[55]Joan Serras, Melanie Bosredon, Ricardo Herranz & Michael Batty, “Urban Planning and Big Data – Taking LUTi Models to the Next Level?” Nordregio News Issue 1, 2014
[56]An Executive Report by IBM Institute for Business Value “Analytics: The real world use of Big Data in financial services”.
[57]A Deloitte Analytics paper, “Big Data – Time for a lean approach in fininacial services”, online available at http://www2.deloitte.com/content/dam/Deloitte/ie/Documents/Technology\ 2012_big_data_deloitte_ireland.pdf
[58]A. Abouzeid, K. Bajda-Pawlikowski, D. Abadi, A. Silberschatz and A. Rasin, “Hadoopdb: an architectural hybrid of mapreduce and dbms technologies for analytical workloads,” Proc. VLDB Endow., vol. 2,August 2009.
[59]N.Samatha1, K.Vijay Chandu, P.Raja Sekhar Redd,” Query Optimization Issues for Data Retrieval in Cloud Computing”.
[60]M. Tamer Oezsu, Patrick Valduriez ``Principles of Distributed Database Systems, Second Edition'' Prentice Hall, ISBN 0-13-659707-6, 1999
[61]W. Itani, A. Kayssi, A. Chehab, “Privacy as a Service: Privacy-Aware Data Storage and Processing in Cloud Computing Architectures,” Eighth IEEE International Conference on Dependable, Autonomic and Secure Computing, Dec 2009
[62]L. Haas, D. Kossmann, E.Wimmers, and J. Yang, “Optimizing queries across diverse data source,” in Proc. VLDB 1997, Athens, Greece.
[63]Edd Dumbill, “Big Data in the Cloud”, Feb 2011 available online at http://www.o’reilly.com
[64]Vishal Jain, Dr. Mayank Singh, “Ontology Development and Query Retrieval using Protégé Tool”, International Journal of Intelligent Systems and Applications (IJISA), Hongkong, Vol. 5, No. 9, August 2013, page no. 67-75, having ISSN No. 2074-9058, DOI: 10.5815/ijisa.2013.09.08 and index with Thomson Reuters (Web of Science), EBSCO, Proquest, DOAJ, Index Copernicus.
[65]Vishal Jain, Dr. Mayank Singh, “Ontology Based Information Retrieval in Semantic Web: A Survey”, International Journal of Information Technology and Computer Science (IJITCS), Hongkong, Vol. 5, No. 10, September 2013, page no. 62-69, having ISSN No. 2074-9015, DOI: 10.5815/ijitcs.2013.10.06 and index with Thomson Reuters (Web of Science), EBSCO, Proquest, DOAJ, Index Copernicus.