Implementing SEReleC with EGG

Full Text (PDF, 168KB), PP.8-15

Views: 0 Downloads: 0

Author(s)

Vishwas J Raval 1,* Padam Kumar 1

1. Electronics & Computer Engineering Department, Indian Institute of Technology Roorkee, Uttarakhand, India

* Corresponding author.

DOI: https://doi.org/10.5815/ijitcs.2012.03.02

Received: 6 Apr. 2011 / Revised: 19 Aug. 2011 / Accepted: 24 Oct. 2011 / Published: 8 Apr. 2012

Index Terms

Web crawlers, Search Engine, HyperFilter, HyperUnique, HyperClass

Abstract

The World Wide Web has immense resources for all kind of people for their specific needs. Searching on the Web using search engines such as Google, Bing, Ask have become an extremely common way of locating information. Searches are factorized by using either term or keyword sequentially or through short sentences. The challenge for the user is to come up with a set of search terms/keywords/sentence which is neither too large (making the search too specific and resulting in many false negatives) nor too small (making the search too general and resulting in many false positives) to get the desired result. No matter, how the user specifies the search query, the results retrieved, organized and presented by the search engines are in terms of millions of linked pages of which many of them might not be useful to the user fully. In fact, the end user never knows that which pages are exactly matching the query and which are not, till one check the pages individually. This task is quite tedious and a kind of drudgery. This is because of lack of refinement and any meaningful classification of search result. Providing the accurate and precise result to the end users has become Holy Grail for the search engines like Google, Bing, Ask etc. There are number of implementations arrived on web in order to provide better result to the users in the form of DuckDuckGo, Yippy, Dogpile etc. This research proposes development of a Meta search engine, called SEReleC that will provide an interface for refining and classifying the search engines’ results so as to narrow down the search results in a sequentially linked manner resulting in drastic reduction of number of pages.

Cite This Paper

Vishwas J Raval, Padam Kumar, "Implementing SEReleC with EGG", International Journal of Information Technology and Computer Science(IJITCS), vol.4, no.3, pp.8-15, 2012. DOI:10.5815/ijitcs.2012.03.02

Reference

[1]Sergey Brin and Lawrence Page; “The Anatomy of a Large-Scale Hypertextual Web Search Engine”; Proceedings of the 7th World Wide Web Conference (WWW7), Brisbane, Australia; April 1998. http://www-db.stanford.edu/~backrub/google.html (Conference Proceedings)

[2]D Hawking and P Thistlewaite; “Methods for Information Server Selection”; ACM Transactions on Information Systems Vol. 17(1); January 1999. (Journal Publication)

[3]Softnik Technologies; “Google API Search Tool”; http://www.searchenginelab.com/common/products/gapis/docs/; 2003. (Internet Draft)

[4]Alex D; “Meta Search Engine Web services with .NET & Java”; EPFL, Lausanne; 2003 (Thesis)

[5]Choon H and Rajkumar B; “Guided Google: A Meta Search Engine and its Implementation using the Google Distributed Web Services”; International Journal of Computers and Applications Vol. 26(3) pp.181-187, ACTA Press; March 2004. (Journal Publication)

[6]Dou S, Zheng C, Qiang Y, Hua-Jun Z, Benyu Z, Yuchang L, Wei- Ying M; “Web-page classification through summarization”; Proceedings of the 27th annual international ACM SIGIR 04, conference on. Research and Development in Information Retrieval, New York, ACM Press, pp.242- 249. 2004. (Conference Proceedings)

[7]Amrish S and Keiichi N; “Hierarchical Classification of Web Search Results Using Personalized Ontologies”, Proceedings of HCI International; 2005; doi=10.1.1.87.5902. (Conference Proceedings)

[8]Vogel D, Bickel S, Haider P, Schimpfk R, Siemen P, Bridges S and Scheffer T; “Classifying search engine queries using the web as background knowledge”; ACM SIGKDD Vol. 7(2) pp.117-122; 2005. (Journal Publication)

[9]Milos R and Mirjana I; “CatS: A Classification Powered Meta-Search Engine” Proceedings of Advances in Web Intelligence and Data Mining; pp.191-200; 2006. (Conference Proceedings)

[10]Debajyoti M, Pradipta B, Young-Chon K; “A Syntactic Classification based Web Page Ranking Algorithm”; Proceedings of 6th International Workshop on MSPT pp.83-92; 2006. (Conference Proceedings)

[11]Isak T, Sarah Z, Amanda S; “Using Web Search Logs to Identify Query Classification Terms”; Proceedings of IEEE International Conference on Information Technology; pp.469-474; 2007. (Conference Proceedings)

[12]Hao W, Liping F and Ling G; “Automatic Web Page Classification using various Features”; LNCS Springer Verlag Vol 5353 pp.368 -376; 2008. (Lecture Notes)

[13]Manoj M and Elizabeth Jacob; “Information Retrieval on Internet using meta-search engines: A review”; Journal of Scientific & Industrial Research Vol. 67 pp.739-746; October 2008. (Journal Publication)

[14]Keyhanipour A, Piroozmand M, Bidoki A.and Badie K; “User-based meta-search with the co-citation graph” ; Porceedings of International Conference on Applications of Digital Information and Web Technologies, pp.563-568; 2008. (Conference Proceedings)

[15]Lin G, Tang J and Wang C; “Studies and Evaluation on Meta Search Engines”; Proceedings of IEEE International Conference on Study & Evaluation of Meta Search Engines pp.191-193; 2010. (Conference Proceedings)

[16]Vishwas R, Amit T, Amit G and Yogesh K; “Re-Search & Re-Classification Algorithm – An Adaptive Algorithm for Web Technologies”; International Journal of Computer Theory & Engineering Vol. 2(6); pp.907-911; December 2010. (Journal Publication)

[17]Lovelyn R and Chandran K; “Web knowledge and Wordnet based Automatic Web Query Classification”; International Journal of Computer Applications Vol. 17(7) pp. 23-38; March 2011. (Journal Publication)

[18]Alamelu M and Santhosh K; “A Novel Approach for Web Page Classification using Optimum features”; International Journal of Computer Science and Network Security Vol.11(5) pp.252-257; May 2011. (Journal Publication)

[19]Vishwas R and Padam K; “EGG (Enhanced Guided Google) – A Meta Search Engine based on Combinatorial Keyword Search”; Proceedings of 2nd IEEE International Conference on Current Trends in Technology; December 2011. (Conference Proceedings)

[20]Vishwas R and Padam K; “SEReleC (Search Engine Result Refinement & Classification) – A Meta Search Engine based on based on Combinatorial Search and Search Keyword based Link Classification”; Proceedings of IEEE International Conference on Advances in Engineering, Sciences and Management; March 2012. (Conference Proceeding)