Development of Relevance Feedback System using Regression Predictive Model and TF-IDF Algorithm

Full Text (PDF, 794KB), PP.31-49

Views: 0 Downloads: 0

Author(s)

Stephen Akuma 1 Rahat Iqbal 2

1. Department of Mathematics and Computer Science, Benue State University, PMB 102119, Nigeria

2. Department of Computing, Coventry University,Priory Street, CV1 5FB Coventry, UK

* Corresponding author.

DOI: https://doi.org/10.5815/ijeme.2018.04.04

Received: 10 Jan. 2018 / Revised: 28 Feb. 2018 / Accepted: 20 Mar. 2018 / Published: 8 Jul. 2018

Index Terms

Recommender System, Implicit feedback system, Domain-specific retrieval, information retrieval, search engine

Abstract

Domain-specific retrieval systems developed for a homogenous group of users can potentially optimise the recommendation of relevant web documents in minimal time as compared to generic systems built for a heterogeneous group of users. Domain-specific retrieval systems are normally developed by learning from users’ past interactions, as a group or individual, with an information system. This paper focuses on the recommendation of relevant web documents to a cohort of users based on their search behaviour. Simulated task situations were used to group users of the same domain. The motivation behind this work is to help a cohort of users find relevant documents that will satisfy their information needs effectively. An aggregated implicit predictive model derived from correlating implicit and explicit feedback parameters was integrated with the traditional term frequency/inverse document frequency (tf-idf) algorithm to improve the relevancy of retrieval results. The aggregated model system was evaluated in terms of recall and precision (Mean Average Precision) by comparing it with self-designed retrieval system and a generic system. The performance of the three systems was measured based on the relevant documents returned. The results showed that the aggregated domain-specific system performed better in returning relevant documents as compared to the other two systems.

Cite This Paper

Stephen Akuma, Rahat Iqbal,"Development of Relevance Feedback System using Regression Predictive Model and TF-IDF Algorithm", International Journal of Education and Management Engineering(IJEME), Vol.8, No.4, pp.31-49, 2018. DOI: 10.5815/ijeme.2018.04.04

Reference

[1]D. Gurung, U. K. Chakraborty and P. Sharma, "An analysis of the Intelligent Predictive String Search Algorithm: A Probabilistic Approach," International Journal of Information Technology and Computer Science(IJITCS), vol. 9, 2, pp. 66-75, 2017. 

[2]M. Claypool et al, "Implicit interest indicators," in International Conference on Intelligent User Interfaces, Proceedings IUI, 2001, pp. 33-40.

[3]V. Balakrishnan, K. Ahmadi and S. D. Ravana, "Improving retrieval relevance using users’ explicit feedback," Aslib J. Inf. Manage., vol. 68, (1), pp. 76-98, 2016. 

[4]K. Takano and K. F. Li, "An adaptive personalized recommender based on web-browsing behavior learning," in Proceedings - International Conference on Advanced Information Networking and Applications, AINA, 2009, pp. 654-660.

[5]S. Akuma et al, "Comparative analysis of relevance feedback methods based on two user studies," Comput. Hum. Behav., vol. 60, pp. 138-146, 7, 2016. 

[6]B. Zhang et al, "Survey of user behaviors as implicit feedback," in 2010 International Conference on Computer, Mechatronics, Control and Electronic Engineering, CMCE 2010, 2010, pp. 345-348.

[7]G. Buscher et al, "Attentive Documents: Eye Tracking as Implicit Feedback for Information Retrieval and Beyond," ACM Transactions on Interactive Intelligent Systems, vol. 2, (1), pp. 1-30, 2012. 

[8]A. F. M. Nazmul et al, "Identifying emotion by keystroke dynamics and text pattern analysis," Behaviour & Information Technology, vol. 33 (9), 2014. 

[9]S. Akuma, "Investigating the Effect of Implicit Browsing Behaviour on Students’ Performance in a Task Specific Context," International Journal of Information Technology and Computer Science(IJITCS), vol. 6, (5), pp. 11-17, 2014. 

[10]G. Jawaheer, P. Weller and P. Kostkova, "Modeling User Preferences in Recommender Systems: A Classification Framework for Explicit and Implicit User Feedback," ACM Transactions on Interactive Intelligent Systems, vol. 4, pp. 1-26, 2014. 

[11]S. Akuma et al, "Implicit predictive indicators: Mouse activity and dwell time," in 10th  IFIP WG 12.5 International Conference,  AIAI 2014, Rhodes, Greece, 2014, pp. 162-171.

[12]S. Fox et al, "Evaluating implicit measures to improve Web search," ACM Transactions on Information Systems, vol. 23, (2), pp. 147-168, 2005. 

[13]S. Akuma et al, "Inferring users’ interest on web documents through their implicit behaviour," Commun. Comput. Info. Sci., vol. 517, pp. 315-324, 2015. 

[14]V. Balakrishnan and X. Zhang, "Implicit user behaviours to improve post-retrieval document relevancy," Comput. Hum. Behav., vol. 33, pp. 104-112, 2014. 

[15]A. Grzywaczewski and R. Iqbal, "Task-Specific Information Retrieval Systems for Software Engineers," Journal of Computer and System Sciences, Elsevier, vol. 78, (4), pp. 1204-1218, 2012. 

[16]Z. Zhu et al, "User interest modeling based on access behavior and its application in personalized information retrieval," in Proceedings - 3rd International Conference on Information Management, Innovation Management and Industrial Engineering, ICIII 2010, 2010, pp. 266-270.

[17]M. Busby, Learn Google. Plano, Texas: Wordware Publishing Inc, 2003.

[18]R. Iqbal et al, "Design implications for task-specific search utilities for retrieval and reengineering of code," Enterprise Information Systems, pp. 1751-7575, 2015. 

[19]R. W. White and D. Kelly, "A study on the effects of personalization and task information on implicit feedback performance," in International Conference on Information and Knowledge Management, Proceedings, 2006, pp. 297-306.

[20]L. A. Leiva and J. Huang, "Building a better mousetrap: Compressing mouse cursor activity for web analytics," Information Processing & Management, vol. 51, (2), pp. 114-129, 3, 2015. 

[21]H. Lieberman, "Autonomous interface agents," in Conference on Human Factors in Computing Systems - Proceedings, 1997, pp. 67-74.

[22]E. Han et al, "WebACE: A web agent for document categorization and exploration," in Proceedings of the International Conference on Autonomous Agents, 1998, pp. 408-415.

[23]L. Chen and K. Sycara, "WebMate - a personal agent for searching and browsing," in Proceedings of the 2nd International Conference on Autonomous Agents, 1998, .

[24]A. Chandrakala and k. Sanjay Dwivedi, "Keyphrase Extraction of News Web Pages ," International Journal of Education and Management Engineering(IJEME), vol. 8, 1, pp. 48-58, 2018. 

[25]T. Joachims, D. Freitag and T. Mitchell, "WebWatcher: A tour guide for the world wide web," in Proceedings of the Fifteenth International Joint Conference on Artificial Intelligence, 1997, pp. 770-775.

[26]M. Balabanovic, Y. Shoham and Y. Yun, "An Adaptive Agent for Automated Web Browsing," Journal of Image Representation and Visual Communication, vol. 6(5), 1995. 

[27]A. Alhindi et al, "Profile-Based Summarisation for Web Site Navigation," ACM Transactions on Information Systems, vol. 33, (1), pp. 1-40, 2015. 

[28]E. J. Glover et al, "Web Search - Your Way: Improving Web searching with user preferences," Commun ACM, vol. 44, (12), pp. 97-102, 2001. 

[29]J. M. Ramírez, J. Donadeu and F. J. Neves, "Poirot: A relevance-based web search agent," 2000.

[30]A. Kumar and M. Ashraf, "Efficient technique for personalized web search using users browsing history," in International Conference on Computing, Communication and Automation, ICCCA 2015, 2015, pp. 919-923.

[31]Q. Guo and E. Agichtein, "Beyond dwell time: Estimating document relevance from cursor movements and other post-click searcher behavior," in WWW'12 - Proceedings of the 21st Annual Conference on World Wide Web, 2012, pp. 569-578.

[32]G. Buscher et al, "Large-scale analysis of individual and task differences in search result page examination strategies," in WSDM 2012 - Proceedings of the 5th ACM International Conference on Web Search and Data Mining, 2012, pp. 373-382.

[33]E. R. Nú?ez-Valdéz et al, "Implicit feedback techniques on recommender systems applied to electronic books," Comput. Hum. Behav., vol. 28, (4), pp. 1186-1193, 2012. 

[34]G. Salton and C. Buckley, "Term-weighting approaches in automatic text retrieval," Information Processing and Management, vol. 24, (5), pp. 513-523, 1988. 

[35]B. Bina, R. H. Goudar and K. Kaushal, "Quine-McCluskey: A Novel Concept for Mining the Frequency Patterns from Web Data," International Journal of Education and Management Engineering(IJEME), vol. Vol.8, No.1, pp. 40-47, 2018. 

[36]A. Grzywaczewski et al, "An Investigation of User Behaviour Consistency for Context-Aware Information Retrieval Systems ," International Journal of Advanced Pervasive and Ubiquitous Computing (IJAPUC), vol. 1(4), pp. 69-90, 2009. 

[37]G. Salton, A. Wong and C. S. Yang, "VECTOR SPACE MODEL FOR AUTOMATIC INDEXING." Commun ACM, vol. 18, (11), pp. 613-620, 1975. 

[38]G. Salton and C. Buckley, "Improving Retrieval Performance by Relevance Feedback," Journal of the American Society for Information Science., vol. 44(4), pp. 288-297, 1990. 

[39]D. Kelly, "Methods for evaluating interactive information retrieval systems with users," Foundations and Trends in Information Retrieval, vol. 3, (1-2), pp. 1-224, 2009. 

[40]R. W. White and G. Buscher, "Text selections as implicit relevance feedback," in SIGIR'12 - Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval, 2012, pp. 1151-1152.

[41]E. Agichtein, E. Brill and S. Dumais, "Improving web search ranking by incorporating user behavior information," in Proceedings of the Twenty-Ninth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2006, pp. 19-26.

[42]M. D. Smucker, J. Allan and B. Carterette, "A comparison of statistical significance tests for information retrieval evaluation," in In Proceedings of the Sixteenth ACM Conference on Conference on Information and Knowledge Management, CIKM ’07, New York, NY, USA, 2007, pp. 623-632.

[43]B. P. Knijnenburg et al, "Explaining the user experience of recommender systems," User Modeling and User-Adapted Interaction, vol. 22, pp. 441-504, 2012. 

[44]D. Manning C., P. Raghavan and H. Schütze, Introduction to Information Retrieval. York, NY, USA: Cambridge University Press, 2008.

[45]Evaluation 12: mean average precision. Available: https://www.youtube.com/watch?v=pM6DJ0ZZee0.

[46]M. Sanderson and J. Zobel, "Information retrieval system evaluation: Effort, sensitivity, and reliability," in In Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’05, New York, NY, USA, 2005, pp. 162-169.

[47]V. Cormack G and T. Lynam R., "Validity and power of t-test for comparing map and gmap ," in In Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’07, New York, NY, USA, 2007, pp. 753-754.

[48]S. Jung, J. L. Herlocker and J. Webster, "Click data as implicit relevance feedback in web search," Inf. Process. Manage., vol. 43, (3), pp. 791-807, 2007. 

[49]T. Joachims et al, "Evaluating the accuracy of implicit feedback from clicks and query reformulations in Web search," ACM Trans. Inf. Syst., vol. 25, (2), 2007. 

[50]J. Huang, R. W. White and S. Dumais, "No clicks, no problem: Using cursor movements to understand and improve search," in Conference on Human Factors in Computing Systems - Proceedings, 2011, pp. 1225-1234.

[51]R. Iqbal et al, "ARREST: From Work Practices to Redesign for Usability," The International Journal of Expert Systems with Applications, Elsevier, vol. 38(2), pp. 1182-1192, 2011. 

[52]R. Iqbal et al, "User-centred design and evaluation of ubiquitous services," in Proceedings of the 23rd Annual International Conference on Design of Communication: Documenting and Designing for Pervasive Information, ACM SIGDOC, 2005, pp. 138-145