Sliding Window Based High Utility Item-Sets Mining over Data Stream Using Extended Global Utility Item-Sets Tree

Full Text (PDF, 662KB), PP.72-83

Views: 0 Downloads: 0

Author(s)

P. Amaranatha Reddy 1,* MHM Krishna Prasad 1

1. Department of CSE, UCE, JNTU Kakinada

* Corresponding author.

DOI: https://doi.org/10.5815/ijigsp.2022.05.06

Received: 12 Feb. 2022 / Revised: 16 Mar. 2022 / Accepted: 25 Apr. 2022 / Published: 8 Oct. 2022

Index Terms

Data mining, high utility item-sets mining, stream mining, sliding window.

Abstract

High utility item-sets mining(HUIM)is a special topic in frequent item-sets mining(FIM). It gives better insights for business growth by focusing on the utility of items in a transaction. HUIM is evolving as a powerful research area due to its vast applications in many fields. Data stream processing, meanwhile, is an interesting and challenging problem since, processing very fast generating a huge amount of data with limited resources strongly demands high-performance algorithms. This paper presents an innovative idea to extract the high utility item-sets (HUIs) from the dynamic data stream by applying sliding window control. Even though certain algorithms exist to solve the same problem, they allow redundant processing or reprocessing of data. To overcome this, the proposed algorithm used a trie like structure called Extended Global Utility Item-sets tree (EGUI-tree), which is flexible to store and retrieve the mined information instead of reprocessing. An experimental study on real-world datasets proved that EGUI-tree algorithm is faster than the state-of-the-art algorithms.

Cite This Paper

P. Amaranatha Reddy, MHM Krishna Prasad, " Sliding Window Based High Utility Item-Sets Mining over Data Stream Using Extended Global Utility Item-Sets Tree", International Journal of Image, Graphics and Signal Processing(IJIGSP), Vol.14, No.5, pp. 72-83, 2022. DOI:10.5815/ijigsp.2022.05.06

Reference

[1]R. Agrawal, T. Imielinski and A. Swami, "Database mining: a performance perspective," in IEEE Transactions on Knowledge and Data Engineering, vol. 5, no. 6, pp. 914-925, Dec. 1993.
[2]Rakesh Agrawal and Ramakrishnan Srikant, “Fast Algorithms for Mining Association Rules in Large Databases,” In Proceedings of the 20th International Conference on Very Large Data Bases (VLDB '94), pp. 487-499, 1994.
[3]Dawar, Siddharth & Goyal, Vikram & Bera, Debajyoti, “A hybrid framework for mining high-utility itemsets in a sparse transaction database,” Applied Intelligence, vol. 47, 2017. 10.1007/s10489-017-0932-1.
[4]P. Fournier-Viger, J. Chun-Wei Lin, T. Truong-Chi, R. Nkambou, “A Survey of High Utility Itemset Mining,” In: Fournier-Viger P., Lin JW., Nkambou R., Vo B., Tseng V. (eds) High-Utility Pattern Mining. Studies in Big Data, vol 51, 2019. Springer, Cham
[5]Krishnamoorthy, Srikumar, “Pruning strategies for mining high utility itemsets,” Expert Systems with Applications, vol. 42, pp. 2371-2381, 2015.
[6]Tin Truong, Hai Duong, Bac Le, Philippe Fournier-Viger, “FMaxCloHUSM: An efficient algorithm for mining frequent closed and maximal high utility sequences,” Engineering Applications of Artificial Intelligence, vol. 85, 2019.
[7]Liu, Junqiang & Wang, ke & Fung, Benjamin, “Mining High Utility Patterns in One Phase without Generating Candidates,” IEEE Transactions on Knowledge and Data Engineering, vol. 28, pp. 1245-1257, 2016. 10.1109/TKDE.2015.2510012.
[8]P. A. Reddy and M. H. M. K. Prasad, "Challenges to find association rules over various types of data items: A Survey," 2017 International Conference on Computing, Communication and Automation (ICCCA), Greater Noida, pp. 180-184, 2017.
[9]Fournier Viger, Philippe & Lin, Chun-Wei & Rage, Uday & Koh, Yun Sing & Thomas, Rincy, “A Survey of Sequential Pattern Mining,” Data Science and Pattern Recognition, vol. 1, pp. 54-77, 2017.
[10]C. Wensheng Gan, Jerry Chun-Wei Lin, Philippe Fournier-Viger, Han-Chieh Chao, Tzung-Pei Hong, and Hamido Fujita, “A survey of incremental high-utility itemset mining,” Wiley Int. Rev. Data Min. and Knowl. Disc., vol. 8, no. 2, March 2018.
[11]T. Truong-Chi, P. Fournier-Viger, “A Survey of High Utility Sequential Pattern Mining,” In: Fournier-Viger P., Lin JW., Nkambou R., Vo B., Tseng V. (eds) High-Utility Pattern Mining. Studies in Big Data, Springer, Cham, vol 51, 2019.
[12]Zhang, Chongsheng & Almpanidis, George & Wang, Wanwan & Liu, Changchang, “An Empirical Evaluation of High Utility Itemset Mining Algorithms,” Expert Systems with Applications, 2018.
[13]V. Lee, R. Jin, G. Agrawal, “Frequent Pattern Mining in Data Streams,” In: Aggarwal C., Han J. (eds) Frequent Pattern Mining. Springer, Cham, 2014.
[14]Bai, P. S. Deshpande and M. Dhabu, "Selective Database Projections Based Approach for Mining High-Utility Itemsets," in IEEE Access, vol. 6, pp. 14389-14409, 2018.
[15]Jerry Chun-Wei Lin, Jiexiong Zhang, Philippe Fournier-Viger, Tzung-Pei Hong, Ji Zhang, “A two-phase approach to mine short-period high-utility itemsets in transactional databases,” Advanced Engineering Informatics, vol. 33, pp. 29-43, 2017.
[16]U. Yun, G. Lee and E. Yoon, "Efficient High Utility Pattern Mining for Establishing Manufacturing Plans With Sliding Window Control," in IEEE Transactions on Industrial Electronics, vol. 64, no. 9, pp. 7239-7249, Sept. 2017.
[17]Y. Liu, W. Liao, and A. Choudhary, “A two-phase algorithm for fast discovery of high utility itemsets,” in Proc. 9th Pacific-Asia Conf. Knowl. Discovery Data Mining, pp. 689–695, May 2005.
[18]H. Ryang and U. Yun, “High utility pattern mining over data streams with sliding window technique,” Expert Syst. Appl., vol. 57, pp. 214–231, Sep. 2016.
[19]F Duong, Quang-Huy & Fournier Viger, Philippe & Ramampiaro, Heri & Nørvåg, Kjetil & Dam, Thu-Lan, “Efficient high utility itemset mining using buffered utility-lists,” Applied Intelligence, 2017.
[20]Fournier Viger, Philippe & Zhang, Yimin & Lin, Chun-Wei & Fujita, Hamido & Koh, Yun Sing, “Mining Local and Peak High Utility Itemsets,” Information Sciences, 2019.
[21]Dawar, Siddharth & Sharma, Veronica & Goyal, Vikram, “Mining top-k high-utility itemsets from a data stream under sliding window model,” Applied Intelligence, 2017.
[22]Jayakrushna Sahoo, Ashok Kumar Das, and A. Goswami, “An efficient fast algorithm for discovering closed+ high utility itemsets,” Applied Intelligence, vol. 45, no.1, pp. 44-74, July 2016.
[23]P. Fournier-Viger, C.-W. Wu, S. Zida and V. S. Tseng, “FHM: Faster high-utility itemset mining using estimated utility co-occurrence pruning,” In Proc. 21st
[24]Fournier-Viger and Philippe, “Efficient Incremental High Utility Itemset Mining”, In Proceedings of the ASE BigData & SocialInformatics, pp. 53, 2015.