Mining Sequential Patterns from mFUSP - Tree

Full Text (PDF, 578KB), PP.77-89

Views: 0 Downloads: 0

Author(s)

Ashin Ara Bithi 1,* Abu Ahmed Ferdaus 2

1. Asian University of Bangladesh, Dhaka, Bangladesh

2. University of Dhaka, Dhaka, Bangladesh

* Corresponding author.

DOI: https://doi.org/10.5815/ijitcs.2015.07.09

Received: 20 Sep. 2014 / Revised: 10 Jan. 2015 / Accepted: 16 Mar. 2015 / Published: 8 Jun. 2015

Index Terms

Intermediate Projected Tree, Projection Database, Sequential Pattern Mining, Frequent Pattern, Sequence Database, Tree - Based Mining

Abstract

Mining sequential patterns from sequence database has consequential responsibility in the data mining region as it can find the association from the ordered list of events. Mining methods that predicated on the pattern growth approach, such as PrefixSpan, are well-organized enough to denude the sequential patterns, but engendering a projection database for each pattern regards as bottleneck of these methods. Lin (2008) first commenced the concept of tree structure to sequential pattern mining, which is acknowledged as Fast updated sequential pattern tree (FUSP - tree). However, link information stored in each node of FUSP - tree structure increases the complication of this method due to its link updating process. In this paper, at first, we have proposed a modified fast updated sequential pattern tree (called a mFUSP - tree) arrangement for storing the complete set of sequences with just frequent items, their frequencies and their relations among items in the given sequence into a compact data structure; excluding this tree structure avoids storing link information along to the next node of the following branch in the tree that carries the same item. Afterward, we have established by a mining method that our mFUSP - tree structure is proficient enough to ascertain out the perfect set of frequent sequential patterns from sequence databases without generating any intermediate projected tree and without calling for repeated scanning of the original database during mining. Our experimental result proves that, the performance of our proposed mFUSP - tree mining approach is a lot more trustworthy than other existing algorithms like GSP, PrefixSpan and FUSP - tree based mining.

Cite This Paper

Ashin Ara Bithi, Abu Ahmed Ferdaus, "Mining Sequential Patterns from mFUSP - Tree", International Journal of Information Technology and Computer Science(IJITCS), vol.7, no.7, pp.77-89, 2015. DOI:10.5815/ijitcs.2015.07.09

Reference

[1]R. Agrawal and R. Srikant, “Mining sequential patterns," in ICDE, P. S. Yu and A. L. P. Chen, Eds. IEEE Computer Society,1995,pp.3-14. http://doi.ieeecomputersociety.org/ 10.1109/ICDE.1995.380415

[2]R. Srikant and R. Agrawal, “Mining sequential patterns: Generalizations and performance improvements," in EDBT, ser. Lecture Notes in Computer Science, P. M. G. Apers, M. Bouzeghoub, and G. Gardarin, Eds., vol. 1057. Springer, 1996, pp. 3-17. http://dx.doi.org/10.1007/ BFb0014140

[3]J. Pei, J. Han, B. Mortazavi-Asl, H. Pinto, Q. Chen, U. Dayal, and M. Hsu, “Prefixspan: Mining sequential patterns by prefix-projected growth," in ICDE, D. Georgakopoulos and A. Buchmann, Eds. IEEE Computer Society, 2001, pp. 215-224. http://doi.ieeecomputersociety. org/10.1109/ICDE.2001.914830

[4]J. Han, J. Pei, and Y. Yin, “Mining frequent patterns without candidate generation," in SIGMOD Conference, W. Chen, J. F. Naughton, and P. A. Bernstein, Eds. ACM, 2000, pp. 1-12. http://doi.acm.org/10.1145/342009.335372

[5]Bithi A. A., Akhter M., & Ferdaus A. A. “Tree Based Sequential Pattern Mining”, IRACST - International Journal of Computer Science and Information Technology & Security (IJCSITS), ISSN: 2249-9555, Vol. 2, No.6, December2012. http://www.ijcsits.org/papers/vol2no6 2012/25vol2no6.pdf 

[6]C.-W. Lin, T.-P. Hong, W.-H. Lu and W.-Y. Lin, “An incremental FUSP-tree maintenance algorithm," in ISDA, J.-S. Pan, A. Abraham, and C.-C. Chang, Eds. IEEE Computer Society, 2008, pp.445-449. http://doi.ieeecomputersociety.org/10.1109/ISDA.2008.126

[7]H. Cheng, X. Yan, and J. Han, “Incspan: incremental mining of sequential patterns in large database," in KDD, W. Kim, R. Kohavi, J. Gehrke, and W. DuMouchel, Eds. ACM, 2004,pp.527-532. http://doi.acm.org/10.1145/ 1014052.1014114

[8]Z. Zheng, R. Kohavi, and L. Mason, “Real world performance of association rule algorithms," in KDD, 2001, pp.401-406.http://portal.acm.org/citation.cfm?id=502512. 502572.