Constraint Based Periodicity Mining in Time Series Databases

Full Text (PDF, 1271KB), PP.37-46

Views: 0 Downloads: 0

Author(s)

Ramachandra.V.Pujeri 1,* G.M.Karthik 2

1. KGiSL Institute of Technology, Saravanampatti, Coimbatore -641035, Tamil Nadu, INDIA

2. CSE Dept., SACS MAVMM Engineering College, Madurai -625301, Tamil Nadu, INDIA

* Corresponding author.

DOI: https://doi.org/10.5815/ijcnis.2012.10.04

Received: 20 Jan. 2012 / Revised: 11 Mar. 2012 / Accepted: 28 May 2012 / Published: 8 Sep. 2012

Index Terms

Data Mining, CBPM, FP-tree, Periodicity mining, Time series data, Noise resilient

Abstract

The search for the periodicity in time-series database has a number of application, is an interesting data mining problem. In real world dataset are mostly noisy and rarely a perfect periodicity, this problem is not trivial. Periodicity is very common practice in time series mining algorithms, since it is more likely trying to discover periodicity signal with no time limit. We propose an algorithm uses FP-tree for finding symbol, partial and full periodicity in time series. We designed the algorithm complexity as O (kN), where N is the length of input sequence and k is length of periodic pattern. We have shown our algorithm is fixed parameter tractable with respect to fixed symbol set size and fixed length of input sequences. Experiment results on both synthetic and real data from different domains have shown our algorithms' time efficient and noise-resilient feature. A comparison with some current algorithms demonstrates the applicability and effectiveness of the proposed algorithm.

Cite This Paper

Ramachandra.V.Pujeri, G.M.Karthik, "Constraint Based Periodicity Mining in Time Series Databases", International Journal of Computer Network and Information Security(IJCNIS), vol.4, no.10, pp.37-46, 2012. DOI:10.5815/ijcnis.2012.10.04

Reference

[1]Agrawal, R. and Srikant, R., Mining Sequential Patterns. In: Proceedings of 11th IEEE Int'l Conf. Data Eng. (ICDE). 1995. p. 3-14,
[2]Berberidis, C., Aref, W., Atallah, M., Vlahavas, I. and Elmagarmid, A., Multiple and Partial Periodicity Mining in Time Series Databases. In Proceedings of European Conf. Artificial Intelligence, July 2002.
[3]Blake, C.L. and Merz, C.J., UCI Repository of Machine Learning Databases, University of California, Department of Information and Computer Science. 1998.
[4]Chen, L., Tamer Ozsu, M. and Oria, V., Robust and Fast Similarity Search for Moving Object Trajectories. In: Proceedings of ACM SIGMOD. 2005. p. 491-502
[5]Cheung, C.F., Yu, J.X., and Lu, H., Constructing Suffix Tree for Gigabyte Sequences with Megabyte Memory. IEEE Trans. Knowledge and Data Eng. January 2005. 17(1): p. 90-105,
[6]Das, M. and Dai, H.K., A Survey of DNA Motif Finding Algorithms. BMC Bioinformatics. 2007. 8: p. S21-S33
[7]Dubiner, M. et al., Faster Tree Pattern Matching. Journal of ACM. 1994. 14:205-213
[8]Elfeky, M.G., Aref, W.G. and Elmagarmid, A.K., Periodicity Detection in Time Series Databases. IEEE Trans. Knowledge and Data Eng. 2005. 17(7): p. 875-887
[9]Elfeky, M.G., Aref, W.G. and Elmagarmid, A.K., WARP: Time WARPing for Periodicity Detection. In: Proceedings of Fifth IEEE Int'l Conf. Data Mining, November 2005
[10]Fei Chen, Jie Yuan and Fusheng Yu, Finding periodicity in pseudo periodic time series and forecasting. GrC 2006. 2006. p.534-537
[11]Fu, A.W.C., Keogh, E.J., Lau, L.Y.H. and Ratanamahatana, C.A., Scaling and Time WARPing in Time Series Querying. In: Proceedings of Int'l Conf. Very Large Data Bases (VLDB). 2005. p. 649-660
[12]Glynn, E.F., Chen, J. and Mushegian, A.R., Detecting Periodic Patterns in Unevenly Spaced Gene Expression Time Series Using Lomb-Scargle Periodograms. Bioinformatics, February 2006. 22(3): p. 310-316
[13]Han, J., Gong, W. and Yin, Y., Mining Segment-Wise Periodic Patterns in Time Related Databases. In: Proceedings of ACM Int'l Conf. Knowledge Discovery and Data Mining. 1998. p. 214-218
[14]Han, J., Lakshmanan, L.V.S. and Raymond, T.N., Constraint-Based Multidimensional Data Mining. IEEE Computer. 1999. 32(8): p.46- 50
[15]Han, J., Yin, Y. and Dong, G., Efficient Mining of Partial Periodic Patterns in Time Series Database. In: Proceedings of 15th IEEE International Conference in Data Engineering. 1999. p. 106
[16]Huang, K.Y. and Chang, C.H., SMCA: A General Model for Mining Asynchronous Periodic Patterns in Temporal Databases. IEEE Trans. Knowledge and Data Eng. June 2005. 17(6): p. 774-785
[17]Indyk, P., Koudas, N. and Muthukrishnan, S., Identifying Representative Trends in Massive Time Series Data Sets Using Sketches. In: Proceedings of Int'l Conf. Very Large Data Bases, September 2000.
[18]Keogh, E., Lin, J. and Fu, A., HOT SAX: Efficiently Finding the Most Unusual Time Series Subsequence. In Proceedings of Fifth IEEE Int'l Conf. Data Mining. 2005. p. 226-233
[19]Kumar, N., Lolla, N., Keogh, E., Lonardi, S., Ratanamahatana, C.A. and Wei, L., Time-Series Bitmaps: A Practical Visualization Tool for Working with Large Time Series Databases. In: Proceedings of SIAM Int'l Conf. Data Mining. 2005. p. 531-535
[20]Ma, S. and Hellerstein, J., Mining Partially Periodic Event Patterns with Unknown Periods. In:Proceedings of 17th IEEE Int'l Conf. Data Eng. April 2001
[21]Papadimitriou, S., Brockwell, A. and Faloutsos, C., Adaptive, Hands Off-Stream Mining. In: Proceedings of Int'l Conf. Very Large Data Bases (VLDB). 2003. p. 560-571
[22]Ptitsyn, A.A., Zvonic, S. and Gimble, J.M., Permutation test for periodicity in short time series data. BMC Bioinformatics (BMCBI). 2006. 7:(S-2)
[23]Ptitsyn, A.A., Zvonic, S. and Gimble, J.M., Permutation test for periodicity in short time series data. BMC Bioinformatics (BMCBI). 2007. vol 8
[24]Rasheed, F. and Alhajj, R., Using Suffix Trees for Periodicity Detection in Time Series Databases. In: Proceedings of IEEE Int'l Conf. Intelligent Systems, September 2008
[25]Rasheed, F. and Alhajj, R., STNR: A Suffix Tree Based Noise Resilient Algorithm for Periodicity Detection in Time Series Databases. Applied Intelligence. 2010. 32(3): 267-278
[26]Rasheed, F., Al-Shalalfa, M. and Alhajj, R., Efficient Periodicity Mining in Time Series Databases Using Suffix Trees. IEEE Trans. Knowl. Data Eng. (TKDE). 2011. 23(1):79-94
[27]Sandve, G.K. and Drablos, F., A Survey of Motif Discovery Methods in an Integrated Framework. Biology Direct. 2006. 1: p. 11-26
[28]Sau Dan Lee and Luc De Raedt, Constraint Based Mining of First Order Sequences in SeqLog. Database Support for Data Mining Applications. 2004. p. 154-173
[29]Sheng, C., Hsu, W. and Lee, M.L., Efficient Mining of Dense Periodic Patterns in Time Series. Technical report, Nat'l Univ. of Singapore. 2005.
[30]Sheng, C., Hsu, W. and Lee, M.L., Mining Dense Periodic Patterns in Time Series Data. In: Proceedings of 22nd IEEE Int'l Conf. Data Eng. 2005. p. 115
[31]Udechukwu, A., Barker, K. and Alhajj, R., Discovering all Frequent Trends in Time Series. In: Proceedings of Winter Int'l Symp. Information and Comm. Technologies. 2004. 58: p. 1-6
[32]Vlachos, M., Kollios, G. and Gunopulos, D., Discovering Similar Multidimensional Trajectories. In: Proceedings of 18th IEEE Int'l Conf. Data Eng. (ICDE). 2002. p. 673-684
[33]Wang, J. and Han, J., BIDE: Efficient Mining of Frequent Closed Sequences. In: Proceedings of 20th IEEE Int'l Conf. Data Eng. (ICDE). 2004. p. 79-90
[34]Wang, W. and Yang, J., Mining Sequential Patterns from Large Datasets. Springer-Verlag. 2005. vol 28
[35]Weigend, A. and Gershenfeld, N., Time Series Prediction: Forecasting the Future and Understanding the Past. Addison-Wesley. 1994.
[36]Yan, X., Han, J. and Afshar, R., CloSpan: Mining Closed Sequential Patterns in Large Datasets. In: Proceedings of SIAM Int'l Conf. Data Mining (SDM). 2003
[37]Yang, J., Wang, W. and Yu, P., InfoMiner: Mining Partial Periodic Patterns with Gap Penalties. In: Proceedings of Second IEEE Int'l Conf. Data Mining. December 2002
[38]Zaki, M.J., SPADE: An Efficient Algorithm for Mining Frequent Sequences. Machine Learning. 2001. 42(1): 31-60
[39]Zhu, Y. and Shasha, D., WARPing Indexes with Envelope Transforms for Query by Humming. In: Proceedings of ACM SIGMOD. 2003. p. 181-192