A Framework for Mining Coherent Patterns Using Particle Swarm Optimization based Biclustering

Full Text (PDF, 1047KB), PP.33-40

Views: 0 Downloads: 0

Author(s)

Suvendu Kanungo 1,* Somya Jaiswal 1

1. Department of Computer Science & Engineering, Birla Institute of Technology, Mesra, Allahabad Campus, India

* Corresponding author.

DOI: https://doi.org/10.5815/ijisa.2015.11.05

Received: 20 Mar. 2015 / Revised: 18 Jul. 2015 / Accepted: 5 Aug. 2015 / Published: 8 Oct. 2015

Index Terms

Clustering, Biclustering, Gene Expression Data, Particle Swarm Optimization

Abstract

High-throughput microarray technologies have enabled development of robust biclustering algorithms which are capable of discovering relevant local patterns in gene expression datasets wherein subset of genes shows coherent expression patterns under subset of experimental conditions. In this work, we have proposed an algorithm that combines biclustering technique with Particle Swarm Optimization (PSO) structure in order to extract significant biological relevant patterns from such dataset. This algorithm comprises of two phases for extracting biclusters, one is the seed finding phase and another is the seed growing phase. In the seed finding phase, gene clustering and condition clustering is done separately on the gene expression data matrix and the result obtained from both the clustering is combined to form small tightly bound submatrices and those submatrices are used as seeds for the algorithm, which are having the Mean Squared Residue (MSR) value less than the defined threshold value. In the seed growing phase, the number of genes and the number of conditions are added in these seeds to enlarge it by using the PSO structure. It is observed that by using our technique in Yeast Saccharomyces Cerevisiae cell cycle expression dataset, significant biclusters are obtained which are having large volume and less MSR value in comparison to other biclustering algorithms.

Cite This Paper

Suvendu Kanungo, Somya Jaiswal, "A Framework for Mining Coherent Patterns Using Particle Swarm Optimization based Biclustering", International Journal of Intelligent Systems and Applications(IJISA), vol.7, no.11, pp.33-40, 2015. DOI:10.5815/ijisa.2015.11.05

Reference

[1]S. Das, “Mean Squared Residue Based Biclustering Algorithms for the Analysis of Gene Expression Data”, Ph.D. thesis, Department of Computer Science Cochin University of Science and Technology, 2011.
[2]X. Xu, “Data Minning Techniques in Gene Expression Data Analysis”, Ph.D. thesis, School of Computing National University Singapore, 2006.
[3]S. Das and S. Mary Idicula, “Greedy Search-Binary PSO Hybrid for Biclustering Gene Expression Data”, International Journal of Computer Applications, vol. 2, pp. 1-5, 2010.
[4]A. Mohamed and W. Ashour, “Efficient Data Clustering Algorithms: Improvements over K means”, International Journal of Intelligent Systems and Applications, vol. 3, pp. 37-49, 2013.
[5]K. Yugal and G. Sahoo, “A Review on Gravitational Search Algorithm and its Applications to Data Clustering and Classification”, International Journal of Intelligent Systems and Applications, vol. 6, pp. 79-93, 2014.
[6]M. B. Eisen, P. T. Spellman, P. O. Brown and D. Botstein, “Cluster Analysis and Display of Genome-Wide Expression Patterns”, Proceedings of National Academy of Sciences, vol. 95, pp. 14863-14868, 1998.
[7]H. Frigui and R. Krishnapuram, “A Robust Competitive Clustering Algorithm with Applications in Computer Vision”, Pattern Analysis and Machine Intelligence, IEEE, vol. 21, pp. 450-465, 1999.
[8]Y. Zhao and G. Karypis, “Empirical and Theoretical Comparisons of Selected Criterion Functions for Document Clustering”, Machine Learning, vol. 55, pp. 311-321, 2004.
[9]S. C. Maderia. and A. L. Oliveria, “Biclustering Algorithms for Biological Data Analysis: A Survey”, Transactions on Computational Biology and Bioinformatics, vol. 1, pp. 24-45, 2004.
[10]K. Bryan, P. Cunningham and N. Bolshakova, “Biclustering of Expression Data Using Simulated Annealing, Computer-Based Medical Systems, IEEE, pp. 383-388, 2005.
[11]I.S. Dhillon, “Co-Clustering Documents and Words Using Bipartite Spectral Graph Partitioning”, International conference on Knowledge discovery and data mining, pp. 269-274, 2001.
[12]Q. Sheng, Y. Moreau and B. D. Moor, “Biclustering Microarray Data by Gibbs Sampling”, Bioinformatics, vol. 19, pp. 196-205, 2003.
[13]K. James and E. Russ C., “Particle Swarm Optimization”, Proceedings of International Conference on Neural Networks, IEEE, pp. 1942-1948, 1995.
[14]J. A. Hartigan, “Direct Clustering of a Data Matrix”, Journal of the American statistical association, pp. 123-129, 1972.
[15]Y. Cheng and G.M. Church, “Biclustering of Expression Data”, International Conference on Intelligent Systems for Molecular Biology, pp. 93-103, 2000.
[16]C. A. Brizulea, J. E. Luna-Taylor, I. Martinez-Perez, H. A. Guillen, D. O. Rodriguez and A. Beltran-Verdugo, “Improving an Evolutionary Multi-objective Algorithm for the Biclustering of Gene Expression Data”, IEEE Congress on Evolutionary computation, pp. 221-228, 2013.
[17]L. Junwan and Y. Chen, “Dynamic Biclustering of Microarray Data with MOPSO”, IEEE International Conference on Granular Computing, pp. 330-334, 2010.
[18]J. Liu, Z. Li, X. Hu, Y. Chen, and E.K. Park, “Dynamic Biclustering of Microarray Data by Multi-Objective Immune Optimization”, BMC Genomics, pp. 1-7 , 2011.
[19]J. Liu, Z. Li, and Y. Chen, “Microarray Data Biclustering with Multi-objective Immune Algorithm”, Fifth International Conference on Natural Computation, pp. 200-204, 2009.
[20]J. Liu, Z. Li, X. Hu, and Y. Chen, “Biclustering of Microarray Data with MOPSO Based on Crowding Distance”, BMC Bioinformatics, pp. 1-10, 2009.
[21]S. Sarkar, A. Roy and B. ShyamPurkayashtha, “Application of Particle Swarm Optimization in Data Clustering: A Survey”, International Journal of Computer Applications, vol. 65, pp. 38-46, 2013.
[22]C. Anupam, “Biclustering of Gene Expression Data by Simulated Annealing”, International Conference on High-Performance Computing, IEEE, pp. 627-632, 2005.
[23]C. Anupam and H. Maka, “Biclustering of Gene Expression Data using Genetic Algorithm”, Computational Intelligence in Bioinformatics and Computational biology, IEEE, pp. 1-8, 2005.
[24]B. Xie, S. Chen and F. Liu, “Biclustering of Gene Expression Data using PSO-GA Hybrid”, International Conference on Bioinformatics and Biomedical Engineering, pp. 302-305, 2007.
[25]S. Tavazoie, J. D. Hughes, M. J. Campbell, R. J. Cho and G. M. Church, “Systematic Determination of Gene Network Architecture”, Nature genetics, vol. 22, pp. 281-285, 1999.
[26]Z. Zhang, A. Teo, B. Chin Ooi and K-L.Tan, “Mining Deterministic Biclusters in Gene Expression Data”, Proceedings at Bioinformatics and Bioengineering, IEEE, pp. 283-292, 2004.
[27]F. Divina and J. S. Aguilar-Ruiz, “Biclustering of Expression Data with Evolutionary Computation”, Knowledge and Data Engineering, IEEE, vol. 18, pp. 590-602, 2006.
[28]J. Yang, H. Wang, W. Wang and P. Yu, “Enhanced Biclustering on Expression Data”, Bioinformatics and Bioengineering, IEEE, pp. 321-327, 2003.
[29]S. Bleuler, A. Prelic and E. Zitzler, “An EA Framework for Biclustering of Gene Expression Data, Evolutionary Computation, IEEE, vol. 1, pp. 166-173, 2004.
[30]SGD GO Term finder [www.yeastgenome.org/cgi-bin/GO/ goTermFinder.pl].