Aligning Molecular Sequences by Wavelet Transform using Cross Correlation Similarity Metric

Full Text (PDF, 781KB), PP.62-70

Views: 0 Downloads: 0

Author(s)

J.Jayapriya 1,* Michael Arock 1

1. Department of Computer Applications, National Institute of Technology, Tiruchirappalli, Tamil Nadu, India

* Corresponding author.

DOI: https://doi.org/10.5815/ijisa.2017.11.08

Received: 17 Mar. 2017 / Revised: 11 May 2017 / Accepted: 8 Jun. 2017 / Published: 8 Nov. 2017

Index Terms

Sequence alignment, wavelet transform, cross-correlation, EIIP (Electron-Ion Interaction Potentials), PSM (Position Specific Matrix)

Abstract

The first fact of sequence analysis is sequence alignment for the study of structural and functional analysis of the molecular sequence. Owing to the increase in biological data, there is a trade-off between accuracy and the computation of sequence alignment process. Sequences can be aligned both in locally and globally to gives vital information for biologists. Focusing these issues, in this work the local and global alignment are focused on aligning multiple molecular sequences by applying a wavelet transform. Here, the sequence is converted into numerical values using the electron-ion interaction potential model. This is decomposed using a type of wavelet transform and the similarity between the sequences is found using the cross- correlation measure. The significance of the similarity is evaluated using two scoring function namely Position Specific Matrix and a new function called Count score. The work is compared with Fast Fourier Transform based approach and the result shows that the proposed method improves the alignment.

Cite This Paper

J.Jayapriya, Michael Arock, "Aligning Molecular Sequences by Wavelet Transform using Cross Correlation Similarity Metric", International Journal of Intelligent Systems and Applications(IJISA), Vol.9, No.11, pp.62-70, 2017. DOI:10.5815/ijisa.2017.11.08

Reference

[1]Arabi E. keshk,"Enhanced Dynamic Algorithm of Genome Sequence Alignments", IJITCS, vol.6, no.6, pp.40-46, 2014. DOI: 10.5815/ijitcs.2014.06.06
[2]Cosic, I.: Macromolecular bioactivity: is it reso- nant interaction between macromolecules?-theory and applications. Biomedical Engineering, IEEE Transactions on 41(12), 1101–1114 (1994).
[3]Das, S., Abraham, A., Konar, A.: Swarm intelli- gence algorithms in bioinformatics. In: Computa- tional Intelligence in Bioinformatics, pp. 113–147. Springer (2008).
[4]Huang, X., Miller, W.: Lalign-find the best local alignments between two sequences. Adv. Appl. Math 12, 373 (1991).
[5]Jayapriya J, Michael Arock," A Novel Distance Metric for Aligning Multiple Sequences Using DNA Hybridization Process", International Journal of Intelligent Systems and Applications(IJISA), Vol.8, No.6, pp.40-47, 2016. DOI: 10.5815/ijisa.2016.06.05
[6]Katoh, K., Misawa, K., Kuma, K.i., Miyata, T.: Mafft: a novel method for rapid multiple sequence alignment based on fast fourier transform. Nucleic acids research 30(14), 3059–3066 (2002).
[7]Kaya, M., Sarhan, A., Alhajj, R.: Multiple sequence alignment with affine gap by using multi-objective genetic algorithm. Computer methods and programs in Biomedicine 114(1), 38–49 (2014).
[8]M. I. Khalil,"A New Heuristic Approach for DNA Sequences Alignment", IJIGSP, vol.7, no.12, pp.18-23, 2015.DOI: 10.5815/ijigsp.2015.12.03
[9]Lee, Z.J., Su, S.F., Chuang, C.C., Liu, K.H.: Ge netic algorithm with ant colony optimization (ga- aco) for multiple sequence alignment. Applied Soft Computing 8(1), 55–78 (2008).
[10]Naznin, F., Sarker, R., Essam, D.: Progressive alignment method using genetic algorithm for mul- tiple sequence alignment. Evolutionary Computa- tion, IEEE Transactions on 16(5), 615–631 (2012).
[11]Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970).
[12]Negi, T., Bansal, V.: Time series: Similarity search and its applications. In: Proceedings of the Inter- national Conference on Systemics, Cybernetics and Informatics: ICSCI-04, Hyderabad, India, pp. 528– 533 (2005).
[13]Notredame, C., Higgins, D.G., Heringa, J.: T- coffee: A novel method for fast and accurate multi- ple sequence alignment. Journal of Molecular Biol- ogy 302(1), 205–217 (2000) .
[14]Orobitg, M., Cores, F., Guirado, F., Roig, C., Notredame, C.: Improving multiple sequence align- ment biological accuracy through genetic algorithms. The Journal of Supercomputing 65(3), 1076–1088 (2013).
[15]Rasmussen, T.K., Krink, T.: Improved hidden markov model training for multiple sequence align- ment by a particle swarm optimization evolutionary algorithm hybrid. Biosystems 72(1), 5–17 (2003).
[16]Rice, P., Longden, I., Bleasby, A., et al.: Emboss: the european molecular biology open software suite. Trends in Genetics 16(6), 276–277 (2000).
[17]Rockwood, A.L., Crockett, D.K., Oliphant, J.R., Elenitoba-Johnson, K.S.: Sequence alignment by cross-correlation. Journal of Biomolecular Tech- niques: JBT 16(4), 453 (2005).
[18]Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) 16. Thomopson, J., Higgins, D.G., Gibson, T.J.: Clustalw. Nucleic Acids Res 22, 4673–4680 (1994).
[19]Thomopson, J., Higgins, D.G., Gibson, T.J.: Clustalw. Nucleic Acids Res 22, 4673{4680 (1994).
[20]de Trad, C.H., Fang, Q., Cosic, I.: Protein sequence comparison based on the wavelet transform ap- proach. Protein Engineering 15(3), 193–203 (2002).
[21]Wen, Z.n., Wang, K.l., Li, M.l., Nie, F.s., Yang, Y.: Analyzing functional similarity of protein sequences with discrete wavelet transform. Computational Biology and Chemistry 29(3), 220–228 (2005).