Enhanced Dynamic Algorithm of Genome Sequence Alignments

Full Text (PDF, 375KB), PP.40-46

Views: 0 Downloads: 0

Author(s)

Arabi E. keshk 1,*

1. Dept. of Computer Science, Faculty of Computers and Information, Menoufia University, Egypt

* Corresponding author.

DOI: https://doi.org/10.5815/ijitcs.2014.06.06

Received: 20 Jul. 2013 / Revised: 25 Nov. 2013 / Accepted: 23 Feb. 2014 / Published: 8 May 2014

Index Terms

Bioinformatics, Dynamic programming, Sequence alignment, Algorithms

Abstract

The merging of biology and computer science has created a new field called computational biology that explore the capacities of computers to gain knowledge from biological data, bioinformatics. Computational biology is rooted in life sciences as well as computers, information sciences, and technologies. The main problem in computational biology is sequence alignment that is a way of arranging the sequences of DNA, RNA or protein to identify the region of similarity and relationship between sequences. This paper introduces an enhancement of dynamic algorithm of genome sequence alignment, which called EDAGSA. It is filling the three main diagonals without filling the entire matrix by the unused data. It gets the optimal solution with decreasing the execution time and therefore the performance is increased. To illustrate the effectiveness of optimizing the performance of the proposed algorithm, it is compared with the traditional methods such as Needleman-Wunsch, Smith-Waterman and longest common subsequence algorithms. Also, database is implemented for using the algorithm in multi-sequence alignments for searching the optimal sequence that matches the given sequence.

Cite This Paper

Arabi E. keshk, "Enhanced Dynamic Algorithm of Genome Sequence Alignments", International Journal of Information Technology and Computer Science(IJITCS), vol.6, no.6, pp.40-46, 2014. DOI:10.5815/ijitcs.2014.06.06

Reference

[1]Bioinformatics, Wikipedia:http://en.wikipedia.org/wiki/Bioinformatics.

[2]Ahmad M. Hosny, Howida A. Shedeed, Ashraf S. Hussein, Mohamed F. Tolba, “An Efficient Solution for Aligning Huge DNA Sequences”, 2011 IEEE International Conference on Computer Engineering and Systems, ICCES’2011, Cairo, Egypt, pp 295-299.

[3]Arthur M. Lesk, Introduction to Bioinformatics 2008.

[4]Hosny, Ahmad M.; Shedeed, Howida A.; Hussein, Ashraf S.; Tolba, Mohamed F. Cloud statistical significance estimation for optimal local alignment of huge DNA sequences, INFOS' 2012, cc-48-54.

[5]Computational biology, Wikipedia: http://en.wikipedia.org/wiki/Computational_Biology.

[6]TahirNaveed, ImitazSaeedSiddiqui, Shaftab Ahmed. Parallel Needleman-Wunsch Algorithm for Grid. Proceedings of the PAK-US International Symposium on High Capacity Optical Networks and Enabling Technologies (HONET 2005), Islamabad, Pakistan, Dec 19 - 21, 2005.

[7]BioInformatics Educational Resources Documentation [online], European Bioinformatics Institute United Kingdom. Available: http://www.ebi.ac.uk/2can/tutorials/protein/align.html.

[8]MacIntosh, G.C., Wilkerson, C., Green, P.J. (2001). Identification and analysis of analysis of Arabidopsis expressed sequence tags characteristic of noncoding RNAs. Plant Physiol. 127(3): 765-776.

[9]Lopez, C., Piegu, B., Cooke, R., Delseny, M., Tohme, J., Verdier, V. Using cDNA and genomic sequences as tools to develop SNP strategies in cassava (ManihotesculentaCrantz). Theor. Appl. Genet, 2005 110: 425-431. 47.

[10]Needleman, S. B. and C. D. Wunsch, A General Method Applicable to Search for Similarities in the Amino Acid Sequence of Two Proteins, Journal of Molecular Biology,48:443-453, 1970.

[11]Cormen, T. H., C. E. Leiserson, R. L. Rivest and C. Stein, Introduction to Algorithms, second edition, MIT Press, 2001. 

[12]Smith, T. F. and M. S. Waterman, Identification of common molecular sub-sequences, Journal of Molecula Biology, 147:195-197, 1981. 

[13]Bergroth, L., Hakonen, H. and Raita, T. "A Survey of Longest Common Subsequence Algorithms". SPIRE (IEEE Computer Society), 2000, 39–48.