1. What is the time complexity of the bottom-up string alignment algorithm ? Bottom-up

Download 21.33 Kb.

Date	23.05.2017
Size	21.33 Kb.
	#18916

2.Create a top-down algorithm for the string alignment problem
3.What is the time complexity of the algorithm
Reference 1

Design and analysis of algorithm

Name Ben Maina Wanjohi

Reg P58/76011/2012

Assignment 3

Course code ics801

Date 4^h July 2012

Assignment III

1.What is the time complexity of the bottom-up string alignment algorithm ?
Bottom-up

Bottom up approach is to try solving the sub problems first and use their solutions to build-on and arrive at solutions to bigger sub problems. This is also usually done in a tabular form by iteratively generating solutions to bigger and bigger sub problems by using the solutions to small sub problems.

Example

n e m a t o d e

e 7 7 6 5 5 5 5 5

m 6 6 6 5 5 4 4 4

p 5 5 5 5 5 4 4 4

t 5 5 5 5 5 4 4 4

y 4 4 4 4 4 4 4 4

There are 2 complexity involved a)- Time complexity

b)-space complexity

a) -Time Complexity:

Time complexity of optimal alignment is product O(nm),this is bacause when computing value of the cell (i, j), only cells (i − 1, j − 1), (i, j − 1), (i − 1, j) are examined) constant time. There are (n + 1)(m+ 1) cells ) O(nm) time complexity.

When computing the value for a specific cell (i; j) only cells (i -1; j- 1), (i; j -1) and (i - 1; ) are examined, along with two characters Si and Tj. Hence, to fill in one cell takes a constant number of cell examinations and comparisons. There are n*m cells in the table. So the time complexity is also O(nm).

b)-Space complexity:

Space complexity is O(n), if only M(x, y) is required, and O(nm) for the reconstruction of the alignment. Row-wise computation of the matrix, for computing row k, only row k−1 must be stored) O(n) space. For reconstructing the alignment, all trace back pointers must be stored) O(nm) space

2.Create a top-down algorithm for the string alignment problem

A matrix is built as follows:

$h(i,0) = 0,\; 0\le i\le m$

$h(0,j) = 0,\; 0\le j\le n$

$\text{ if } a_i = b_j$ then $w(a_i, b_j) = w\text{(match)}$ $\text{ or if } a_i != b_j$ then $w(a_i, b_j) = w\text{(mismatch)}$

$h(i,j) = \max \begin{bmatrix} 0 \\ h(i-1,j-1) + \ w(a_i,b_j) & \text{match/mismatch} \\ h(i-1,j) + \ w(a_i,-) & \text{deletion} \\ h(i,j-1) + \ w(-,b_j) & \text{insertion} \end{bmatrix} ,\; 1\le i\le m, 1\le j\le n$

Where:

= Strings over the Alphabet $\sigma$
$m = \text{length}(a)$
$n = \text{length}(b)$
- is the maximum Similarity-Score between a suffix of a[1...i] and a suffix of b[1...j]
$w(c,d),\; c, d\in\sigma\cup\{\'-\'\}$ , '-' is the gap-scoring scheme

Example

Sequence 1 = ACACACTA Sequence 2 = AGCACACA

$h = \begin{pmatrix} &-&a&c&a&c&a&c&t&a \\ -&0&0&0&0&0&0&0&0&0 \\ a&0&2&1&2&1&2&1&0&2 \\ g&0&1&1&1&1&1&1&0&1 \\ c&0&0&3&2&3&2&3&2&1 \\ a&0&2&2&5&4&5&4&3&4 \\ c&0&1&4&4&7&6&7&6&5 \\ a&0&2&3&6&6&9&8&7&8 \\ c&0&1&4&5&8&8&11&10&9 \\ a&0&2&3&6&7&10&10&10&12 \\ \end{pmatrix}$

3.What is the time complexity of the algorithm

Time complexity is when top-down parser tries to parse an ambiguous input, it may need exponential number of steps (with respect to the length of the input) to try all alternatives in order to produce all possible output, which eventually would require exponential memory space. The problem of exponential time complexity in top-down constructed as sets of mutually recursive functions.

The key idea is to store results of applying a parser p at position j in a memotable and to reuse results whenever the same situation arises, we also use memoization for refraining redundant computations to accommodate any form of CFG in polynomial time (Θ(n⁴) for left-recursive grammars and Θ(n³) for non left-recursive grammars). Their top-down parsing algorithm also requires polynomial space for potentially exponential ambiguous parse trees by 'compact representation' and 'local ambiguities grouping'.

Reference

1 ICS 161 -- Dept. Information & Computer Science -- UC Irvine 21 Aug 2011, 08:47:13 PDT.

2 Stephen F. Altschul; and Bruce W. Erickson (1986). "Optimal sequence alignment using affine gap costs". Bulletin of Mathematical Biology 48: 603–616. DOI:10.1007/BF02462326.

3 Miller, Webb and Myers, Eugene (1988). "Optimal alignments in linear space". Computer Applications in Biosciences (CABIOS) 4: 11–17.

4 Smith, Temple F.; and Waterman, Michael S. (1981). "Identification of Common Molecular Subsequences". Journal of Molecular Biology 147: 195–197. DOI:10.1016/0022-2836(81)90087-5. PMID 7265238.

5 Stephen F. Altschul; and Bruce W. Erickson (1986). "Optimal sequence alignment using affine gap costs". Bulletin of Mathematical Biology 48: 603–616. DOI:10.1007/BF02462326.

6 Miller, Webb and Myers, Eugene (1988). "Optimal alignments in linear space". Computer Applications in Biosciences (CABIOS) 4: 11–17.

Directory: courses
courses -> Algorithms for Register Allocation Andrew Pardoe
courses -> Ieor 4600 Mixed-Integer Rounding Inequalities
courses -> San José State University Social Science/Psychology Psych 175, Management Psychology, Section 1, Spring 2014
courses -> Kennan's Telegram (Excerpt)
courses -> The university of british columbia
courses -> A hurricane track density function and empirical orthogonal function approach to predicting seasonal hurricane activity in the Atlantic Basin Elinor Keith April 17, 2007 Abstract
courses -> Cover page need figure & title by Proposed Table of Contents
courses -> Objectives
courses -> Dr. Jeff Masters' WunderBlog

Download 21.33 Kb.

Share with your friends: