Needleman-Wunsch - producing more than the optimal solution -- pleasehelp

almurph · Sep 17, 2008

Folks,

I want to modify the Needleman Wunsch algorithm (pseudo-code here:
http://en.wikipedia.org/wiki/Needleman-Wunsch_algorithm) so that it
produces more than the optimal solution.

What I want is the optimal solution, second optimal solution, third
and so on do the chain.

Problem is though I don't know how to do this. Can anyone help
please? Any comments/suggestions/code-sample would be most
appreciated.

Thanks,
Al.

Ethan Strauss · Sep 17, 2008

Hi Chris,
I am not sure exactly what you have right now, but I have been using a
Smith Waterman alignment algorithm I got from SourceForge
(http://jaligner.sourceforge.net/naligner/) which seems to work pretty well.
I have thought about trying to get it to do pretty much what you have said
you are looking for, but have not yet done so. If it would be helpful,
perhaps we could work on this together? I never have quite wrapped my brain
around these dynamic programming algorithms and perhaps it is time.
Let me know if you are interested.
Ethan

Ethan Strauss Ph.D.
Bioinformatics Scientist
Promega Corporation
2800 Woods Hollow Rd.
Madison, WI 53711
608-274-4330
800-356-9526
(e-mail address removed)

almurph · Sep 18, 2008

Ethan/Peter,

Firstly thank you both for your comments. Peter, I am familiar with
Smith-Waterman. The implementation that you have indicated (Java I
think) is for Gotoh's improvement. I have seen this before and found
it a bit complex. With that in mind could you have a look at the
following URL:
http://mlab.cb.k.u-tokyo.ac.jp/~moris/LSGSP/Chapter6_ApproximateStringMatching.java

There are several algorithms here (including smith-Waterman-Gotoh)
and I prefer this implementation. I think it uses Affile Gap Penalties
(cf. http://en.wikipedia.org/wiki/Gap_penalty)

As for doing a callaboration together, this takes me to my next
point. Peter, thank you for you comments. It's quite a response. I am
aware of the danger here - the problem is exponential, this is true.
However, the website: http://www.avatar.se/molbioinfo2001/dynprog/dynamic.html
(particularly the bit at the bottom concerning "alternative
solutions") leads me to believe that it may be possible - in the sense
of working at the traceback level and using the fact that the
solutions must never exceed the maximum value (smith waterman) or the
bottom right value (needleman-wunsch)

However I do not have a nice metric to offer to rank any solutions
that may be obtained. This still eludes me.

I am particularly interested in solutions that minimise any
introduction of a gap in the corpus text (i.e. the text that you are
searching through). This condition could help to reduce the sheer
number of possible permutations, I think...

I would like to offer one more thing. This is slightly off topic but
I would like to offer it tentatively. I obtained the optimal solution
and compared and contrasted it with the corpus in an effort to extract
a regular expression. I then use this regular expression to search for
similar patterns. Something like this:

CORPUS: ZYGZBCZ
PATTERN: ZYY-BCZ

Hence I extract a regEx like: "ZY[GY]?BCZ" and see if patterns like
this exist elsewhere. Its off topic and a bit of a dodge to be quite
honest but I tentatively offer it.

Al.

Should you simplify this code? Write only code	1	Sep 9, 2008
Changing the floating point arithmetic model in asp.net -- pleasehelp	1	Mar 16, 2009
Wanted: strign searching algorithm <--- suggestions appreciated...	3	Jun 19, 2008
Start more than one async methods but wait previous to finish before start next	4	Apr 23, 2007
Static instance Event fires more than one time	9	May 24, 2007
Uploading file more than 5 mb,sound file, media file.	2	Jul 29, 2004
How to Split Related Records in Two if Greater than 15	4	Mar 22, 2010
Paradigm for multiple IDisposables	3	Sep 22, 2006

Needleman-Wunsch - producing more than the optimal solution -- pleasehelp

almurph

Ethan Strauss

almurph

Ask a Question

Similar Threads