next up previous
Next: Bibliography Up: Probabilistic ancestral sequences and Previous: Indels.

Conclusions

This algorithm computes maximum likelihood multiple alignments given the sequences and their EPT. A maximum likelihood alignment under a Markovian model of evolution is a very desirable type of alignment, as it has well understood properties. This algorithm was implemented in Darwin between 1990 and 1991. It went through several (some major) revisions until about 1993 and it has been in use by the community since then. This is one of the rare cases where the algorithm had a successful service life before being published. It can be used through the Web (http://cbrg.inf.ethz.ch/) or by e-mail to (cbrg.inf.ethz.ch) from an automatic Darwin server. The server handles about 800 requests per month, of which about 34% involve multiple alignments. So far the feedback from users has been very positive.

A more formal evaluation of this algorithm is required. This has been done with a technique where we generate random EPTs and randomly mutate the sequences. From the resulting mutated sequences at the leaves of the tree we try to reconstruct the multiple alignment. It is easy to verify the accuracy, as we know the correct alignment by construction. The comparisons were done for this and several other algorithms available to us. These results and the description of the methodology exceed the scope of the paper and will be available in [9].

One problem affects the quality of these multiple alignments. The results are dependent on a correct EPT, which is not always available. Minor errors in the EPT are of little consequence, unless they straddle incorrectly a point where an indel occurred. This causes the same indel to be in separate subtrees of the EPT. The above algorithm will handle this by repeating the indel. As a result the two indels may not match. Heuristics to try to fix this problem include weighting the indels in a special way in the subtrees, but have not been entirely successful. It should be noted that if the EPT is correct then the algorithm described here will work correctly.


next up previous
Next: Bibliography Up: Probabilistic ancestral sequences and Previous: Indels.
Gaston Gonnet
1998-07-14