source: trunk/GDE/PHYLIP/doc/dolmove.html

Last change on this file was 2176, checked in by westram, 21 years ago

* empty log message *

  • Property svn:eol-style set to native
  • Property svn:keywords set to Author Date Id Revision
File size: 18.4 KB
Line 
1<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
2<HTML>
3<HEAD>
4<TITLE>dolmove</TITLE>
5<META NAME="description" CONTENT="dolmove">
6<META NAME="keywords" CONTENT="dolmove">
7<META NAME="resource-type" CONTENT="document">
8<META NAME="distribution" CONTENT="global">
9<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1">
10</HEAD>
11<BODY BGCOLOR="#ccffff">
12<DIV ALIGN=RIGHT>
13version 3.6
14</DIV>
15<P>
16<DIV ALIGN=CENTER>
17<H1>DOLMOVE  -- Interactive Dollo and Polymorphism Parsimony</H1>
18</DIV>
19</PRE>
20<P>
21&#169; Copyright 1986-2002 by the University of
22Washington.  Written by Joseph Felsenstein.  Permission is granted to copy
23this document provided that no fee is charged for it and that this copyright
24notice is not removed.
25<P>
26DOLMOVE is an interactive parsimony program which uses the Dollo and
27Polymorphism parsimony criteria.  It is inspired on Wayne Maddison and
28David Maddison's marvellous program MacClade, which is written for Apple
29MacIntosh computers.  DOLMOVE reads in a data set which is prepared in almost
30the same format as one for the Dollo and polymorhism parsimony program
31DOLLOP.  It allows the user to choose an initial tree, and displays this tree
32on the screen.  The user can look at different characters and the way their
33states are distributed on that tree, given the most parsimonious reconstruction
34of state changes for that particular tree.  The user then can specify how the
35tree is to be rearraranged, rerooted or written out to a file.  By looking at
36different rearrangements of the tree the user can manually search for the most
37parsimonious tree, and can get a feel for how different characters are affected
38by changes in the tree topology. 
39<P>
40This program is compatible with fewer computer systems than the other
41programs in PHYLIP.  It can be adapted to PCDOS systems or to
42any system whose screen or terminals emulate DEC VT100
43terminals (such as Telnet programs for logging in to remote computers over a
44TCP/IP network,
45VT100-compatible windows in the X windowing system, and any
46terminal compatible with ANSI standard terminals).
47For any other screen types, there is a generic option which does
48not make use of screen graphics characters to display the character
49states.  This will be less effective, as the states will be less
50easy to see when displayed.
51<P>
52The input data file is set up almost identically to the data files for
53DOLLOP.
54<P>
55The user interaction starts with the program presenting a menu.  The
56menu looks like this:
57<P>
58<TABLE><TR><TD BGCOLOR=white>
59<PRE>
60
61Interactive Dollo or polymorphism parsimony, version 3.6a3
62
63Settings for this run:
64  P                        Parsimony method?  Dollo
65  A                     Use ancestral states?  No
66  F                  Use factors information?  No
67  W                           Sites weighted?  No
68  T                 Use Threshold parsimony?  No, use ordinary parsimony
69  A      Use ancestral states in input file?  No
70  U Initial tree (arbitrary, user, specify)?  Arbitrary
71  0      Graphics type (IBM PC, ANSI, none)?  (none)
72  L               Number of lines on screen?  24
73  S                Width of terminal screen?  80
74
75
76Are these settings correct? (type Y or the letter for one to change)
77</PRE>
78</TD></TR></TABLE>
79<P>
80The P (Parsimony Method) option is the one that toggles between polymorphism
81parsimony and Dollo parsimony.  The program defaults to Dollo parsimony.
82<P>
83The T (Threshold), F (Factors), A (Ancestors), and 0 (Graphics type) options
84are the usual
85ones and are described in the main documentation page and in the
86Discrete Characters Program documentation page. 
87(<B>Note: at present DOLMOVE actully
88does not use the A (Ancestral states) information</B>). The F (Factors)
89option is used to inform the program which
90groups of characters are to be counted together in computing the number of
91characters compatible with the tree.  Thus if three binary characters are all
92factors of the same multistate character, the multistate character will
93be counted as compatible with the tree only if all three factors are compatible
94with it.
95<P>
96The L option allows
97the program to take advantage of larger screens if available.  The X (Mixed
98Methods option is not available in DOLMOVE.  The
99U (initial tree) option allows the user to choose whether
100the initial tree is to be arbitrary, interactively specified by the user, or
101read from a tree file.  Typing U causes the program to change among the
102three possibilities in turn.  I
103would recommend that for a first run, you allow the tree to be set up
104arbitrarily (the default), as the "specify" choice is difficult
105to use and the "user tree" choice requires that you have available a tree file
106with the tree topology of the initial tree.
107Its default name is <TT>intree</TT>.  The program will ask you for its name if
108it looks for the input tree file and does not find one of this name.
109If you wish to set up some
110particular tree you can also do that by the rearrangement commands specified
111below.  The T (threshold) option allows a continuum of methods between
112parsimony and compatibility.  Thresholds less than or equal to 0 do not
113have any
114meaning and should not be used: they will result in a tree dependent only on
115the input order of species and not at all on the data!
116Note that the usual W (Weights) option is not available in MOVE.  We
117hope to add it soon.
118<P>
119After the initial menu is displayed and the choices are made,
120the program then sets up an initial tree and displays it.  Below it will be a
121one-line menu of possible commands, which looks like this:
122<P>
123<PRE>
124NEXT? (Options: R # + - S . T U W O F C H ? X Q) (H or ? for Help)
125</PRE>
126<P>
127If you type H or ? you will get a single screen showing a description of each
128of these commands in a few words.  Here are slightly more detailed
129descriptions:
130<P>
131<DL>
132<DT>R</DT> <DD>("Rearrange").  This command asks for the number of a node which is to be
133removed from the tree.  It and everything to the right of it on the tree is to
134be removed (by breaking the branch immediately below it).  The command also
135asks for the number of a node below which that group is to be inserted.  If an
136impossible number is given, the program refuses to carry out the rearrangement
137and asks for a new command.  The rearranged tree is displayed: it will often
138have a different number of steps than the original.  If you wish to undo a
139rearrangement, use the Undo command, for which see below.</DD>
140<P>
141<DT>#</DT> <DD>This command, and the +, - and S commands described below, determine
142which character has its states displayed on the branches of
143the trees.  The initial tree displayed by the program does not show
144states of sites.  When # is typed, the program does not ask the user which
145character is to be shown but automatically shows the states of the next
146binary character that is not compatible with the tree (the next character that
147does not
148perfectly fit the current tree).  The search for this character "wraps around"
149so that if it reaches the last character without finding one that is not
150compatible with the tree, the search continues at the first character; if no
151incompatible character is found the current character is shown, and if no
152current character is shown then the first character is shown.  If the last
153character has been reached, using + again causes the first
154character to be shown.  The display takes the form of
155different symbols or textures on the branches of the tree.  The state of each
156branch is actually the state of the node above it.  A key of the symbols or
157shadings used for states 0, 1 and ? are shown next to the tree.  State ? means
158that either state 0 or state 1 could exist at that point on the tree, and that
159the user may want to consider the different possibilities, which are usually
160apparent by inspection. </DD>
161<DT>+</DT> <DD>This command is the same as # except that it goes forward one character,
162showing the states of the next character.  If no character has been shown,
163using + will
164cause the first character to be shown.  Once the last character has been
165reached, using + again will show the first character.</DD>
166<P>
167<DT>-</DT> <DD>This command is the same as + except that it goes backwards, showing the
168states of the previous character.  If no character has been shown, using - will
169cause the last character to be shown.  Once character number 1 has been
170reached, using - again will show the last character.</DD>
171<P>
172<DT>S</DT> <DD>("Show").  This command is the same as + and - except that it causes
173the program to ask you for the number of a character.  That character is
174the one whose states will be displayed.  If you give the character number as 0,
175the program will go back to not showing the states of the characters.</DD>
176<P>
177<DT>. (dot)</DT> <DD>This command simply causes the current tree to be redisplayed.  It is of
178use when the tree has partly disappeared off of the top of the screen owing to
179too many responses to commands being printed out at the bottom of the screen.  </DD>
180<P>
181<DT>T</DT> <DD>("Try rearrangements").  This command asks for the name of a node.  The
182part of the tree at and above that node is removed from the tree.  The program
183tries to re-insert it in each possible location on the tree (this may take some
184time, and the program reminds you to wait).  Then it prints out a summary.  For
185each possible location the program prints out the number of the node to the
186right of the
187place of insertion and the number of steps required in each case.  These are
188divided into those that are better, tied, or worse than the current tree.  Once
189this summary is printed out, the group that was removed is inserted into its
190original position.  It is up to you to use the R command to actually carry out
191any the arrangements that have been tried. </DD>
192<P>
193<DT>U</DT> <DD>("Undo").  This command reverses the effect of the most recent
194rearrangement, outgroup re-rooting, or flipping of branches.  It returns to the
195previous tree topology.  It will be of great use when rearranging the tree and
196when a rearrangement proves worse than the preceding one -- it permits you to
197abandon the new one and return to the previous one without remembering its
198topology in detail.</DD>
199<P>
200<DT>W</DT> <DD>("Write").  This command writes out the current tree onto a tree output
201file.  If the file already has been written to by this run of DOLMOVE, it will
202ask you whether you want to replace the contents of the file, add the tree to
203the end of the file, or  not write out the tree to the file.  The tree
204is written in the standard format used by PHYLIP (a subset of the
205Newick standard).  It is in the proper format to serve as the
206User-Defined Tree for setting up the initial tree in a subsequent run of the
207program.</DD>
208<P>
209<DT>O</DT> <DD>("Outgroup").  This asks for the number of a node which is to be the
210outgroup.  The tree will be redisplayed with that node
211as the left descendant of the bottom fork.  The number of
212steps required on the tree may change on re-rooting.  Note that it is possible
213to use this to make a multi-species group the outgroup (i.e., you can give the
214number of an interior node of the tree as the outgroup, and the program will
215re-root the tree properly with that on the left of the bottom fork).</DD>
216<P>
217<DT>F</DT> <DD>("Flip").  This asks for a node number and then flips the two branches at
218that, so that the left-right order of branches at that node is
219changed.  This does not actually change the tree topology (or the number of
220steps on that tree) but it does change the appearance of the tree.</DD>
221<P>
222<DT>C</DT> <DD>("Clade").  When the data consist of more than 12 species (or more than
223half the number of lines on the screen if this is not 24), it may be
224difficult to display the tree on one screen.  In that case the tree
225will be squeezed down to
226one line per species.  This is too small to see all the interior states of the
227tree.  The C command instructs the program to print out only that part of the
228tree (the "clade") from a certain node on up.  The program will prompt you for
229the number of this node.  Remember that thereafter you are not looking at the
230whole tree.  To go back to looking at the whole tree give the C command again
231and enter "0" for the node number when asked.  Most users will not want to use
232this option unless forced to.</DD>
233<P>
234<DT>H</DT> <DD>("Help").  Prints a one-screen summary of what the commands do, a few
235words for each command.</DD>
236<P>
237<DT>?</DT> <DD>("huh?").  A synonym for H.  Same as Help command.</DD>
238<P>
239<DT>X</DT> <DD>("Exit").  Exit from program.  If the current tree has not yet been saved
240into a file, the program will ask you whether it should be saved.</DD>
241<P>
242<DT>Q</DT> <DD>("Quit").  A synonym for X.  Same as the eXit command.</DD>
243</DL>
244<P>
245<H2>OUTPUT</H2>
246<P>
247If the A option is used, then
248the program will infer, for any character whose ancestral state is unknown
249("?") whether the ancestral state 0 or 1 will give the fewest changes
250(according to the criterion in use).  If these are tied, then it may not be
251possible for the program to infer the state in the internal nodes, and many of
252these will be shown as "?".  If the A option is not used, then the program will
253assume 0 as the ancestral state.
254<P>
255When reconstructing the placement of forward
256changes and reversions under the Dollo method, keep in mind that each
257polymorphic state in the input data will require one "last minute"
258reversion.  This is included in the counts.  Thus if we have both states 0 and
2591 at a tip of the tree the program will assume that the lineage had state 1 up
260to the last minute, and then state 0 arose in that population by reversion,
261without loss of state 1. 
262<P>
263When DOLMOVE calculates the number of characters
264compatible with the tree, it will take the F option into
265account and count the multistate characters as units, counting a character
266as compatible with the tree only when all of the binary characters
267corresponding to it are compatible with the tree.
268<P>
269<H2>ADAPTING THE PROGRAM TO YOUR COMPUTER AND TO YOUR TERMINAL</H2>
270<P>
271As we have seen, the initial menu of the program allows you to choose
272among three screen types (PC, ANSI, and none). 
273If you want to
274avoid having to make this choice every time, you can change
275some of the
276constants in the file <TT>phylip.h</TT> to have the terminal type initialize
277itself in the proper way, and recompile.
278The constants that need attention are ANSICRT and IBMCRT.
279Currently these are both set to "false" on Macintosh and on Unix/Linux
280systems, and IBMCRT is set to "true" on Windows systems.  If your system
281has an ANSI compatible terminal, you might want to find the
282definition of ANSICRT in <TT>phylip.h</TT> and set it to "true", and
283IBMCRT to "false".
284<P>
285<H2>MORE ABOUT THE PARSIMONY CRITERION</H2>
286<P>
287DOLMOVE uses as its numerical criterion the Dollo and
288polymorphism parsimony methods.  The program defaults to carrying out Dollo
289parsimony. 
290<P>
291The Dollo parsimony method was
292first suggested in print in verbal form by Le Quesne (1974) and was
293first well-specified by Farris (1977).  The method is named after Louis
294Dollo since he was one of the first to assert that in evolution it is
295harder to gain a complex feature than to lose it.  The algorithm
296explains the presence of the state 1 by allowing up to one forward
297change 0-->1 and as many reversions 1-->0 as are necessary to explain
298the pattern of states seen.  The program attempts to minimize the number
299of 1-->0 reversions necessary.
300<P>
301The assumptions of this method are in effect:
302<P>
303<OL>
304<LI>We know which state is the ancestral one (state 0).
305<LI>The characters are evolving independently.
306<LI>Different lineages evolve independently.
307<LI>The probability of a forward change (0-->1) is small over the
308evolutionary times involved.
309<LI>The probability of a reversion (1-->0) is also small, but
310still far larger than the probability of a forward change, so
311that many reversions are easier to envisage than even one
312extra forward change.
313<LI>Retention of polymorphism for both states (0 and 1) is highly
314improbable.
315<LI>The lengths of the segments of the true tree are not so
316unequal that two changes in a long segment are as probable as
317one in a short segment.
318</OL>
319<P>
320One problem can arise when using additive binary recoding to
321represent a multistate character as a series of two-state characters.  Unlike
322the Camin-Sokal, Wagner, and Polymorphism methods, the Dollo
323method can reconstruct ancestral states which do not exist.  An example
324is given in my 1979 paper.  It will be necessary to check the output to
325make sure that this has not occurred.
326<P>
327The polymorphism parsimony method was first used by me,
328and the results published
329(without a clear
330specification of the method) by Inger (1967).  The method was
331published by Farris (1978a) and by me (1979).  The method
332assumes that we can explain the pattern of states by no more than one
333origination (0-->1) of state 1, followed by retention of polymorphism
334along as many segments of the tree as are necessary, followed by loss of
335state 0 or of state 1 where necessary.  The program tries to minimize
336the total number of polymorphic characters, where each polymorphism is
337counted once for each segment of the tree in which it is retained.
338<P>
339The assumptions of the polymorphism parsimony method are in effect:
340<P>
341<OL>
342<LI>The ancestral state (state 0) is known in each character.
343<LI>The characters are evolving independently of each other.
344<LI>Different lineages are evolving independently.
345<LI>Forward change (0-->1) is highly improbable over the length of
346time involved in the evolution of the group.
347<LI>Retention of polymorphism is also improbable, but far more
348probable that forward change, so that we can more easily
349envisage much polymorhism than even one additional forward
350change.
351<LI>Once state 1 is reached, reoccurrence of state 0 is very
352improbable, much less probable than multiple retentions of
353polymorphism.
354<LI>The lengths of segments in the true tree are not so unequal
355that we can more easily envisage retention events occurring in
356both of two long segments than one retention in a short
357segment.
358</OL>
359<P>
360That these are the assumptions of parsimony methods has been documented
361in a series of papers of mine: (1973a, 1978b, 1979, 1981b,
3621983b, 1988b).  For an opposing view arguing that the parsimony methods
363make no substantive
364assumptions such as these, see the papers by Farris (1983) and Sober (1983a,
3651983b), but also read the exchange between Felsenstein and Sober (1986). 
366<P>
367Below is a test data set, but we cannot show the
368output it generates because of the interactive nature of the program.
369<P>
370<HR>
371<P>
372<H3>TEST DATA SET</H3>
373<P>
374<TABLE><TR><TD BGCOLOR=white>
375<PRE>
376     5    6
377Alpha     110110
378Beta      110000
379Gamma     100110
380Delta     001001
381Epsilon   001110
382</PRE>
383</TD></TR></TABLE>
384</BODY>
385</HTML>
Note: See TracBrowser for help on using the repository browser.