source: branches/stable/GDE/PHYLIP/doc/mix.html

Last change on this file was 2176, checked in by westram, 21 years ago

* empty log message *

  • Property svn:eol-style set to native
  • Property svn:keywords set to Author Date Id Revision
File size: 15.1 KB
Line 
1<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
2<HTML>
3<HEAD>
4<TITLE>mix</TITLE>
5<META NAME="description" CONTENT="mix">
6<META NAME="keywords" CONTENT="mix">
7<META NAME="resource-type" CONTENT="document">
8<META NAME="distribution" CONTENT="global">
9<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1">
10</HEAD>
11<BODY BGCOLOR="#ccffff">
12<DIV ALIGN=RIGHT>
13version 3.6
14</DIV>
15<P>
16<DIV ALIGN=CENTER>
17<H1>MIX - Mixed method discrete characters parsimony</H1>
18</DIV>
19<P>
20&#169; Copyright 1986-2002 by the University of
21Washington.  Written by Joseph Felsenstein.  Permission is granted to copy
22this document provided that no fee is charged for it and that this copyright
23notice is not removed.
24<P>
25MIX is a general parsimony program which carries out the Wagner and
26Camin-Sokal parsimony methods in mixture, where each character can have
27its method specified separately.  The program defaults to carrying out Wagner
28parsimony.
29<P>
30The Camin-Sokal parsimony method explains the data by assuming that
31changes 0 --> 1 are allowed but not changes 1 --> 0.  Wagner parsimony
32allows both kinds of changes.  (This under the assumption that 0 is the
33ancestral state, though the program allows reassignment of the ancestral
34state, in which case we must reverse the state numbers 0 and 1
35throughout this discussion).  The criterion is to find the tree which
36requires the minimum number of changes.  The Camin-Sokal method is due
37to Camin and Sokal (1965) and the Wagner method to Eck and Dayhoff
38(1966) and to Kluge and Farris (1969).
39<P>
40Here are the assumptions of these two methods:
41<P>
42<OL>
43<LI>Ancestral states are known (Camin-Sokal) or unknown (Wagner).
44<LI>Different characters evolve independently.
45<LI>Different lineages evolve independently.
46<LI>Changes 0 --> 1 are much more probable than changes 1 --> 0
47(Camin-Sokal) or equally probable (Wagner).
48<LI>Both of these kinds of changes are a priori improbable over the
49evolutionary time spans involved in the differentiation of the
50group in question.
51<LI>Other kinds of evolutionary event such as retention of polymorphism
52are far less probable than 0 --> 1 changes.
53<LI>Rates of evolution in different lineages are sufficiently low that
54two changes in a long segment of the tree are far less probable
55than one change in a short segment.
56</OL>
57<P>
58That these are the assumptions of parsimony methods has been documented
59in a series of papers of mine: (1973a, 1978b, 1979, 1981b,
601983b, 1988b).  For an opposing view arguing that the parsimony methods
61make no substantive
62assumptions such as these, see the papers by Farris (1983) and Sober (1983a,
631983b), but also read the exchange between Felsenstein and Sober (1986). 
64<P>
65<H2>INPUT FORMAT</H2>
66<P>
67The input for MIX is the standard input for discrete characters
68programs, described above in the documentation file for the
69discrete-characters programs.  States "?", "P", and "B" are allowed.
70<P>
71The options are selected using a menu:
72<P>
73<TABLE><TR><TD BGCOLOR=white>
74<PRE>
75
76Mixed parsimony algorithm, version 3.6a3
77
78Settings for this run:
79  U                 Search for best tree?  Yes
80  X                     Use Mixed method?  No
81  P                     Parsimony method?  Wagner
82  J     Randomize input order of species?  No. Use input order
83  O                        Outgroup root?  No, use as outgroup species  1
84  T              Use Threshold parsimony?  No, use ordinary parsimony
85  A   Use ancestral states in input file?  No
86  W                       Sites weighted?  No
87  M           Analyze multiple data sets?  No
88  0   Terminal type (IBM PC, ANSI, none)?  (none)
89  1    Print out the data at start of run  No
90  2  Print indications of progress of run  Yes
91  3                        Print out tree  Yes
92  4     Print out steps in each character  No
93  5     Print states at all nodes of tree  No
94  6       Write out trees onto tree file?  Yes
95
96Are these settings correct? (type Y or the letter for one to change)
97
98</PRE>
99</TD></TR></TABLE>
100<P>
101The options U, X, J, O, T, A, and M are the usual User Tree, miXed
102methods, Jumble, Outgroup,
103Ancestral States, and Multiple Data Sets options, described either
104in the main documentation file or in the Discrete Characters Programs
105documentation file.  The
106user-defined trees supplied if you use the U option must be given as rooted
107trees with two-way splits (bifurcations).  The O option is acted upon only if
108the final tree is unrooted and is not a user-defined tree.  One of the
109important uses of the the O option is to root the tree so that if there are
110any characters in which the ancestral states have not been specified, the
111program will print out a table showing which ancestral states require the
112fewest steps.  Note that when any of the characters has Camin-Sokal parsimony
113assumed for it, the tree is rooted and the O option will have no effect. 
114<P>
115The option P toggles between the Camin-Sokal parsimony criterion
116and the default Wagner parsimony criterion.  Option X invokes
117mixed-method parsimony.  If the A option is invoked, the ancestor is not
118to be counted as one of the species.
119<P>
120The F (Factors)
121option is not available in this program, as it would have no effect on
122the result even if that information were provided in the input file.
123<P>
124<H2>OUTPUT FORMAT</H2>
125<P>
126Output is standard: a list of equally parsimonious trees, which will be printed
127as rooted or unrooted depending on which is appropriate, and, if the
128user chooses, a table of the
129number of changes of state required in each character.  If the Wagner option is
130in force for a character, it may not be possible to unambiguously locate the
131places on the tree where the changes occur, as there may be multiple
132possibilities.  If the user selects menu option 5, a table is printed out
133after each tree, showing for each
134branch whether there are known to be changes in the branch, and what the states
135are inferred to have been at the top end of the branch.  If the inferred state
136is a "?" there will be multiple equally-parsimonious assignments of states; the
137user must work these out for themselves by hand. 
138<P>
139If the Camin-Sokal parsimony method
140is invoked and the Ancestors option is also used, then the program will
141infer, for any character whose ancestral state is unknown ("?") whether the
142ancestral state 0 or 1 will give the fewest state changes.  If these are
143tied, then it may not be possible for the program to infer the
144state in the internal nodes, and these will all be printed as ".".  If this
145has happened and you want to know more about the states at the internal
146nodes, you will find helpful to use MOVE to display the tree and examine
147its interior states, as the algorithm in MOVE shows all that can be known
148in this case about the interior states, including where there is and is not
149amibiguity.  The algorithm in MIX gives up more easily on displaying these
150states.
151<P>
152If the A option is not used, then the program will assume 0 as the
153ancestral state for those characters following the Camin-Sokal method,
154and will assume that the ancestral state is unknown for those characters
155following Wagner parsimony.  If any characters have unknown ancestral
156states, and if the resulting tree is rooted (even by outgroup),
157a table will also be printed out
158showing the best guesses of which are the ancestral states in each
159character.  You will find it useful to understand the difference between
160the Camin-Sokal parsimony criterion with unknown ancestral state and the Wagner
161parsimony criterion.
162<P>
163If the U (User Tree) option is used and more than one tree is supplied, the
164program also performs a statistical test of each of these trees against the
165best tree.  This test, which is a version of the test proposed by
166Alan Templeton (1983) and evaluated in a test case by me (1985a).  It is
167closely parallel to a test using log likelihood differences
168invented by Kishino and Hasegawa (1989), and uses the mean and variance of
169step differences between trees, taken across characters.  If the mean
170is more than 1.96 standard deviations different then the trees are declared
171significantly different.  The program
172prints out a table of the steps for each tree, the differences of
173each from the highest one, the variance of that quantity as determined by
174the step differences at individual sites, and a conclusion as to
175whether that tree is or is not significantly worse than the best one. It
176is important to understand that the test assumes that all the binary
177characters are evolving independently, which is unlikely to be true for
178many suites of morphological characters.
179<P>
180If the U (User Tree) option is used and more than one tree is supplied, the
181program also performs a statistical test of each of these trees against the
182best tree.  This test, which is a version of the test proposed by
183Alan Templeton (1983) and evaluated in a test case by me (1985a).  It is
184closely parallel to a test using log likelihood differences
185invented by Kishino and Hasegawa (1989), and uses the mean and variance of
186step differences between trees, taken across characters.  If the mean
187is more than 1.96 standard deviations different then the trees are declared
188significantly different.  The program
189prints out a table of the steps for each tree, the differences of
190each from the highest one, the variance of that quantity as determined by
191the step differences at individual characters, and a conclusion as to
192whether that tree is or is not significantly worse than the best one. It
193is important to understand that the test assumes that all the binary
194characters are evolving independently, which is unlikely to be true for
195many suites of morphological characters.
196<P>
197If there are more than two trees, the test done is an extension of
198the KHT test, due to Shimodaira and Hasegawa (1999).  They pointed out
199that a correction for the number of trees was necessary, and they
200introduced a resampling method to make this correction.  In the version
201used here the variances and covariances of the sums of steps across
202characters are computed for all pairs of trees.  To test whether the
203difference between each tree and the best one is larger than could have
204been expected if they all had the same expected number of steps,
205numbers of steps for all trees are sampled with these covariances and equal
206means (Shimodaira and Hasegawa's "least favorable hypothesis"),
207and a P value is computed from the fraction of times the difference between
208the tree's value and the lowest number of steps exceeds that actually
209observed.  Note that this sampling needs random numbers, and so the
210program will prompt the user for a random number seed if one has not
211already been supplied.  With the two-tree KHT test no random numbers
212are used.
213<P>
214In either the KHT or the SH test the program
215prints out a table of the number of steps for each tree, the differences of
216each from the lowest one, the variance of that quantity as determined by
217the differences of the numbers of steps at individual characters,
218and a conclusion as to
219whether that tree is or is not significantly worse than the best one.
220<P>
221At the beginning of the program is a constant, <TT>maxtrees</TT>,
222the maximum number of trees which the program will store for output.
223<P>
224The program is descended from earlier programs SOKAL and WAGNER which have
225long since been removed from the PHYLIP package, since MIX has all their
226capabilites and more.
227<P>
228<HR>
229<P>
230<H3>TEST DATA SET</H3>
231<P>
232<TABLE><TR><TD BGCOLOR=white>
233<PRE>
234     5    6
235Alpha     110110
236Beta      110000
237Gamma     100110
238Delta     001001
239Epsilon   001110
240</PRE>
241</TD></TR></TABLE>
242<P>
243<HR>
244<P>
245<H3>TEST SET OUTPUT (with all numerical options on)</H3>
246<P>
247<TABLE><TR><TD BGCOLOR=white>
248<PRE>
249
250Mixed parsimony algorithm, version 3.6a3
251
2525 species, 6 characters
253
254Wagner parsimony method
255
256
257Name         Characters
258----         ----------
259
260Alpha        11011 0
261Beta         11000 0
262Gamma        10011 0
263Delta        00100 1
264Epsilon      00111 0
265
266
267
268     4 trees in all found
269
270
271
272
273           +--Epsilon   
274     +-----4 
275     !     +--Gamma     
276  +--2 
277  !  !     +--Delta     
278--1  +-----3 
279  !        +--Beta     
280  ! 
281  +-----------Alpha     
282
283  remember: this is an unrooted tree!
284
285
286requires a total of      9.000
287
288steps in each character:
289         0   1   2   3   4   5   6   7   8   9
290     *-----------------------------------------
291    0!       2   2   2   1   1   1           
292
293From    To     Any Steps?    State at upper node
294                             ( . means same as in the node below it on tree)
295
296          1                1?011 0
297  1       2         no     .?... .
298  2       4        maybe   .0... .
299  4    Epsilon      yes    0.1.. .
300  4    Gamma        no     ..... .
301  2       3         yes    .?.00 .
302  3    Delta        yes    001.. 1
303  3    Beta        maybe   .1... .
304  1    Alpha       maybe   .1... .
305
306
307
308
309
310     +--------Gamma     
311     ! 
312  +--2     +--Epsilon   
313  !  !  +--4 
314  !  +--3  +--Delta     
315--1     ! 
316  !     +-----Beta     
317  ! 
318  +-----------Alpha     
319
320  remember: this is an unrooted tree!
321
322
323requires a total of      9.000
324
325steps in each character:
326         0   1   2   3   4   5   6   7   8   9
327     *-----------------------------------------
328    0!       1   2   1   2   2   1           
329
330From    To     Any Steps?    State at upper node
331                             ( . means same as in the node below it on tree)
332
333          1                1?011 0
334  1       2         no     .?... .
335  2    Gamma       maybe   .0... .
336  2       3        maybe   .?.?? .
337  3       4         yes    001?? .
338  4    Epsilon     maybe   ...11 .
339  4    Delta        yes    ...00 1
340  3    Beta        maybe   .1.00 .
341  1    Alpha       maybe   .1... .
342
343
344
345
346
347     +--------Epsilon   
348  +--4 
349  !  !  +-----Gamma     
350  !  +--2 
351--1     !  +--Delta     
352  !     +--3 
353  !        +--Beta     
354  ! 
355  +-----------Alpha     
356
357  remember: this is an unrooted tree!
358
359
360requires a total of      9.000
361
362steps in each character:
363         0   1   2   3   4   5   6   7   8   9
364     *-----------------------------------------
365    0!       2   2   2   1   1   1           
366
367From    To     Any Steps?    State at upper node
368                             ( . means same as in the node below it on tree)
369
370          1                1?011 0
371  1       4        maybe   .0... .
372  4    Epsilon      yes    0.1.. .
373  4       2         no     ..... .
374  2    Gamma        no     ..... .
375  2       3         yes    ...00 .
376  3    Delta        yes    0.1.. 1
377  3    Beta         yes    .1... .
378  1    Alpha       maybe   .1... .
379
380
381
382
383
384     +--------Gamma     
385  +--2 
386  !  !  +-----Epsilon   
387  !  +--4 
388--1     !  +--Delta     
389  !     +--3 
390  !        +--Beta     
391  ! 
392  +-----------Alpha     
393
394  remember: this is an unrooted tree!
395
396
397requires a total of      9.000
398
399steps in each character:
400         0   1   2   3   4   5   6   7   8   9
401     *-----------------------------------------
402    0!       2   2   2   1   1   1           
403
404From    To     Any Steps?    State at upper node
405                             ( . means same as in the node below it on tree)
406
407          1                1?011 0
408  1       2        maybe   .0... .
409  2    Gamma        no     ..... .
410  2       4        maybe   ?.?.. .
411  4    Epsilon     maybe   0.1.. .
412  4       3         yes    ?.?00 .
413  3    Delta        yes    0.1.. 1
414  3    Beta         yes    110.. .
415  1    Alpha       maybe   .1... .
416
417
418</PRE>
419</TD></TR></TABLE>
420</BODY>
421</HTML>
Note: See TracBrowser for help on using the repository browser.