1 | <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN"> |
---|
2 | <HTML> |
---|
3 | <HEAD> |
---|
4 | <TITLE>mix</TITLE> |
---|
5 | <META NAME="description" CONTENT="mix"> |
---|
6 | <META NAME="keywords" CONTENT="mix"> |
---|
7 | <META NAME="resource-type" CONTENT="document"> |
---|
8 | <META NAME="distribution" CONTENT="global"> |
---|
9 | <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1"> |
---|
10 | </HEAD> |
---|
11 | <BODY BGCOLOR="#ccffff"> |
---|
12 | <DIV ALIGN=RIGHT> |
---|
13 | version 3.6 |
---|
14 | </DIV> |
---|
15 | <P> |
---|
16 | <DIV ALIGN=CENTER> |
---|
17 | <H1>MIX - Mixed method discrete characters parsimony</H1> |
---|
18 | </DIV> |
---|
19 | <P> |
---|
20 | © Copyright 1986-2002 by the University of |
---|
21 | Washington. Written by Joseph Felsenstein. Permission is granted to copy |
---|
22 | this document provided that no fee is charged for it and that this copyright |
---|
23 | notice is not removed. |
---|
24 | <P> |
---|
25 | MIX is a general parsimony program which carries out the Wagner and |
---|
26 | Camin-Sokal parsimony methods in mixture, where each character can have |
---|
27 | its method specified separately. The program defaults to carrying out Wagner |
---|
28 | parsimony. |
---|
29 | <P> |
---|
30 | The Camin-Sokal parsimony method explains the data by assuming that |
---|
31 | changes 0 --> 1 are allowed but not changes 1 --> 0. Wagner parsimony |
---|
32 | allows both kinds of changes. (This under the assumption that 0 is the |
---|
33 | ancestral state, though the program allows reassignment of the ancestral |
---|
34 | state, in which case we must reverse the state numbers 0 and 1 |
---|
35 | throughout this discussion). The criterion is to find the tree which |
---|
36 | requires the minimum number of changes. The Camin-Sokal method is due |
---|
37 | to Camin and Sokal (1965) and the Wagner method to Eck and Dayhoff |
---|
38 | (1966) and to Kluge and Farris (1969). |
---|
39 | <P> |
---|
40 | Here are the assumptions of these two methods: |
---|
41 | <P> |
---|
42 | <OL> |
---|
43 | <LI>Ancestral states are known (Camin-Sokal) or unknown (Wagner). |
---|
44 | <LI>Different characters evolve independently. |
---|
45 | <LI>Different lineages evolve independently. |
---|
46 | <LI>Changes 0 --> 1 are much more probable than changes 1 --> 0 |
---|
47 | (Camin-Sokal) or equally probable (Wagner). |
---|
48 | <LI>Both of these kinds of changes are a priori improbable over the |
---|
49 | evolutionary time spans involved in the differentiation of the |
---|
50 | group in question. |
---|
51 | <LI>Other kinds of evolutionary event such as retention of polymorphism |
---|
52 | are far less probable than 0 --> 1 changes. |
---|
53 | <LI>Rates of evolution in different lineages are sufficiently low that |
---|
54 | two changes in a long segment of the tree are far less probable |
---|
55 | than one change in a short segment. |
---|
56 | </OL> |
---|
57 | <P> |
---|
58 | That these are the assumptions of parsimony methods has been documented |
---|
59 | in a series of papers of mine: (1973a, 1978b, 1979, 1981b, |
---|
60 | 1983b, 1988b). For an opposing view arguing that the parsimony methods |
---|
61 | make no substantive |
---|
62 | assumptions such as these, see the papers by Farris (1983) and Sober (1983a, |
---|
63 | 1983b), but also read the exchange between Felsenstein and Sober (1986). |
---|
64 | <P> |
---|
65 | <H2>INPUT FORMAT</H2> |
---|
66 | <P> |
---|
67 | The input for MIX is the standard input for discrete characters |
---|
68 | programs, described above in the documentation file for the |
---|
69 | discrete-characters programs. States "?", "P", and "B" are allowed. |
---|
70 | <P> |
---|
71 | The options are selected using a menu: |
---|
72 | <P> |
---|
73 | <TABLE><TR><TD BGCOLOR=white> |
---|
74 | <PRE> |
---|
75 | |
---|
76 | Mixed parsimony algorithm, version 3.6a3 |
---|
77 | |
---|
78 | Settings for this run: |
---|
79 | U Search for best tree? Yes |
---|
80 | X Use Mixed method? No |
---|
81 | P Parsimony method? Wagner |
---|
82 | J Randomize input order of species? No. Use input order |
---|
83 | O Outgroup root? No, use as outgroup species 1 |
---|
84 | T Use Threshold parsimony? No, use ordinary parsimony |
---|
85 | A Use ancestral states in input file? No |
---|
86 | W Sites weighted? No |
---|
87 | M Analyze multiple data sets? No |
---|
88 | 0 Terminal type (IBM PC, ANSI, none)? (none) |
---|
89 | 1 Print out the data at start of run No |
---|
90 | 2 Print indications of progress of run Yes |
---|
91 | 3 Print out tree Yes |
---|
92 | 4 Print out steps in each character No |
---|
93 | 5 Print states at all nodes of tree No |
---|
94 | 6 Write out trees onto tree file? Yes |
---|
95 | |
---|
96 | Are these settings correct? (type Y or the letter for one to change) |
---|
97 | |
---|
98 | </PRE> |
---|
99 | </TD></TR></TABLE> |
---|
100 | <P> |
---|
101 | The options U, X, J, O, T, A, and M are the usual User Tree, miXed |
---|
102 | methods, Jumble, Outgroup, |
---|
103 | Ancestral States, and Multiple Data Sets options, described either |
---|
104 | in the main documentation file or in the Discrete Characters Programs |
---|
105 | documentation file. The |
---|
106 | user-defined trees supplied if you use the U option must be given as rooted |
---|
107 | trees with two-way splits (bifurcations). The O option is acted upon only if |
---|
108 | the final tree is unrooted and is not a user-defined tree. One of the |
---|
109 | important uses of the the O option is to root the tree so that if there are |
---|
110 | any characters in which the ancestral states have not been specified, the |
---|
111 | program will print out a table showing which ancestral states require the |
---|
112 | fewest steps. Note that when any of the characters has Camin-Sokal parsimony |
---|
113 | assumed for it, the tree is rooted and the O option will have no effect. |
---|
114 | <P> |
---|
115 | The option P toggles between the Camin-Sokal parsimony criterion |
---|
116 | and the default Wagner parsimony criterion. Option X invokes |
---|
117 | mixed-method parsimony. If the A option is invoked, the ancestor is not |
---|
118 | to be counted as one of the species. |
---|
119 | <P> |
---|
120 | The F (Factors) |
---|
121 | option is not available in this program, as it would have no effect on |
---|
122 | the result even if that information were provided in the input file. |
---|
123 | <P> |
---|
124 | <H2>OUTPUT FORMAT</H2> |
---|
125 | <P> |
---|
126 | Output is standard: a list of equally parsimonious trees, which will be printed |
---|
127 | as rooted or unrooted depending on which is appropriate, and, if the |
---|
128 | user chooses, a table of the |
---|
129 | number of changes of state required in each character. If the Wagner option is |
---|
130 | in force for a character, it may not be possible to unambiguously locate the |
---|
131 | places on the tree where the changes occur, as there may be multiple |
---|
132 | possibilities. If the user selects menu option 5, a table is printed out |
---|
133 | after each tree, showing for each |
---|
134 | branch whether there are known to be changes in the branch, and what the states |
---|
135 | are inferred to have been at the top end of the branch. If the inferred state |
---|
136 | is a "?" there will be multiple equally-parsimonious assignments of states; the |
---|
137 | user must work these out for themselves by hand. |
---|
138 | <P> |
---|
139 | If the Camin-Sokal parsimony method |
---|
140 | is invoked and the Ancestors option is also used, then the program will |
---|
141 | infer, for any character whose ancestral state is unknown ("?") whether the |
---|
142 | ancestral state 0 or 1 will give the fewest state changes. If these are |
---|
143 | tied, then it may not be possible for the program to infer the |
---|
144 | state in the internal nodes, and these will all be printed as ".". If this |
---|
145 | has happened and you want to know more about the states at the internal |
---|
146 | nodes, you will find helpful to use MOVE to display the tree and examine |
---|
147 | its interior states, as the algorithm in MOVE shows all that can be known |
---|
148 | in this case about the interior states, including where there is and is not |
---|
149 | amibiguity. The algorithm in MIX gives up more easily on displaying these |
---|
150 | states. |
---|
151 | <P> |
---|
152 | If the A option is not used, then the program will assume 0 as the |
---|
153 | ancestral state for those characters following the Camin-Sokal method, |
---|
154 | and will assume that the ancestral state is unknown for those characters |
---|
155 | following Wagner parsimony. If any characters have unknown ancestral |
---|
156 | states, and if the resulting tree is rooted (even by outgroup), |
---|
157 | a table will also be printed out |
---|
158 | showing the best guesses of which are the ancestral states in each |
---|
159 | character. You will find it useful to understand the difference between |
---|
160 | the Camin-Sokal parsimony criterion with unknown ancestral state and the Wagner |
---|
161 | parsimony criterion. |
---|
162 | <P> |
---|
163 | If the U (User Tree) option is used and more than one tree is supplied, the |
---|
164 | program also performs a statistical test of each of these trees against the |
---|
165 | best tree. This test, which is a version of the test proposed by |
---|
166 | Alan Templeton (1983) and evaluated in a test case by me (1985a). It is |
---|
167 | closely parallel to a test using log likelihood differences |
---|
168 | invented by Kishino and Hasegawa (1989), and uses the mean and variance of |
---|
169 | step differences between trees, taken across characters. If the mean |
---|
170 | is more than 1.96 standard deviations different then the trees are declared |
---|
171 | significantly different. The program |
---|
172 | prints out a table of the steps for each tree, the differences of |
---|
173 | each from the highest one, the variance of that quantity as determined by |
---|
174 | the step differences at individual sites, and a conclusion as to |
---|
175 | whether that tree is or is not significantly worse than the best one. It |
---|
176 | is important to understand that the test assumes that all the binary |
---|
177 | characters are evolving independently, which is unlikely to be true for |
---|
178 | many suites of morphological characters. |
---|
179 | <P> |
---|
180 | If the U (User Tree) option is used and more than one tree is supplied, the |
---|
181 | program also performs a statistical test of each of these trees against the |
---|
182 | best tree. This test, which is a version of the test proposed by |
---|
183 | Alan Templeton (1983) and evaluated in a test case by me (1985a). It is |
---|
184 | closely parallel to a test using log likelihood differences |
---|
185 | invented by Kishino and Hasegawa (1989), and uses the mean and variance of |
---|
186 | step differences between trees, taken across characters. If the mean |
---|
187 | is more than 1.96 standard deviations different then the trees are declared |
---|
188 | significantly different. The program |
---|
189 | prints out a table of the steps for each tree, the differences of |
---|
190 | each from the highest one, the variance of that quantity as determined by |
---|
191 | the step differences at individual characters, and a conclusion as to |
---|
192 | whether that tree is or is not significantly worse than the best one. It |
---|
193 | is important to understand that the test assumes that all the binary |
---|
194 | characters are evolving independently, which is unlikely to be true for |
---|
195 | many suites of morphological characters. |
---|
196 | <P> |
---|
197 | If there are more than two trees, the test done is an extension of |
---|
198 | the KHT test, due to Shimodaira and Hasegawa (1999). They pointed out |
---|
199 | that a correction for the number of trees was necessary, and they |
---|
200 | introduced a resampling method to make this correction. In the version |
---|
201 | used here the variances and covariances of the sums of steps across |
---|
202 | characters are computed for all pairs of trees. To test whether the |
---|
203 | difference between each tree and the best one is larger than could have |
---|
204 | been expected if they all had the same expected number of steps, |
---|
205 | numbers of steps for all trees are sampled with these covariances and equal |
---|
206 | means (Shimodaira and Hasegawa's "least favorable hypothesis"), |
---|
207 | and a P value is computed from the fraction of times the difference between |
---|
208 | the tree's value and the lowest number of steps exceeds that actually |
---|
209 | observed. Note that this sampling needs random numbers, and so the |
---|
210 | program will prompt the user for a random number seed if one has not |
---|
211 | already been supplied. With the two-tree KHT test no random numbers |
---|
212 | are used. |
---|
213 | <P> |
---|
214 | In either the KHT or the SH test the program |
---|
215 | prints out a table of the number of steps for each tree, the differences of |
---|
216 | each from the lowest one, the variance of that quantity as determined by |
---|
217 | the differences of the numbers of steps at individual characters, |
---|
218 | and a conclusion as to |
---|
219 | whether that tree is or is not significantly worse than the best one. |
---|
220 | <P> |
---|
221 | At the beginning of the program is a constant, <TT>maxtrees</TT>, |
---|
222 | the maximum number of trees which the program will store for output. |
---|
223 | <P> |
---|
224 | The program is descended from earlier programs SOKAL and WAGNER which have |
---|
225 | long since been removed from the PHYLIP package, since MIX has all their |
---|
226 | capabilites and more. |
---|
227 | <P> |
---|
228 | <HR> |
---|
229 | <P> |
---|
230 | <H3>TEST DATA SET</H3> |
---|
231 | <P> |
---|
232 | <TABLE><TR><TD BGCOLOR=white> |
---|
233 | <PRE> |
---|
234 | 5 6 |
---|
235 | Alpha 110110 |
---|
236 | Beta 110000 |
---|
237 | Gamma 100110 |
---|
238 | Delta 001001 |
---|
239 | Epsilon 001110 |
---|
240 | </PRE> |
---|
241 | </TD></TR></TABLE> |
---|
242 | <P> |
---|
243 | <HR> |
---|
244 | <P> |
---|
245 | <H3>TEST SET OUTPUT (with all numerical options on)</H3> |
---|
246 | <P> |
---|
247 | <TABLE><TR><TD BGCOLOR=white> |
---|
248 | <PRE> |
---|
249 | |
---|
250 | Mixed parsimony algorithm, version 3.6a3 |
---|
251 | |
---|
252 | 5 species, 6 characters |
---|
253 | |
---|
254 | Wagner parsimony method |
---|
255 | |
---|
256 | |
---|
257 | Name Characters |
---|
258 | ---- ---------- |
---|
259 | |
---|
260 | Alpha 11011 0 |
---|
261 | Beta 11000 0 |
---|
262 | Gamma 10011 0 |
---|
263 | Delta 00100 1 |
---|
264 | Epsilon 00111 0 |
---|
265 | |
---|
266 | |
---|
267 | |
---|
268 | 4 trees in all found |
---|
269 | |
---|
270 | |
---|
271 | |
---|
272 | |
---|
273 | +--Epsilon |
---|
274 | +-----4 |
---|
275 | ! +--Gamma |
---|
276 | +--2 |
---|
277 | ! ! +--Delta |
---|
278 | --1 +-----3 |
---|
279 | ! +--Beta |
---|
280 | ! |
---|
281 | +-----------Alpha |
---|
282 | |
---|
283 | remember: this is an unrooted tree! |
---|
284 | |
---|
285 | |
---|
286 | requires a total of 9.000 |
---|
287 | |
---|
288 | steps in each character: |
---|
289 | 0 1 2 3 4 5 6 7 8 9 |
---|
290 | *----------------------------------------- |
---|
291 | 0! 2 2 2 1 1 1 |
---|
292 | |
---|
293 | From To Any Steps? State at upper node |
---|
294 | ( . means same as in the node below it on tree) |
---|
295 | |
---|
296 | 1 1?011 0 |
---|
297 | 1 2 no .?... . |
---|
298 | 2 4 maybe .0... . |
---|
299 | 4 Epsilon yes 0.1.. . |
---|
300 | 4 Gamma no ..... . |
---|
301 | 2 3 yes .?.00 . |
---|
302 | 3 Delta yes 001.. 1 |
---|
303 | 3 Beta maybe .1... . |
---|
304 | 1 Alpha maybe .1... . |
---|
305 | |
---|
306 | |
---|
307 | |
---|
308 | |
---|
309 | |
---|
310 | +--------Gamma |
---|
311 | ! |
---|
312 | +--2 +--Epsilon |
---|
313 | ! ! +--4 |
---|
314 | ! +--3 +--Delta |
---|
315 | --1 ! |
---|
316 | ! +-----Beta |
---|
317 | ! |
---|
318 | +-----------Alpha |
---|
319 | |
---|
320 | remember: this is an unrooted tree! |
---|
321 | |
---|
322 | |
---|
323 | requires a total of 9.000 |
---|
324 | |
---|
325 | steps in each character: |
---|
326 | 0 1 2 3 4 5 6 7 8 9 |
---|
327 | *----------------------------------------- |
---|
328 | 0! 1 2 1 2 2 1 |
---|
329 | |
---|
330 | From To Any Steps? State at upper node |
---|
331 | ( . means same as in the node below it on tree) |
---|
332 | |
---|
333 | 1 1?011 0 |
---|
334 | 1 2 no .?... . |
---|
335 | 2 Gamma maybe .0... . |
---|
336 | 2 3 maybe .?.?? . |
---|
337 | 3 4 yes 001?? . |
---|
338 | 4 Epsilon maybe ...11 . |
---|
339 | 4 Delta yes ...00 1 |
---|
340 | 3 Beta maybe .1.00 . |
---|
341 | 1 Alpha maybe .1... . |
---|
342 | |
---|
343 | |
---|
344 | |
---|
345 | |
---|
346 | |
---|
347 | +--------Epsilon |
---|
348 | +--4 |
---|
349 | ! ! +-----Gamma |
---|
350 | ! +--2 |
---|
351 | --1 ! +--Delta |
---|
352 | ! +--3 |
---|
353 | ! +--Beta |
---|
354 | ! |
---|
355 | +-----------Alpha |
---|
356 | |
---|
357 | remember: this is an unrooted tree! |
---|
358 | |
---|
359 | |
---|
360 | requires a total of 9.000 |
---|
361 | |
---|
362 | steps in each character: |
---|
363 | 0 1 2 3 4 5 6 7 8 9 |
---|
364 | *----------------------------------------- |
---|
365 | 0! 2 2 2 1 1 1 |
---|
366 | |
---|
367 | From To Any Steps? State at upper node |
---|
368 | ( . means same as in the node below it on tree) |
---|
369 | |
---|
370 | 1 1?011 0 |
---|
371 | 1 4 maybe .0... . |
---|
372 | 4 Epsilon yes 0.1.. . |
---|
373 | 4 2 no ..... . |
---|
374 | 2 Gamma no ..... . |
---|
375 | 2 3 yes ...00 . |
---|
376 | 3 Delta yes 0.1.. 1 |
---|
377 | 3 Beta yes .1... . |
---|
378 | 1 Alpha maybe .1... . |
---|
379 | |
---|
380 | |
---|
381 | |
---|
382 | |
---|
383 | |
---|
384 | +--------Gamma |
---|
385 | +--2 |
---|
386 | ! ! +-----Epsilon |
---|
387 | ! +--4 |
---|
388 | --1 ! +--Delta |
---|
389 | ! +--3 |
---|
390 | ! +--Beta |
---|
391 | ! |
---|
392 | +-----------Alpha |
---|
393 | |
---|
394 | remember: this is an unrooted tree! |
---|
395 | |
---|
396 | |
---|
397 | requires a total of 9.000 |
---|
398 | |
---|
399 | steps in each character: |
---|
400 | 0 1 2 3 4 5 6 7 8 9 |
---|
401 | *----------------------------------------- |
---|
402 | 0! 2 2 2 1 1 1 |
---|
403 | |
---|
404 | From To Any Steps? State at upper node |
---|
405 | ( . means same as in the node below it on tree) |
---|
406 | |
---|
407 | 1 1?011 0 |
---|
408 | 1 2 maybe .0... . |
---|
409 | 2 Gamma no ..... . |
---|
410 | 2 4 maybe ?.?.. . |
---|
411 | 4 Epsilon maybe 0.1.. . |
---|
412 | 4 3 yes ?.?00 . |
---|
413 | 3 Delta yes 0.1.. 1 |
---|
414 | 3 Beta yes 110.. . |
---|
415 | 1 Alpha maybe .1... . |
---|
416 | |
---|
417 | |
---|
418 | </PRE> |
---|
419 | </TD></TR></TABLE> |
---|
420 | </BODY> |
---|
421 | </HTML> |
---|