| 1 | <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN"> |
|---|
| 2 | <HTML> |
|---|
| 3 | <HEAD> |
|---|
| 4 | <TITLE>mix</TITLE> |
|---|
| 5 | <META NAME="description" CONTENT="mix"> |
|---|
| 6 | <META NAME="keywords" CONTENT="mix"> |
|---|
| 7 | <META NAME="resource-type" CONTENT="document"> |
|---|
| 8 | <META NAME="distribution" CONTENT="global"> |
|---|
| 9 | <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1"> |
|---|
| 10 | </HEAD> |
|---|
| 11 | <BODY BGCOLOR="#ccffff"> |
|---|
| 12 | <DIV ALIGN=RIGHT> |
|---|
| 13 | version 3.6 |
|---|
| 14 | </DIV> |
|---|
| 15 | <P> |
|---|
| 16 | <DIV ALIGN=CENTER> |
|---|
| 17 | <H1>MIX - Mixed method discrete characters parsimony</H1> |
|---|
| 18 | </DIV> |
|---|
| 19 | <P> |
|---|
| 20 | © Copyright 1986-2002 by the University of |
|---|
| 21 | Washington. Written by Joseph Felsenstein. Permission is granted to copy |
|---|
| 22 | this document provided that no fee is charged for it and that this copyright |
|---|
| 23 | notice is not removed. |
|---|
| 24 | <P> |
|---|
| 25 | MIX is a general parsimony program which carries out the Wagner and |
|---|
| 26 | Camin-Sokal parsimony methods in mixture, where each character can have |
|---|
| 27 | its method specified separately. The program defaults to carrying out Wagner |
|---|
| 28 | parsimony. |
|---|
| 29 | <P> |
|---|
| 30 | The Camin-Sokal parsimony method explains the data by assuming that |
|---|
| 31 | changes 0 --> 1 are allowed but not changes 1 --> 0. Wagner parsimony |
|---|
| 32 | allows both kinds of changes. (This under the assumption that 0 is the |
|---|
| 33 | ancestral state, though the program allows reassignment of the ancestral |
|---|
| 34 | state, in which case we must reverse the state numbers 0 and 1 |
|---|
| 35 | throughout this discussion). The criterion is to find the tree which |
|---|
| 36 | requires the minimum number of changes. The Camin-Sokal method is due |
|---|
| 37 | to Camin and Sokal (1965) and the Wagner method to Eck and Dayhoff |
|---|
| 38 | (1966) and to Kluge and Farris (1969). |
|---|
| 39 | <P> |
|---|
| 40 | Here are the assumptions of these two methods: |
|---|
| 41 | <P> |
|---|
| 42 | <OL> |
|---|
| 43 | <LI>Ancestral states are known (Camin-Sokal) or unknown (Wagner). |
|---|
| 44 | <LI>Different characters evolve independently. |
|---|
| 45 | <LI>Different lineages evolve independently. |
|---|
| 46 | <LI>Changes 0 --> 1 are much more probable than changes 1 --> 0 |
|---|
| 47 | (Camin-Sokal) or equally probable (Wagner). |
|---|
| 48 | <LI>Both of these kinds of changes are a priori improbable over the |
|---|
| 49 | evolutionary time spans involved in the differentiation of the |
|---|
| 50 | group in question. |
|---|
| 51 | <LI>Other kinds of evolutionary event such as retention of polymorphism |
|---|
| 52 | are far less probable than 0 --> 1 changes. |
|---|
| 53 | <LI>Rates of evolution in different lineages are sufficiently low that |
|---|
| 54 | two changes in a long segment of the tree are far less probable |
|---|
| 55 | than one change in a short segment. |
|---|
| 56 | </OL> |
|---|
| 57 | <P> |
|---|
| 58 | That these are the assumptions of parsimony methods has been documented |
|---|
| 59 | in a series of papers of mine: (1973a, 1978b, 1979, 1981b, |
|---|
| 60 | 1983b, 1988b). For an opposing view arguing that the parsimony methods |
|---|
| 61 | make no substantive |
|---|
| 62 | assumptions such as these, see the papers by Farris (1983) and Sober (1983a, |
|---|
| 63 | 1983b), but also read the exchange between Felsenstein and Sober (1986). |
|---|
| 64 | <P> |
|---|
| 65 | <H2>INPUT FORMAT</H2> |
|---|
| 66 | <P> |
|---|
| 67 | The input for MIX is the standard input for discrete characters |
|---|
| 68 | programs, described above in the documentation file for the |
|---|
| 69 | discrete-characters programs. States "?", "P", and "B" are allowed. |
|---|
| 70 | <P> |
|---|
| 71 | The options are selected using a menu: |
|---|
| 72 | <P> |
|---|
| 73 | <TABLE><TR><TD BGCOLOR=white> |
|---|
| 74 | <PRE> |
|---|
| 75 | |
|---|
| 76 | Mixed parsimony algorithm, version 3.6a3 |
|---|
| 77 | |
|---|
| 78 | Settings for this run: |
|---|
| 79 | U Search for best tree? Yes |
|---|
| 80 | X Use Mixed method? No |
|---|
| 81 | P Parsimony method? Wagner |
|---|
| 82 | J Randomize input order of species? No. Use input order |
|---|
| 83 | O Outgroup root? No, use as outgroup species 1 |
|---|
| 84 | T Use Threshold parsimony? No, use ordinary parsimony |
|---|
| 85 | A Use ancestral states in input file? No |
|---|
| 86 | W Sites weighted? No |
|---|
| 87 | M Analyze multiple data sets? No |
|---|
| 88 | 0 Terminal type (IBM PC, ANSI, none)? (none) |
|---|
| 89 | 1 Print out the data at start of run No |
|---|
| 90 | 2 Print indications of progress of run Yes |
|---|
| 91 | 3 Print out tree Yes |
|---|
| 92 | 4 Print out steps in each character No |
|---|
| 93 | 5 Print states at all nodes of tree No |
|---|
| 94 | 6 Write out trees onto tree file? Yes |
|---|
| 95 | |
|---|
| 96 | Are these settings correct? (type Y or the letter for one to change) |
|---|
| 97 | |
|---|
| 98 | </PRE> |
|---|
| 99 | </TD></TR></TABLE> |
|---|
| 100 | <P> |
|---|
| 101 | The options U, X, J, O, T, A, and M are the usual User Tree, miXed |
|---|
| 102 | methods, Jumble, Outgroup, |
|---|
| 103 | Ancestral States, and Multiple Data Sets options, described either |
|---|
| 104 | in the main documentation file or in the Discrete Characters Programs |
|---|
| 105 | documentation file. The |
|---|
| 106 | user-defined trees supplied if you use the U option must be given as rooted |
|---|
| 107 | trees with two-way splits (bifurcations). The O option is acted upon only if |
|---|
| 108 | the final tree is unrooted and is not a user-defined tree. One of the |
|---|
| 109 | important uses of the the O option is to root the tree so that if there are |
|---|
| 110 | any characters in which the ancestral states have not been specified, the |
|---|
| 111 | program will print out a table showing which ancestral states require the |
|---|
| 112 | fewest steps. Note that when any of the characters has Camin-Sokal parsimony |
|---|
| 113 | assumed for it, the tree is rooted and the O option will have no effect. |
|---|
| 114 | <P> |
|---|
| 115 | The option P toggles between the Camin-Sokal parsimony criterion |
|---|
| 116 | and the default Wagner parsimony criterion. Option X invokes |
|---|
| 117 | mixed-method parsimony. If the A option is invoked, the ancestor is not |
|---|
| 118 | to be counted as one of the species. |
|---|
| 119 | <P> |
|---|
| 120 | The F (Factors) |
|---|
| 121 | option is not available in this program, as it would have no effect on |
|---|
| 122 | the result even if that information were provided in the input file. |
|---|
| 123 | <P> |
|---|
| 124 | <H2>OUTPUT FORMAT</H2> |
|---|
| 125 | <P> |
|---|
| 126 | Output is standard: a list of equally parsimonious trees, which will be printed |
|---|
| 127 | as rooted or unrooted depending on which is appropriate, and, if the |
|---|
| 128 | user chooses, a table of the |
|---|
| 129 | number of changes of state required in each character. If the Wagner option is |
|---|
| 130 | in force for a character, it may not be possible to unambiguously locate the |
|---|
| 131 | places on the tree where the changes occur, as there may be multiple |
|---|
| 132 | possibilities. If the user selects menu option 5, a table is printed out |
|---|
| 133 | after each tree, showing for each |
|---|
| 134 | branch whether there are known to be changes in the branch, and what the states |
|---|
| 135 | are inferred to have been at the top end of the branch. If the inferred state |
|---|
| 136 | is a "?" there will be multiple equally-parsimonious assignments of states; the |
|---|
| 137 | user must work these out for themselves by hand. |
|---|
| 138 | <P> |
|---|
| 139 | If the Camin-Sokal parsimony method |
|---|
| 140 | is invoked and the Ancestors option is also used, then the program will |
|---|
| 141 | infer, for any character whose ancestral state is unknown ("?") whether the |
|---|
| 142 | ancestral state 0 or 1 will give the fewest state changes. If these are |
|---|
| 143 | tied, then it may not be possible for the program to infer the |
|---|
| 144 | state in the internal nodes, and these will all be printed as ".". If this |
|---|
| 145 | has happened and you want to know more about the states at the internal |
|---|
| 146 | nodes, you will find helpful to use MOVE to display the tree and examine |
|---|
| 147 | its interior states, as the algorithm in MOVE shows all that can be known |
|---|
| 148 | in this case about the interior states, including where there is and is not |
|---|
| 149 | amibiguity. The algorithm in MIX gives up more easily on displaying these |
|---|
| 150 | states. |
|---|
| 151 | <P> |
|---|
| 152 | If the A option is not used, then the program will assume 0 as the |
|---|
| 153 | ancestral state for those characters following the Camin-Sokal method, |
|---|
| 154 | and will assume that the ancestral state is unknown for those characters |
|---|
| 155 | following Wagner parsimony. If any characters have unknown ancestral |
|---|
| 156 | states, and if the resulting tree is rooted (even by outgroup), |
|---|
| 157 | a table will also be printed out |
|---|
| 158 | showing the best guesses of which are the ancestral states in each |
|---|
| 159 | character. You will find it useful to understand the difference between |
|---|
| 160 | the Camin-Sokal parsimony criterion with unknown ancestral state and the Wagner |
|---|
| 161 | parsimony criterion. |
|---|
| 162 | <P> |
|---|
| 163 | If the U (User Tree) option is used and more than one tree is supplied, the |
|---|
| 164 | program also performs a statistical test of each of these trees against the |
|---|
| 165 | best tree. This test, which is a version of the test proposed by |
|---|
| 166 | Alan Templeton (1983) and evaluated in a test case by me (1985a). It is |
|---|
| 167 | closely parallel to a test using log likelihood differences |
|---|
| 168 | invented by Kishino and Hasegawa (1989), and uses the mean and variance of |
|---|
| 169 | step differences between trees, taken across characters. If the mean |
|---|
| 170 | is more than 1.96 standard deviations different then the trees are declared |
|---|
| 171 | significantly different. The program |
|---|
| 172 | prints out a table of the steps for each tree, the differences of |
|---|
| 173 | each from the highest one, the variance of that quantity as determined by |
|---|
| 174 | the step differences at individual sites, and a conclusion as to |
|---|
| 175 | whether that tree is or is not significantly worse than the best one. It |
|---|
| 176 | is important to understand that the test assumes that all the binary |
|---|
| 177 | characters are evolving independently, which is unlikely to be true for |
|---|
| 178 | many suites of morphological characters. |
|---|
| 179 | <P> |
|---|
| 180 | If the U (User Tree) option is used and more than one tree is supplied, the |
|---|
| 181 | program also performs a statistical test of each of these trees against the |
|---|
| 182 | best tree. This test, which is a version of the test proposed by |
|---|
| 183 | Alan Templeton (1983) and evaluated in a test case by me (1985a). It is |
|---|
| 184 | closely parallel to a test using log likelihood differences |
|---|
| 185 | invented by Kishino and Hasegawa (1989), and uses the mean and variance of |
|---|
| 186 | step differences between trees, taken across characters. If the mean |
|---|
| 187 | is more than 1.96 standard deviations different then the trees are declared |
|---|
| 188 | significantly different. The program |
|---|
| 189 | prints out a table of the steps for each tree, the differences of |
|---|
| 190 | each from the highest one, the variance of that quantity as determined by |
|---|
| 191 | the step differences at individual characters, and a conclusion as to |
|---|
| 192 | whether that tree is or is not significantly worse than the best one. It |
|---|
| 193 | is important to understand that the test assumes that all the binary |
|---|
| 194 | characters are evolving independently, which is unlikely to be true for |
|---|
| 195 | many suites of morphological characters. |
|---|
| 196 | <P> |
|---|
| 197 | If there are more than two trees, the test done is an extension of |
|---|
| 198 | the KHT test, due to Shimodaira and Hasegawa (1999). They pointed out |
|---|
| 199 | that a correction for the number of trees was necessary, and they |
|---|
| 200 | introduced a resampling method to make this correction. In the version |
|---|
| 201 | used here the variances and covariances of the sums of steps across |
|---|
| 202 | characters are computed for all pairs of trees. To test whether the |
|---|
| 203 | difference between each tree and the best one is larger than could have |
|---|
| 204 | been expected if they all had the same expected number of steps, |
|---|
| 205 | numbers of steps for all trees are sampled with these covariances and equal |
|---|
| 206 | means (Shimodaira and Hasegawa's "least favorable hypothesis"), |
|---|
| 207 | and a P value is computed from the fraction of times the difference between |
|---|
| 208 | the tree's value and the lowest number of steps exceeds that actually |
|---|
| 209 | observed. Note that this sampling needs random numbers, and so the |
|---|
| 210 | program will prompt the user for a random number seed if one has not |
|---|
| 211 | already been supplied. With the two-tree KHT test no random numbers |
|---|
| 212 | are used. |
|---|
| 213 | <P> |
|---|
| 214 | In either the KHT or the SH test the program |
|---|
| 215 | prints out a table of the number of steps for each tree, the differences of |
|---|
| 216 | each from the lowest one, the variance of that quantity as determined by |
|---|
| 217 | the differences of the numbers of steps at individual characters, |
|---|
| 218 | and a conclusion as to |
|---|
| 219 | whether that tree is or is not significantly worse than the best one. |
|---|
| 220 | <P> |
|---|
| 221 | At the beginning of the program is a constant, <TT>maxtrees</TT>, |
|---|
| 222 | the maximum number of trees which the program will store for output. |
|---|
| 223 | <P> |
|---|
| 224 | The program is descended from earlier programs SOKAL and WAGNER which have |
|---|
| 225 | long since been removed from the PHYLIP package, since MIX has all their |
|---|
| 226 | capabilites and more. |
|---|
| 227 | <P> |
|---|
| 228 | <HR> |
|---|
| 229 | <P> |
|---|
| 230 | <H3>TEST DATA SET</H3> |
|---|
| 231 | <P> |
|---|
| 232 | <TABLE><TR><TD BGCOLOR=white> |
|---|
| 233 | <PRE> |
|---|
| 234 | 5 6 |
|---|
| 235 | Alpha 110110 |
|---|
| 236 | Beta 110000 |
|---|
| 237 | Gamma 100110 |
|---|
| 238 | Delta 001001 |
|---|
| 239 | Epsilon 001110 |
|---|
| 240 | </PRE> |
|---|
| 241 | </TD></TR></TABLE> |
|---|
| 242 | <P> |
|---|
| 243 | <HR> |
|---|
| 244 | <P> |
|---|
| 245 | <H3>TEST SET OUTPUT (with all numerical options on)</H3> |
|---|
| 246 | <P> |
|---|
| 247 | <TABLE><TR><TD BGCOLOR=white> |
|---|
| 248 | <PRE> |
|---|
| 249 | |
|---|
| 250 | Mixed parsimony algorithm, version 3.6a3 |
|---|
| 251 | |
|---|
| 252 | 5 species, 6 characters |
|---|
| 253 | |
|---|
| 254 | Wagner parsimony method |
|---|
| 255 | |
|---|
| 256 | |
|---|
| 257 | Name Characters |
|---|
| 258 | ---- ---------- |
|---|
| 259 | |
|---|
| 260 | Alpha 11011 0 |
|---|
| 261 | Beta 11000 0 |
|---|
| 262 | Gamma 10011 0 |
|---|
| 263 | Delta 00100 1 |
|---|
| 264 | Epsilon 00111 0 |
|---|
| 265 | |
|---|
| 266 | |
|---|
| 267 | |
|---|
| 268 | 4 trees in all found |
|---|
| 269 | |
|---|
| 270 | |
|---|
| 271 | |
|---|
| 272 | |
|---|
| 273 | +--Epsilon |
|---|
| 274 | +-----4 |
|---|
| 275 | ! +--Gamma |
|---|
| 276 | +--2 |
|---|
| 277 | ! ! +--Delta |
|---|
| 278 | --1 +-----3 |
|---|
| 279 | ! +--Beta |
|---|
| 280 | ! |
|---|
| 281 | +-----------Alpha |
|---|
| 282 | |
|---|
| 283 | remember: this is an unrooted tree! |
|---|
| 284 | |
|---|
| 285 | |
|---|
| 286 | requires a total of 9.000 |
|---|
| 287 | |
|---|
| 288 | steps in each character: |
|---|
| 289 | 0 1 2 3 4 5 6 7 8 9 |
|---|
| 290 | *----------------------------------------- |
|---|
| 291 | 0! 2 2 2 1 1 1 |
|---|
| 292 | |
|---|
| 293 | From To Any Steps? State at upper node |
|---|
| 294 | ( . means same as in the node below it on tree) |
|---|
| 295 | |
|---|
| 296 | 1 1?011 0 |
|---|
| 297 | 1 2 no .?... . |
|---|
| 298 | 2 4 maybe .0... . |
|---|
| 299 | 4 Epsilon yes 0.1.. . |
|---|
| 300 | 4 Gamma no ..... . |
|---|
| 301 | 2 3 yes .?.00 . |
|---|
| 302 | 3 Delta yes 001.. 1 |
|---|
| 303 | 3 Beta maybe .1... . |
|---|
| 304 | 1 Alpha maybe .1... . |
|---|
| 305 | |
|---|
| 306 | |
|---|
| 307 | |
|---|
| 308 | |
|---|
| 309 | |
|---|
| 310 | +--------Gamma |
|---|
| 311 | ! |
|---|
| 312 | +--2 +--Epsilon |
|---|
| 313 | ! ! +--4 |
|---|
| 314 | ! +--3 +--Delta |
|---|
| 315 | --1 ! |
|---|
| 316 | ! +-----Beta |
|---|
| 317 | ! |
|---|
| 318 | +-----------Alpha |
|---|
| 319 | |
|---|
| 320 | remember: this is an unrooted tree! |
|---|
| 321 | |
|---|
| 322 | |
|---|
| 323 | requires a total of 9.000 |
|---|
| 324 | |
|---|
| 325 | steps in each character: |
|---|
| 326 | 0 1 2 3 4 5 6 7 8 9 |
|---|
| 327 | *----------------------------------------- |
|---|
| 328 | 0! 1 2 1 2 2 1 |
|---|
| 329 | |
|---|
| 330 | From To Any Steps? State at upper node |
|---|
| 331 | ( . means same as in the node below it on tree) |
|---|
| 332 | |
|---|
| 333 | 1 1?011 0 |
|---|
| 334 | 1 2 no .?... . |
|---|
| 335 | 2 Gamma maybe .0... . |
|---|
| 336 | 2 3 maybe .?.?? . |
|---|
| 337 | 3 4 yes 001?? . |
|---|
| 338 | 4 Epsilon maybe ...11 . |
|---|
| 339 | 4 Delta yes ...00 1 |
|---|
| 340 | 3 Beta maybe .1.00 . |
|---|
| 341 | 1 Alpha maybe .1... . |
|---|
| 342 | |
|---|
| 343 | |
|---|
| 344 | |
|---|
| 345 | |
|---|
| 346 | |
|---|
| 347 | +--------Epsilon |
|---|
| 348 | +--4 |
|---|
| 349 | ! ! +-----Gamma |
|---|
| 350 | ! +--2 |
|---|
| 351 | --1 ! +--Delta |
|---|
| 352 | ! +--3 |
|---|
| 353 | ! +--Beta |
|---|
| 354 | ! |
|---|
| 355 | +-----------Alpha |
|---|
| 356 | |
|---|
| 357 | remember: this is an unrooted tree! |
|---|
| 358 | |
|---|
| 359 | |
|---|
| 360 | requires a total of 9.000 |
|---|
| 361 | |
|---|
| 362 | steps in each character: |
|---|
| 363 | 0 1 2 3 4 5 6 7 8 9 |
|---|
| 364 | *----------------------------------------- |
|---|
| 365 | 0! 2 2 2 1 1 1 |
|---|
| 366 | |
|---|
| 367 | From To Any Steps? State at upper node |
|---|
| 368 | ( . means same as in the node below it on tree) |
|---|
| 369 | |
|---|
| 370 | 1 1?011 0 |
|---|
| 371 | 1 4 maybe .0... . |
|---|
| 372 | 4 Epsilon yes 0.1.. . |
|---|
| 373 | 4 2 no ..... . |
|---|
| 374 | 2 Gamma no ..... . |
|---|
| 375 | 2 3 yes ...00 . |
|---|
| 376 | 3 Delta yes 0.1.. 1 |
|---|
| 377 | 3 Beta yes .1... . |
|---|
| 378 | 1 Alpha maybe .1... . |
|---|
| 379 | |
|---|
| 380 | |
|---|
| 381 | |
|---|
| 382 | |
|---|
| 383 | |
|---|
| 384 | +--------Gamma |
|---|
| 385 | +--2 |
|---|
| 386 | ! ! +-----Epsilon |
|---|
| 387 | ! +--4 |
|---|
| 388 | --1 ! +--Delta |
|---|
| 389 | ! +--3 |
|---|
| 390 | ! +--Beta |
|---|
| 391 | ! |
|---|
| 392 | +-----------Alpha |
|---|
| 393 | |
|---|
| 394 | remember: this is an unrooted tree! |
|---|
| 395 | |
|---|
| 396 | |
|---|
| 397 | requires a total of 9.000 |
|---|
| 398 | |
|---|
| 399 | steps in each character: |
|---|
| 400 | 0 1 2 3 4 5 6 7 8 9 |
|---|
| 401 | *----------------------------------------- |
|---|
| 402 | 0! 2 2 2 1 1 1 |
|---|
| 403 | |
|---|
| 404 | From To Any Steps? State at upper node |
|---|
| 405 | ( . means same as in the node below it on tree) |
|---|
| 406 | |
|---|
| 407 | 1 1?011 0 |
|---|
| 408 | 1 2 maybe .0... . |
|---|
| 409 | 2 Gamma no ..... . |
|---|
| 410 | 2 4 maybe ?.?.. . |
|---|
| 411 | 4 Epsilon maybe 0.1.. . |
|---|
| 412 | 4 3 yes ?.?00 . |
|---|
| 413 | 3 Delta yes 0.1.. 1 |
|---|
| 414 | 3 Beta yes 110.. . |
|---|
| 415 | 1 Alpha maybe .1... . |
|---|
| 416 | |
|---|
| 417 | |
|---|
| 418 | </PRE> |
|---|
| 419 | </TD></TR></TABLE> |
|---|
| 420 | </BODY> |
|---|
| 421 | </HTML> |
|---|