1 | <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN"> |
---|
2 | <HTML> |
---|
3 | <HEAD> |
---|
4 | <TITLE>pars</TITLE> |
---|
5 | <META NAME="description" CONTENT="pars"> |
---|
6 | <META NAME="keywords" CONTENT="pars"> |
---|
7 | <META NAME="resource-type" CONTENT="document"> |
---|
8 | <META NAME="distribution" CONTENT="global"> |
---|
9 | <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1"> |
---|
10 | </HEAD> |
---|
11 | <BODY BGCOLOR="#ccffff"> |
---|
12 | <DIV ALIGN=RIGHT> |
---|
13 | version 3.6 |
---|
14 | </DIV> |
---|
15 | <P> |
---|
16 | <DIV ALIGN=CENTER> |
---|
17 | <H1>PARS - Discrete character parsimony</H1> |
---|
18 | </DIV> |
---|
19 | <P> |
---|
20 | © Copyright 1986-2000 by the University of |
---|
21 | Washington. Written by Joseph Felsenstein. Permission is granted to copy |
---|
22 | this document provided that no fee is charged for it and that this copyright |
---|
23 | notice is not removed. |
---|
24 | <P> |
---|
25 | PARS is a general parsimony program which carries out the Wagner |
---|
26 | parsimony method with multiple states. Wagner parsimony |
---|
27 | allows changes among all states. The criterion is to find the tree which |
---|
28 | requires the minimum number of changes. |
---|
29 | The Wagner method was originated by Eck and Dayhoff (1966) and by Kluge and |
---|
30 | Farris (1969). Here are its assumptions: |
---|
31 | <P> |
---|
32 | <OL> |
---|
33 | <LI>Ancestral states are unknown unknown. |
---|
34 | <LI>Different characters evolve independently. |
---|
35 | <LI>Different lineages evolve independently. |
---|
36 | <LI>Changes to all other states are equally probable (Wagner). |
---|
37 | <LI>These changes are a priori improbable over the |
---|
38 | evolutionary time spans involved in the differentiation of the |
---|
39 | group in question. |
---|
40 | <LI>Other kinds of evolutionary event such as retention of polymorphism |
---|
41 | are far less probable than these state changes. |
---|
42 | <LI>Rates of evolution in different lineages are sufficiently low that |
---|
43 | two changes in a long segment of the tree are far less probable |
---|
44 | than one change in a short segment. |
---|
45 | </OL> |
---|
46 | <P> |
---|
47 | That these are the assumptions of parsimony methods has been documented |
---|
48 | in a series of papers of mine: (1973a, 1978b, 1979, 1981b, |
---|
49 | 1983b, 1988b). For an opposing view arguing that the parsimony methods |
---|
50 | make no substantive |
---|
51 | assumptions such as these, see the papers by Farris (1983) and Sober (1983a, |
---|
52 | 1983b), but also read the exchange between Felsenstein and Sober (1986). |
---|
53 | <P> |
---|
54 | <H2>INPUT FORMAT</H2> |
---|
55 | <P> |
---|
56 | The input for PARS is the standard input for discrete characters |
---|
57 | programs, described above in the documentation file for the |
---|
58 | discrete-characters programs, except that multiple states (up to 9 of them) |
---|
59 | are allowed. Any characters other than "?" are allowed as states, up to a |
---|
60 | maximum of 9 states. In fact, one can |
---|
61 | use different symbols in different columns of the data matrix, |
---|
62 | although it is rather unlikely that you would want to do that. |
---|
63 | The symbols you can use are: |
---|
64 | <UL> |
---|
65 | <LI>The digits <TT>0-9</TT>, |
---|
66 | <LI>The letters <TT>A-Z</TT> and <TT>a-z</TT>, |
---|
67 | <LI>The symbols <TT>"!\"#$%&'()*+,-./:;<=>?@\[\\]^_`\{|}~</TT><BR> |
---|
68 | (of these, probably only + and - will be of interest to most users). |
---|
69 | </UL> |
---|
70 | But note that these do <I>not</I> include blank (" "). Blanks in the |
---|
71 | input data are simply skipped by the program, so that they can be used to |
---|
72 | make characters into groups for ease of viewing. |
---|
73 | The "?" (question mark) symbol has special meaning. It is allowed in the |
---|
74 | input but is not available as the symbol of a state. Rather, it means that |
---|
75 | the state is unknown. |
---|
76 | <P> |
---|
77 | PARS can handle both bifurcating and multifurcating trees. In doing its |
---|
78 | search for most parsimonious trees, it adds species not only by creating new |
---|
79 | forks in the middle of existing branches, but it also tries putting them at |
---|
80 | the end of new branches which are added to existing forks. Thus it searches |
---|
81 | among both bifurcating and multifurcating trees. If a branch in a tree |
---|
82 | does not have any characters which might change in that branch in the most |
---|
83 | parsimonious tree, it does not save that tree. Thus in any tree that |
---|
84 | results, a branch exists only if some character has a most parsimonious |
---|
85 | reconstruction that would involve change in that branch. |
---|
86 | <P> |
---|
87 | It also saves a number of trees tied for best (you can alter the number |
---|
88 | it saves using the V option in the menu). When rearranging trees, it |
---|
89 | tries rearrangements of all of the saved trees. This makes the algorithm |
---|
90 | slower than earlier programs such as MIX. |
---|
91 | <P> |
---|
92 | The options are selected using a menu: |
---|
93 | <P> |
---|
94 | <TABLE><TR><TD BGCOLOR=white> |
---|
95 | <PRE> |
---|
96 | |
---|
97 | Discrete character parsimony algorithm, version 3.6 |
---|
98 | |
---|
99 | Setting for this run: |
---|
100 | U Search for best tree? Yes |
---|
101 | S Search option? More thorough search |
---|
102 | V Number of trees to save? 100 |
---|
103 | J Randomize input order of sequences? No. Use input order |
---|
104 | O Outgroup root? No, use as outgroup species 1 |
---|
105 | T Use Threshold parsimony? No, use ordinary parsimony |
---|
106 | W Sites weighted? No |
---|
107 | M Analyze multiple data sets? No |
---|
108 | I Input sequences interleaved? Yes |
---|
109 | 0 Terminal type (IBM PC, ANSI, none)? (none) |
---|
110 | 1 Print out the data at start of run No |
---|
111 | 2 Print indications of progress of run Yes |
---|
112 | 3 Print out tree Yes |
---|
113 | 4 Print out steps in each site No |
---|
114 | 5 Print character at all nodes of tree No |
---|
115 | 6 Write out trees onto tree file? Yes |
---|
116 | |
---|
117 | Y to accept these or type the letter for one to change |
---|
118 | </PRE> |
---|
119 | </TD></TR></TABLE> |
---|
120 | <P> |
---|
121 | The Weights (W) option |
---|
122 | takes the weights from a file whose default name is "weights". The weights |
---|
123 | follow the format described in the main documentation file, with integer |
---|
124 | weights from 0 to 35 allowed by using the characters 0, 1, 2, ..., 9 and |
---|
125 | A, B, ... Z. |
---|
126 | <P> |
---|
127 | The User tree (option U) is read from a file whose default name is |
---|
128 | <TT>intree</TT>. |
---|
129 | The trees can be multifurcating. They must be preceded in the file by a |
---|
130 | line giving the number of trees in the file. |
---|
131 | <P> |
---|
132 | The options J, O, T, and M are the usual Jumble, Outgroup, |
---|
133 | Threshold parsimony, and Multiple Data Sets options, |
---|
134 | described either |
---|
135 | in the main documentation file or in the Discrete Characters Programs |
---|
136 | documentation file. |
---|
137 | <P> |
---|
138 | The M (multiple data sets option) will ask you whether you want to |
---|
139 | use multiple sets of weights (from the weights file) or multiple data sets. |
---|
140 | The ability to use a single data set with multiple weights means that |
---|
141 | much less disk space will be used for this input data. The bootstrapping |
---|
142 | and jackknifing tool Seqboot has the ability to create a weights file with |
---|
143 | multiple weights. |
---|
144 | <P> |
---|
145 | The O (outgroup) option will have no effect if the U (user-defined tree) |
---|
146 | option is in effect. |
---|
147 | The T (threshold) option allows a continuum of methods |
---|
148 | between parsimony and compatibility. Thresholds less than or equal to 1.0 do |
---|
149 | not have any meaning and should |
---|
150 | not be used: they will result in a tree dependent only on the input |
---|
151 | order of species and not at all on the data! |
---|
152 | <P> |
---|
153 | <H2>OUTPUT FORMAT</H2> |
---|
154 | <P> |
---|
155 | Output is standard: if option 1 is toggled on, the data is printed out, |
---|
156 | with the convention that "." means "the same as in the first species". |
---|
157 | Then comes a list of equally parsimonious trees. |
---|
158 | Each tree has branch lengths. These are computed using an algorithm |
---|
159 | published by Hochbaum and Pathria (1997) which I first heard of from |
---|
160 | Wayne Maddison who invented it independently of them. This algorithm |
---|
161 | averages the number of reconstructed changes of state over all sites a |
---|
162 | over all possible most parsimonious placements of the changes of state |
---|
163 | among branches. Note that it does not correct in any way for multiple |
---|
164 | changes that overlay each other. |
---|
165 | <P> |
---|
166 | If option 2 is |
---|
167 | toggled on a table of the |
---|
168 | number of changes of state required in each character is also |
---|
169 | printed. If option 5 is toggled |
---|
170 | on, a table is printed |
---|
171 | out after each tree, showing for each branch whether there are known to be |
---|
172 | changes in the branch, and what the states are inferred to have been at the |
---|
173 | top end of the branch. This is a reconstruction of the ancestral sequences |
---|
174 | in the tree. If you choose option 5, a menu item D appears which gives you |
---|
175 | the opportunity to turn off dot-differencing so that complete ancestral |
---|
176 | sequences are shown. If the inferred state is a "?", |
---|
177 | there will be multiple |
---|
178 | equally-parsimonious assignments of states; the user must work these out for |
---|
179 | themselves by hand. |
---|
180 | If option 6 is left in its default state the trees |
---|
181 | found will be written to a tree file, so that they are available to be used |
---|
182 | in other programs. |
---|
183 | <P> |
---|
184 | If the U (User Tree) option is used and more than one tree is supplied, the |
---|
185 | program also performs a statistical test of each of these trees against the |
---|
186 | best tree. This test, which is a version of the test proposed by |
---|
187 | Alan Templeton (1983) and evaluated in a test case by me (1985a). It is |
---|
188 | closely parallel to a test using log likelihood differences |
---|
189 | due to Kishino and Hasegawa (1989), and |
---|
190 | uses the mean and variance of |
---|
191 | step differences between trees, taken across sites. If the mean |
---|
192 | is more than 1.96 standard deviations different then the trees are declared |
---|
193 | significantly different. The program |
---|
194 | prints out a table of the steps for each tree, the differences of |
---|
195 | each from the best one, the variance of that quantity as determined by |
---|
196 | the step differences at individual sites, and a conclusion as to |
---|
197 | whether that tree is or is not significantly worse than the best one. |
---|
198 | It is important to understand that the test assumes that all the discrete |
---|
199 | characters are evolving independently, which is unlikely to be true for |
---|
200 | many suites of morphological characters. |
---|
201 | <P> |
---|
202 | Option 6 in the menu controls whether the tree estimated by the program |
---|
203 | is written onto a tree file. The default name of this output tree file |
---|
204 | is "outtree". If the U option is in effect, all the user-defined |
---|
205 | trees are written to the output tree file. |
---|
206 | <P> |
---|
207 | <HR> |
---|
208 | <P> |
---|
209 | <H3>TEST DATA SET</H3> |
---|
210 | <P> |
---|
211 | <TABLE><TR><TD BGCOLOR=white> |
---|
212 | <PRE> |
---|
213 | 5 6 |
---|
214 | Alpha 110110 |
---|
215 | Beta 110000 |
---|
216 | Gamma 100110 |
---|
217 | Delta 001001 |
---|
218 | Epsilon 001110 |
---|
219 | </PRE> |
---|
220 | </TD></TR></TABLE> |
---|
221 | <P> |
---|
222 | <HR> |
---|
223 | <P> |
---|
224 | <H3>TEST SET OUTPUT (with all numerical options on)</H3> |
---|
225 | <P> |
---|
226 | <TABLE><TR><TD BGCOLOR=white> |
---|
227 | <PRE> |
---|
228 | |
---|
229 | Discrete character parsimony algorithm, version 3.6 |
---|
230 | |
---|
231 | |
---|
232 | One most parsimonious tree found: |
---|
233 | |
---|
234 | |
---|
235 | +Epsilon |
---|
236 | +---------3 |
---|
237 | +----2 +--------------Delta |
---|
238 | | | |
---|
239 | | +Gamma |
---|
240 | | |
---|
241 | 1---------Beta |
---|
242 | | |
---|
243 | +Alpha |
---|
244 | |
---|
245 | |
---|
246 | requires a total of 8.000 |
---|
247 | |
---|
248 | between and length |
---|
249 | ------- --- ------ |
---|
250 | 1 2 0.166667 |
---|
251 | 2 3 0.333333 |
---|
252 | 3 Epsilon 0.000000 |
---|
253 | 3 Delta 0.500000 |
---|
254 | 2 Gamma 0.000000 |
---|
255 | 1 Beta 0.333333 |
---|
256 | 1 Alpha 0.000000 |
---|
257 | |
---|
258 | steps in each site: |
---|
259 | 0 1 2 3 4 5 6 7 8 9 |
---|
260 | *----------------------------------------- |
---|
261 | 0| 1 1 1 2 2 1 |
---|
262 | |
---|
263 | From To Any Steps? State at upper node |
---|
264 | ( . means same as in the node below it on tree) |
---|
265 | |
---|
266 | 1 110110 |
---|
267 | 1 2 yes .0.... |
---|
268 | 2 3 yes 0.1... |
---|
269 | 3 Epsilon no ...... |
---|
270 | 3 Delta yes ...001 |
---|
271 | 2 Gamma no ...... |
---|
272 | 1 Beta yes ...00. |
---|
273 | 1 Alpha no ...... |
---|
274 | |
---|
275 | |
---|
276 | </PRE> |
---|
277 | </TD></TR></TABLE> |
---|
278 | </BODY> |
---|
279 | </HTML> |
---|