Context Navigation

dollop.html

Visit:

Last change on this file was 2176, checked in by westram, 22 years ago
* empty log message *
Property svn:eol-style set to `native` Property svn:keywords set to `Author Date Id Revision`
File size: 13.5 KB

Line
1	<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
2	<HTML>
3	<HEAD>
4	<TITLE>dollop</TITLE>
5	<META NAME="description" CONTENT="dollop">
6	<META NAME="keywords" CONTENT="dollop">
7	<META NAME="resource-type" CONTENT="document">
8	<META NAME="distribution" CONTENT="global">
9	<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1">
10	</HEAD>
11	<BODY BGCOLOR="#ccffff">
12	<DIV ALIGN=RIGHT>
13	version 3.6
14	</DIV>
15	<P>
16	<DIV ALIGN=CENTER>
17	<H1>DOLLOP -- Dollo and Polymorphism Parsimony Program</H1>
18	</DIV>
19	<P>
20	© Copyright 1986-2002 by the University of
21	Washington. Written by Joseph Felsenstein. Permission is granted to copy
22	this document provided that no fee is charged for it and that this copyright
23	notice is not removed.
24	<P>
25	This program carries out the Dollo and polymorphism parsimony methods. The
26	Dollo parsimony method was
27	first suggested in print in verbal form by Le Quesne (1974) and was
28	first well-specified by Farris (1977). The method is named after Louis
29	Dollo since he was one of the first to assert that in evolution it is
30	harder to gain a complex feature than to lose it. The algorithm
31	explains the presence of the state 1 by allowing up to one forward
32	change 0-->1 and as many reversions 1-->0 as are necessary to explain
33	the pattern of states seen. The program attempts to minimize the number
34	of 1-->0 reversions necessary.
35	<P>
36	The assumptions of this method are in effect:
37	<OL>
38	<LI>We know which state is the ancestral one (state 0).
39	<LI>The characters are evolving independently.
40	<LI>Different lineages evolve independently.
41	<LI>The probability of a forward change (0-->1) is small over the
42	evolutionary times involved.
43	<LI>The probability of a reversion (1-->0) is also small, but
44	still far larger than the probability of a forward change, so
45	that many reversions are easier to envisage than even one
46	extra forward change.
47	<LI>Retention of polymorphism for both states (0 and 1) is highly
48	improbable.
49	<LI>The lengths of the segments of the true tree are not so
50	unequal that two changes in a long segment are as probable as
51	one in a short segment.
52	</OL>
53	<P>
54	One problem can arise when using additive binary recoding to
55	represent a multistate character as a series of two-state characters. Unlike
56	the Camin-Sokal, Wagner, and Polymorphism methods, the Dollo
57	method can reconstruct ancestral states which do not exist. An example
58	is given in my 1979 paper. It will be necessary to check the output to
59	make sure that this has not occurred.
60	<P>
61	The polymorphism parsimony method was first used by me, and the results
62	published (without a clear
63	specification of the method) by Inger (1967). The method was
64	independently published by Farris (1978a) and by me (1979). The method
65	assumes that we can explain the pattern of states by no more than one
66	origination (0-->1) of state 1, followed by retention of polymorphism
67	along as many segments of the tree as are necessary, followed by loss of
68	state 0 or of state 1 where necessary. The program tries to minimize
69	the total number of polymorphic characters, where each polymorphism is
70	counted once for each segment of the tree in which it is retained.
71	<P>
72	The assumptions of the polymorphism parsimony method are in effect:
73	<OL>
74	<LI>The ancestral state (state 0) is known in each character.
75	<LI>The characters are evolving independently of each other.
76	<LI>Different lineages are evolving independently.
77	<LI>Forward change (0-->1) is highly improbable over the length of
78	time involved in the evolution of the group.
79	<LI>Retention of polymorphism is also improbable, but far more
80	probable that forward change, so that we can more easily
81	envisage much polymorhism than even one additional forward
82	change.
83	<LI>Once state 1 is reached, reoccurrence of state 0 is very
84	improbable, much less probable than multiple retentions of
85	polymorphism.
86	<LI>The lengths of segments in the true tree are not so unequal
87	that we can more easily envisage retention events occurring in
88	both of two long segments than one retention in a short
89	segment.
90	</OL>
91	<P>
92	That these are the assumptions of parsimony methods has been documented
93	in a series of papers of mine: (1973a, 1978b, 1979, 1981b,
94	1983b, 1988b). For an opposing view arguing that the parsimony methods
95	make no substantive
96	assumptions such as these, see the papers by Farris (1983) and Sober (1983a,
97	1983b), but also read the exchange between Felsenstein and Sober (1986).
98	<P>
99	The input format is the standard one, with "?", "P", "B" states
100	allowed. The options are selected using a menu:
101	<P>
102	<TABLE><TR><TD BGCOLOR=white>
103	<PRE>
104
105	Dollo and polymorphism parsimony algorithm, version 3.6a3
106
107	Settings for this run:
108	U Search for best tree? Yes
109	P Parsimony method? Dollo
110	J Randomize input order of species? No. Use input order
111	T Use Threshold parsimony? No, use ordinary parsimony
112	A Use ancestral states in input file? No
113	W Sites weighted? No
114	M Analyze multiple data sets? No
115	0 Terminal type (IBM PC, ANSI, none)? (none)
116	1 Print out the data at start of run No
117	2 Print indications of progress of run Yes
118	3 Print out tree Yes
119	4 Print out steps in each character No
120	5 Print states at all nodes of tree No
121	6 Write out trees onto tree file? Yes
122
123	Are these settings correct? (type Y or the letter for one to change)
124	</PRE>
125	</TD></TR></TABLE>
126	<P>
127	The options U, J, T, A, and M are the usual User Tree, Jumble,
128	Ancestral States, and Multiple Data Sets options, described either
129	in the main documentation file or in the Discrete Characters Programs
130	documentation file. The A (Ancestral States)
131	option allows implementation of the unordered Dollo parsimony and unordered
132	polymorphism parsimony methods which I have
133	described elsewhere (1984b). When the A option is used the ancestor is
134	not to be counted as one of the species. The O (outgroup) option is not
135	available since the tree produced is already rooted. Since the Dollo and
136	polymorphism methods produce a rooted
137	tree, the user-defined trees required by the U option have two-way forks
138	at each level.
139	<P>
140	The P (Parsimony Method) option is the one that toggles between polymorphism
141	parsimony and Dollo parsimony. The program defaults to Dollo parsimony.
142	<P>
143	The T (Threshold) option has already been described in
144	the Discrete Characters programs documentation file. Setting T at or below
145	1.0 but above 0 causes the criterion to become compatibility rather than
146	polymorphism parsimony, although there is no advantage to using this
147	program instead of MIX to do a compatibility method. Setting the
148	threshold value higher brings about an intermediate between
149	the Dollo or polymorphism parsimony methods and the compatibility method,
150	so that there is some rationale for doing that. Since the Dollo and
151	polymorphism methods produces a rooted
152	tree, the user-defined trees required by the U option have two-way forks
153	at each level.
154	<P>
155	Using a threshold value of 1.0 or lower, but above 0, one can
156	obtain a rooted (or, if the A option is used with ancestral states of
157	"?", unrooted) compatibility criterion, but there is no particular
158	advantage to using this program for that instead of MIX. Higher
159	threshold values are of course meaningful and provide
160	intermediates between Dollo and compatibility methods.
161	<P>
162	The X
163	(Mixed parsimony methods) option is not available in this program. The
164	Factors option is also not available in this program, as it would have no
165	effect on the result even if that information were provided in the input file.
166	<P>
167	Output is standard: a list of equally parsimonious trees, and, if the
168	user selects menu option 4, a table
169	of the numbers of reversions or retentions of polymorphism necessary
170	in each character. If any of the
171	ancestral states has been specified to be unknown, a table of
172	reconstructed ancestral states is also provided. When reconstructing
173	the placement of forward changes and reversions under the Dollo method,
174	keep in mind that each
175	polymorphic state in the input data will require one "last minute"
176	reversion. This is included in the tabulated counts. Thus if we have
177	both states 0 and 1 at a tip of the tree the program will assume that
178	the lineage had state 1 up to the last minute, and then state 0 arose in
179	that population by reversion, without loss of state 1.
180	<P>
181	If the user selects menu option 5, a table is printed out after each
182	tree, showing for each branch whether
183	there are known to be changes in the branch, and what the states are inferred
184	to have been at the top end of the branch. If the inferred state is a "?"
185	there may be multiple equally-parsimonious assignments of states; the user
186	must work these out for themselves by hand.
187	<P>
188	If the A option is used, then the program will
189	infer, for any character whose ancestral state is unknown ("?") whether the
190	ancestral state 0 or 1 will give the best tree. If these are
191	tied, then it may not be possible for the program to infer the
192	state in the internal nodes, and these will all be printed as ".". If this
193	has happened and you want to know more about the states at the internal
194	nodes, you will find helpful to use DOLMOVE to display the tree and examine
195	its interior states, as the algorithm in DOLMOVE shows all that can be known
196	in this case about the interior states, including where there is and is not
197	amibiguity. The algorithm in DOLLOP gives up more easily on displaying these
198	states.
199	<P>
200	If the U (User Tree) option is used and more than one tree is supplied, the
201	program also performs a statistical test of each of these trees against the
202	best tree. This test, which is a version of the test proposed by
203	Alan Templeton (1983) and evaluated in a test case by me (1985a). It is
204	closely parallel to a test using log likelihood differences
205	invented by Kishino and Hasegawa (1989), and uses the mean and variance of
206	step differences between trees, taken across characters. If the mean
207	is more than 1.96 standard deviations different then the trees are declared
208	significantly different. The program
209	prints out a table of the steps for each tree, the differences of
210	each from the highest one, the variance of that quantity as determined by
211	the step differences at individual characters, and a conclusion as to
212	whether that tree is or is not significantly worse than the best one. It
213	is important to understand that the test assumes that all the binary
214	characters are evolving independently, which is unlikely to be true for
215	many suites of morphological characters.
216	<P>
217	If there are more than two trees, the test done is an extension of
218	the KHT test, due to Shimodaira and Hasegawa (1999). They pointed out
219	that a correction for the number of trees was necessary, and they
220	introduced a resampling method to make this correction. In the version
221	used here the variances and covariances of the sums of steps across
222	characters are computed for all pairs of trees. To test whether the
223	difference between each tree and the best one is larger than could have
224	been expected if they all had the same expected number of steps,
225	numbers of steps for all trees are sampled with these covariances and equal
226	means (Shimodaira and Hasegawa's "least favorable hypothesis"),
227	and a P value is computed from the fraction of times the difference between
228	the tree's value and the lowest number of steps exceeds that actually
229	observed. Note that this sampling needs random numbers, and so the
230	program will prompt the user for a random number seed if one has not
231	already been supplied. With the two-tree KHT test no random numbers
232	are used.
233	<P>
234	In either the KHT or the SH test the program
235	prints out a table of the number of steps for each tree, the differences of
236	each from the lowest one, the variance of that quantity as determined by
237	the differences of the numbers of steps at individual characters,
238	and a conclusion as to
239	whether that tree is or is not significantly worse than the best one.
240	<P>
241	At the beginning of the program is the constant
242	"maxtrees", the maximum number of trees which the program will store for
243	output.
244	<P>
245	The algorithm is a fairly simple adaptation of the one used in
246	the program SOKAL, which was formerly in this package and has been
247	superseded by MIX. It requires two passes through each tree to count the
248	numbers of reversions.
249	<P>
250	<HR>
251	<P>
252	<H3>TEST DATA SET</H3>
253	<P>
254	<TABLE><TR><TD BGCOLOR=white>
255	<PRE>
256	5 6
257	Alpha 110110
258	Beta 110000
259	Gamma 100110
260	Delta 001001
261	Epsilon 001110
262	</PRE>
263	</TD></TR></TABLE>
264	<P>
265	<HR>
266	<P>
267	<H3>TEST SET OUTPUT (with all numerical options on)</H3>
268	<P>
269	<TABLE><TR><TD BGCOLOR=white>
270	<PRE>
271
272	Dollo and polymorphism parsimony algorithm, version 3.6a3
273
274	5 species, 6 characters
275
276	Dollo parsimony method
277
278
279	Name Characters
280	---- ----------
281
282	Alpha 11011 0
283	Beta 11000 0
284	Gamma 10011 0
285	Delta 00100 1
286	Epsilon 00111 0
287
288
289
290	One most parsimonious tree found:
291
292
293
294
295	+-----------Delta
296	--3
297	! +--------Epsilon
298	+--4
299	! +-----Gamma
300	+--2
301	! +--Beta
302	+--1
303	+--Alpha
304
305
306	requires a total of 3.000
307
308	reversions in each character:
309	0 1 2 3 4 5 6 7 8 9
310	*-----------------------------------------
311	0! 0 0 1 1 1 0
312
313	From To Any Steps? State at upper node
314	( . means same as in the node below it on tree)
315
316	root 3 yes ..1.. .
317	3 Delta yes ..... 1
318	3 4 yes ...11 .
319	4 Epsilon no ..... .
320	4 2 yes 1.0.. .
321	2 Gamma no ..... .
322	2 1 yes .1... .
323	1 Beta yes ...00 .
324	1 Alpha no ..... .
325
326
327	</PRE>
328	</TD></TR></TABLE>
329	</BODY>
330	</HTML>

Note: See TracBrowser for help on using the repository browser.

Context Navigation

source: branches/stable/GDE/PHYLIP/doc/dollop.html

Download in other formats: