1 | <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN"> |
---|
2 | <HTML> |
---|
3 | <HEAD> |
---|
4 | <TITLE>consense</TITLE> |
---|
5 | <META NAME="description" CONTENT="consense"> |
---|
6 | <META NAME="keywords" CONTENT="consense"> |
---|
7 | <META NAME="resource-type" CONTENT="document"> |
---|
8 | <META NAME="distribution" CONTENT="global"> |
---|
9 | <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1"> |
---|
10 | </HEAD> |
---|
11 | <BODY BGCOLOR="#ccffff"> |
---|
12 | <DIV ALIGN=RIGHT> |
---|
13 | version 3.6 |
---|
14 | </DIV> |
---|
15 | <P> |
---|
16 | <DIV ALIGN=CENTER> |
---|
17 | <H1>CONSENSE -- Consensus tree program</H1> |
---|
18 | </DIV> |
---|
19 | <P> |
---|
20 | © Copyright 1986-2000 by The University of |
---|
21 | Washington. Written by Joseph Felsenstein. Permission is granted to copy |
---|
22 | this document provided that no fee is charged for it and that this copyright |
---|
23 | notice is not removed. |
---|
24 | <P> |
---|
25 | CONSENSE reads a file of computer-readable trees and prints |
---|
26 | out (and may also write out onto a file) a consensus tree. At the moment |
---|
27 | it carries out a family of consensus tree methods called the |
---|
28 | <I>M<SUB>l</SUB></I> |
---|
29 | methods (Margush and McMorris, 1981). These include strict consensus and |
---|
30 | majority rule consensus. Basically the |
---|
31 | consensus tree consists of monophyletic groups |
---|
32 | that occur as often as possible in the data. If a group occurs in more than |
---|
33 | a fraction <EM>l</EM> of all the input trees it will definitely |
---|
34 | appear in the consensus tree. |
---|
35 | <P> |
---|
36 | The tree printed out has at each fork a number indicating how many times the |
---|
37 | group which consists of the species to the right of (descended from) the fork |
---|
38 | occurred. Thus if we read in 15 trees and find that a fork has the number |
---|
39 | 15, that group occurred in all of the trees. The strict consensus tree |
---|
40 | consists of all groups that occurred 100% of the time, the rest of the |
---|
41 | resolution being ignored. The tree printed out here includes groups down |
---|
42 | to 50%, and below it until the tree is fully resolved. |
---|
43 | <P> |
---|
44 | The majority rule consensus tree consists of all groups that occur more than |
---|
45 | 50% of the time. Any other percentage level between 50% and 100% can also |
---|
46 | be used, and that is why the program in effect |
---|
47 | carries out a family of methods. You |
---|
48 | have to decide on the percentage level, figure out for yourself what number |
---|
49 | of occurrences that would be (e.g. 15 in the above case for 100%), and |
---|
50 | resolutely ignore any group below that number. Do not use numbers at or below |
---|
51 | 50%, because some groups occurring (say) 35% of the time will not be shown |
---|
52 | on the tree. The collection of all groups that occur 35% or more of the |
---|
53 | time may include two groups that are mutually self contradictory and cannot |
---|
54 | appear in the same tree. In this program, as the default method I have |
---|
55 | included groups that occur |
---|
56 | less than 50% of the time, working downwards in their frequency of occurrence, |
---|
57 | as long as they continue to resolve the tree and do not contradict more |
---|
58 | frequent groups. In this respect the method is similar to the Nelson consensus |
---|
59 | method (Nelson, 1979) as explicated by Page (1989) although it is not identical |
---|
60 | to it. |
---|
61 | <P> |
---|
62 | The program can also carry out Strict consensus, Majority Rule consensus |
---|
63 | without the extension which adds groups until the tree is fully |
---|
64 | resolved, and other members of the M<SUB>l</SUB> family, where the |
---|
65 | user supplied the fraction of times the group must appear in the input |
---|
66 | trees to be included in the consensus tree. |
---|
67 | For the moment the program cannot carry out any other |
---|
68 | consensus tree method, such as Adams consensus (Adams, 1972, 1986) or methods |
---|
69 | based on |
---|
70 | quadruples of species (Estabrook, McMorris, and Meacham, 1985). |
---|
71 | <P> |
---|
72 | <H2>INPUT, OUTPUT, AND OPTIONS</H2> |
---|
73 | <P> |
---|
74 | Input is a tree file (called <TT>intree</TT>) |
---|
75 | which contains a series of trees in the Newick |
---|
76 | standard form -- the form used when many of the programs in this package |
---|
77 | write out tree files. Each tree starts on a new line. Each tree can have |
---|
78 | a weight, which is a real number and is located in comment brackets "[" |
---|
79 | and "]" just before the final ";" which |
---|
80 | ends the description of the tree. When the input trees have weights |
---|
81 | (like [0.01000]) then the total number of trees will be the total of those |
---|
82 | weights, which is often a number like 1.00. When the a tree doesn't have |
---|
83 | a weight it will each be assigned a weight of 1. This means that when we have |
---|
84 | tied trees (as from a parsimony program) three alternative tied trees will |
---|
85 | be counted as if each was <SUP>1</SUP>/<SUB>3</SUB> of a tree. |
---|
86 | <P> |
---|
87 | Note that this program can correctly |
---|
88 | read trees whether or not they are bifurcating: in fact they can be |
---|
89 | multifurcating at any level in the tree. |
---|
90 | <P> |
---|
91 | The options are selected from a menu, which looks like this: |
---|
92 | <P> |
---|
93 | <TABLE><TR><TD BGCOLOR=white> |
---|
94 | <PRE> |
---|
95 | |
---|
96 | Majority-rule and strict consensus tree program, version 3.6 |
---|
97 | |
---|
98 | Settings for this run: |
---|
99 | C Consensus type (strict, MR, MRe, Ml) Majority Rule (extended) |
---|
100 | O Outgroup root: No, use as outgroup species 1 |
---|
101 | R Trees to be treated as Rooted: No |
---|
102 | T Terminal type (IBM PC, ANSI, none): none |
---|
103 | 1 Print out the sets of species: Yes |
---|
104 | 2 Print indications of progress of run: Yes |
---|
105 | 3 Print out tree: Yes |
---|
106 | 4 Write out trees onto tree file: Yes |
---|
107 | |
---|
108 | Are these settings correct? (type Y or the letter for one to change) |
---|
109 | </PRE> |
---|
110 | </TD></TR></TABLE> |
---|
111 | <P> |
---|
112 | Option C (Consensus method) selects which of four methods the |
---|
113 | program uses. The program defaults to using the extended Majority |
---|
114 | Rule method. Each time the C option is chosen the program moves on |
---|
115 | to another method, the others being in order Strict, Majority Rule, |
---|
116 | and M<SUB>l</SUB>. Here are descriptions of the methods. In each |
---|
117 | case the fraction of times a set appears among the input trees |
---|
118 | is counted by weighting by the weights of the trees (the numbers |
---|
119 | like <TT>[0.6000]</TT> that appear at the ends of trees in some |
---|
120 | cases). |
---|
121 | <P> |
---|
122 | <DL> |
---|
123 | <DT>Strict</DT> <DD>A set of species must appear in all input trees |
---|
124 | to be included in the strict consensus tree.</DD> |
---|
125 | <P> |
---|
126 | <DT>Majority Rule (extended)</DT> <DD>Any set of species that appears |
---|
127 | in more than 50% of the trees is included. The program then |
---|
128 | considers the other sets of species in order of the frequency with |
---|
129 | which they have appeared, adding to the consensus tree any which are |
---|
130 | compatible with it until the tree is fully resolved. This is the |
---|
131 | default setting.</DD> |
---|
132 | <P> |
---|
133 | <DT>M<SUB>l</SUB></DT> <DD>The user is asked for a fraction between |
---|
134 | 0.5 and 1, and the program then includes in the consensus tree any |
---|
135 | set of species that occurs among the input trees more than that |
---|
136 | fraction of then time. The Strict consensus and the Majority Rule |
---|
137 | consensus are extreme cases of the M<SUB>l</SUB> consensus, being |
---|
138 | for fractions of 1 and 0.5 respectively.</DD> |
---|
139 | <P> |
---|
140 | <DT>Majority Rule</DT> <DD>A set of species is included in the |
---|
141 | consensus tree if it is present in more than half of the |
---|
142 | input trees.</DD> |
---|
143 | </DL> |
---|
144 | <P> |
---|
145 | Option R (Rooted) toggles between the default assumption that the input trees |
---|
146 | are unrooted trees and the selection that |
---|
147 | specifies that the tree is to be treated as a rooted tree and not |
---|
148 | re-rooted. Otherwise the tree will be treated as outgroup-rooted and will |
---|
149 | be re-rooted automatically at the first species encountered on the first |
---|
150 | tree (or at a species designated by the Outgroup option). |
---|
151 | <P> |
---|
152 | Option O is the usual Outgroup rooting option. It is in effect only if |
---|
153 | the Rooted option selection is not in effect. The trees will be re-rooted |
---|
154 | with a species of your choosing. You will be asked for the number of the |
---|
155 | species that is to be the outgroup. If we want to outgroup-root the tree on |
---|
156 | the line leading to a |
---|
157 | species which appears as the third species (counting left-to-right) in the |
---|
158 | first computer-readable tree in the input file, we would invoke select |
---|
159 | menu option O and specify species 3. |
---|
160 | <P> |
---|
161 | Output is a list of the species (in the order in which they appear in the |
---|
162 | first tree, which is the numerical order used in the program), a list |
---|
163 | of the subsets that appear in the consensus tree, a list of those that |
---|
164 | appeared in one or another of the individual |
---|
165 | trees but did not occur frequently enough to get into the consensus tree, |
---|
166 | followed by a diagram showing the consensus tree. The lists of subsets |
---|
167 | consists of a row of symbols, each either "." or "*". The species |
---|
168 | that are in the set are marked by "*". Every ten species there is |
---|
169 | a blank, to help you keep track of the alignment of columns. The |
---|
170 | order of symbols corresponds to the order of species in the species |
---|
171 | list. Thus a set that consisted of the second, seventh, and eighth out |
---|
172 | of 13 species would be represented by: |
---|
173 | <P> |
---|
174 | <PRE> |
---|
175 | .*....**.. ... |
---|
176 | </PRE> |
---|
177 | <P> |
---|
178 | Note that if the trees are unrooted the final tree will have one group, |
---|
179 | consisting of every species except the Outgroup (which by default is the |
---|
180 | first species encountered on the first tree), which always appears. It |
---|
181 | will not be listed in either of the lists of sets, but it will be shown in |
---|
182 | the final tree as occurring all of the time. This is hardly surprising: |
---|
183 | in telling the program that this species is the outgroup we have specified |
---|
184 | that the set consisting of all of the others is always a monophyletic set. So |
---|
185 | this is not to be taken as interesting information, despite its dramatic |
---|
186 | appearance. |
---|
187 | <P> |
---|
188 | Option 2 in the menu gives you the option of turning off the writing of |
---|
189 | these sets into the output file. This may be useful if you are primarily |
---|
190 | interested in getting the tree file. |
---|
191 | <P> |
---|
192 | Option 3 is the usual tree file option. If this is on (it is by default) |
---|
193 | then the final tree will be written onto an output tree file (whose default |
---|
194 | name is "outtree"). Note that the lengths on the tree on the output tree file |
---|
195 | are not branch lengths but the number of times that |
---|
196 | each group appeared in the input trees. This |
---|
197 | number is the sum of the weights of the trees in which it appeared, so that |
---|
198 | if there are 11 trees, ten of them having weight 0.1 and one weight 1.0, |
---|
199 | a group that appeared in the last tree and in 6 others would be shown as |
---|
200 | appearing 1.6 times and its branch length will be 1.6. |
---|
201 | <P> |
---|
202 | <H2>CONSTANTS</H2> |
---|
203 | <P> |
---|
204 | The program uses the consensus tree algorithm originally designed for |
---|
205 | the bootstrap programs. It is quite fast, and execution time is unlikely |
---|
206 | to be limiting for you (assembling the input file will be much more of a |
---|
207 | limiting step). In the future, if possible, more consensus tree methods |
---|
208 | will be incorporated (although the current methods are the ones needed |
---|
209 | for the component analysis of bootstrap estimates of phylogenies, and in |
---|
210 | other respects I also think that the |
---|
211 | present ones are among the best). |
---|
212 | <P> |
---|
213 | <PRE> |
---|
214 | <P> |
---|
215 | <HR> |
---|
216 | <P> |
---|
217 | <H3>TEST DATA SET</H3> |
---|
218 | <P> |
---|
219 | <TABLE><TR><TD BGCOLOR=white> |
---|
220 | <PRE> |
---|
221 | (A,(B,(H,(D,(J,(((G,E),(F,I)),C)))))); |
---|
222 | (A,(B,(D,((J,H),(((G,E),(F,I)),C))))); |
---|
223 | (A,(B,(D,(H,(J,(((G,E),(F,I)),C)))))); |
---|
224 | (A,(B,(E,(G,((F,I),((J,(H,D)),C)))))); |
---|
225 | (A,(B,(E,(G,((F,I),(((J,H),D),C)))))); |
---|
226 | (A,(B,(E,((F,I),(G,((J,(H,D)),C)))))); |
---|
227 | (A,(B,(E,((F,I),(G,(((J,H),D),C)))))); |
---|
228 | (A,(B,(E,((G,(F,I)),((J,(H,D)),C))))); |
---|
229 | (A,(B,(E,((G,(F,I)),(((J,H),D),C))))); |
---|
230 | </PRE> |
---|
231 | </TD></TR></TABLE> |
---|
232 | <P> |
---|
233 | <HR> |
---|
234 | <P> |
---|
235 | <H3>TEST SET OUTPUT</H3> |
---|
236 | <P> |
---|
237 | <TABLE><TR><TD BGCOLOR=white> |
---|
238 | <PRE> |
---|
239 | |
---|
240 | Majority-rule and strict consensus tree program, version 3.6 |
---|
241 | |
---|
242 | Species in order: |
---|
243 | |
---|
244 | A |
---|
245 | B |
---|
246 | H |
---|
247 | D |
---|
248 | J |
---|
249 | G |
---|
250 | E |
---|
251 | F |
---|
252 | I |
---|
253 | C |
---|
254 | |
---|
255 | |
---|
256 | Sets included in the consensus tree |
---|
257 | |
---|
258 | Set (species in order) How many times out of 9.00 |
---|
259 | |
---|
260 | .......**. 9.00 |
---|
261 | ..******** 9.00 |
---|
262 | ..***....* 6.00 |
---|
263 | ..****.*** 6.00 |
---|
264 | ..***..... 6.00 |
---|
265 | ..*.*..... 4.00 |
---|
266 | ..***..*** 2.00 |
---|
267 | |
---|
268 | |
---|
269 | Sets NOT included in consensus tree: |
---|
270 | |
---|
271 | Set (species in order) How many times out of 9.00 |
---|
272 | |
---|
273 | .....**... 3.00 |
---|
274 | .....****. 3.00 |
---|
275 | ..**...... 3.00 |
---|
276 | .....***** 3.00 |
---|
277 | ..*.****** 2.00 |
---|
278 | .....*.**. 2.00 |
---|
279 | ..****...* 2.00 |
---|
280 | ....****** 2.00 |
---|
281 | ...******* 1.00 |
---|
282 | |
---|
283 | |
---|
284 | Majority rule consensus (extended to resolve tree) |
---|
285 | |
---|
286 | CONSENSUS TREE: |
---|
287 | the numbers at the forks indicate the number |
---|
288 | of times the group consisting of the species |
---|
289 | which are to the right of that fork occurred |
---|
290 | among the trees, out of 9.00 trees |
---|
291 | |
---|
292 | +-------------------------------------------------------A |
---|
293 | | |
---|
294 | | +-----------------------------------------E |
---|
295 | | | |
---|
296 | | | +------I |
---|
297 | | | +----------------9.0-| |
---|
298 | | | | +------F |
---|
299 | | +--9.0-| | |
---|
300 | | | | +--2.0-| +-------------D |
---|
301 | | | | | | +--6.0-| |
---|
302 | | | | | | | | +------J |
---|
303 | | | | | +--6.0-| +--4.0-| |
---|
304 | +------| +--6.0-| | +------H |
---|
305 | | | | |
---|
306 | | | +--------------------C |
---|
307 | | | |
---|
308 | | +----------------------------------G |
---|
309 | | |
---|
310 | +------------------------------------------------B |
---|
311 | |
---|
312 | |
---|
313 | remember: this is an unrooted tree! |
---|
314 | |
---|
315 | </PRE> |
---|
316 | </TD></TR></TABLE> |
---|
317 | </BODY> |
---|
318 | </HTML> |
---|