source: branches/stable/GDE/PHYLIP/doc/consense.html

Last change on this file was 2176, checked in by westram, 21 years ago

* empty log message *

  • Property svn:eol-style set to native
  • Property svn:keywords set to Author Date Id Revision
File size: 12.6 KB
Line 
1<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
2<HTML>
3<HEAD>
4<TITLE>consense</TITLE>
5<META NAME="description" CONTENT="consense">
6<META NAME="keywords" CONTENT="consense">
7<META NAME="resource-type" CONTENT="document">
8<META NAME="distribution" CONTENT="global">
9<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1">
10</HEAD>
11<BODY BGCOLOR="#ccffff">
12<DIV ALIGN=RIGHT>
13version 3.6
14</DIV>
15<P>
16<DIV ALIGN=CENTER>
17<H1>CONSENSE -- Consensus tree program</H1>
18</DIV>
19<P>
20&#169; Copyright 1986-2000 by The University of
21Washington.  Written by Joseph Felsenstein.  Permission is granted to copy
22this document provided that no fee is charged for it and that this copyright
23notice is not removed.
24<P>
25CONSENSE reads a file of computer-readable trees and prints
26out (and may also write out onto a file) a consensus tree.  At the moment
27it carries out a family of consensus tree methods called the
28<I>M<SUB>l</SUB></I>
29methods (Margush and McMorris, 1981).  These include strict consensus and
30majority rule consensus.  Basically the
31consensus tree consists of monophyletic groups
32that occur as often as possible in the data.  If a group occurs in more than
33a fraction <EM>l</EM> of all the input trees it will definitely
34appear in the consensus tree.
35<P>
36The tree printed out has at each fork a number indicating how many times the
37group which consists of the species to the right of (descended from) the fork
38occurred.  Thus if we read in 15 trees and find that a fork has the number
3915, that group occurred in all of the trees.  The strict consensus tree
40consists of all groups that occurred 100% of the time, the rest of the
41resolution being ignored.  The tree printed out here includes groups down
42to 50%, and below it until the tree is fully resolved.
43<P>
44The majority rule consensus tree consists of all groups that occur more than
4550% of the time.  Any other percentage level between 50% and 100% can also
46be used, and that is why the program in effect
47carries out a family of methods.  You
48have to decide on the percentage level, figure out for yourself what number
49of occurrences that would be (e.g. 15 in the above case for 100%), and
50resolutely ignore any group below that number.  Do not use numbers at or below
5150%, because some groups occurring (say) 35% of the time will not be shown
52on the tree.  The collection of all groups that occur 35% or more of the
53time may include two groups that are mutually self contradictory and cannot
54appear in the same tree.  In this program, as the default method I have
55included groups that occur
56less than 50% of the time, working downwards in their frequency of occurrence,
57as long as they continue to resolve the tree and do not contradict more
58frequent groups.  In this respect the method is similar to the Nelson consensus
59method (Nelson, 1979) as explicated by Page (1989) although it is not identical
60to it.
61<P>
62The program can also carry out Strict consensus, Majority Rule consensus
63without the extension which adds groups until the tree is fully
64resolved, and other members of the M<SUB>l</SUB> family, where the
65user supplied the fraction of times the group must appear in the input
66trees to be included in the consensus tree.
67For the moment the program cannot carry out any other
68consensus tree method, such as Adams consensus (Adams, 1972, 1986) or methods
69based on
70quadruples of species (Estabrook, McMorris, and Meacham, 1985).
71<P>
72<H2>INPUT, OUTPUT, AND OPTIONS</H2>
73<P>
74Input is a tree file (called <TT>intree</TT>)
75which contains a series of trees in the Newick
76standard form -- the form used when many of the programs in this package
77write out tree files.  Each tree starts on a new line.  Each tree can have
78a weight, which is a real number and is located in comment brackets "["
79and "]" just before the final ";" which
80ends the description of the tree.  When the input trees have weights
81(like [0.01000]) then the total number of trees will be the total of those
82weights, which is often a number like 1.00.  When the a tree doesn't have
83a weight it will each be assigned a weight of 1.  This means that when we have
84tied trees (as from a parsimony program) three alternative tied trees will
85be counted as if each was <SUP>1</SUP>/<SUB>3</SUB> of a tree.
86<P>
87Note that this program can correctly
88read trees whether or not they are bifurcating: in fact they can be
89multifurcating at any level in the tree.
90<P>
91The options are selected from a menu, which looks like this:
92<P>
93<TABLE><TR><TD BGCOLOR=white>
94<PRE>
95
96Majority-rule and strict consensus tree program, version 3.6
97
98Settings for this run:
99 C   Consensus type (strict, MR, MRe, Ml)  Majority Rule (extended)
100 O                         Outgroup root:  No, use as outgroup species  1
101 R         Trees to be treated as Rooted:  No
102 T    Terminal type (IBM PC, ANSI, none):  none
103 1         Print out the sets of species:  Yes
104 2  Print indications of progress of run:  Yes
105 3                        Print out tree:  Yes
106 4        Write out trees onto tree file:  Yes
107
108Are these settings correct? (type Y or the letter for one to change)
109</PRE>
110</TD></TR></TABLE>
111<P>
112Option C (Consensus method) selects which of four methods the
113program uses.  The program defaults to using the extended Majority
114Rule method.  Each time the C option is chosen the program moves on
115to another method, the others being in order Strict, Majority Rule,
116and M<SUB>l</SUB>.  Here are descriptions of the methods.  In each
117case the fraction of times a set appears among the input trees
118is counted by weighting by the weights of the trees (the numbers
119like <TT>[0.6000]</TT> that appear at the ends of trees in some
120cases).
121<P>
122<DL>
123<DT>Strict</DT> <DD>A set of species must appear in all input trees
124to be included in the strict consensus tree.</DD>
125<P>
126<DT>Majority Rule (extended)</DT> <DD>Any set of species that appears
127in more than 50% of the trees is included.  The program then
128considers the other sets of species in order of the frequency with
129which they have appeared, adding to the consensus tree any which are
130compatible with it until the tree is fully resolved. This is the
131default setting.</DD>
132<P>
133<DT>M<SUB>l</SUB></DT> <DD>The user is asked for a fraction between
1340.5 and 1, and the program then includes in the consensus tree any
135set of species that occurs among the input trees more than that
136fraction of then time.  The Strict consensus and the Majority Rule
137consensus are extreme cases of the M<SUB>l</SUB> consensus, being
138for fractions of 1 and 0.5 respectively.</DD>
139<P>
140<DT>Majority Rule</DT> <DD>A set of species is included in the
141consensus tree if it is present in more than half of the
142input trees.</DD>
143</DL>
144<P>
145Option R (Rooted) toggles between the default assumption that the input trees
146are unrooted trees and the selection that
147specifies that the tree is to be treated as a rooted tree and not
148re-rooted.  Otherwise the tree will be treated as outgroup-rooted and will
149be re-rooted automatically at the first species encountered on the first
150tree (or at a species designated by the Outgroup option).
151<P>
152Option O is the usual Outgroup rooting option.  It is in effect only if
153the Rooted option selection is not in effect.  The trees will be re-rooted
154with a species of your choosing.  You will be asked for the number of the
155species that is to be the outgroup.  If we want to outgroup-root the tree on
156the line leading to a
157species which appears as the third species (counting left-to-right) in the
158first computer-readable tree in the input file, we would invoke select
159menu option O and specify species 3.
160<P>
161Output is a list of the species (in the order in which they appear in the
162first tree, which is the numerical order used in the program), a list
163of the subsets that appear in the consensus tree, a list of those that
164appeared in one or another of the individual
165trees but did not occur frequently enough to get into the consensus tree,
166followed by a diagram showing the consensus tree.  The lists of subsets
167consists of a row of symbols, each either "." or "*".  The species
168that are in the set are marked by "*".  Every ten species there is
169a blank, to help you keep track of the alignment of columns.  The
170order of symbols corresponds to the order of species in the species
171list.  Thus a set that consisted of the second, seventh, and eighth out
172of 13 species would be represented by:
173<P>
174<PRE>
175          .*....**.. ...
176</PRE>
177<P>
178Note that if the trees are unrooted the final tree will have one group,
179consisting of every species except the Outgroup (which by default is the
180first species encountered on the first tree), which always appears.  It
181will not be listed in either of the lists of sets, but it will be shown in
182the final tree as occurring all of the time.  This is hardly surprising:
183in telling the program that this species is the outgroup we have specified
184that the set consisting of all of the others is always a monophyletic set.  So
185this is not to be taken as interesting information, despite its dramatic
186appearance.
187<P>
188Option 2 in the menu gives you the option of turning off the writing of
189these sets into the output file.  This may be useful if you are primarily
190interested in getting the tree file.
191<P>
192Option 3 is the usual tree file option.  If this is on (it is by default)
193then the final tree will be written onto an output tree file (whose default
194name is "outtree"). Note that the lengths on the tree on the output tree file
195are not branch lengths but the number of times that
196each group appeared in the input trees.  This
197number is the sum of the weights of the trees in which it appeared, so that
198if there are 11 trees, ten of them having weight 0.1 and one weight 1.0,
199a group that appeared in the last tree and in 6 others would be shown as
200appearing 1.6 times and its branch length will be 1.6.
201<P>
202<H2>CONSTANTS</H2>
203<P>
204The program uses the consensus tree algorithm originally designed for
205the bootstrap programs.  It is quite fast, and execution time is unlikely
206to be limiting for you (assembling the input file will be much more of a
207limiting step).  In the future, if possible, more consensus tree methods
208will be incorporated (although the current methods are the ones needed
209for the component analysis of bootstrap estimates of phylogenies, and in
210other respects I also think that the
211present ones are among the best).
212<P>
213<PRE>
214<P>
215<HR>
216<P>
217<H3>TEST DATA SET</H3>
218<P>
219<TABLE><TR><TD BGCOLOR=white>
220<PRE>
221(A,(B,(H,(D,(J,(((G,E),(F,I)),C))))));
222(A,(B,(D,((J,H),(((G,E),(F,I)),C)))));
223(A,(B,(D,(H,(J,(((G,E),(F,I)),C))))));
224(A,(B,(E,(G,((F,I),((J,(H,D)),C))))));
225(A,(B,(E,(G,((F,I),(((J,H),D),C))))));
226(A,(B,(E,((F,I),(G,((J,(H,D)),C))))));
227(A,(B,(E,((F,I),(G,(((J,H),D),C))))));
228(A,(B,(E,((G,(F,I)),((J,(H,D)),C)))));
229(A,(B,(E,((G,(F,I)),(((J,H),D),C)))));
230</PRE>
231</TD></TR></TABLE>
232<P>
233<HR>
234<P>
235<H3>TEST SET OUTPUT</H3>
236<P>
237<TABLE><TR><TD BGCOLOR=white>
238<PRE>
239
240Majority-rule and strict consensus tree program, version 3.6
241
242Species in order:
243
244  A
245  B
246  H
247  D
248  J
249  G
250  E
251  F
252  I
253  C
254
255
256Sets included in the consensus tree
257
258Set (species in order)     How many times out of    9.00
259
260.......**.                   9.00
261..********                   9.00
262..***....*                   6.00
263..****.***                   6.00
264..***.....                   6.00
265..*.*.....                   4.00
266..***..***                   2.00
267
268
269Sets NOT included in consensus tree:
270
271Set (species in order)     How many times out of    9.00
272
273.....**...                   3.00
274.....****.                   3.00
275..**......                   3.00
276.....*****                   3.00
277..*.******                   2.00
278.....*.**.                   2.00
279..****...*                   2.00
280....******                   2.00
281...*******                   1.00
282
283
284Majority rule consensus (extended to resolve tree)
285
286CONSENSUS TREE:
287the numbers at the forks indicate the number
288of times the group consisting of the species
289which are to the right of that fork occurred
290among the trees, out of   9.00 trees
291
292  +-------------------------------------------------------A
293  |
294  |             +-----------------------------------------E
295  |             |
296  |             |                                  +------I
297  |             |             +----------------9.0-|
298  |             |             |                    +------F
299  |      +--9.0-|             |
300  |      |      |      +--2.0-|             +-------------D
301  |      |      |      |      |      +--6.0-|
302  |      |      |      |      |      |      |      +------J
303  |      |      |      |      +--6.0-|      +--4.0-|
304  +------|      +--6.0-|             |             +------H
305         |             |             |
306         |             |             +--------------------C
307         |             |
308         |             +----------------------------------G
309         |
310         +------------------------------------------------B
311
312
313  remember: this is an unrooted tree!
314
315</PRE>
316</TD></TR></TABLE>
317</BODY>
318</HTML>
Note: See TracBrowser for help on using the repository browser.