source: trunk/GDE/PHYLIP/doc/neighbor.html

Last change on this file was 2176, checked in by westram, 21 years ago

* empty log message *

  • Property svn:eol-style set to native
  • Property svn:keywords set to Author Date Id Revision
File size: 8.2 KB
Line 
1<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
2<HTML>
3<HEAD>
4<TITLE>neighbor</TITLE>
5<META NAME="description" CONTENT="neighbor">
6<META NAME="keywords" CONTENT="neighbor">
7<META NAME="resource-type" CONTENT="document">
8<META NAME="distribution" CONTENT="global">
9<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1">
10</HEAD>
11<BODY BGCOLOR="#ccffff">
12<DIV ALIGN=RIGHT>
13version 3.6
14</DIV>
15<P>
16<DIV ALIGN=CENTER>
17<H1>NEIGHBOR -- Neighbor-Joining and UPGMA methods</H1>
18</DIV>
19<P>
20&#169; Copyright 1991-2000 by the University of
21Washington.  Written by Joseph Felsenstein.  Permission is granted to copy
22this document provided that no fee is charged for it and that this copyright
23notice is not removed.
24<P>
25This program implements the Neighbor-Joining method of Nei and Saitou (1987)
26and the UPGMA method of clustering.  The program was written by Mary Kuhner
27and Jon Yamato, using some code from program FITCH.  An important part of the
28code was translated
29from FORTRAN code from the neighbor-joining program written by Naruya Saitou
30and by Li Jin, and is used with the kind permission of Drs. Saitou and Jin.
31<P>
32NEIGHBOR constructs a tree by successive clustering of lineages, setting
33branch lengths as the lineages join.  The tree is not rearranged
34thereafter.  The tree does not assume an evolutionary clock, so that it
35is in effect an unrooted tree.  It should be somewhat similar to the tree
36obtained by FITCH.  The program cannot evaluate a User tree, nor can it prevent
37branch lengths from becoming negative.  However the algorithm is far faster
38than FITCH or KITSCH.  This will make it particularly effective in their place
39for large studies or for bootstrap or jackknife resampling studies which
40require runs on multiple data sets.
41<P>
42The UPGMA option constructs a tree by successive (agglomerative) clustering
43using an average-linkage method of clustering.  It has some relationship
44to KITSCH, in that when the tree topology turns out the same, the
45branch lengths with UPGMA will turn out to be the same as with the P = 0
46option of KITSCH.
47<P>
48The options for NEIGHBOR are selected through the menu, which looks like
49this:
50<P>
51<TABLE><TR><TD BGCOLOR=white>
52<PRE>
53
54Neighbor-Joining/UPGMA method version 3.6a3
55
56Settings for this run:
57  N       Neighbor-joining or UPGMA tree?  Neighbor-joining
58  O                        Outgroup root?  No, use as outgroup species  1
59  L         Lower-triangular data matrix?  No
60  R         Upper-triangular data matrix?  No
61  S                        Subreplicates?  No
62  J     Randomize input order of species?  No. Use input order
63  M           Analyze multiple data sets?  No
64  0   Terminal type (IBM PC, ANSI, none)?  (none)
65  1    Print out the data at start of run  No
66  2  Print indications of progress of run  Yes
67  3                        Print out tree  Yes
68  4       Write out trees onto tree file?  Yes
69
70
71  Y to accept these or type the letter for one to change
72
73</PRE>
74</TD></TR></TABLE>
75<P>
76Most of the input options
77(L, R, S, J, and M) are as given in the Distance Matrix Programs
78documentation file,
79that file, and their input format is the same as given there.
80The O (Outgroup) option
81is described in the main
82documentation file of this package.  It is not available when the
83UPGMA option is selected.  The Jumble option (J) does not allow
84multiple jumbles (as most of the other programs that have it do),
85as there is no objective way of choosing which of the multiple
86results is best, there being no explicit criterion for optimality of the tree.
87<P>
88Option N chooses between the Neighbor-Joining and UPGMA methods. Option
89S is the usual Subreplication option.  Here, however, it is present only
90to allow NEIGHBOR to read the input data: the number of replicates is
91actually ignored, even though it is read in.  Note that this means that
92one cannot use it to have missing data in the input file, if NEIGHBOR is
93to be used.
94<P>
95The output consists of an tree (rooted if UPGMA, unrooted if Neighbor-Joining)
96and the lengths of the
97interior segments.  The Average Percent Standard Deviation is not
98computed or printed out.  If the tree found by Neighbor is fed into FITCH
99as a User Tree, it will compute this quantity if one also selects the
100N option of FITCH to ensure that none of the branch lengths is re-estimated.
101<P>
102As NEIGHBOR runs it prints out an account of the successive clustering
103levels, if you allow it to.  This is mostly for reassurance and can be
104suppressed using menu option 2.  In this printout of cluster levels
105the word "OTU" refers to a tip species, and the word "NODE" to an
106interior node of the resulting tree.
107<P>
108The constants available for modification at the beginning of the
109program are "namelength" which gives the length of a
110species name, and the usual boolean
111constants that initiliaze the terminal type.  There is no feature saving
112multiply trees tied for best,
113partly because we do not expect exact ties except in cases where the branch
114lengths make the nature of the tie obvious, as when a branch is of zero
115length.
116<P>
117The major advantage of NEIGHBOR is its speed: it requires a time only
118proportional to the square of the number of species.  It is significantly
119faster than version 3.5 of this program.  By contrast FITCH
120and KITSCH require a time that rises as the fourth power of the number
121of species.  Thus NEIGHBOR is well-suited to bootstrapping studies and
122to analysis of very large trees.  Our simulation studies (Kuhner
123and Felsenstein, 1994) show that, contrary to statements in the
124literature by others, NEIGHBOR does not get as accurate an estimate of
125the phylogeny as does FITCH.  However it does nearly as well, and in
126view of its speed this will make it a quite useful program.
127<P>
128<HR>
129<P>
130<H3>TEST DATA SET</H3>
131<P>
132<TABLE><TR><TD BGCOLOR=white>
133<PRE>
134    7
135Bovine      0.0000  1.6866  1.7198  1.6606  1.5243  1.6043  1.5905
136Mouse       1.6866  0.0000  1.5232  1.4841  1.4465  1.4389  1.4629
137Gibbon      1.7198  1.5232  0.0000  0.7115  0.5958  0.6179  0.5583
138Orang       1.6606  1.4841  0.7115  0.0000  0.4631  0.5061  0.4710
139Gorilla     1.5243  1.4465  0.5958  0.4631  0.0000  0.3484  0.3083
140Chimp       1.6043  1.4389  0.6179  0.5061  0.3484  0.0000  0.2692
141Human       1.5905  1.4629  0.5583  0.4710  0.3083  0.2692  0.0000
142</PRE>
143</TD></TR></TABLE>
144<P>
145<HR>
146<P>
147<H3>OUTPUT FROM TEST DATA SET (with all numerical options on)</H3>
148<P>
149<TABLE><TR><TD BGCOLOR=white>
150<PRE>
151
152   7 Populations
153
154Neighbor-Joining/UPGMA method version 3.6a3
155
156
157 Neighbor-joining method
158
159 Negative branch lengths allowed
160
161
162Name                       Distances
163----                       ---------
164
165Bovine        0.00000   1.68660   1.71980   1.66060   1.52430   1.60430
166              1.59050
167Mouse         1.68660   0.00000   1.52320   1.48410   1.44650   1.43890
168              1.46290
169Gibbon        1.71980   1.52320   0.00000   0.71150   0.59580   0.61790
170              0.55830
171Orang         1.66060   1.48410   0.71150   0.00000   0.46310   0.50610
172              0.47100
173Gorilla       1.52430   1.44650   0.59580   0.46310   0.00000   0.34840
174              0.30830
175Chimp         1.60430   1.43890   0.61790   0.50610   0.34840   0.00000
176              0.26920
177Human         1.59050   1.46290   0.55830   0.47100   0.30830   0.26920
178              0.00000
179
180
181  +---------------------------------------------Mouse     
182  !
183  !                        +---------------------Gibbon   
184  1------------------------2
185  !                        !  +----------------Orang     
186  !                        +--5
187  !                           ! +--------Gorilla   
188  !                           +-4
189  !                             ! +--------Chimp     
190  !                             +-3
191  !                               +------Human     
192  !
193  +------------------------------------------------------Bovine   
194
195
196remember: this is an unrooted tree!
197
198Between        And            Length
199-------        ---            ------
200   1          Mouse           0.76891
201   1             2            0.42027
202   2          Gibbon          0.35793
203   2             5            0.04648
204   5          Orang           0.28469
205   5             4            0.02696
206   4          Gorilla         0.15393
207   4             3            0.03982
208   3          Chimp           0.15167
209   3          Human           0.11753
210   1          Bovine          0.91769
211
212
213</PRE>
214</TD></TR></TABLE>
215</BODY>
216</HTML>
Note: See TracBrowser for help on using the repository browser.