1 | <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN"> |
---|
2 | <HTML> |
---|
3 | <HEAD> |
---|
4 | <TITLE>neighbor</TITLE> |
---|
5 | <META NAME="description" CONTENT="neighbor"> |
---|
6 | <META NAME="keywords" CONTENT="neighbor"> |
---|
7 | <META NAME="resource-type" CONTENT="document"> |
---|
8 | <META NAME="distribution" CONTENT="global"> |
---|
9 | <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1"> |
---|
10 | </HEAD> |
---|
11 | <BODY BGCOLOR="#ccffff"> |
---|
12 | <DIV ALIGN=RIGHT> |
---|
13 | version 3.6 |
---|
14 | </DIV> |
---|
15 | <P> |
---|
16 | <DIV ALIGN=CENTER> |
---|
17 | <H1>NEIGHBOR -- Neighbor-Joining and UPGMA methods</H1> |
---|
18 | </DIV> |
---|
19 | <P> |
---|
20 | © Copyright 1991-2000 by the University of |
---|
21 | Washington. Written by Joseph Felsenstein. Permission is granted to copy |
---|
22 | this document provided that no fee is charged for it and that this copyright |
---|
23 | notice is not removed. |
---|
24 | <P> |
---|
25 | This program implements the Neighbor-Joining method of Nei and Saitou (1987) |
---|
26 | and the UPGMA method of clustering. The program was written by Mary Kuhner |
---|
27 | and Jon Yamato, using some code from program FITCH. An important part of the |
---|
28 | code was translated |
---|
29 | from FORTRAN code from the neighbor-joining program written by Naruya Saitou |
---|
30 | and by Li Jin, and is used with the kind permission of Drs. Saitou and Jin. |
---|
31 | <P> |
---|
32 | NEIGHBOR constructs a tree by successive clustering of lineages, setting |
---|
33 | branch lengths as the lineages join. The tree is not rearranged |
---|
34 | thereafter. The tree does not assume an evolutionary clock, so that it |
---|
35 | is in effect an unrooted tree. It should be somewhat similar to the tree |
---|
36 | obtained by FITCH. The program cannot evaluate a User tree, nor can it prevent |
---|
37 | branch lengths from becoming negative. However the algorithm is far faster |
---|
38 | than FITCH or KITSCH. This will make it particularly effective in their place |
---|
39 | for large studies or for bootstrap or jackknife resampling studies which |
---|
40 | require runs on multiple data sets. |
---|
41 | <P> |
---|
42 | The UPGMA option constructs a tree by successive (agglomerative) clustering |
---|
43 | using an average-linkage method of clustering. It has some relationship |
---|
44 | to KITSCH, in that when the tree topology turns out the same, the |
---|
45 | branch lengths with UPGMA will turn out to be the same as with the P = 0 |
---|
46 | option of KITSCH. |
---|
47 | <P> |
---|
48 | The options for NEIGHBOR are selected through the menu, which looks like |
---|
49 | this: |
---|
50 | <P> |
---|
51 | <TABLE><TR><TD BGCOLOR=white> |
---|
52 | <PRE> |
---|
53 | |
---|
54 | Neighbor-Joining/UPGMA method version 3.6a3 |
---|
55 | |
---|
56 | Settings for this run: |
---|
57 | N Neighbor-joining or UPGMA tree? Neighbor-joining |
---|
58 | O Outgroup root? No, use as outgroup species 1 |
---|
59 | L Lower-triangular data matrix? No |
---|
60 | R Upper-triangular data matrix? No |
---|
61 | S Subreplicates? No |
---|
62 | J Randomize input order of species? No. Use input order |
---|
63 | M Analyze multiple data sets? No |
---|
64 | 0 Terminal type (IBM PC, ANSI, none)? (none) |
---|
65 | 1 Print out the data at start of run No |
---|
66 | 2 Print indications of progress of run Yes |
---|
67 | 3 Print out tree Yes |
---|
68 | 4 Write out trees onto tree file? Yes |
---|
69 | |
---|
70 | |
---|
71 | Y to accept these or type the letter for one to change |
---|
72 | |
---|
73 | </PRE> |
---|
74 | </TD></TR></TABLE> |
---|
75 | <P> |
---|
76 | Most of the input options |
---|
77 | (L, R, S, J, and M) are as given in the Distance Matrix Programs |
---|
78 | documentation file, |
---|
79 | that file, and their input format is the same as given there. |
---|
80 | The O (Outgroup) option |
---|
81 | is described in the main |
---|
82 | documentation file of this package. It is not available when the |
---|
83 | UPGMA option is selected. The Jumble option (J) does not allow |
---|
84 | multiple jumbles (as most of the other programs that have it do), |
---|
85 | as there is no objective way of choosing which of the multiple |
---|
86 | results is best, there being no explicit criterion for optimality of the tree. |
---|
87 | <P> |
---|
88 | Option N chooses between the Neighbor-Joining and UPGMA methods. Option |
---|
89 | S is the usual Subreplication option. Here, however, it is present only |
---|
90 | to allow NEIGHBOR to read the input data: the number of replicates is |
---|
91 | actually ignored, even though it is read in. Note that this means that |
---|
92 | one cannot use it to have missing data in the input file, if NEIGHBOR is |
---|
93 | to be used. |
---|
94 | <P> |
---|
95 | The output consists of an tree (rooted if UPGMA, unrooted if Neighbor-Joining) |
---|
96 | and the lengths of the |
---|
97 | interior segments. The Average Percent Standard Deviation is not |
---|
98 | computed or printed out. If the tree found by Neighbor is fed into FITCH |
---|
99 | as a User Tree, it will compute this quantity if one also selects the |
---|
100 | N option of FITCH to ensure that none of the branch lengths is re-estimated. |
---|
101 | <P> |
---|
102 | As NEIGHBOR runs it prints out an account of the successive clustering |
---|
103 | levels, if you allow it to. This is mostly for reassurance and can be |
---|
104 | suppressed using menu option 2. In this printout of cluster levels |
---|
105 | the word "OTU" refers to a tip species, and the word "NODE" to an |
---|
106 | interior node of the resulting tree. |
---|
107 | <P> |
---|
108 | The constants available for modification at the beginning of the |
---|
109 | program are "namelength" which gives the length of a |
---|
110 | species name, and the usual boolean |
---|
111 | constants that initiliaze the terminal type. There is no feature saving |
---|
112 | multiply trees tied for best, |
---|
113 | partly because we do not expect exact ties except in cases where the branch |
---|
114 | lengths make the nature of the tie obvious, as when a branch is of zero |
---|
115 | length. |
---|
116 | <P> |
---|
117 | The major advantage of NEIGHBOR is its speed: it requires a time only |
---|
118 | proportional to the square of the number of species. It is significantly |
---|
119 | faster than version 3.5 of this program. By contrast FITCH |
---|
120 | and KITSCH require a time that rises as the fourth power of the number |
---|
121 | of species. Thus NEIGHBOR is well-suited to bootstrapping studies and |
---|
122 | to analysis of very large trees. Our simulation studies (Kuhner |
---|
123 | and Felsenstein, 1994) show that, contrary to statements in the |
---|
124 | literature by others, NEIGHBOR does not get as accurate an estimate of |
---|
125 | the phylogeny as does FITCH. However it does nearly as well, and in |
---|
126 | view of its speed this will make it a quite useful program. |
---|
127 | <P> |
---|
128 | <HR> |
---|
129 | <P> |
---|
130 | <H3>TEST DATA SET</H3> |
---|
131 | <P> |
---|
132 | <TABLE><TR><TD BGCOLOR=white> |
---|
133 | <PRE> |
---|
134 | 7 |
---|
135 | Bovine 0.0000 1.6866 1.7198 1.6606 1.5243 1.6043 1.5905 |
---|
136 | Mouse 1.6866 0.0000 1.5232 1.4841 1.4465 1.4389 1.4629 |
---|
137 | Gibbon 1.7198 1.5232 0.0000 0.7115 0.5958 0.6179 0.5583 |
---|
138 | Orang 1.6606 1.4841 0.7115 0.0000 0.4631 0.5061 0.4710 |
---|
139 | Gorilla 1.5243 1.4465 0.5958 0.4631 0.0000 0.3484 0.3083 |
---|
140 | Chimp 1.6043 1.4389 0.6179 0.5061 0.3484 0.0000 0.2692 |
---|
141 | Human 1.5905 1.4629 0.5583 0.4710 0.3083 0.2692 0.0000 |
---|
142 | </PRE> |
---|
143 | </TD></TR></TABLE> |
---|
144 | <P> |
---|
145 | <HR> |
---|
146 | <P> |
---|
147 | <H3>OUTPUT FROM TEST DATA SET (with all numerical options on)</H3> |
---|
148 | <P> |
---|
149 | <TABLE><TR><TD BGCOLOR=white> |
---|
150 | <PRE> |
---|
151 | |
---|
152 | 7 Populations |
---|
153 | |
---|
154 | Neighbor-Joining/UPGMA method version 3.6a3 |
---|
155 | |
---|
156 | |
---|
157 | Neighbor-joining method |
---|
158 | |
---|
159 | Negative branch lengths allowed |
---|
160 | |
---|
161 | |
---|
162 | Name Distances |
---|
163 | ---- --------- |
---|
164 | |
---|
165 | Bovine 0.00000 1.68660 1.71980 1.66060 1.52430 1.60430 |
---|
166 | 1.59050 |
---|
167 | Mouse 1.68660 0.00000 1.52320 1.48410 1.44650 1.43890 |
---|
168 | 1.46290 |
---|
169 | Gibbon 1.71980 1.52320 0.00000 0.71150 0.59580 0.61790 |
---|
170 | 0.55830 |
---|
171 | Orang 1.66060 1.48410 0.71150 0.00000 0.46310 0.50610 |
---|
172 | 0.47100 |
---|
173 | Gorilla 1.52430 1.44650 0.59580 0.46310 0.00000 0.34840 |
---|
174 | 0.30830 |
---|
175 | Chimp 1.60430 1.43890 0.61790 0.50610 0.34840 0.00000 |
---|
176 | 0.26920 |
---|
177 | Human 1.59050 1.46290 0.55830 0.47100 0.30830 0.26920 |
---|
178 | 0.00000 |
---|
179 | |
---|
180 | |
---|
181 | +---------------------------------------------Mouse |
---|
182 | ! |
---|
183 | ! +---------------------Gibbon |
---|
184 | 1------------------------2 |
---|
185 | ! ! +----------------Orang |
---|
186 | ! +--5 |
---|
187 | ! ! +--------Gorilla |
---|
188 | ! +-4 |
---|
189 | ! ! +--------Chimp |
---|
190 | ! +-3 |
---|
191 | ! +------Human |
---|
192 | ! |
---|
193 | +------------------------------------------------------Bovine |
---|
194 | |
---|
195 | |
---|
196 | remember: this is an unrooted tree! |
---|
197 | |
---|
198 | Between And Length |
---|
199 | ------- --- ------ |
---|
200 | 1 Mouse 0.76891 |
---|
201 | 1 2 0.42027 |
---|
202 | 2 Gibbon 0.35793 |
---|
203 | 2 5 0.04648 |
---|
204 | 5 Orang 0.28469 |
---|
205 | 5 4 0.02696 |
---|
206 | 4 Gorilla 0.15393 |
---|
207 | 4 3 0.03982 |
---|
208 | 3 Chimp 0.15167 |
---|
209 | 3 Human 0.11753 |
---|
210 | 1 Bovine 0.91769 |
---|
211 | |
---|
212 | |
---|
213 | </PRE> |
---|
214 | </TD></TR></TABLE> |
---|
215 | </BODY> |
---|
216 | </HTML> |
---|