source: trunk/HELP_SOURCE/source/dist.hlp

Last change on this file was 18769, checked in by westram, 3 years ago
  • move all helpfiles to new source location
  • Property svn:eol-style set to native
  • Property svn:keywords set to Author Date Id Revision
File size: 8.2 KB
Line 
1#Please insert up references in the next lines (line starts with keyword UP)
2UP      arb.hlp
3UP      glossary.hlp
4UP      mark.hlp
5UP      phylo.hlp
6
7#Please insert subtopic references  (line starts with keyword SUB)
8SUB     user_matrix.hlp
9SUB     savedef.hlp
10SUB     props_frame.hlp
11SUB     sel_fil.hlp
12SUB     awt_csp.hlp
13SUB     bootstrap.hlp
14
15
16# Hypertext links in helptext can be added like this: LINK{ref.hlp|http://add|bla@domain}
17
18#************* Title of helpfile !! and start of real helpfile ********
19TITLE           Neighbour joining
20
21OCCURRENCE      ARB_NT/Tree/Build tree from sequence data/Distance matrix methods/Distance matrix + ARB NJ
22
23DESCRIPTION     Reconstructs a tree for all or marked species by first
24                calculating binary distances and subsequently applying the
25                neighbour joining method.
26
27                The tree topology is stored in the database and can be displayed
28                within the tree display area of the 'ARB_NT' window.
29
30                1. Mark all interesting species.
31
32                2. Select all or marked species from the 'Select Species' menu
33                   of the 'NEIGHBOUR JOINING' window.
34
35                3. Select Alignment from the 'Select Alignment' subwindow of
36                   the 'NEIGHBOUR JOINING' window.
37
38                4. Display the 'Select Filter' window by pressing the button
39                   after the 'Filter' prompt and define an alignment-associated
40                   mask which defines alignment positions to include for treeing.
41
42                5. Define Weights: @@@ not implemented
43
44                6. Select rate matrix (only implemented for some corrections; see LINK{user_matrix.hlp})
45
46                7. Type characters for the exclusion of alignment postions to
47                   the 'Exclude Column' subwindow. The positions are
48                   excluded from the calculation of binary distance values
49                   if one of the specified characters is present in one or
50                   both sequences. The described function acts as a second
51                   filter and affects only the particular sequence
52                   pairs, not the whole alignment.
53
54                   Possible settings are "" (empty), "." and ".-".
55
56                      - "" treats dots as separate gap-type (i.e. '.' vs. '-' counted as mutation)
57                      - "." does not count dots (over-rates gaps running over multiple columns)
58                      - ".-" does not count gaps (ignores insertions and deletions)
59
60                   Note: This setting is ignored when using a LINK{user_matrix.hlp} or
61                         if Kimura correction is selected.
62
63                8. Select the type of distance correction from the 'Distance
64                   Correction' submenu. You can use the program to detect
65                   the best correction for you by pressing the AUTODETECT
66                   button.
67
68                   none:
69
70                        Differences/Sequence length. May be a good
71                        choice for short sequences (length < 300).
72
73                   similarity:
74
75                        1.0 - Differences/Sequence_Length
76
77                   jukes-cantor:
78
79                        Accounts for multiple base changes, assumes
80                        equal base frequencies.
81                        Good choice for medium sized sequences
82                        ( 300 - 1000/2000 sequence length )
83
84                   felsenstein:
85
86                        Similar to jukes-cantor transformation. Allows
87                        unequal base frequencies.
88                        ( length > 1000/2000 )
89
90                   olsen:
91
92                        As Felsenstein, except the base frequencies are
93                        calculated for each pair of sequences.
94
95                   from selected tree:
96
97                        This is NOT a distance correction! By selecting 'from selected tree'
98                        distances are not calculated using sequence data.
99                        Instead they are extracted from the
100                        tree currently selected in the 'Trees in Database' selection list.
101
102                        The distance between two species is defined as the sum of the lengths
103                        of all branches that connect these two species.
104
105                        Please note that this is an experimental feature. The distances between two species
106                        are not directly based on the sequence differences between these two species.
107                        Instead they reflect the evolutionary distance assumed by the tree reconstruction
108                        algorithm used to build the tree.
109
110                        The distances extracted from a tree are expected to be (slightly) bigger than
111                        the distances directly calculated from the sequences.
112                        This seems reasonable, because it is very unlikely, that evolution always took the
113                        shortest possible way (which is represented by the direct sequence distance).
114                        This effect increases for more distant (unrelated) species, reflecting the
115                        indirections evolution most likely made.
116
117                   Please note:
118
119                          the other correcting functions are in an experimental state.
120                          Wait for new release.!!!
121
122                9. Select a name for the tree from the 'Trees in Database'
123                   subwindow or type a new tree name.
124                   The tree name has to be 'tree_*'.
125                   An existing tree with that name will be deleted.
126
127                10. Press the 'CALCULATE TREE' button
128
129                11. Now you may display the new tree in the ARB_NT main window
130                        by selecting its name from the <Tree/Select> subwindow.
131                        If its name is already selected, you will not need to
132                        reselect it.
133
134                The distance matrix can be written to an ascii file:
135
136                        Press the <SAVE MATRIX> button to display the 'SAVE
137                        MATRIX' window. Select a file from the 'Directories
138                        and Files' subwindow or type a file name to the 'FILE
139                        NAME' subwindow. Press the <SAVE> button.
140                        The suffix displayed in the 'SUFFIX' subwindow is added
141                        to the typed file name and defines the selection of
142                        files listed in the 'Directories and Files' subwindow.
143
144SECTION         Calculate compressed matrix
145
146                You may select a tree to calculate a compressed matrix. A
147                compressed matrix contains columns for all folded groups
148                visible in the displayed tree (i.e. not for unfolded groups
149                and not for folded groups inside other folded groups).
150               
151                Species inside such groups are NOT listed as single entries.
152               
153                The distance shown for each group is the arithmetic average of
154                of the distances of all contained species.
155
156SECTION         Automatic calculation
157
158                There are two toggles in the ARB_DIST main window allowing to
159                trigger instant recalculation:
160
161                - 'Auto recalculate' will force recalculation of the matrix
162                - 'Auto calculate tree' will force calculation of the tree
163
164                If 'Auto calculate tree' is checked, the tree will be calculated
165                whenever the matrix has been updated.
166
167                If 'Auto recalculate' is checked, the matrix will be recalculated
168                whenever any input changes, e.g. if
169
170                - the filter is changed (or its underlaying SAI changes),
171                - the excluded columns change,
172                - the user defined matrix changes,
173                - the correction is changed,
174                - the sequence data changes or species marks change or
175                - the tree selected for compression or sorting changes or a different
176                  tree is selected.
177
178                As you might have guessed, this is only useful for smaller sets of sequences.
179
180                Some suggestions:
181
182                     You might for example display the resulting NJ tree in one window and
183                     play with distance parameter to instantly see their effect on the tree.
184
185                     Or you may unmark unwanted species/subtrees in the tree display.
186
187                     Or you might align sequences to see the effect on the resulting topology.
188
189
190NOTES           Computing time can be estimated using the following formula:
191
192                        time = (Sequence_Length * Nr.of.Spec * Nr.of.Spec)/
193                                Computer Power
194
195                                Example: Sparc 10, 74 Sequences, length 8000 characters
196                                         -> 10 Seconds
197
198
199WARNINGS        Don't try to build a tree with the 'similarity' distance
200                correction selected.
201
202                Distance values calculated without distance correction are strictly inside range [0.0 .. 1.0].
203                Same is true for 'similarity' distance "correction".
204                With distance corrections, the range does vary depending on the correction method.
205
206BUGS            None
207
Note: See TracBrowser for help on using the repository browser.