| 1 | #Please insert up references in the next lines (line starts with keyword UP) |
|---|
| 2 | UP arb.hlp |
|---|
| 3 | UP glossary.hlp |
|---|
| 4 | |
|---|
| 5 | #Please insert subtopic references (line starts with keyword SUB) |
|---|
| 6 | #SUB subtopic.hlp |
|---|
| 7 | |
|---|
| 8 | # Hypertext links in helptext can be added like this: LINK{ref.hlp|http://add|bla@domain} |
|---|
| 9 | |
|---|
| 10 | #************* Title of helpfile !! and start of real helpfile ******** |
|---|
| 11 | TITLE Branch analysis |
|---|
| 12 | |
|---|
| 13 | OCCURRENCE ARB_NT/Tree/Branch analysis |
|---|
| 14 | |
|---|
| 15 | DESCRIPTION Branch analysis functions |
|---|
| 16 | |
|---|
| 17 | Functions provided here may be useful to |
|---|
| 18 | - detect wrong placed species or groups (or poor data, which might also lead to wrong placement) |
|---|
| 19 | - detect anomalies caused by (wrong) tree reconstruction |
|---|
| 20 | - gather information about tree topologies. |
|---|
| 21 | |
|---|
| 22 | Each function reports several values gathered during execution. |
|---|
| 23 | |
|---|
| 24 | SECTION Analyse distances in tree |
|---|
| 25 | |
|---|
| 26 | For the whole tree |
|---|
| 27 | - the in-tree-distance (ITD = sum of all branchlengths) and |
|---|
| 28 | - the per-species-distance (PSD = ITD / number of species) are displayed. |
|---|
| 29 | |
|---|
| 30 | For all leafs in the tree the following values are calculated: |
|---|
| 31 | - mean distance to all other leafs |
|---|
| 32 | - minimum distance to any other leaf |
|---|
| 33 | - maximum distance to any other leaf |
|---|
| 34 | |
|---|
| 35 | It reports the mean and the range of each of these 3 values separately |
|---|
| 36 | for all and for marked species. |
|---|
| 37 | |
|---|
| 38 | SECTION Using the PSD to compare trees |
|---|
| 39 | |
|---|
| 40 | The PSD is useful when comparing tree topologies (based on similar sets |
|---|
| 41 | of species) that were reconstructed using different methods. Imagine |
|---|
| 42 | you have two trees: |
|---|
| 43 | |
|---|
| 44 | - tree_raxml (reconstructed with RAxML) |
|---|
| 45 | - tree_arbpars (reconstructed with ARB parsimony) |
|---|
| 46 | |
|---|
| 47 | The PSDs of both trees will be quite different (maybe by factor 50 or 60). |
|---|
| 48 | Calculating the ratio of both PSDs, give you a good value for scaling |
|---|
| 49 | the branchlengths of (a copy of) one of the trees. For example the PSDs |
|---|
| 50 | might be |
|---|
| 51 | |
|---|
| 52 | - PSDr = PSD(tree_raxml) = 0.002219 |
|---|
| 53 | - PSDp = PSD(tree_arbpars) = 0.118483 |
|---|
| 54 | |
|---|
| 55 | Now you may scale the branchlengths of tree_arbpars by factor 0.01873 (=PSDr/PSDp) or those |
|---|
| 56 | of tree_raxml by factor 53.39 (==PSDp/PSDr) to ease comparison of the two trees. |
|---|
| 57 | |
|---|
| 58 | SECTION Mark long branches |
|---|
| 59 | |
|---|
| 60 | For each furcation in the tree, the relative difference between the distances of its |
|---|
| 61 | subtrees is calculated. |
|---|
| 62 | |
|---|
| 63 | 'Distance' here is the sum of all branches between the furcation and |
|---|
| 64 | the least distant leaf of the left resp. right subtree. |
|---|
| 65 | |
|---|
| 66 | Relative difference Meaning |
|---|
| 67 | The nearer subtree has at least |
|---|
| 68 | 10% 90% |
|---|
| 69 | 50% 50% |
|---|
| 70 | 75% 25% |
|---|
| 71 | 90% 10% |
|---|
| 72 | of the farther subtrees distance. |
|---|
| 73 | |
|---|
| 74 | Starting from the tree-tips, this function marks the more distant subtree of any furcation |
|---|
| 75 | where the relative and absolute difference are above the specified minimas. |
|---|
| 76 | |
|---|
| 77 | When a subtree has been marked, all further furcations between that subtree and the root |
|---|
| 78 | of the whole tree will be ignored. |
|---|
| 79 | |
|---|
| 80 | Poorly aligned sequences often result in long branches in the tree. Being able to identify |
|---|
| 81 | those branches quickly helps to find those sequences. |
|---|
| 82 | |
|---|
| 83 | The indented workflow is |
|---|
| 84 | * search long branches |
|---|
| 85 | * check alignment and data and fix any problems |
|---|
| 86 | * recalculate tree parts (see LINK{pa_add.hlp}) |
|---|
| 87 | * search again. Now you may find other branches, nearer to the tree root. |
|---|
| 88 | |
|---|
| 89 | SECTION Mark deep leafs |
|---|
| 90 | |
|---|
| 91 | Marks alls leafs in tree that have |
|---|
| 92 | - depth above min.depth and |
|---|
| 93 | - root-distance above min.root-distance |
|---|
| 94 | |
|---|
| 95 | 'depth' is the number of branches between root and leaf. Multifurcations |
|---|
| 96 | are respected properly. |
|---|
| 97 | |
|---|
| 98 | The 'root-distance' is the sum of the lengths of all branches between the root and |
|---|
| 99 | a leaf. |
|---|
| 100 | |
|---|
| 101 | SECTION Mark degenerated branches |
|---|
| 102 | |
|---|
| 103 | Branches are considered degenerated when two subtrees of an inner node |
|---|
| 104 | differ in size (=number of members) by a reasonable factor. |
|---|
| 105 | |
|---|
| 106 | This function allows you to specify that degeneration factor. |
|---|
| 107 | |
|---|
| 108 | For each degenerated inner node, the smaller subtree will be marked as whole. |
|---|
| 109 | The not-marked subtree will be examined for further degenerated nodes. |
|---|
| 110 | |
|---|
| 111 | Common reasons for degenerated trees: |
|---|
| 112 | - subsequently adding species using the 'quick add marked'-feature of |
|---|
| 113 | ARB parsimony without ever optimizing the whole tree. |
|---|
| 114 | - some "phylogenetic areas" are explored more thoroughly than others, resulting in |
|---|
| 115 | unbalanced representation of the evolution as it took place. |
|---|
| 116 | This is especially relevant if your database contains many clone-variants and you try |
|---|
| 117 | to calculate a tree. |
|---|
| 118 | |
|---|
| 119 | Solutions: |
|---|
| 120 | - Optimize your tree. For big trees you might try to |
|---|
| 121 | - mark questionable species using this function and then |
|---|
| 122 | - perform local/global optimization of marked species in ARB parsimony. |
|---|
| 123 | - Replace over-represented areas by one or few representatives (see also LINK{di_clusters.hlp}). |
|---|
| 124 | Calculate a new or optimize an existing tree with that subset of species. |
|---|
| 125 | Then quick-add previously removed species into that tree. |
|---|
| 126 | |
|---|
| 127 | SECTION Automarking |
|---|
| 128 | |
|---|
| 129 | If the 'Auto mark?'-toggle is checked, changing any of the parameters will |
|---|
| 130 | instantly trigger the execution of the corresponding mark function. |
|---|
| 131 | |
|---|
| 132 | NOTES To compare the information of two or more trees, |
|---|
| 133 | open new ARB_NT-window using 'File/New window' and popup their |
|---|
| 134 | 'Branch analysis'-windows. |
|---|
| 135 | |
|---|
| 136 | EXAMPLES None |
|---|
| 137 | |
|---|
| 138 | WARNINGS None |
|---|
| 139 | |
|---|
| 140 | BUGS No bugs known |
|---|