| 19 | | NOTES The branch lengths reflect the significance of edges rather than |
| 20 | | the number of changed residues. |
| 21 | | |
| | 18 | |
| | 19 | SECTION Inner branches |
| | 20 | |
| | 21 | To calculate the lengths of non-terminal branches, branch swapping |
| | 22 | is used on them. |
| | 23 | |
| | 24 | Branch swapping (aka NNI=Nearest Neighborhood Interchange) is the most |
| | 25 | atomic operation possible at an inner branch and has as such an effect on |
| | 26 | the overall costs of the tree. |
| | 27 | |
| | 28 | That effect is used as branchlength of the inner branch. |
| | 29 | |
| | 30 | The branchlength reflects the significance of the branch, i.e. |
| | 31 | |
| | 32 | - the exact topology around SHORT inner branches has little influence |
| | 33 | on the overall tree costs, i.e. the calculated topology does most |
| | 34 | likely NOT reflect the "real phylogentic topology". |
| | 35 | |
| | 36 | - Opposed, the exact topology around LONG inner branches has big influence |
| | 37 | on the overall tree costs, i.e. the calculated topology does most |
| | 38 | likely reflect the "real phylogentic topology". |
| | 39 | |
| | 40 | SECTION Terminal branches |
| | 41 | |
| | 42 | For terminal branches ARB_PARSIMONY checks how much the overall |
| | 43 | tree costs changed by adding this species to the tree. The price |
| | 44 | is weighted by the base-count of the species. |
| | 45 | |
| | 46 | i.e. |
| | 47 | - if the species has an identical relative in the tree and is added |
| | 48 | as neighbor of that relative, the resulting branchlength will be zero. |
| | 49 | - if added the species increases the tree costs by 50 and the species |
| | 50 | contains 100 bases, the resulting branchlength will be 0.5 |
| | 51 | |
| | 52 | This does quite accurately reflect the percentage of residues changed |
| | 53 | against the rest of the tree. |
| | 54 | |
| | 55 | SECTION Partial sequences |
| | 56 | |
| | 57 | If you add species with partial sequences as fulllength-species, they will |
| | 58 | group together in distant subtrees. |
| | 59 | |
| | 60 | But if species are flagged to contain "partial sequences" (this is done by |
| | 61 | "Add marked partial species"), they are handled differently: |
| | 62 | |
| | 63 | - Each partial species corresponds to one non-partial species |
| | 64 | - The partial-species is always inserted "below" the corresponding full-species. |
| | 65 | Multiple partials may correspond to the same full-species. |
| | 66 | - By adding that partial sequence to the tree, the tree costs only raise |
| | 67 | by the weighted mismatches in the region that overlaps (opposed, non-partial |
| | 68 | sequences would as well count the missing part as "gap insertions", i.e. |
| | 69 | the costs for adding a sequence as "partial" are MUCH cheaper). |
| | 70 | |
| | 71 | Species with partial sequences have the field "ARB_partial" set to 1 |
| | 72 | |
| | 73 | SECTION Used terms |
| | 74 | |
| | 75 | - overall tree costs: minimum number of mutation in the tree |
| | 76 | - base-count: without filtered positions. affected by specified weights. |
| | 77 | |