source: trunk/HELP_SOURCE/source/pa_value.hlp

Last change on this file was 18769, checked in by westram, 3 years ago
  • move all helpfiles to new source location
File size: 6.7 KB
Line 
1#Please insert up references in the next lines (line starts with keyword UP)
2UP      arb.hlp
3UP      arb_pars.hlp
4UP      glossary.hlp
5
6#Please insert subtopic references  (line starts with keyword SUB)
7#SUB    subtopic.hlp
8
9# Hypertext links in helptext can be added like this: LINK{ref.hlp|http://add|bla@domain}
10
11#************* Title of helpfile !! and start of real helpfile ********
12TITLE           Parsimony value
13
14OCCURRENCE      displayed in the top area of the ARB_PARSIMONY main window
15
16DESCRIPTION     The parsimony value indicates the
17                quality of a trees topology.
18
19                Basically it counts the minimum number of base mutations that
20                necessarily needed to occur,
21                if we assume that the current topology represents the way the evolution took.
22                Therefore smaller values indicate better topologies.
23
24                Several parameters influence the absolute parsimony value:
25
26                - if you specify a filter (in parsimony startup window) only mutations
27                  in the remaining, unfiltered alignment columns are counted, i.e. filtering
28                  will normally lower the resulting parsimony value.
29                - if you specify a weighting mask (in parsimony startup window) higher
30                  weighted sites will count stronger and raise the absolute parsimony value.
31                - adding more species to a tree will normally raise the number of mutations,
32                  i.e. a tree with many species has a higher parsimony value than a tree
33                  with fewer species (see also LINK{pa_reset.hlp}).
34
35                If you compare parsimony values of different topologies you need to use
36                the same alignment, the same filter, the same weighting mask and the
37                same set of species.
38
39SECTION         Dots
40
41                ARB uses dots ('.') as a special gap type.
42                The meaning of a dot is "might be a gap or a nucleotide/aa".
43                It indicates the lack of any information about the sequence data at the
44                position where they are used.
45
46                Opposed to that, a normal gap ('-') clearly states that it is KNOWN that the
47                sequence does NOT CONTAIN any bases at the positions of the gaps - the gaps have
48                only been inserted for alignment purposes.
49
50                And - opposed to gap - a 'N' (or 'X' for amino acid sequences) clearly states,
51                that it is KNOWN that the sequence CONTAINS some nucleotide/aa at that position.
52
53                In ARB databases you should use dots at both ends of the alignment.
54                Doing so means: you know that the sequence continues in both directions - it just
55                has not been sequenced completely.
56
57                Also you may use dots in the middle of the alignment, whenever you have
58                stronger indications, that some gap might in fact be a sequencing error.
59
60SECTION         Mutations against dots
61
62                When the parsimony value is calculated, dots do not cause mutations.
63                They will match any base or gap or other dot.
64
65                That means, dots at both sequence endings will compensate some of the
66                negative effects, that are normally caused by using sequences of different
67                lengths (e.g. clustering of LINK{partial_sequences.hlp}).
68
69
70SECTION         Differences between sequence types
71
72                For nucleotide sequences:
73
74                    Mutations are simply counted for single nucleotides.
75
76                For amino acid sequences:
77
78                    Mutations are determined on amino acid basis. This differs
79                    from what would be done when using the corresponding
80                    DNA alignment:
81
82                    - in DNA several different codons (combinations of 3 nucleotides)
83                      may represent the same amino acid. Therefore a mutation would
84                      be counted for DNA, where no mutation is counted for AA.
85                    - the parsimony value for amino acid alignments does not count
86                      the number of amino-acid-mutations. It counts the minimum
87                      number of nucleotide(!) mutations needed to mutate from one
88                      amino acid to another, while assuming that there is no
89                      selection pressure when mutating a codon into another codon that
90                      translates into the same amino acid (see also EXAMPLES below).
91
92                    ARB generally uses the "Standard code" to calculate the
93                    mutations between different amino acids, when determining
94                    the parsimony value.
95
96NOTES           The parsimony value is also used to LINK{pa_branchlengths.hlp}.
97
98EXAMPLES        for an amino acid mutation
99
100                Imagine an alignment position P and three species, where
101
102                 - species F has an 'F' (Phenylalanine) at position P,
103                 - species Q has a 'Q' (Glutamine) at position P and
104                 - species L has an 'L' (Leucine) at position P.
105
106                These amino acids may be represented by the following codons:
107
108                 - F = TTT | TTC
109                 - Q = CAA | CAG
110                 - L = TTA | TTG | CTN
111
112                Based on the minimum codon distances, the mutation costs used
113                in ARB_PARSIMONY are:
114
115                 - F -> Q = 3 mutations
116                 - F -> L = 1 mutation (e.g. TTT -> TTA)
117                 - L -> Q = 1 mutation (e.g. CTA -> CAA)
118
119                This results in the following parsimony values for the
120                possible subtree-rearrangements (R=Rest of whole tree):
121
122                         R
123                         |
124                         |
125                         F              pars value = 4
126                        / \
127                       /   \
128                      Q     L
129
130
131                         R
132                         |
133                         |
134                         Q              pars value = 4
135                        / \
136                       /   \
137                      F     L
138
139
140                         R
141                         |
142                         |
143                         L              pars value = 2 (!)
144                        / \
145                       /   \
146                      Q     F
147
148
149                Assuming the third topology (which is the "best" according to the parsimony
150                value), means to assume that the ancestor of Q and F had an L at position P.
151                As no selection pressure is assumed for mutating that 'L'-codon (e.g.
152                from 'TTA' into 'CTA') no mutation penalty is counted when calculating the
153                parsimony value.
154
155WARNINGS        None
156
157BUGS            No bugs known
Note: See TracBrowser for help on using the repository browser.