source: tags/arb-6.0.5/HELP_SOURCE/oldhelp/chimera_check.hlp

Last change on this file was 12267, checked in by westram, 10 years ago
  • Property svn:eol-style set to native
  • Property svn:keywords set to Author Date Id Revision
File size: 3.1 KB
Line 
1#Please insert up references in the next lines (line starts with keyword UP)
2UP      arb.hlp
3UP      glossary.hlp
4
5#Please insert subtopic references  (line starts with keyword SUB)
6#SUB    subtopic.hlp
7
8# Hypertext links in helptext can be added like this: LINK{ref.hlp|http://add|bla@domain}
9
10#************* Title of helpfile !! and start of real helpfile ********
11TITLE           Chimera check
12
13OCCURRENCE      ARB_NT/Sequence/Chimera check
14
15DESCRIPTION     Takes sequences, a tree and a column statistic as input,
16                and generates a short sequence quality output string, which
17                will be stored into the database under a user defined key.
18
19                First the sequences are split into different slices:
20
21                        - 2 pieces (front and back half)
22                        - 5 equally sized pieces
23                        - user defined pieces
24
25                The programs sums up the weighted mutations for each sequence slice
26                using a maximum likelihood technique.
27
28                For each slice a students t-test (see LINK{http://en.wikipedia.org/wiki/T-test}) is
29                performed and its result is written into the XXX portions of the entries mentioned below.
30                The t-test tests whether the likelyhood of a specific sequence slice (of one species) follows
31                a t-distribution of the likelyhoods for that sequence slice in all examined species.
32
33                The meaning of each X contains the result of the t-test (the "t-value") as follows:
34                    * if the t-test succeeds the value of X is '1' up to '8' (where '5' is shown as '-').
35                    * if the t-test fails '0' or '9' is written to the X's
36                    * if there is not enough data to perform the t-test, '.' is written to the X's
37
38                Rule of thumb: Values near '0' or '9' indicate regions with an abnormal, values
39                near '5' ('-') regions with a normal (i.e. expectable) number of weighted mutations.
40
41                The sequence quality string written into a user-definable species
42                field has the following format:
43
44                    MED SUM aXX bXXXXX cXXXXX...XXXX
45
46                    where:
47                        * MED is the median of all t-values (0.0 = normal; <5.0 = succeeds t-test (mean); >5.0 = abnormal)
48                        * SUM is the sum of all t-values
49                        * aXX shows the quality for 2 pieces
50                        * bXXXXX shows the quality for 5 pieces
51                        * cXXXXX...XXXX shows the quality for user defined slices
52
53                Optionally a 'quality' entry may be written to the alignment, allowing
54                to display it in EDIT4 below the sequence. That quality entry simply is
55                a "blown up" version of the "cXXXXX...XXXX" part of the sequence quality
56                field.
57
58NOTES           Only sequences which are in the tree are used.
59
60                Slices in high variance regions more easily pass the t-test.
61
62                Slices from sequences with higher overall variance more easily pass the t-test.
63
64WARNINGS        Needs a really lot of computer memory!
65
66BUGS            Does not delete the destination field of species not in
67                the tree.
Note: See TracBrowser for help on using the repository browser.