Context Navigation

← Previous Revision
Next Revision →
Blame
Revision Log

chimera_check.hlp

Visit:

Last change on this file was 12267, checked in by westram, 10 years ago
full update from 'trunk' into 'rc' because i was (nearly) only fixing bugs in the last 5 weeks adds: log:branches/fix@11985,11990:11997,12008:12013,12023:12042,12049:12053,12058:12137,12147:12156 log:branches/gcc@12076:12079 log:branches/tree@12004:12019,12116:12120 log:trunk@11971:11972,11974:11980,11982:11989,11991:12260
Property svn:eol-style set to `native` Property svn:keywords set to `Author Date Id Revision`
File size: 3.1 KB

Line
1	#Please insert up references in the next lines (line starts with keyword UP)
2	UP arb.hlp
3	UP glossary.hlp
4
5	#Please insert subtopic references (line starts with keyword SUB)
6	#SUB subtopic.hlp
7
8	# Hypertext links in helptext can be added like this: LINK{ref.hlp\|http://add\|bla@domain}
9
10	#*********** Title of helpfile !! and start of real helpfile ******
11	TITLE Chimera check
12
13	OCCURRENCE ARB_NT/Sequence/Chimera check
14
15	DESCRIPTION Takes sequences, a tree and a column statistic as input,
16	and generates a short sequence quality output string, which
17	will be stored into the database under a user defined key.
18
19	First the sequences are split into different slices:
20
21	- 2 pieces (front and back half)
22	- 5 equally sized pieces
23	- user defined pieces
24
25	The programs sums up the weighted mutations for each sequence slice
26	using a maximum likelihood technique.
27
28	For each slice a students t-test (see LINK{http://en.wikipedia.org/wiki/T-test}) is
29	performed and its result is written into the XXX portions of the entries mentioned below.
30	The t-test tests whether the likelyhood of a specific sequence slice (of one species) follows
31	a t-distribution of the likelyhoods for that sequence slice in all examined species.
32
33	The meaning of each X contains the result of the t-test (the "t-value") as follows:
34	* if the t-test succeeds the value of X is '1' up to '8' (where '5' is shown as '-').
35	* if the t-test fails '0' or '9' is written to the X's
36	* if there is not enough data to perform the t-test, '.' is written to the X's
37
38	Rule of thumb: Values near '0' or '9' indicate regions with an abnormal, values
39	near '5' ('-') regions with a normal (i.e. expectable) number of weighted mutations.
40
41	The sequence quality string written into a user-definable species
42	field has the following format:
43
44	MED SUM aXX bXXXXX cXXXXX...XXXX
45
46	where:
47	* MED is the median of all t-values (0.0 = normal; <5.0 = succeeds t-test (mean); >5.0 = abnormal)
48	* SUM is the sum of all t-values
49	* aXX shows the quality for 2 pieces
50	* bXXXXX shows the quality for 5 pieces
51	* cXXXXX...XXXX shows the quality for user defined slices
52
53	Optionally a 'quality' entry may be written to the alignment, allowing
54	to display it in EDIT4 below the sequence. That quality entry simply is
55	a "blown up" version of the "cXXXXX...XXXX" part of the sequence quality
56	field.
57
58	NOTES Only sequences which are in the tree are used.
59
60	Slices in high variance regions more easily pass the t-test.
61
62	Slices from sequences with higher overall variance more easily pass the t-test.
63
64	WARNINGS Needs a really lot of computer memory!
65
66	BUGS Does not delete the destination field of species not in
67	the tree.

Note: See TracBrowser for help on using the repository browser.

Download in other formats:

Original Format