Context Navigation

← Previous Revision
Next Revision →
Blame
Revision Log

max_freq.hlp

Visit:

Last change on this file was 19532, checked in by westram, 4 months ago
reintegrates 'help' into 'trunk' tweak arb documentation: automatically link ticket references to arb tracker (only affects html version). found URLs. page titles warn about long titles. introduce `SUBTITLE`s (automatically triggered by multi-line titles in source files). increase allowed length (limited by subwindow width). cleanup header sections in all helpfiles. fix and/or update several help files. document syntax of help sources. build issues: when xml validation fails, next build no longer uses invalid xml ⇒ keeps failing. remove output files on error (including files below `ARBHOME/lib`). pipe output through logs to ensure proper wrapping in `Entering/Leaving` lines. moves `Tree admin` + `NDS` menu entries to top of menu adds: log:branches/help@18783:19531
Property svn:eol-style set to `native` Property svn:keywords set to `Author Date Id Revision`
File size: 2.5 KB

Line
1	# main topics:
2	UP arb.hlp
3	UP glossary.hlp
4
5	# sub topics:
6	#SUB subtopic.hlp
7
8	# format described in ../help.readme
9
10
11	TITLE Maximum base frequency
12
13	Calculate the Percentage of the Most Frequent Base
14
15	OCCURRENCE ARB_NT/SAI/create SAI/Max Frequency
16
17	DESCRIPTION Finds the most frequent base (or gap) in each column for all marked
18	species. Then the number of all sequences with this base are
19	divided by:
20
21	* the number of all marked sequences (if not ignoring gaps)
22	* the number of bases in this column (if ignoring gaps)
23
24	The resulting percentage is divided by ten and then the second last
25	digit is taken:
26
27	0% - 19% -> '1' (does not occur for nucleotides)
28	20% - 29% -> '2'
29	30% - 39% -> '3'
30	...
31	90% - 99% -> '9'
32	100% -> '0'
33
34
35	NOTE The result can be used as a conservation profile and filter.
36	Rule of thumb:
37	the higher the number, the more conserved the position (but mind the '0' which means 100%!).
38
39	Internally the SAI consists of two lines: the main line called 'data' and a second line called 'dat2'.
40
41	The first is used when you use the SAI as conservation profile or filter and
42	contains the SECOND LAST digit of the calculated frequencies.
43
44	The second contains the LAST digit of the calculated frequencies. It is not used and does only
45	show up, when you load the SAI into ARB_EDIT4, where it will show both lines.
46
47	EXAMPLES Say one column contains 7 A's 4 G's and 5 Gaps.
48
49	* ignoring gaps will result in 7/11 == 64 % which is converted to '6'.
50	* otherwise we get 7/16 == 44% which will be indicated by a '4' in the target sequence.
51
52	SECTION Gaps
53
54	If gaps are ignored '-' are treated like '.': both get removed and frequency is calculated on non-gaps only.
55
56	If gaps are NOT ignored, '-' are treated like non-gaps, i.e. a column containing only '-' will be assigned a
57	max. frequency of 100%. '.' are treated as gaps.
58
59	SECTION Ambiguities
60
61	Ambiguities are counted proportionally, i.e.
62
63	* a 'N' counts as 1/4 'A', 1/4 'C', 1/4 'G' and 1/4 'T'
64	* a 'D' counts as 1/3 'A', 1/3 'G' and 1/3 'T'
65	* a 'Y' counts as 1/2 'C' and 1/2 'T'
66
67	Example:
68
69	A column containing 9 'C' and one 'Y' results in a max. frequency of 95% (=9.5 'C').
70
71	BUGS No bugs known yet

Note: See TracBrowser for help on using the repository browser.

Download in other formats:

Original Format