Context Navigation

awt_csp.hlp

Visit:

Last change on this file was 6142, checked in by westram, 15 years ago
backport [6141] (parts not affecting code at all, i.e. helpfiles, figs, ..)
Property svn:eol-style set to `native` Property svn:keywords set to `Author Date Id Revision`
File size: 2.5 KB

Line
1	#Please insert up references in the next lines (line starts with keyword UP)
2	UP arb.hlp
3	UP glossary.hlp
4
5	#Please insert subtopic references (line starts with keyword SUB)
6	SUB pos_var_pars.hlp
7
8	# Hypertext links in helptext can be added like this: LINK{ref.hlp\|http://add\|bla@domain}
9
10	#*********** Title of helpfile !! and start of real helpfile strunk******
11	TITLE Estimate Parameters from Column Statistics
12
13	OCCURRENCE ARB_DIST
14
15	DESCRIPTION In a standard RNA, base frequencies are not equally
16	distributed. Especially in the archea subclass we find
17	extremely G+C rich sequences.
18	This yielded in a couple of new rate corrections, algorithms
19	and programs which:
20
21	- calculate the average G+C content of all/two sequences
22	- correct the distance.
23
24	But further research showed us that the G+C frequencies are
25	not equally distributed within a sequence. Especially helical
26	parts have a significant higher G+C content than non
27	helical parts.
28	One strait forward algorithm would calculate each frequency
29	independently for each column.
30	Especially for small datasets the resulting frequencies would
31	look like random data, as too few examples are analyzed.
32
33	In ARB we implemented a combination of the 2 approaches.
34	Lets say we want to estimate a Parameter 'P' with
35	a maximum variance 'maxvar', so we need a minimum
36	samples 'minsap'.
37
38	- All sequence positions a clustered according to
39
40	- helical/non helical region
41	- variability
42
43	The size of the cluster is choosen with respect
44	to the variability of the sequences to get a
45	minimum of independent events.
46
47	- The final parameter estimate for a column is a
48	weighted sum between the estimate for the
49	cluster and the estimate for the single position.
50
51	You can give your favorite method a higher weight by
52	controlling the smoothing parameter:
53
54	Less smoothing -> independent parameter estimates
55
56	Much smoothing -> clustered parameter estimates
57
58	To get a good tree we recommend you to try all selections.
59
60	NOTES To get parameters from a column statistic you first have
61	to create one.
62	Do this with <ARB_NT/SAI/Positional Variability (Parsimony M.)>
63
64	WARNINGS Problems may occur when
65
66	1. independent parameter estimates is selected and
67	2. your dataset is quit small (<100 Sequences) and
68	3. one sequence is bad or badly aligned
69
70	or
71
72	1. Much smoothing of parameters is selected and
73	2. you are analyzing ribosomal RNA and
74	3. 'Use Helix Information' is turned off
75
76
77	BUGS No bugs known

Note: See TracBrowser for help on using the repository browser.

Download in other formats:

Original Format