Context Navigation

DNAml_rates.help

Visit:

Last change on this file was 2657, checked in by westram, 21 years ago
moved from $ARBHOME/TOOLS
Property svn:eol-style set to `native` Property svn:keywords set to `Author Date Id Revision`
File size: 3.5 KB

Line
1
2	DNAml_rates_1_0
3
4	Gary J. Olsen
5
6	August 14, 1992
7
8
9	The DNAml_rates program takes a set of sequences and corresponding
10	phylogenetic tree and produces and maximum likelihood estimate of the
11	rate of nucleotides substitution at each sequence position.
12
13	Input is read from standard input. The format is very much like that
14	of the fastDNAml program. The first line of the input file gives the
15	number of sequences and the number of bases per sequence. Also on
16	this line are the requested program option letters. Any auxiliary
17	data required by the options follow on subsequent lines. Either the
18	user must specify the empirical base frequencies (F) option, or
19	immediately preceding the data matrix there must be a line of data
20	with the frequencies of A, C, G and T. Next, the program expects a
21	data matrix. The first 10 characters of the first line of data for a
22	given sequence in interpreted as the name (blanks are counted).
23	Elsewhere in the data matrix, blanks and numbers are ignored. The
24	default data matrix format is interleaved. If all the data for a
25	sequence are on one input line, then interleaved and noninterleaved
26	are equivalent. Following the data matrix there must be a line with
27	the number of user-specified trees for which rates are to be estimated
28	(as with the U option is fastDNAml). The rest of the input file is
29	one or more user-specified trees with branch lengths (as with the U
30	and L options in fastDNAml).
31
32	The program writes to standard output. The output lists the estimated
33	rate of change at every site in the sequence, or "Undefined" if there
34	are not sufficient unambiguous data at the site.
35
36	If the C option is specified, the program also categorizes the rates
37	into the requested number of categories. The current categorization
38	algorithm is rather crude, but is probably adequate if the number of
39	categories is large enough. A weighting mask is also created in which
40	sites with Undefined rates are assigned a weight of zero.
41
42	If the Y option is specified, the program writes the weights and
43	categories data to a file in a format appropriate for use by
44	fastDNAml.
45
46
47	Options summary:
48
49	1 - print data. Toggles print data option (default = noprint).
50
51	C - write categories. Requires auxiliary line with a C and the desired
52	number of categories.
53
54	F - empirical base frequencies. Calculates base frequencies from data matrix,
55	rather than expecting a base frequency input line.
56
57	I - interleave. Toggles the data interleave option (default = interleave).
58
59	L - userlengths. This is implicit in the program, so the option is ignored.
60
61	M - minimum informative sequences. Requires an auxiliary data line with an
62	M and the minimum number of sequences in which a sequence position
63	(alignment column) must have unambiguous information in order for the rate
64	at the site to be defined (default = 4).
65
66	T - transitions/transversion ratio. Requires auxiliary line with a T and
67	the ration of observed transitions to transversions (default = 2.0).
68
69	U - user trees. This is implicit in the program, so the option is ignored.
70
71	W - user weights. Requires weights auxiliary data.
72
73	Y - categories file. Writes the weights and categories to a file.
74
75
76	The option scripts usertree, weights, n_categories and categories_file are
77	useful for adding the appropriate options to the input data matrix.
78
79	The option script weights_categories is useful for adding the resulting
80	outfile to a fastDNAml input file.
81

Note: See TracBrowser for help on using the repository browser.

Download in other formats:

Original Format