Context Navigation

contchar.html

Visit:

Last change on this file was 2176, checked in by westram, 22 years ago
* empty log message *
Property svn:eol-style set to `native` Property svn:keywords set to `Author Date Id Revision`
File size: 7.3 KB

Line
1	<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
2	<HTML>
3	<HEAD>
4	<TITLE>contchar</TITLE>
5	<META NAME="description" CONTENT="contchar">
6	<META NAME="keywords" CONTENT="contchar">
7	<META NAME="resource-type" CONTENT="document">
8	<META NAME="distribution" CONTENT="global">
9	<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1">
10	</HEAD>
11	<BODY BGCOLOR="#ccffff">
12	<DIV ALIGN=RIGHT>
13	version 3.6
14	</DIV>
15	<P>
16	<DIV ALIGN=CENTER>
17	<H1>Gene Frequencies and Continuous Character Data Programs</H1>
18	</DIV>
19	<P>
20	© Copyright 1986-2000 by the University of
21	Washington. Written by Joseph Felsenstein. Permission is granted to copy
22	this document provided that no fee is charged for it and that this copyright
23	notice is not removed.
24	<P>
25	The programs in this group
26	use gene frequencies and quantitative character values. One (CONTML)
27	constructs maximum likelihood estimates of the phylogeny, another
28	(GENDIST) computes genetic distances for use in the distance matrix
29	programs, and the third (CONTRAST) examines correlation of traits as
30	they evolve along a given phylogeny.
31	<P>
32	When the gene frequencies data are used in CONTML or GENDIST, this
33	involves the following assumptions:
34	<P>
35	<OL>
36	<LI>Different lineages evolve independently.
37	<LI>After two lineages split, their characters change
38	independently.
39	<LI>Each gene frequency changes by genetic drift, with or without mutation
40	(this varies from method to method).
41	<LI>Different loci or characters drift independently.
42	</OL>
43	<P>
44	How these assumptions affect the methods will be seen in my papers on
45	inference of phylogenies from gene frequency and continuous character
46	data (Felsenstein, 1973b, 1981c, 1985c).
47	<P>
48	The input formats are fairly similar to the discrete-character
49	programs, but with one difference. When CONTML is used in the gene-frequency
50	mode (its usual, default mode), or when GENDIST is used,
51	the first line contains the number of species (or
52	populations) and the number of loci and the options information.
53	There then follows a line which
54	gives the numbers of alleles at each locus, in order. This must be
55	the full number of alleles, not the number of alleles which will be input:
56	i. e. for a two-allele locus the number should be 2, not 1. There
57	then follow the species (population) data, each species beginning
58	on a new line. The first 10 characters are taken as the name, and
59	thereafter the values of the individual characters are read free-format,
60	preceded and separated by blanks. They can go to a new line if desired,
61	though of course not in the middle of a number. Missing data is not
62	allowed - an important limitation. In the default configuration, for
63	each locus, the numbers should be
64	the frequencies of all but one allele. The menu option A (All) signals that
65	the frequencies of all alleles are provided in the input data -- the
66	program will then automatically ignore the last of them. So without the
67	A option, for a
68	three-allele locus there should be two numbers, the frequencies of
69	two of the alleles (and of course it must always be the same
70	two!). Here is a typical data set without the A option:
71	<P>
72	<TABLE><TR><TD BGCOLOR=white>
73	<PRE>
74	5 3
75	2 3 2
76	Alpha 0.90 0.80 0.10 0.56
77	Beta 0.72 0.54 0.30 0.20
78	Gamma 0.38 0.10 0.05 0.98
79	Delta 0.42 0.40 0.43 0.97
80	Epsilon 0.10 0.30 0.70 0.62
81	</PRE>
82	</TD></TR></TABLE>
83	<P>
84	whereas here is what it would have to look like if the A option were
85	invoked:
86	<P>
87	<TABLE><TR><TD BGCOLOR=white>
88	<PRE>
89	5 3
90	2 3 2
91	Alpha 0.90 0.10 0.80 0.10 0.10 0.56 0.44
92	Beta 0.72 0.28 0.54 0.30 0.16 0.20 0.80
93	Gamma 0.38 0.62 0.10 0.05 0.85 0.98 0.02
94	Delta 0.42 0.58 0.40 0.43 0.17 0.97 0.03
95	Epsilon 0.10 0.90 0.30 0.70 0.00 0.62 0.38
96	</PRE>
97	</TD></TR></TABLE>
98	<P>
99	The first line has the number of species (or populations) and the number
100	of loci. The second line has the number of alleles for each of the 3 loci.
101	The species lines have names (filled out to 10 characters with blanks)
102	followed by the gene frequencies of the 2 alleles for the first locus, the
103	3 alleles for the second locus, and the 2 alleles for the third locus.
104	You can start a new line after any of these allele frequencies, and
105	continue to give the frequencies on that line (without repeating the
106	species name).
107	<P>
108	If all alleles of a locus are given, it is important to have them add up
109	to 1. Roundoff of the frequencies may cause the program to conclude that
110	the numbers do not sum to 1, and stop with an error message.
111	<P>
112	While many compilers may be more tolerant, it is probably wise to
113	make sure that each number, including the first, is preceded by a blank,
114	and that there are digits both preceding and following any decimal
115	points.
116	<P>
117	CONTML and CONTRAST also treat quantitative characters (the
118	continuous-characters mode in CONTML, which is option C). It is assumed
119	that each character is evolving according to a Brownian motion model, at the
120	same rate, and independently. In
121	reality it is almost always impossible to guarantee this. The issue is
122	discussed at length
123	in my review article in Annual Review of Ecology and Systematics (Felsenstein,
124	1988a), where I point out the difficulty of transforming the characters so
125	that they are not only genetically independent but have independent selection
126	acting on them. If you are going to use CONTML to model evolution of
127	continuous characters, then you should at least make some attempt to remove
128	genetic correlations between the characters (usually all one can do is remove
129	phenotypic correlations by transforming the characters so that there is no
130	within-population covariance and so that the within-population
131	variances of the characters are equal -- this is equivalent to using
132	Canonical Variates). However, this will only guarantee that one has
133	removed phenotypic covariances between characters. Genetic covariances
134	could only be removed by knowing the coheritabilities of the characters,
135	which would require genetic experiments, and selective covariances
136	(covariances due to covariation of selection pressures) would require
137	knowledge of the sources and extent of selection pressure in all variables.
138	<P>
139	CONTRAST is a program designed to infer, for a given phylogeny that is
140	provided to the program, the covariation between characters in a data
141	set. Thus we have a program in this set that allow us to take information
142	about the covariation and rates of evolution of characters and make an
143	estimate of the phylogeny (CONTML), and a program that takes an estimate of the
144	phylogeny and infers the variances and covariances of the character
145	changes. But we have no program that infers both the phylogenies and
146	the character covariation from the same data set.
147	<P>
148	In the quantitative characters mode, a typical small data set would be:
149	<P>
150	<TABLE><TR><TD BGCOLOR=white>
151	<PRE>
152	5 6
153	Alpha 0.345 0.467 1.213 2.2 -1.2 1.0
154	Beta 0.457 0.444 1.1 1.987 -0.2 2.678
155	Gamma 0.6 0.12 0.97 2.3 -0.11 1.54
156	Delta 0.68 0.203 0.888 2.0 1.67
157	Epsilon 0.297 0.22 0.90 1.9 1.74
158	</PRE>
159	</TD></TR></TABLE>
160	<P>
161	Note that in the latter case, there is no line giving the numbers
162	of alleles at each locus. In this latter case no square-root
163	transformation of the coordinates is done: each is assumed to give
164	directly the position on the Brownian motion scale.
165	<P>
166	For further discussion of options and modifiable constants in CONTML,
167	GENDIST, and CONTRAST see the documentation files for those programs.
168	</BODY>
169	</HTML>

Note: See TracBrowser for help on using the repository browser.

Context Navigation

source: branches/profile/GDE/PHYLIP/doc/contchar.html

Download in other formats: