1 | #Please insert up references in the next lines (line starts with keyword UP) |
---|
2 | UP arb.hlp |
---|
3 | UP arb_ntree.hlp |
---|
4 | UP e4.hlp |
---|
5 | UP arb_edit4.hlp |
---|
6 | UP glossary.hlp |
---|
7 | |
---|
8 | #Please insert subtopic references (line starts with keyword SUB) |
---|
9 | |
---|
10 | # Hypertext links in helptext can be added like this: LINK{ref.hlp|http://add|bla@domain} |
---|
11 | |
---|
12 | #************* Title of helpfile !! and start of real strunk ******** |
---|
13 | TITLE Protein Alignments |
---|
14 | |
---|
15 | OCCURRENCE ARB_EDIT4 |
---|
16 | ARB_NTREE |
---|
17 | |
---|
18 | DESCRIPTION |
---|
19 | |
---|
20 | Protein gene sequences and (predicted) protein primary structures (= amino |
---|
21 | acid sequences) as well as protein secondary structures can be stored in the |
---|
22 | ARB database and protein alignments can be created. Using import filters |
---|
23 | amino acid sequences and/or protein secondary structures can be imported from |
---|
24 | DSSP files. Refer to LINK{arb_import.hlp} and especially LINK{dssp_ift.hlp} |
---|
25 | for information on how this is done, please. Description of the DSSP code |
---|
26 | and format as well as an example file can be found there, too. |
---|
27 | |
---|
28 | Once a protein secondary structure is present as species in the database it |
---|
29 | can be converted to an SAI (see LINK{sp_sp_2_ext.hlp}) to use it as reference |
---|
30 | for comparing other protein secondary structures or amino acid sequences. SAIs |
---|
31 | can be created from the protein secondary structure information in a special |
---|
32 | field named 'sec_struct', too (see LINK{pfold_sai.hlp}). This is useful, if |
---|
33 | one has a protein secondary structure aligned along with the amino acid |
---|
34 | sequence. |
---|
35 | |
---|
36 | An approach for visualizing matches between protein structures has been |
---|
37 | incorporated in ARB. The match computation for sequences and secondary |
---|
38 | structures is based on the Chou-Fasman algorithm (see below) or adaptions |
---|
39 | to it and depends on the used match method. The match methods are described |
---|
40 | in detail in LINK{pfold_props.hlp} along with all other related settings that |
---|
41 | can be configured via the 'Properties' menu. |
---|
42 | |
---|
43 | SECTION Overview of the Chou-Fasman Algorithm |
---|
44 | |
---|
45 | The Chou-Fasman algorithm is a statistical method for predicting a protein |
---|
46 | secondary structure from its amino acid sequence. It is based on the fact |
---|
47 | that certain amino acids tend to form or break alpha-helices ('H'), |
---|
48 | beta-sheets ('E') and beta-turns ('T'). The experimentally obtained |
---|
49 | Chou-Fasman parameters (former and breaker values) are used to predict the |
---|
50 | possible occurrence of the individual structure types which can then be |
---|
51 | merged to create a secondary structure summary. Further information on how |
---|
52 | this approach is used for protein structure match computation can be found |
---|
53 | in LINK{pfold_props.hlp} in section 'Description of Match Methods'. |
---|
54 | |
---|
55 | SECTION REFERENCES |
---|
56 | |
---|
57 | [1] Chou-Fasman Algorithm |
---|
58 | |
---|
59 | Details on the Chou-Fasman algorithm can be found in the original |
---|
60 | paper: "Chou, P. and Fasman, G. (1978). Prediction of the secondary |
---|
61 | structure of proteins from their amino acid sequence. Advanced |
---|
62 | Enzymology, 47, 45-148.". |
---|
63 | |
---|
64 | [2] DSSP |
---|
65 | |
---|
66 | The DSSP program was developed to standardize secondary structure |
---|
67 | assignment. It assigns protein secondary structures to amino acid |
---|
68 | sequences from the amino acids' crystallographic atom coordinates |
---|
69 | as specified by protein entries in the Protein Data Bank (PDB). The |
---|
70 | program can be found on the web at |
---|
71 | "LINK{http://swift.cmbi.ru.nl/gv/dssp/}". Details on the algorithm |
---|
72 | can be found in "Kabsch, W. and Sander, C. (1983). Dictionary of |
---|
73 | protein secondary structure: pattern recognition of hydrogen-bonded |
---|
74 | and geometrical features. Biopolymers, 22 (12), 2577-2637. |
---|
75 | PMID: 6667333; UI: 84128824." |
---|
76 | |
---|
77 | NOTES |
---|
78 | |
---|
79 | The used method for protein secondary structure prediction, i.e. the Chou-Faman |
---|
80 | algorithm, is fast which was the main reason for choosing it. Performance is |
---|
81 | important for a large number of sequences loaded in the editor. However, it |
---|
82 | is not very accurate and should only be used as rough estimation. Thus, the |
---|
83 | match computation can only give an approximate overview if a given amino acid |
---|
84 | sequence matches a certain secondary structure. |
---|
85 | |
---|
86 | EXAMPLES None |
---|
87 | |
---|
88 | WARNINGS Protein secondary structure in the field 'sec_struct' is not aligned |
---|
89 | automatically with the sequence (yet). It has to be aligned manually! |
---|
90 | |
---|
91 | BUGS The editor might be unstable and may crash if sequences are not formatted. |
---|