| 1 | #Please insert up references in the next lines (line starts with keyword UP) |
|---|
| 2 | UP arb.hlp |
|---|
| 3 | UP arb_ntree.hlp |
|---|
| 4 | UP e4.hlp |
|---|
| 5 | UP arb_edit4.hlp |
|---|
| 6 | UP glossary.hlp |
|---|
| 7 | |
|---|
| 8 | #Please insert subtopic references (line starts with keyword SUB) |
|---|
| 9 | |
|---|
| 10 | # Hypertext links in helptext can be added like this: LINK{ref.hlp|http://add|bla@domain} |
|---|
| 11 | |
|---|
| 12 | #************* Title of helpfile !! and start of real strunk ******** |
|---|
| 13 | TITLE Protein Alignments |
|---|
| 14 | |
|---|
| 15 | OCCURRENCE ARB_EDIT4 |
|---|
| 16 | ARB_NTREE |
|---|
| 17 | |
|---|
| 18 | DESCRIPTION |
|---|
| 19 | |
|---|
| 20 | Protein gene sequences and (predicted) protein primary structures (= amino |
|---|
| 21 | acid sequences) as well as protein secondary structures can be stored in the |
|---|
| 22 | ARB database and protein alignments can be created. Using import filters |
|---|
| 23 | amino acid sequences and/or protein secondary structures can be imported from |
|---|
| 24 | DSSP files. Refer to LINK{arb_import.hlp} and especially LINK{dssp_ift.hlp} |
|---|
| 25 | for information on how this is done, please. Description of the DSSP code |
|---|
| 26 | and format as well as an example file can be found there, too. |
|---|
| 27 | |
|---|
| 28 | Once a protein secondary structure is present as species in the database it |
|---|
| 29 | can be converted to an SAI (see LINK{sp_sp_2_ext.hlp}) to use it as reference |
|---|
| 30 | for comparing other protein secondary structures or amino acid sequences. SAIs |
|---|
| 31 | can be created from the protein secondary structure information in a special |
|---|
| 32 | field named 'sec_struct', too (see LINK{pfold_sai.hlp}). This is useful, if |
|---|
| 33 | one has a protein secondary structure aligned along with the amino acid |
|---|
| 34 | sequence. |
|---|
| 35 | |
|---|
| 36 | An approach for visualizing matches between protein structures has been |
|---|
| 37 | incorporated in ARB. The match computation for sequences and secondary |
|---|
| 38 | structures is based on the Chou-Fasman algorithm (see below) or adaptions |
|---|
| 39 | to it and depends on the used match method. The match methods are described |
|---|
| 40 | in detail in LINK{pfold_props.hlp} along with all other related settings that |
|---|
| 41 | can be configured via the 'Properties' menu. |
|---|
| 42 | |
|---|
| 43 | SECTION Overview of the Chou-Fasman Algorithm |
|---|
| 44 | |
|---|
| 45 | The Chou-Fasman algorithm is a statistical method for predicting a protein |
|---|
| 46 | secondary structure from its amino acid sequence. It is based on the fact |
|---|
| 47 | that certain amino acids tend to form or break alpha-helices ('H'), |
|---|
| 48 | beta-sheets ('E') and beta-turns ('T'). The experimentally obtained |
|---|
| 49 | Chou-Fasman parameters (former and breaker values) are used to predict the |
|---|
| 50 | possible occurrence of the individual structure types which can then be |
|---|
| 51 | merged to create a secondary structure summary. Further information on how |
|---|
| 52 | this approach is used for protein structure match computation can be found |
|---|
| 53 | in LINK{pfold_props.hlp} in section 'Description of Match Methods'. |
|---|
| 54 | |
|---|
| 55 | SECTION REFERENCES |
|---|
| 56 | |
|---|
| 57 | [1] Chou-Fasman Algorithm |
|---|
| 58 | |
|---|
| 59 | Details on the Chou-Fasman algorithm can be found in the original |
|---|
| 60 | paper: "Chou, P. and Fasman, G. (1978). Prediction of the secondary |
|---|
| 61 | structure of proteins from their amino acid sequence. Advanced |
|---|
| 62 | Enzymology, 47, 45-148.". |
|---|
| 63 | |
|---|
| 64 | [2] DSSP |
|---|
| 65 | |
|---|
| 66 | The DSSP program was developed to standardize secondary structure |
|---|
| 67 | assignment. It assigns protein secondary structures to amino acid |
|---|
| 68 | sequences from the amino acids' crystallographic atom coordinates |
|---|
| 69 | as specified by protein entries in the Protein Data Bank (PDB). The |
|---|
| 70 | program can be found on the web at |
|---|
| 71 | "LINK{http://swift.cmbi.ru.nl/gv/dssp/}". Details on the algorithm |
|---|
| 72 | can be found in "Kabsch, W. and Sander, C. (1983). Dictionary of |
|---|
| 73 | protein secondary structure: pattern recognition of hydrogen-bonded |
|---|
| 74 | and geometrical features. Biopolymers, 22 (12), 2577-2637. |
|---|
| 75 | PMID: 6667333; UI: 84128824." |
|---|
| 76 | |
|---|
| 77 | NOTES |
|---|
| 78 | |
|---|
| 79 | The used method for protein secondary structure prediction, i.e. the Chou-Faman |
|---|
| 80 | algorithm, is fast which was the main reason for choosing it. Performance is |
|---|
| 81 | important for a large number of sequences loaded in the editor. However, it |
|---|
| 82 | is not very accurate and should only be used as rough estimation. Thus, the |
|---|
| 83 | match computation can only give an approximate overview if a given amino acid |
|---|
| 84 | sequence matches a certain secondary structure. |
|---|
| 85 | |
|---|
| 86 | EXAMPLES None |
|---|
| 87 | |
|---|
| 88 | WARNINGS Protein secondary structure in the field 'sec_struct' is not aligned |
|---|
| 89 | automatically with the sequence (yet). It has to be aligned manually! |
|---|
| 90 | |
|---|
| 91 | BUGS The editor might be unstable and may crash if sequences are not formatted. |
|---|