source: branches/port5/HELP_SOURCE/oldhelp/gde.hlp

Last change on this file was 6142, checked in by westram, 16 years ago
  • backport [6141] (parts not affecting code at all, i.e. helpfiles, figs, ..)
  • Property svn:eol-style set to native
  • Property svn:keywords set to Author Date Id Revision
File size: 24.4 KB
Line 
1UP      arb.hlp
2UP      glossary.hlp
3UP      save.hlp
4
5SUB     arb_edit.hlp
6SUB     ale.hlp
7
8TITLE           GDE Interface and Editor
9
10DESCRIPTION     Starts the GDE Editor designed by Steven Smith.
11                See next chapter of this text for the original help text.
12                As GDE originally used its own built-in database, it had to be
13                slightly modified to run under ARB. So
14
15                **** READ THE WARNINGS/BUGS CAREFULLY ****
16
17WARNINGS        As soon as you start GDE, it creates a copy of the selected
18                sequences. That means that you may change the sequences
19                with either GDE or ARB, but not both. Therefore, if you have started
20                GDE, do nothing but sequence editing in GDE till you quit GDE.
21                To really save sequences to disc, you have to send the sequence
22                changes to ARB and then use ARB to save the ARB database.
23
24
25BUGS            Many functions, especially
26
27                                        -deleting,
28                                        -moving,
29                                        -duplicating,
30                                        -creating,
31                                        -importing,
32
33                species do not work correctly.
34
35
36
37        ********* Part of the Original GDE HELPTEXT ******************
38
39SECTION Introduction
40
41        The Genetic Data Environment is part of a growing
42        set of programs for manipulating and analyzing
43        "genetic" data. It differs in design from other
44        analysis programs in that it is intended to be an
45        expandable and customizable system, while still
46        being easy to use.
47
48        There are a tremendous number of publicly available
49        programs for sequence analysis. Many of these
50        programs have found their way into commercial
51        packages which incorporate them into integrated,
52        easy to use systems. The goal of the GDE is to
53        minimize the amount of effort required to integrate
54        sequence analysis functions into a common
55        environment. The GDE takes care of the user
56        interface issues, and allows the programmer to
57        concentrate on the analysis itself. Existing programs
58        can be tied into the GDE in a matter of hours (or
59        minutes) as apposed to days or weeks. Programs
60        may be written in any language, and still seamlessly
61        be incorporated into the GDE.
62
63        These programs are, and will continue to be,
64        available at no charge. It is the hope that this
65        system will grow in functionality as more and more
66        people see the benefits of a modular analysis
67        environment. Users are encouraged to make
68        modifications to the system, and forward all changes
69        and additions to Steven Smith at
70        smith@bioimage.millipore.com.
71
72SECTION What's New for this Release
73
74        GDE 2.2 represents a maintenance release. Several
75        small bugs have been fixed, as well as new editing
76        features and user interface elements. Also, I have
77        tried to update all of the contributed external
78        programs to their latest release. Updated programs
79        include:
80
81                - Phylip
82                - Treetool
83                - LoopTool
84                - Readseq
85                - Blast
86                - Fasta
87
88        Improved versions of printing, and translate are
89        included as well. As for new editing features, a
90        useful "yanking" feature has been added by Scott
91        Ferguson from Exxon Research, and the capability
92        to export the colormap for a sequence (see
93        appendices A/C). Among the bugs fixed in this
94        release are:
95
96        Selection mask problems when exporting to
97        Genbank (fixed in 2.1)
98        Memory leaks (fixed in 2.1)
99        Correct handling of circular sequences
100        More liberal interpretation of Genbank formatted
101        files. (not column dependent)
102
103
104SECTION System Requirements
105
106        GDE 2.2 currently runs on the Sun family of
107        workstations. This includes the Sun3 and Sun4
108        (Sparcstation) systems. It was written in XView,
109        and runs on Suns using OpenWindows 3.0 or MIT's
110        X Windows. It runs in both monochrome, and color,
111        and can be run remotely on any system capable of
112        running X Windows Release 4. You should have at
113        least 15 meg of free disk space available. The binary
114        release for SparcStations was compiled under
115        SunOS 4.1.2 and Openwindows 3.0.
116
117        We are also supporting a DECStation version of
118        GDE. This is running under XView 3.0/X11R5. We
119        encourage interested people to port the programs to
120        their favorite Unix platform. There are informal
121        ports to the SGI line of unix machines.
122
123SECTION Note to Motif users
124
125        GDE2.2 can be run using different window
126        managers. The most common alternative to olwm is
127        the Motif window manager (mwm). The only
128        problem in using another window manager is that
129        the status line is not displayed. We have added a
130        "Message panel" as an option under "File-
131        >Properties" which displays all of the information
132        contained on the status line.
133
134        People using other window managers may also
135        prefer using xterm, and xedit as default terminals and
136        file editors. This can be accomplished by replacing
137        all occurrences of 'shelltool' and 'textedit' with
138        'xterm -e' and 'xedit' in the
139        $GDE_HELP_DIR/.GDEmenus file.
140
141
142        FastA and Blast need to have the properly formatted
143        databases installed in the $GDE_HELP_DIR under
144        the directories FASTA/PIR, FASTA/GENBANK,
145        BLAST/pir BLAST/genbank. For FASTA, simply
146        copy a version of PIR and Genbank into the proper
147        directory. Alternately, the PIR and GENBANK
148        files can be symbolic links to copies of Genbank
149        held elsewhere on your system. You may need to
150        look at the .GDEmenus file in $GDE_HELP_DIR to
151        verify that you are using the same divisions for
152        these databases.
153
154        Blast installation involves converting PIR and
155        GENBANK to a temporary FASTA format (using
156        pir2fasta and gb2fasta) and then using pressdb for
157        nucleic acid, and setdb for amino acid to reformat the
158        databases again into blast format. The .GDEmenus
159        file is currently set up to search with blast using the
160        following databases: pir, genpept, genupdate, and
161        genbank. If you wish to divide these into
162        subdivisions, then the .GDEmenus file will have to
163        be edited.
164
165        The most up to date release of blast can be obtained
166        via anonymous ftp to ncbi.nlm.nih.gov. The most
167        recent release of FASTA can be obtained via
168        anonymous ftp to uvaarpa.virginia.edu. It is
169        strongly recommended that you retrieve these copies,
170        and become familiar with their setup.
171
172SECTION Using the GDE
173
174        It is assumed that the user is familiar with the Unix,
175        and OpenWindows/Xwindows environments. It is
176        also assumed that people running standard MIT X-
177        Windows will be using the OpenLook window
178        manager (olwm). Other window managers work
179        with varied success. If you are not certain as to how
180        your system is set up, please contact your systems
181        administrator.
182
183
184        The GDE uses a menu description language to
185        define what external programs it can call, and what
186        parameters and data to pass to each function. This
187        language allows users to customize their own
188        environment to suite individual needs.
189
190        The following is how the GDE handles external
191        programs when selected from a menu:
192
193        Each step in this process is described in a file
194        .GDEmenus in the user's current or home directory.
195
196        The language used in this file describes three phases
197        to an external function call. The first phase
198        describes the menu item as it will appear, and the
199        Unix command line that is actually run when it is
200        selected. The second phase describes how to prompt
201        for the parameters needed by the function. The third
202        phase describes what data needs to be passed as
203        input to the external function, and what data (if any)
204        needs to be read back from its output.
205
206        The form of the language is a simple keyword/value
207        list delimited by the colon (:) character. The
208        language retains old values until new ones are set.
209        For example, setting the menu name is done once for
210        all items in that menu, and is only reset when the
211        next menu is reached.
212
213        The keywords for phase one are:
214
215                menu:menu name      Name of current menu
216                item:item name      Name of current menu item
217                itemmeta:meta_key   Meta key equivalence (quick keys)
218                itemhelp:help_file  Help file (either full path, or in GDE_HELP_DIR)
219                itemmethod:         Unix command
220
221        The item method command is a bit more involved, it
222        is the Unix command that will actually run the
223        external program intended. It is one line long, and
224        can be up to 256 characters in length. It can have
225        embedded variable names (starting with a '$') that
226        will be replaced with appropriate values later on. It
227        can consist of multiple Unix commands separated by
228        semi-colons (;), and may contain shell scripts and
229        background processes as well as simple command
230        names. Examples will be given later.
231
232        The keywords for phase two are:
233
234            arg:argument_variable_name
235
236                        Name of this variable. It will appear
237                        in the itemmethod: line with a dollar
238                        sign ($) in front of it.
239
240            argtype:slider,chooser,choice_menu or text
241
242                        The type of graphic object
243                        representing this argument.
244
245            arglabel:descriptive label
246
247                        A short description of what this
248                        argument represents
249
250            argmin:minimum_value (integer)
251
252                        Used for sliders.
253
254            argmax:maximum_value (integer)
255
256                        Used for sliders.
257
258            argvalue:default_value (integer)
259
260                        It is the numeric value associated with
261                        sliders or the default choice in
262                        choosers, choice_menus, and choice_lists
263                        (the first choice is 0, the second is 1 etc.)
264
265            argtext:default value
266
267                        Used for text fields.
268
269            argchoice:displayed value:passed value
270
271                        Used for choosers and
272                        choice_menus. The first value is
273                        displayed on screen, and the second
274                        value is passed to the itemmethod
275                        line.
276
277        The keywords for phase three are as follows:
278
279            in:input_file
280
281                        GDE will replace this name with a
282                        randomly generated temporary file
283                        name. It will then write the selected
284                        data out to this file.
285
286            informat:file_format
287
288                        Write data to this file for input to
289                        this function. Currently support
290                        values are Genbank, and flat.
291
292            inmask:
293
294                        This data can be controlled by a
295                        selection mask.
296
297            insave:
298
299                        Do not remove this file after running
300                        the external function. This is useful
301                        for functions put in the background.
302
303            out:output_file
304
305                        GDE will replace this name with a
306                        randomly generated temporary file
307                        name. It is up to the external function
308                        to fill this file with any results that
309                        might be read back into the GDE.
310
311            outformat:file_format
312
313                        The data in the output file will be in
314                        this format. Currently support
315                        values are colormask, Genbank, and
316                        flat.
317
318            outsave:
319
320                        Do not remove this file after reading.
321                        This is useful for background tasks.
322
323            outoverwrite:
324
325                        Overwrite existing sequences in the current
326                        GDE window. Currently supported with
327                        "gde" format only.
328
329
330        Here is a sample dialog box, and it's entry in the
331        .GDEmenus file:
332
333        Using the default parameters given in the dialog
334        box, the executed Unix command line would be:
335
336             (tr '[a-z]' '[A-Z]' < .gde_001 >.gde_001.tmp ; mv .gde_001.tmp CAPS ; gde CAPS -Wx medium ; rm .gde_001 ) &
337
338        where .gde_001 is the name of the temporary file
339        generated by the GDE which contains the selected
340        sequences in flat file format. Since the GDE runs
341        this command in the background ('&' at the end) it
342        is necessary to specify the insave: line, and to
343        remove all temporary files manually. There is no
344        output file specific because the data is not loaded
345        back into the current GDE window, but rather a new
346        GDE window is opened on the file. A simpler
347        command that reloads the data after conversion
348        might be:
349
350              item:          All caps
351              itemmethod:    tr '[a-z]' '[A-Z]' <INPUT > OUTPUT
352              in:            INPUT
353              informat:      flat
354              out:           OUTPUT
355              outformat:     flat
356
357        In this example, no arguments are specified, and so
358        no dialog box will appear. The command is not run
359        in the background, so the GDE can clean up after
360        itself automatically. The converted sequence is
361        automatically loaded back into the current GDE
362        window.
363
364        In general, the easiest type of program to integrate
365        into the GDE is a program completely driven from a
366        Unix command line. Interactive programs can be
367        tied in (MFOLD for example), however shell scripts
368        must be used to drive the parameter entry for these
369        programs. Programs of the form:
370
371                program_name -a1 argument1 -a2 argument2 -f inputfile -er errorfile > outputfile
372
373        can be specified in the .GDEmenus file directly. As
374        this is the general form of most one Unix commands,
375        these tend to be simpler to implement under the
376        GDE.
377
378        As functions grow in complexity, they may begin to
379        need a user interface of their own. In these cases, the
380        command line calling arguments are still necessary
381        in order to allow the GDE to hand them the
382        appropriate data, and possible retrieve results after
383        some external manipulation.
384
385
386SECTION Appendix C, External functions
387
388    ClustalV - Cluster multiple sequence alignment
389
390        Author: Des Higgins.
391
392        Reference: Higgins,D.G. Bleasby,A.J. and Fuchs,R. (1991)
393
394        CLUSTAL V: improved software for multiple sequence alignment. ms. submitted to CABIOS
395
396        Parameters:
397
398                k-tuple pairwise search Word size for pairwise comparisons
399                Window size             Smaller values give faster alignments,
400                                        larger values are more sensitive.
401                Transitions weighted    Can weight transitions twice as high as
402                                        transversions (DNA only).
403                Fixed gap penalty       Gap insertion penalty, lower value, more gaps
404                Floating gap penalty    Gap extension penalty, lower value, longer gaps
405
406
407
408        Comments:
409
410                        ClustalV is a directed multiple sequence alignment algorithm that
411                        aligns a set of sequences based on their level of similarity. It first
412                        uses a Lipman Peasron pairwise similarity scoring to find "clusters"
413                        of similar sequences, and pre-aligns those sequences. It then adds
414                        other sequences to the alignment in the order of their similarity so as
415                        to produce the cleanest alignment.
416
417        Warning:
418
419                        ClustalV only uses unambiguous character codes. It will also
420                        convert all sequences to upper case in the process of aligning. Clustal
421                        does not pass back comments, author etc. Be sure to keep copies of your
422                        sequences if you do not wish to lose this information.
423
424
425    MFOLD - RNA secondary prediction
426
427        Author: Michael Zuker
428
429        Reference:
430
431                        M. Zuker
432                        On Finding All Suboptimal Foldings of an RNA Molecule.
433                        Science, 244, 48-52, (1989)
434
435                        J. A. Jaeger, D. H. Turner and M. Zuker
436                        Improved Predictions of Secondary Structures for RNA.
437                        Proc. Natl. Acad. Sci. USA, BIOCHEMISTRY, 86, 7706-7710, (1989)
438
439                        J. A. Jaeger, D. H. Turner and M. Zuker
440                        Predicting Optimal and Suboptimal Secondary Structure for RNA.
441                        in "Molecular Evolution: Computer Analysis of Protein and
442                        Nucleic Acid Sequences", R. F. Doolittle ed.
443                        Methods in Enzymology, 183, 281-306 (1989)
444
445        Parameters:
446
447                        Linear/circular RNA fold
448
449                        ct File to save results
450
451        Comments:
452
453                        MFOLD passes it's output to a program Zuk_to_gen that translates the secondary
454                        structure prediction to a nested bracket ([]) notation.
455                        This notation can then be used in the Highlight Helix, and Draw
456                        Secondary structure (LoopTool) functions.
457
458                        MFOLD currently does not support much in the way of additional parameters.
459                        We hope to have all additional parameters available soon.
460
461
462    Blast - Basic Local Alignment Search Tool
463
464        Reference:
465
466                        Karlin, Samuel and Stephen F. Altschul (1990). Methods for
467                        assessing the statistical significance of molecular sequence
468                        features by using general scoring schemes, Proc. Natl. Acad.
469                        Sci. USA 87:2264-2268.
470
471                        Altschul, Stephen F., Warren Gish, Webb Miller, Eugene W.
472                        Myers, and David J. Lipman (1990). Basic local alignment
473                        search tool, J. Mol. Biol. 215:403-410.
474
475                        Altschul, Stephen F. (1991). Amino acid substitution
476                        matrices from an information theoretic perspective. J. Mol.
477                        Biol. 219:555-565.
478
479
480
481        Parameters:
482
483                        Which Database          Which nucleic or amino acid database
484                                                to search.
485
486                        Word Size               Length of initial hit. after locating a match of
487                                                this length, alignment extension is attempted. Blastn
488                        Match score             Score for matches in secondary alignment extension
489                        Mismatch score          Score for mismatches in secondary alignment extension
490                        Blastx, tblastn, blastp, blast3
491                        Substitution Matrix PAM120 or PAM250
492
493
494        Comments:
495
496                        The report is loaded into a text editor. This should be saved as a new file
497                        as the default file is removed after execution. The latest version of blast
498                        can be obtained via anonymous ftp to ncbi.nlm.nih.gov.
499
500    FastA - Similarity search
501
502                Reference:
503
504                        W. R. Pearson and D. J. Lipman (1988),
505                        "Improved Tools for Biological Sequence Analysis", PNAS 85:2444-2448
506
507                        W. R. Pearson (1990) "Rapid and Sensitive Sequence
508                        Comparison with FASTP and FASTA" Methods in Enzymology 183:63-98
509
510                Parameters:
511
512                        Database        Which database to search
513                        Number of alignments to report
514                        SMATRIX         Which similarity matrix to use
515
516
517                Comments:
518
519                        The FastA package includes several additional programs for pairwise alignment.
520                        We have only included a bare bones link to FastA. We hope to include a more
521                        complete setup for the actual 2.2 release.
522
523
524
525
526    Assemble Contigs - CAP Contig Assembly Program
527
528                Author
529
530                        Xiaoqiu Huang
531                        Department of Computer Science
532                        Michigan Technological University
533                        Houghton, MI 49931
534                        E-mail: huang@cs.mtu.edu
535
536                        Minor modifications for I/O by S. Smith
537
538                Reference
539
540                        "A Contig Assembly Program Based on Sensitive Detection of
541                        Fragment Overlaps" (submitted to Genomics, 1991)
542
543                Parameters:
544
545                        Minimum overlap                 Number of bases required for overlap
546                        Percent match within overlap    Percentage match required in the overlap
547                                                        region before merge is allowed.
548
549                Comments:
550
551                        CAP returns the aligned sequences to the current editor window. The sequences are
552                        placed into contigs by setting the groupid. Cap does not change the order of the
553                        sequences, and so the results should be sorted by group and offset (see sort under
554                        the Edit menu).
555
556
557    Lsadt - Least squares additive tree analysis
558
559        Author:
560
561                Geert De Soete,
562                'C' implementation by Mike Maciukenas,
563                University of Illinois
564
565        Reference:
566
567                LSADT, 1983 Psychometrika, 1984,
568                Quality and Quantity
569
570        Parameters:
571
572                        Distance correction to use in distance matrix calculations (see count below).
573                        What should be used for initial parameters estimates.
574                        Random number seed.
575                        Display method (See TreeTool below).
576
577        Comments:
578
579                        The program has been rewritten in 'C' and will be included with the rRNA Database
580                        phylogenetic package being written at the University of Illinois Department of
581                        Microbiology.
582
583                        Count is a short program to calculate a distance matrix from a sequence
584                        alignment (see below).
585
586
587    Count - Distance matrix calculator
588
589        Author: Steven Smith
590
591        Parameters:
592
593                        Correction method Currently Jukes-Cantor or none,
594                        Include dashed columns,
595                        Match upper case to lower
596
597
598        Comments:
599
600                        Passes back a distance matrix in a format readable by LSADT.
601
602
603
604    Treetool - Tree drawing/manipulation
605
606        Author: Michael Maciukenas, University of Illinois
607
608        Comments: See included documentation for TreeTool usage.
609
610
611
612    Readseq - format conversion program
613
614        Author: Don Gilbert
615
616        Parameters: Many, but can easily be run in interactive mode.
617
618        Comments:
619
620                        Readseq is a very useful program for format conversion.
621                        The latest versionsupports over a dozen different file formats, as
622                        well as formating capabilities for publication. GDE makes of Readseq
623                        for importing and exporting sequences as well as a filtering tool to
624                        some external functions.
625
626
627SECTION Copyright Notice
628
629        The Genetic Data Environment (GDE) software and
630        documentation are not in the public domain.
631        Portions of this code are owned and copyrighted by
632        the The Board of Trustees of the University of
633        Illinois and by Steven Smith. External functions
634        used by GDE are the proporty of, their respective
635        authors. This release of the GDE program and
636        documentation may not be sold, or incorporated into
637        a commercial product, in whole or in part without
638        the expressed written consent of the University of
639        Illinois and of its author, Steven Smith.
640
641        All interested parties may redistribute the GDE as
642        long as all copies are accompanied by this
643        documentation, and all copyright notices remain
644        intact. Parties interested in redistribution must do
645        so on a non-profit basis, charging only for cost of
646        media. Modifications to the GDE core editor should
647        be forwarded to the author Steven Smith. External
648        programs used by the GDE are copyright by, and are
649        the property of their respective authors unless
650        otherwise stated.
651
652
653        While all attempts have been made to insure the
654        integrity of these programs:
655
656SECTION Disclaimer
657
658        THE UNIVERSITY OF ILLINOIS, HARVARD
659        UNIVERSITY AND THE AUTHOR, STEVEN
660        SMITH GIVE NO WARRANTIES, EXPRESSED
661        OR IMPLIED FOR THE SOFTWARE AND
662        DOCUMENTATION PROVIDED, INCLUDING,
663        BUT NOT LIMITED TO WARRANTY OF
664        MERCHANTABILITY AND WARRANTY OF
665        FITNESS FOR A PARTICULAR PURPOSE.
666        User understands the software is a research tool for
667        which no warranties as to capabilities or accuracy are
668        made, and user accepts the software "as is." User
669        assumes the entire risk as to the results and
670        performance of the software and documentation. The
671        above parties cannot be held liable for any direct,
672        indirect, consequential or incidental damages with
673        respect to any claim by user or any third party on
674        account of, or arising from the use of software and
675        associated materials. This disclaimer covers both the
676        GDE core editor and all external programs used by
677        the GDE.
678
679
Note: See TracBrowser for help on using the repository browser.