Context Navigation

← Previous Revision
Next Revision →
Blame
Revision Log

Readme

Visit:

Last change on this file was 10842, checked in by westram, 12 years ago
reintegrates 'help' into 'trunk': adds: log:branches/help@10647:10841 log:branches/helptest@10704:10720
Property svn:eol-style set to `native` Property svn:keywords set to `Author Date Id Revision`
File size: 6.2 KB

Line
1
2	\| * ReadSeq -- 1 Feb 93
3	\| *
4	\| * Reads and writes nucleic/protein sequences in various
5	\| * formats. Data files may have multiple sequences.
6	\| *
7	\| * Copyright 1990 by d.g.gilbert
8	\| * biology dept., indiana university, bloomington, in 47405
9	\| * e-mail: gilbertd@bio.indiana.edu
10	\| *
11	\| * This program may be freely copied and used by anyone.
12	\| * Developers are encourged to incorporate parts in their
13	\| * programs, rather than devise their own private sequence
14	\| * format.
15	\| *
16	\| * This should compile and run with any ANSI C compiler.
17	\| * Please advise me of any bugs, additions or corrections.
18
19	Readseq has been updated. There have been a number of enhancements
20	and a few bug corrections since the previous general release in Nov 91
21	(see below). If you are using earlier versions, I recommend you update to
22	this release.
23
24	Readseq is particularly useful as it automatically detects many
25	sequence formats, and interconverts among them.
26	Formats added to this release include
27	+ MSF multi sequence format used by GCG software
28	+ PAUP's multiple sequence (NEXUS) format
29	+ PIR/CODATA format used by PIR
30	+ ASN.1 format used by NCBI
31	+ Pretty print with various options for nice looking output.
32
33	As well, Phylip format can now be used as input. Options to
34	reverse-compliment and to degap sequences have been added. A menu
35	addition for users of the GDE sequence editor is included.
36
37	This program is available thru Internet gopher, as
38
39	gopher ftp.bio.indiana.edu
40	browse into the IUBio-Software+Data/molbio/readseq/ folder
41	select the readseq.shar document
42
43	Or thru anonymous FTP in this manner:
44	my_computer> ftp ftp.bio.indiana.edu (or IP address 129.79.224.25)
45	username: anonymous
46	password: my_username@my_computer
47	ftp> cd molbio/readseq
48	ftp> get readseq.shar
49	ftp> bye
50
51	readseq.shar is a Unix shell archive of the readseq files.
52	This file can be editted by any text editor to reconstitute the
53	original files, for those who do not have a Unix system or an
54	Unshar program. Read the top of this .shar file for further
55	instructions.
56
57	There are also pre-compiled executables for the following computers:
58	Silicon Graphics Iris, Sparc (Sun Sparcstation & clones), VMS-Vax,
59	Macintosh. Use binary ftp to transfer these, except Macintosh. The
60	Mac version is just the command-line program in a window, not very
61	handy.
62
63	C source files:
64	readseq.c ureadseq.c ureadasn.c ureadseq.h
65	Document files:
66	Readme (this doc)
67	Readseq.help (longer than this doc)
68	Formats (description of sequence file formats)
69	add.gdemenu (GDE program users can add this to the .GDEmenu file)
70	Stdfiles -- test sequence files
71	Makefile -- Unix make file
72	Make.com -- VMS make file
73	*.std -- files for testing validity of readseq
74
75
76	Example usage:
77	readseq
78	-- for interactive use
79	readseq my.1st.seq my.2nd.seq -all -format=genbank -output=my.gb
80	-- convert all of two input files to one genbank format output file
81	readseq my.seq -all -form=pretty -nameleft=3 -numleft -numright -numtop -match
82	-- output to standard output a file in a pretty format
83	readseq my.seq -item=9,8,3,2 -degap -CASE -rev -f=msf -out=my.rev
84	-- select 4 items from input, degap, reverse, and uppercase them
85	cat *.seq \| readseq -pipe -all -format=asn > bunch-of.asn
86	-- pipe a bunch of data thru readseq, converting all to asn
87
88
89	The brief usage of readseq is as follows. The "[]" denote
90	optional parts of the syntax:
91
92	readseq -help
93	readSeq (27Dec92), multi-format molbio sequence reader.
94	usage: readseq [-options] in.seq > out.seq
95	options
96	-a[ll] select All sequences
97	-c[aselower] change to lower case
98	-C[ASEUPPER] change to UPPER CASE
99	-degap[=-] remove gap symbols
100	-i[tem=2,3,4] select Item number(s) from several
101	-l[ist] List sequences only
102	-o[utput=]out.seq redirect Output
103	-p[ipe] Pipe (command line, <stdin, >stdout)
104	-r[everse] change to Reverse-complement
105	-v[erbose] Verbose progress
106	-f[ormat=]# Format number for output, or
107	-f[ormat=]Name Format name for output:
108	\| 1. IG/Stanford 10. Olsen (in-only)
109	\| 2. GenBank/GB 11. Phylip3.2
110	\| 3. NBRF 12. Phylip
111	\| 4. EMBL 13. Plain/Raw
112	\| 5. GCG 14. PIR/CODATA
113	\| 6. DNAStrider 15. MSF
114	\| 7. Fitch 16. ASN.1
115	\| 8. Pearson/Fasta 17. PAUP
116	\| 9. Zuker 18. Pretty (out-only)
117
118	Pretty format options:
119	-wid[th]=# sequence line width
120	-tab=# left indent
121	-col[space]=# column space within sequence line on output
122	-gap[count] count gap chars in sequence numbers
123	-nameleft, -nameright[=#] name on left/right side [=max width]
124	-nametop name at top/bottom
125	-numleft, -numright seq index on left/right side
126	-numtop, -numbot index on top/bottom
127	-match[=.] use match base for 2..n species
128	-inter[line=#] blank line(s) between sequence blocks
129
130
131
132	Recent changes:
133
134	4 May 92
135
136	+ added 32 bit CRC checksum as alternative to GCG 6.5bit checksum
137
138	Aug 92
139
140	= fixed Olsen format input to handle files w/ more sequences,
141	not to mess up when more than one seq has same identifier,
142	and to convert number masks to symbols.
143	= IG format fix to understand ^L
144
145	30 Dec 92
146
147	* revised command-line & interactive interface. Suggested form is now
148
149	readseq infile -format=genbank -output=outfile -item=1,3,4 ...
150
151	but remains compatible with prior commandlines:
152
153	readseq infile -f2 -ooutfile -i3 ...
154
155	+ added GCG MSF multi sequence file format
156	+ added PIR/CODATA format
157	+ added NCBI ASN.1 sequence file format
158	+ added Pretty, multi sequence pretty output (only)
159	+ added PAUP multi seq format
160	+ added degap option
161	+ added Gary Williams (GWW, G.Williams@CRC.AC.UK) reverse-complement option.
162	+ added support for reading Phylip formats (interleave & sequential)
163	* string fixes, dropped need for compiler flags NOSTR, FIXTOUPPER, NEEDSTRCASECMP
164	* changed 32bit checksum to default, -DSMALLCHECKSUM for GCG version
165
166	1Feb93
167
168	= reverted Genbank output format to fixed left margin
169	(change in 30 Dec release), so GDE and others relying on fixed margin
170	can read this.

Note: See TracBrowser for help on using the repository browser.

Download in other formats:

Original Format