source: branches/stable/lib/submit/submiss.embl

Last change on this file was 2, checked in by oldcode, 24 years ago

Initial revision

  • Property svn:eol-style set to native
  • Property svn:executable set to *
  • Property svn:keywords set to Author Date Id Revision
File size: 15.4 KB
Line 
1                       SEQUENCE DATA SUBMISSION FORM
2
3
4This form solicits the information needed for a  nucleotide  or  amino  acid
5sequence  database  entry.   It  can  be  filled in using any text editor or
6printed and filled in by  hand.   By  completing  and  returning  it  to  us
7promptly  you  help  us  to  enter  your data in the database accurately and
8rapidly.  These data will be shared among  the  following  databases:   EMBL
9Data Library (Heidelberg, W.  Germany); GenBank (Los Alamos, NM, U.S.A.  and
10Mountain View, CA, U.S.A), DNA Data Bank  of  Japan  (DDBJ;  Tokyo,  Japan);
11National  Biomedical  Research  Foundation  Protein  Identification Resource
12(NBRF-PIR; Washington, D.C.,  U.S.A.);  Martinsried  Institute  for  Protein
13Sequence  Data  (MIPS;  Martinsried,  W.  Germany) and International Protein
14Information Database in Japan (JIPID; Tokyo).
15
16Please answer all questions which apply to your data.  If you  submit  2  or
17more  non-contiguous  sequences,  copy  and  fill  out  this  form  for each
18additional sequence.  When  submitting  nucleic  acid  sequences  containing
19protein  coding  regions,  please include a translation (SEPARATELY from the
20nucleic acid sequence).  Then send us (1) this form, (2) a pre-  or  reprint
21of  any manuscript which pertains to these data, and (3) your sequence data.
22You can send these materials (a) electronically via computer network, (b) on
23magnetic tape, or (c) on a floppy diskette.  More detailed information about
24formats for submitted data is included at the end of this form.
25
26  our mailing address:     EMBL Data Library Submissions, Postfach 10.2209
27                           D-6900 Heidelberg,  West Germany
28  telephone:               (06221) 387 258
29  computer network:        datasubs@embl.earn (for data submissions)
30                           datalib@embl.earn  (for general inquiries)
31
32Please include in your submission any additional sequence data which is  not
33reported  in  your  manuscript  but  which has been reliably determined (for
34example, introns or flanking sequences).
35
36When we receive this material we will assign the data an  accession  number,
37which  serves  as  a  reference  that  permanently  identifies  them  in the
38database.  We will inform you what accession  number  your  data  have  been
39given  and  we  recommend  that you cite this number when referring to these
40data in publications.
41
42If new data become available  which  would  make  the  database  entry  more
43informative  (e.g.,  function  of  the gene product or location of important
44sites within the sequence), or if you discover errors in  the  sequence,  we
45urge you to contact us so that we can update your entry.
46
47Thank you.
48
49
50I.  GENERAL INFORMATION
51==============================================================================
52Your name    $(YOUR NAME)
53------------------------------------------------------------------------------
54Institution  $(INSTITUTION)
55------------------------------------------------------------------------------
56Address      $(ADDRESS)
57------------------------------------------------------------------------------
58Computer mail address $(MAIL)  Telex number
59------------------------------------------------------------------------------
60Telephone $(PHONE)       Telefax number  $(TELEFAX)
61==============================================================================
62On what medium and in what format are you sending us your sequence data?
63(see instructions at the end of this form)
64  [X] electronic mail
65  [ ] diskette
66        computer:Commodore        operating system:MS DOS           editor:
67  [ ] magnetic tape
68        record length:             blocksize:              label type:
69        density         [ ] 800    [ ] 1600     [ ] 6250
70        character code  [ ] ASCII  [ ] EBCDIC
71==============================================================================
72
73
74II.  CITATION INFORMATION
75==============================================================================
76These data are  [ ] published  [X] in press  [ ] submitted  [ ] in preparation
77                [ ] no plans to publish
78------------------------------------------------------------------------------
79authors $(author)
80------------------------------------------------------------------------------
81title of paper $(title)  ------------------------------------------------------------------------------
82journal       volume, first-last pages, $(journal) ------------------------------------------------------------------------------
83Do you agree that these  data can be made  available in the  database before
84they appear in print?
85  [x] yes    [ ] no, they should be made available only after publication.
86                 estimated date: $(DATE)
87==============================================================================
88Does the sequence  which you are  sending with this form  include  data that
89do NOT appear in the above citation?
90  [X] no    [ ] yes, from position ______ to ______  [ ] base pairs OR
91                                                     [ ] amino acid residues
92             (If your sequence contains 2 or more such spans, use the feature
93              table in section IV to indicate their positions)
94If so, how should these data be cited in the database?
95  [ ] published  [ ] in press  [ ] submitted  [ ] in preparation
96  [ ] no plans to publish
97------------------------------------------------------------------------------
98authors
99------------------------------------------------------------------------------
100address (if different from that given in section I)
101
102
103------------------------------------------------------------------------------
104title of paper
105
106------------------------------------------------------------------------------
107journal                     volume, first-last pages, year
108==============================================================================
109List references to papers  and/or  database  entries which report sequences
110overlapping with that submitted here.
111
1121st author     journal, vol., pages, year and/or database, accession number
113------------------------------------------------------------------------------
114
115------------------------------------------------------------------------------
116
117==============================================================================
118
119
120III.  DESCRIPTION OF SEQUENCED SEGMENT
121
122Wherever possible, please use standard  nomenclature or conventions.  If  a
123question  is not applicable to your sequence, answer by writing N.A. in the
124appropriate space; if the information is relevant but not available,  write
125a question mark (?).
126==============================================================================
127What kind of molecule did you sequence?   (check all boxes which apply)
128
129 [X] genomic DNA     [ ] genomic RNA      [ ] virus  or  [ ] provirus
130 [ ] cDNA to mRNA    [ ] cDNA to genomic RNA
131 [ ] organelle DNA   [ ] organelle RNA    please specify organelle:
132 [ ] tRNA            [ ] rRNA             [ ] snRNA      [ ] scRNA
133 [ ] other nucleic acid.  please specify:
134 [ ] peptide  [ ] sequence assembled by  [ ] overlap of sequenced fragments
135                                         [ ] homology with related sequence
136                                         [ ] other.  please specify:
137
138              [ ] partial:               [ ] N-terminal
139                                         [ ] C-terminal
140                                         [ ] internal fragment
141==============================================================================
142length of sequence   $(SEQ_LEN)     [X] base pairs  or  [ ] amino acid residues
143------------------------------------------------------------------------------
144gene name(s) (e.g., lacZ) $(gene)
145------------------------------------------------------------------------------
146gene product name(s) (e.g., beta-D-galactosidase)  $(gene)
147------------------------------------------------------------------------------
148Enzyme Commission number (e.g., EC 3.2.1.23)
149------------------------------------------------------------------------------
150gene product subunit structure (e.g., hemoglobin alpha-2 beta-2)
151==============================================================================
152The following items refer to the  original source of the  molecule you have
153sequenced.
154  organism ---- name  $(full_name)
155------------------------------------------------------------------------------
156  sub-species                         strain  $(strain)
157------------------------------------------------------------------------------
158  name/number of individual or isolate  (e.g., patient 123; influenza virus
159  A/PR/8/34)
160------------------------------------------------------------------------------
161  developmental stage                        [ ] germ line   [ ] rearranged
162------------------------------------------------------------------------------
163  haplotype                    tissue type                cell type
164==============================================================================
165The  following  items  refer  to the  immediate experimental  source of the
166submitted sequence.
167  name of cell line (e.g., Hela; 3T3-L1)
168------------------------------------------------------------------------------
169  library (type; name)                          clone(s)
170==============================================================================
171The following items refer to the  position of the submitted sequence in the
172genome.
173  chromosome (or segment) name/number
174------------------------------------------------------------------------------
175  map position                   units:  [ ] genome %  [ ] nucleotide number
176                                         [ ] other:
177==============================================================================
178Using single words or short phrases, describe the properties of the sequence
179in terms of:
180
181  -  its associated phenotype(s);
182  -  the biological/enzymatic activity of its product;
183  -  the general functional  classification of the gene  and/or gene product
184  -  macromolecules to which the gene product can bind  (e.g., DNA, calcium,
185     other proteins);
186  -  subcellular localization of the gene product;
187  -  any other relevant information.
188
189Example (for the viral erbB nucleotide sequence): transforming capacity; EGF
190receptor-related; tyrosine kinase; oncogene; transmembrane protein.
191
192 
193==============================================================================
194
195
196IV.  FEATURES OF THE SEQUENCE
197
198Please  list  below  the  types  and  locations of all significant  features
199experimentally  identified within the sequence.   Be sure that your sequence
200is numbered beginning with "1."
201
202In the column marked                   fill in
203
204      feature          type of feature (see information below)
205      from             number of first base/amino acid in the feature
206      to               number of last base/amino acid in the feature
207      bp               x, if numbering  refers to position of a base pair in
208                       a nucleotide sequence
209      aa               x, if numbering  refers to  position of an amino acid
210                       residue in a peptide sequence
211      id               indicate  method by which the feature was identified.
212                       E  =  experimentally;  S  =  by similarity  to  known
213                       sequence or to an established consensus sequence; P =
214                       by similarity  to  some  other  pattern,  such  as an
215                       open reading frame
216      comp             x, if feature is  located on the  nucleic acid strand
217                       complementary to that reported here
218
219Significant features include:
220
221  -  regulatory signals (e.g., promoters, attenuators, enhancers)
222  -  transcribed  regions  (e.g., mRNA, rRNA, tRNA).  (indicate reading frame
223     if start and stop codons are not present)
224  -  regions  subject to  post-transcriptional  modificaton  (e.g.,  introns,
225     modified bases)
226  -  translated regions
227  -  extent of  signal  peptide,  prepropeptide,  propeptide,  mature peptide
228  -  regions subject to post-translational modification  (e.g.,  glycosylated
229     or phosphorylated sites)
230  -  other  domains/sites  of  interest  (e.g.,  extracellular  domain,  DNA-
231     binding domain, active site, inhibitory site)
232  -  sites involved in bonding (disulfide, thiolester, intrachain, interchain)
233  -  regions of protein secondary structure  (e.g., alpha helix or beta sheet)
234  -  conflicts with sequence data reported by other authors
235  -  variations and polymorphisms
236
237The first 2 lines of the table are filled in with examples.
238
239==============================================================================
240Numbering for features on submitted sequence  [X] matches manuscript
241                                              [ ] does not match manuscript
242==============================================================================
243             feature             from        to         bp  aa   id    comp
244------------------------------------------------------------------------------
245EXAMPLE     TATA box              1           8          x        S
246------------------------------------------------------------------------------
247EXAMPLE      exon 1               9          264         x
248==============================================================================
249          $(gene)                 1     $(SEQ_LEN)      x
250
251------------------------------------------------------------------------------
252$(tax)
253------------------------------------------------------------------------------
254
255------------------------------------------------------------------------------
256
257------------------------------------------------------------------------------
258
259------------------------------------------------------------------------------
260
261------------------------------------------------------------------------------
262
263------------------------------------------------------------------------------
264
265==============================================================================
266
267
268
269FORMATS FOR SUBMITTED DATA
270
271We are happy to accept data submitted in any of the following formats:
272
273(1) Electronic file transfer:  files can be sent via  computer  network  to:
274DATASUBS@EMBL.EARN.   This  BITNET/EARN  address  can be reached via various
275gateways from Arpanet, Usenet, JANET, etc.  Ask your  local  network  expert
276for help or phone us.
277
278(2) Magnetic tapes:  9-track only  (fixed-length  records  preferred);  800,
2791600 or 6250 bpi (any blocksize); ASCII or EBCDIC character codes; any label
280type or unlabelled.
281
282(3) Floppy disks:  we can read Macintosh diskettes and 5-1/4" diskettes from
283MS-DOS systems.
284
285Whatever format you choose, we would appreciate receiving the sequence  data
286in a form which  conforms as closely as possible to the  following standards.
287
288  -  Each sequence should include the names of the authors.
289
290  -  Each distinct sequence should be listed separately using the same number
291     of  bases/residues  per  line.   The  length of each  sequence in bases/
292     residues should be clearly indicated.
293
294  -  Enumeration should begin with a "1" and continue in the  direction 5' to
295     3' (or amino- to carboxy-terminus).
296
297  -  Amino acid sequences should be listed using the one-letter code.
298
299  -  Translations of protein  coding  regions in  nucleotide sequences should
300     be submitted in a  separate computer file from the  nucleotide sequences
301     themselves.
302
303  -  The code for representing the sequence  characters should conform to the
304     IUPAC-IUB standards, which are described in:  Nucl. Acids Res. 13: 3021-
305     3030  (1985)  (for  nucleic  acids)  and  J. Biol. Chem. 243:  3557-3559
306     (1968) and Eur. J. Biochem 5: 151-153 (1968) (for amino acids).
307
308$(SEQUENCE)
309
310
Note: See TracBrowser for help on using the repository browser.