rev 1/21/94 GenBank DATA SUBMISSION FORM This form solicits the information needed for a nucleotide and/or amino acid sequence data bank entry. By completing and returning it to us promptly you help us to enter your data in the database accurately and rapidly. Please answer all questions which apply to your data. If you submit two or more non-contiguous sequences, please copy and fill out this form for each additional sequence. Please include in your submission any additional sequence data which is not reported in your manuscript but which has been reliably determined (for example, introns or flanking sequences). When submitting nucleic acid sequences containing protein coding regions, please include a translation (SEPARATELY from the nucleic acid sequence). Then send (1) this form, (2) a copy of your manuscript (if available) and (3) your sequence data (in machine readable form) to the address shown below. Information about the various ways you can send us your data and about formats for the sequence data is given in the following sections. SUBMITTING DATA TO GENBANK We can process sequence and annotation data submitted in any of the following ways: 1. ELECTRONIC FILE TRANSFER: files can be sent via Internet e-mail to gb-sub@ncbi.nlm.nih.gov 2. FLOPPY DISKS: Macintosh or DOS systems (all sizes and densities): if using word processing software, the file should be sent as an ASCII text file rather than as a software-specific file. 3. PRINTED COPY: as a last resort only! Please do not reduce the size of the letters in the sequence. Our address is: GenBank Submissions National Center for Biotechnology Information 8600 Rockville Pike Telephone: (301) 496-2475 Building 38A, Room 8N-803 Bethesda, MD 20894 Electronic mail: U.S.A. gb-sub@ncbi.nlm.nih.gov ACCESSION NUMBERS An accession number is permanently assigned to each sequence submitted to the database. We will assign an accession number upon receipt of this form and return it to you within seven days, or contact you if there are errors. We recommend that you cite this number when referring to both these data and the article where they were originally reported. If you are forwarding this number on to a journal, please send a photocopy or facsimile of the notification received from GenBank; do not send the number over the telephone. If your manuscript has already been accepted for publication, the accession number should be included at the galley proof stage as a note added in proof. If the journal has not already provided a format, we suggest that the note added to the manuscript or in the galley proof should be inserted as a footnote on the title page and read approximately as follows: "The nucleotide sequence data reported in this paper have been submitted to GenBank and assigned the accession number M12345." FORMATS FOR SUBMITTED DATA We would appreciate receiving the sequence data in a form which conforms as closely as possible to the following standards: o Each sequence should include the names of the authors. o Each distinct sequence should be listed separately using the same number of bases/residues per line and clearly indicating its length in bases/residues. o Enumeration of distinct sequences should begin with a "1" and ascend in the direction 5' to 3' (amino- to carboxy-terminus). o Amino acid sequences should be listed using the one-letter code. The code for representing the sequence characters should conform to the IUPAC-IUB standards, which are described in the following references: Nucl. Acids Res. 13: 3021-3030 (1985) for nucleotides, and J. Biol. Chem. 243: 3557-3559 (1968) for amino acids. _________________________________________________ These data will be shared among the following databases: EMBL Data Library (Heidelberg, Federal Republic of Germany); GenBank (Bethesda, MD, U.S.A.); DNA Data Bank of Japan (DDBJ; Mishima, Japan); National Biomedical Research Foundation Protein Identification Resource (NBRF-PIR; Washington, D.C., U.S.A.); Martinsried Institute for Protein Sequence Data (MIPS; Martinsried, Federal Republic of Germany) and International Protein Information Database in Japan (JIPID; Noda, Japan). I. GENERAL INFORMATION ============================================================================== Your last name first name middle initials ------------------------------------------------------------------------------ Institution ------------------------------------------------------------------------------ Address ------------------------------------------------------------------------------ Computer mail address Telex number ------------------------------------------------------------------------------ Telephone Telefax number ============================================================================== On what medium and in what format are you sending us your sequence data? (see instructions at the beginning of this form) [ ] electronic mail [ ] diskette computer: operating system: editor: filename: [ ] magnetic tape (specify format) ============================================================================== II. CITATION INFORMATION ============================================================================== These data represent [ ]new submission [ ]correction (if correction, Accession number: ) ============================================================================== These data are [ ] published [ ] in press [ ] submitted [ ] in preparation [ ] no plans to publish ------------------------------------------------------------------------------ authors ------------------------------------------------------------------------------ title of paper ------------------------------------------------------------------------------ journal volume, first-last pages, year ------------------------------------------------------------------------------ Do you agree that these data can be made available in the database before they appear in print? [ ] yes [ ] no, they can be made available after: (date) ============================================================================== Does the sequence which you are sending with this form include data that does NOT appear in the above citation? [ ] no [ ] yes, from position _______ to _______ [ ] bases OR [ ] amino acid residues (If your sequence contains 2 or more such spans, use the feature table in section IV to indicate their positions) If so, how should these data be cited in the database? [ ] published [ ] in press [ ] submitted [ ] in preparation [ ] no plans to publish ------------------------------------------------------------------------------ authors ------------------------------------------------------------------------------ address (if different from that given in section I) ------------------------------------------------------------------------------ title of paper ------------------------------------------------------------------------------ journal volume, first-last pages, year ============================================================================== List references to papers and/or database entries which report sequences overlapping with that submitted here. 1st author journal, vol., pages, year and/or database, accession number ------------------------------------------------------------------------------ ------------------------------------------------------------------------------ ============================================================================== III. DESCRIPTION OF SEQUENCED SEGMENT Wherever possible, please use standard nomenclature or conventions. If a question is not applicable to your sequence, answer by writing N.A. in the appropriate space; if the information is relevant but not available, write a question mark (?). ============================================================================== What kind of molecule did you sequence? (check all boxes which apply) [ ] genomic DNA [ ] genomic RNA [ ] cDNA to mRNA [ ]cDNA to genomic RNA [ ] organelle DNA [ ] organelle RNA please specify organelle: [ ] tRNA [ ] rRNA [ ] snRNA [ ] scRNA for viruses: [ ] virus or [ ] provirus or [ ] viroid [ ] DNA or [ ] RNA [ ] ds or [ ] ss or [ ] circular [ ] enveloped or [ ] nonenveloped [ ] other nucleic acid. please specify: [ ] peptide [ ] sequence assembled by [ ] overlap of sequenced fragments [ ] homology with related sequence [ ] other. please specify: [ ] partial: [ ] N-terminal [ ] C-terminal [ ] internal fragment ============================================================================== length of sequence [ ] bases or [ ] amino acid residues ------------------------------------------------------------------------------ gene name(s) (e.g., lacZ) ------------------------------------------------------------------------------ gene product name(s) (e.g., beta-D-galactosidase) ------------------------------------------------------------------------------ Enzyme Commission number (e.g., EC 3.2.1.23) ------------------------------------------------------------------------------ gene product subunit structure (e.g., hemoglobin alpha-2 beta-2) ============================================================================== The following items refer to the original source of the molecule you have sequenced. organism (species) (e.g., Mus musculus) plant cultivar ------------------------------------------------------------------------------ strain (e.g., K12, BALB/c) substrain ------------------------------------------------------------------------------ name/number of individual/isolate (e.g., patient 123; influenza virus A/PR/8/34) ------------------------------------------------------------------------------ developmental stage [ ] germ line [ ] rearranged ------------------------------------------------------------------------------ haplotype tissue type cell type ------------------------------------------------------------------------------ allele variant [ ] macronuclear ============================================================================== The following items refer to the immediate experimental source of the submitted sequence. name of cell line (e.g., Hela; 3T3-L1) or plant cultivar ------------------------------------------------------------------------------ clone library clone(s), subclone(s) ============================================================================== The following items refer to the position of the submitted sequence in the genome. chromosome (or segment) name/number ------------------------------------------------------------------------------ map position units: [ ] genome % [ ] nucleotide number [ ] other: ============================================================================== Using single words or short phrases, describe the properties of the sequence in terms of: - its associated phenotype(s); - the biological/enzymatic activity of its product; - the general functional classification of the gene and/or gene product - macromolecules to which the gene product can bind (e.g., DNA, calcium, other proteins); - subcellular localization of the gene product; - any other relevant information. Example (for the viral erbB nucleotide sequence): transforming capacity; EGF receptor-related; tyrosine kinase; oncogene; transmembrane protein. ============================================================================== IV. FEATURES OF THE SEQUENCE Please list below the types and locations of all significant features experimentally identified within the sequence. Be sure that your sequence is numbered beginning with "1." Use < or > if a feature extends beyond the beginning or end of the indicated sequence span. In the column marked fill in feature type of feature (see information below) from number of first base/amino acid in the feature to number of last base/amino acid in the feature bp an "x" if numbering refers to position of a base pair in a nucleotide sequence aa an "x" if numbering refers to position of an amino acid residue in a peptide sequence id indicate method by which the feature was identified. E = experimentally; S = by similarity with known sequence or to an established consensus sequence; P = by similarity to some other pattern, such as an open reading frame comp an "x" for a nucleotide sequence feature located on strand complementary to that reported here Significant features include: - regulatory signals (e.g., promoters, attenuators, enhancers) - transcribed regions (e.g., mRNA, rRNA, tRNA). (indicate reading frame if start and stop codons are not present) - regions subject to post-transcriptional modificaton (e.g., introns, modified bases) - translated regions - extent of signal peptide, prepropeptide, propeptide, mature peptide - regions subject to post-translational modification (e.g., glycosylated or phosphorylated sites) - other domains/sites of interest (e.g., extracellular domain, DNA- binding domain, active site, inhibitory site) - sites involved in bonding (disulfide, thiolester, intrachain, interchain) - regions of protein secondary structure (e.g., alpha helix or beta sheet) - conflicts with sequence data reported by other authors - variations and polymorphisms The first 2 lines of the table are filled in with examples. ============================================================================== Numbering for features on submitted sequence [ ] matches manuscript [ ] does not match manuscript ============================================================================== feature from to bp aa id comp ------------------------------------------------------------------------------ EXAMPLE TATA box 1 8 x S ------------------------------------------------------------------------------ EXAMPLE exon 1 9 >264 x ============================================================================== ------------------------------------------------------------------------------ ------------------------------------------------------------------------------ ------------------------------------------------------------------------------ ------------------------------------------------------------------------------ ------------------------------------------------------------------------------ ------------------------------------------------------------------------------ ------------------------------------------------------------------------------ ============================================================================== V. SEQUENCE DATA Please enter the nucleotide sequence data here: Please enter the translated amino acid sequence here: