CA1341067C

CA1341067C - Construction of synthetic dna and its use in large polypeptide synthesis

Info

Publication number: CA1341067C
Application number: CA000550963A
Authority: CA
Inventors: Franco A. Ferrari; Joseph Capello; Charles Richardson; John W. Crissman; Thomas J. Pollock; Stuart C. Causey; James Chambers
Original assignee: Protein Polymer Technologies Inc
Current assignee: Protein Polymer Technologies Inc
Priority date: 1986-11-04
Filing date: 1987-11-03
Publication date: 2000-08-01
Anticipated expiration: 2017-08-01

Abstract

Methods are provided for the production of large polypeptides containing repeating sequences of amino acids utilizing biochemical techniques, specifically DNA sequences coding for the expression of the large polypeptides. Systems utilizing exogenous transcriptional and translational regions to control the production of the large polypeptides are also provided.

Description

CONSTRUCTTON OF SYNTHETIC DNA AND ITS
USf: IN LARGE POLYPEPTIDE SYNTHESIS

The :field is related to the production of high-molecula~-weight polymers, either nucleic acids or a tides that are the ex ression p p p products of the nucleic acids, and is particularly related to the pro-duction of hid;h-molecular-weight peptides containing repeating sequences by biochemical processes, the pep-tide finding use as structured materials.
Recocabinant DNA technology has been applied in the isolation of natural genes and the expression of these genes in a variety of host cells. Typically, this technoloF;y has had utility in producing biologic-ally active polypeptides, such as interferons or pep-tide hormones" which were impractical to produce in useful amount: by other means. It was also possible to produce modified proteins by isolating natural genes and utilizing the techniques of site specific, in vitro mutagenesis to alt,e:r these genes and thereby change the polypeptides produced. Other polypeptides have been ~34~ 067 created by combining sections of various native genes to produce new polypeptides that are chimeric molecules of the several naturally occurring molecules.
With the advent of efficient and automated methods for t;he chemical synthesis of DNA, it has become possible t.o synthesize entire genes and to modify such :~yntY~etic genes at will during the course of synthesis. However, these various technologies have been applied to the production of natural or modified versions of natural polypeptides. There have been very few attempts to use these technologies to create sub-stantially new pol;ypeptides. In nature, polypeptides have a wide ranges of chemical, physical and physio-logical characteristics. Nevertheless there are com-mercial applications for which known, naturally occurring po7.ypept:ides are not appropriate.
Whi7.e biotechnology is versatile, usually it has been limited in its applications to naturally occurring products or modifications of naturally occur-ring molecules. One great strength of organic chemical synthesis, b3~ contrast, has been the ability to trans-form inexpensive carbon materials to a wide variety of polymeric mo7.ecul.es, including naturally occurring molecules, but most importantly entirely new chemical structures, ouch as polypropylene and polyacrylates, which have de~fine~d and predicted chemical properties not. associated with naturally occurring molecules.
Such materials, particularly high-molecular-weight polymers containing repeating sequences of amino acids, have proven difficult to produce by biochemical means. The E;enes necessary for producing large pep-tides containing repeating units of amino acids were unstable and often underwent intermolecular recombi-nation causing deletions of repeating units in the gene. The development of a biotechnology which would produce polymeric: molecules by biological processes similar to those available by organic synthesis would significantly broaden the range of applications of biotechnology., The <:loning of multiple lactose operators up to four in tandem is disclosed by Sadler et _al., Gene, (1980) 8:279-300. kiybrid bacterial plasmids containing highly repeatEad satellite DNA is disclosed by Brutlag et al., Cell, (1977) 10:509-519. The synthesis of a poly(aspartyl-~phenylalanine) in bacteria is disclosed by Doel et al", Nucleic Acids Research, (1980) 8:4575-4592. A method for enriching for proline content by cloning a pla:jmid which codes for the production of a proline polymE;r was disclosed by Kangas et al., Applied and Environmental Microbiology, (1982) 43:629-635. The biological lio~itations on the length of highly repeti-tive DNA sequences that may be stably maintained within plasmid repli<;ons is discussed by Gupta et al. in Bio/Technology, p. 602-609, September 1983.
This in~rention provides a DNA sequence encoding a peptide containing an oligopeptide repeating unit, which repeating unit is characterized by containing at least three different amino acids and a total of from 4 to 30 amino a~~ids, there being at least two repeating units in said peptide and at least two identical amino acids in each repeating unit and wherein said units are optionally joined by an amino acid bridge of from about 1 to 15 amino a~~ids, said DNA sequence having the following formula:
Rk(~xNYyliLl wherein:
K and L ~3re each DNA sequences encoding an amino acid sequence of from about 1 to 100 amino acids, R and L
being fewer than about 20 number % of the total amino acids;
k and 1 are 0 or 1;
.;,.....

1 X41 Ofi 7 3a W is of the formula:
I(A)n (B)p]q A is a DN~~ sequence coding for said oligopeptide repeating unit where A will contain from about 12 to 90nt, at least two codons coding for said identical amino acid in said repeating units being different, where there will be at lea:;t two different A's differing by at least one nucleotide;
B is a DN~~ sequence different from A coding for other than the oligopeptide unit coded by the A unit and having from about 3 to 45nt, where the B units may be the same or different;
n is an integer in the range of 1 to 100;
each p is independently 0 or 1; and q is at least 1 and is selected so as to provide a DNA sequence of: at least 90nt;
M and N are the same or different and are a DNA
sequence of coclons and are of 0 to l8nt in reading frame with W and X respectively;
X is the same as or different from W and is of the formula:
f(A1)nl(B1)pl]ql Y is the same as or different from W and is of the formula:
I(A2)n2(B2)p2]q2 wherein:
all of the symbols come within the definitions of their letter counterparts;
x and y are 0 or. 1;
i is 1 to 100; and the total of q, ql, and q2 is at least 1 and not greater than about 50.
This invention also provides a DNA sequence encoding a peptide containing an oligopeptide repeating unit, which repeating unit is characterized by containing 1 341 06 ?
3b at least three: different amino acids and a total of from 4 to 30 amino acids, there being at least two repeating units in said peptide and at least two identical amino acids in each repeating unit and wherein said units are optionally joined by an amino acid bridge of from about 1 to 15 amino acids, said DNA sequence having the following formula:
KkI(A)n (B)p7qLl wherein:
R and L are each DNA sequences encoding an amino acid sequence of from about 1 to 100 amino acids, R and L
being fewer than about 20 number % of the total amino acids;
k and 1 are 0 or 1;
A is a DNA sequence coding for said oligopeptide repeating unit where A will contain from about 12 to 90nt, at least two codons coding for said identical amino acid in said repeating units being different, where there will be at least two different A's differing by at least one nucleotide;
B is a DNA sequence different from A coding for other than the oligopeptide unit coded by the A unit and having from about :3 to 45nt, where the B units may be the same or different;
n is an integer in the range of 1 to 100;
each p is independently 0 or 1; and q is at least 1 and is selected so as to provide a DNA sequence of at least 90nt.
This invention also provides a DNA sequence encoding a peptide containing an oligopeptide repeating unit, which re~~eating unit is characterized by comprising the amino acid sequence GAGAGS or GVGVP, said DNA
sequence comprising the following formula:
((A)n(B)plq wherein:

~ X41 ~6 7 3c A is a DN.A sequence coding for said oligopeptide repeating unit, wherein the codons for at least two G's in the same or different A's are different, where there will be at least two different A's differing by at least one nucleotide;
B is a DN,A sequence different from A coding for other than the oligopeptide unit coded by the A unit and having from ab~~ut 3 to 45nt, where the B units may be the same or differ~snt;
n is an integer in the range of 5 to 25;
each p is independently 0 or 1; and q is at least 1 and is selected so as to provide a DNA sequence of at least 90nt.
This invention also provides a polypeptide comprising the recombinant DNA expression product of a DNA sequence encoding a peptide having repeating units of an oligopeptid~~, which oligopeptide is characterized by having at least three different amino acids and a total of from 4 to 3~0 am:Lno acids, there being at least two repeating units in said peptide and at least two identical amine acids in each repeating unit and wherein said units are optionally joined by an amino acid bridge of from about L to 15 amino acids, said DNA sequence having the fol:Lowing formula:
Kk(WMXxNYy)iLl wherein:
K and L a:re each DNA sequences encoding an amino acid sequence of from about 1 to 100 amino acids, K and L
being fewer than about 20 number ~ of the total amino acids;
k and 1 a:re 0 and 1;
W is of the formula:
((A)n (B)plq A is a DNA sequence coding for said oligopeptide repeating unit where A will contain from about 12 to 3d 90nt, at least: two codons coding for said identical amino acids in said repeating unit being different, where there will be at least two different A's differing by at least one nucleotide;;
B is a Dr~A sequence different from A coding for other than the' oli.gopeptide unit coded by the A unit and having from about 3 to 45nt, where the H units may be the same or different;
n is an integer in the range of 1 to 100;
each p iFC independently 0 or 1; and q is at least 1 and is selected so as to provide a DNA sequence of at. least 90nt.
Thi:~ invention also provides a peptide comprising then recombinant DNA expression product of a DNA sequence e~ncodling a peptide having repeating units of an oligopeptide, which oligopeptide is characterized by having at lea:>t three different amino acids and a total of from 4 to a0 amino acids, there being at least two repeating units in said peptide and at least two identical amino acids in each repeating unit and wherein said units area optionally joined by an amino acid bridge of from about 1 to 15 amino acids, said DNA sequence having the fol.lowi.ng formula:
Rk~(A)n (H)plqL1 wherein:
K and L ~~re each DNA sequences encoding an amino acid sequence of from about 1 to 100 amino acids, R and L
being fewer than about 20 number $ of the total amino acids;
k and 1 ~~re 0 or 1;
A is a DDIA sequence coding for said oligopeptide repeating unit: where A will contain from about 12 to 90nt, at least: two codons coding for said identical amino acids in said repeating unit being different, where there will be at le<<st two different A's differing by at least one nucleotides;
."., .

~ 341 06 7 3e B is a DNA sequence different from A coding for other than the oligopeptide unit coded by the A unit and having from about 3 to 45nt, where the B units may be the same or different;
n is an integer in the range of 1 to 100;
each p is independently 0 or 1; and q is at least 1 and is selected so as to provide a DNA sequence of at least 90nt.
This invention also provides a regulatory system for regulated transcription in a prokaryotic host comprising:
(1) a DN.A sequence which comprises a structural gene encoding an RNA polymerase exogenous to said host which is capable of transcribing DNA to messenger RNA and which is under the transcriptional control of an inducible promoter and (2) a structural gene encoding a polypeptide of interest under the transcriptional control of a promoter which is not f~snctional with the endogenous RNA
polymerase of said host, but is functional with said exogenous RNA ~~olymerase.
This invention also provides a method for producing a peg?tide comprising the recombinant DNA
expression product of a DNA sequence encoding a peptide having repeating units of an oligopeptide, which oligopeptide i;a characterized by having at least three different amino acids and a total of from 4 to 30 amino acids, there being at least two repeating units in said peptide and at least two identical amino acids in each repeating unit and wherein said units are optionally joined by an amino acid bridge of from about 1 to 15 amino acids, s~iid DNA sequence having the following formula:
Kk(WMXxNYy)iLl wherein:
:.
,.... _...

3f R and L a.re each DNA sequences encoding an amino acid sequence of from about 1 to 100 amino acids, R and L
being fewer than about 20 number % of the total amino acids;
k and 1 are 0 ar 1;
W is of the formula:
f(Aln (B)p1q A is a DNA sequence coding for said oligopeptide repeating unit where A will contain from about 12 to 90nt, at least two codons coding for said identical amino acids in said repeating unit being different, where there will be at least two different A's differing by at least one nucleotide;
H is a DNA sequence different from A coding for other than the oligopeptide unit coded by the A unit and having from about 3 to 45nt, where the B units may be the same or different;
n is an integer in the range of 1 to 100;
each p is independently 0 or 1; and q is at least 1 and is selected so as to provide a DNA sequence of at least 90nt;
M and N are the same or different and are a DNA
sequence of codons and are of 0 to l8nt in reading frame with W and X respectively, encoding an amino acid sequence;
X is the same as or different from W and is of the formula:
((A1)nl(B1)p11q1 Y is the same as or different from W and is of the formula:
~(A2)n2(B2)p2~q2 wherein:
all of th~~ symbols come within the definitions of their letter c~~unterparts;
f' /~y a ~H1 .,.. ..

~ X41 Q6 7 3g x and y are 0 or 1;
i is 1 to 100; and the total of q, ql, and q2 is at least 1 and not greater than about 50;
said method comprising:
growing a prokaryotic host according to claim 20, wherein said DNA sequence is under the transcriptional and translational regulation of initiation and termination regulatory regions functional in said host, whereby said peptide is expressed; and isolating the expression product.
This invention also provides a method of preparing a synthetic DNA sequence having at least about 80~ of the synthesized sequence encoding repeating units of from 4 to 8 amino acids, said repeating units being varied as to nucleotide sequence utilizing codon redundancy and encoding a protein of at least 10 kdal, said method comprising:
(1) synthesizing a DNA monomer encoding not greater than 200 amino acids, (2) cloning said monomer in a cloning vector, (3) sequencing said monomer either in portions or in its entirety, (4) excising said monomer from said cloning vector, and (5) oligomerizing said monomer to provide at least one multimer comprising at least two monomers;
wherein two or more different multimers encoding different amino acid units may be joined together to form a block copolymer and wherein the sequences of said monomer and vector are selected to permit insertion of said segments and excision of said monomer by restriction enzyme digestion.
This invention also provides a method of preparing a synthetic DNA sequence having at least about 80~ of the synthesized sequence encoding repeating units of from 4 to 8 amino acids, said repeating units being varied as to nucleotide sequence utilizing codon . ~ 3'ff.:, .."., 3h redundancy and. encoding a protein of at least 10 kdal, said method comprising:
in a first step synthesizing a monomer by:
(A) (1) synthesizing at least two different double stranded sections of DNA of from about 12 to 120 bases, (2) cloning said sections of DNA in a cloning vector to form. a prior segment, (3) sequencing said prior segment to ensure the fidelity of replication of said segment, (4) sequentially adding one or more additional DNA
segments comprising said sections of DNA 3' or 5' of the prior segment in reading frame with the prior segment by repeatedly cloning said sections of DNA into a cloning vector with the prior segment to form a monomer, and sequencing the prior' segment and additional segments to ensure fidelity of replication, and (5) excising the monomer from the cloning vector;
or (B) (1) synthesizing at least two different pairs of single strands of :DNA of from about 12 to 120 bases, wherein each of the strands of a pair overlap, (2) hybridizing each of said pairs of single strands to provide successive segments, (3) cloning said segments in a cloning vector to form a monomer, (4) sequencing said monomer to ensure the fidelity of replication of each of said segments, and (5) excising t:he monomer from the cloning vector;
or (C) (1) synthesizing a DNA monomer encoding not greater than 200 amine acids, (2) cloning said monomer in a cloning vector, and (3) excising the monomer from the cloning vector;
and oligomerizing said monomer to provide at least one multi.mer comprising at least two monomers; wherein two or more different multi.mers encoding different amino acid units may be joined together to form a block copolymer, and wherein th.e sequences of said segments, monomer and a ~.w._, " _ 1341a~7 3i vector are se7_ected to permit insertion of said segments and excision of said monomer by restriction enzyme digestion.
Methods and compositions are provided for the production of polypeptides having repetitive oligomeric units by expression of a synthetic structural gene.
The individual, units coding for an oligomeric peptide sequence are varied as to nucleotide sequence utilizing amino acid cod;on redundancy. Long nucleic acid sequences are built up by synthesizing nucleic acid oligomers which express a plurality of individual repe-titive peptide units, and the oligomers are ,joined to provide a polynucleotide of the desired length. Ex-pression systems are used which provide for the growth of the subject host to high density prior to signifi-cant expression of the polypeptide product, followed by induction of expression to provide high yields of the polypeptide product, which can be isolated Prom the ' '1.

~ 341 06 7 host cells. In one embodiment, a system is employed where the transcription init:iati.on system of the synthetic gene is not recognized by the host. RNA polymerase and a gene expressing a functional RNA po7.ymerase under inducible regulation is included in the host.
In the drawings:
Figure 1.: Plasmid pSY701 structure.
Figure a: Immunoblots of polypeptide products using antibody to (A) bE~ta-lactamase or to (B) gly-ala peptide.
Figure =. Construction of plasmids pG9/silkI (also named pG9/SlpI-4; containing 4 units of SlpI) and pGlO/silkI
(also named pGlO/~~lpI-4; containing 4 units of SlpI).
Figure 9: Immunoblots of polypeptide products (A) T7gp10/silkI (synonymou~~ to T7gp10/SlpI and T7gp10/SlpI-4) with anti-Slp Ab, (B) T7gp9/silkI (synonymous to T7gp9/SlpI
and T7gp10/SlpI-4) with anti-Slp Ab, or (C) staining with Coomassie blue. The fu:~ion proteins comprise 4 units of the silk-like protein I monomer fused to either T7gp10 ro T7gp9 sequences. The respective fusion polypeptides are indicated as gpl0/silkI fusion and gp9/silkI fusion.
Figure 5: Con~~truction flowchart for plasmid pSY856.
Figure 6: Time course for accumulation of the kanamycin-resistance gene product with the T7 system.
Figure 7: Con~~truction flowchart for plasmids pSY857, pSY325 and pSY9=;7.
Figure 8: Con~~truction flowchart for plasmid pSY980.
Figure 9: (A) Amido black stain of gel containing the product of beta-galactosidase/SIpIII gene fusion; (B) immunoblot of same product with anti-Slp antibody.
Figure 10: Construction flowchart for plasmid pSY1280.
Novel polype;pt.ides are provided which are polyoligomers of repeating, relatively short, amino acid sequence units. The oligomers may be linked by spacers of different amino acid sequence. The novel polypeptides therefore contain reptit.ive amino acid sequences and are particularly useful as fibrous pro-teins, including elastomeric. The gene encoding the repeating-unit-containing peptides is produced to par-ticularly avoid problems previously associated with 5 genes containing multiple repeating units.
The genes of the subject invention comprise multimers of DNA sequences encoding the same amino acid sequence unit, where two or more different multimers encoding different amino acid units may be joined together to form a block copolymer. The individual units will have from 4 to 30 amino acids (12 to 120 nt), more usually 4 to 25 amino acids (12 to 75 nt), particularly 4 to 8 amino acids, usually having the same amino acid appear at least twice in the same unit, generally separated by at least one amino acid. The units of the multimer coding for the same amino acid sequence may involve two or more nucleotide sequences, relying on the coda n redundancy to achieve the same amino acid sequence.
For the most part the DNA compositions of this invention may be depicted by the following formula:
Kk (W M XX N Yy)i L1 wherein:
K is a DN.A sequence encoding an amino acid sequence of from about 1 to 100 amino acids, usually 1 to 60 amino acids, which may be any sequence, generally being fewer than about 20~ of the total number of amino acids, more generally being fewer than about 10~ of the total number of amino acids, which may be any sequence, particulary a naturally occurring sequence where the multimer structural gene has been fused to another DNA
sequence in reading frame. K will have the initiation methionine codon.
k is 0 or 1 ;

W has the formula:
C(A)n (B)p~q wherein:
A is a DNA, sequence coding each time that it appears for the same amino acid sequence unit normally having at least one amino acid appear at least twice in the sequence, where A will generally be from about 12 to 90 nucleotides (nt), more usually for about 12 to 75 nucleotides;
where there will usually be at least two dif-ferent A's, usually not more than ten different A's, more usually not me>re than six different A's, which code for the same amino acid sequence but differ from each other~by at least one nucleotide and may differ by as many as ten nucleotides, usually not differing by more than about five nucleotides from another A
sequence, each of t:he different A's usually being repeated at least twice; at least two different codons are employed for the same amino acid, e.g., GGC and GGA
for glycine, in different A's coding for the same amino acid sequence unit;
n will be an integer of at least 2, usually at least about 8, and not more than about 250, usually not more than about 200, frequently not more than about 125, and in some instances may not exceed about 50;
B is a DNA sequence different from A coding for an amino acid sequence other than the amino acid sequence unit coded by the A unit and serves as a linking unit between oligomers of A units. B will generally have from about 3 to 45 nt, (1 to 15 amino acids) more usually from about 3 to 30 nt (1 to 10 amino acids);
where the B units appearing in the gene may be the same or different, there usually not being more than about 10 different B units, more usually not more than about 5 different B units, where the B units may 1 341 Ofi 7 differ in from 1 to X45 nt, more usually from about 1 to 15 nt, where tile different B's may code for the same or different amino ac:d sequence;
p is 0 or 1 and may differ each time there is a successive A unit:;
q is an integer of at least 1 and will vary with the numbe~~ of nucleotides in A and B, as well as the values of n and p. The variable q will be selected so as to provi~3e for at least 90 nucleotides for the multimeric por'~ion of the structural gene, preferably at least about 150nt, more preferably at least ~150nt, and most preferably at least 900 nucleotides, and the number of nuclnoti<ies will usually not exceed about 10,000, more u:;uall_y not exceeding about 8,000, gener-ally being i.n the r°ange of about 900 to 6,000, more usually to about 5,.000; and M is a DNA nucleotide sequence of 0 to 18 nt, which may encode any amino acid sequence, usually including amino acids of A and/or B, generally limited to the amino a~~ids of A and/or B;
X may be the same as or different from W, usu-ally different, and will have the formula CA1)n1 (B1)p17q1 wherein:
A1, B~, N1, p1 and q1 are the same as or different from A, B, n, p and q respectively, at least one being different, wherein the analogous symbols come within the same definition as their counterparts;
x is 0 or 1;
N is the same as or different from M and comes within the same definition as M;
Y may be the same as or different from W, usually different, and will have the formula f:A2>n2 (B2)p27q2 wherein:
A2, B2, n', p2 and q2 are the same as or different from A, 3, n, p and q respectively, at least one being different, wherein the analogous symbols come within the same definitions as their counterparts.
y is 0 or 1 ;
i is 1 to 100, usually 1 to 50, more usually 1 to 30, particularly 1, when x and y are 0;
when x or y are 1, q, q1 and q2 will be a total of at least 2, usually at least 5 and not more than about 50, usually not more than about 30.
The total number of nucleotides will be at least 90 nucleotides, usually at least about 150nt, preferably at least abut 900nt and may be 20knt (kilo-nucleotides), usually not more than about l5nt, more usually not more than about l0knt.
The polypeptide encoded by the above DNA
sequence will have the following formula:
Kk (.W~ M~ XX Nt Yy)i L1 wherein:
W~ will have the following formula C(D)n (E)p~q wherein:
D is the amino acid sequence encoded for by A
and therefore has the numerical limitations based on 3 nucleotides defining a codon that codes for one amino acid;
E is the amino acid sequence encoded for by B, and therefore has the numerical limitations based on 3 nucleotides dE:finin;g a codon, where each E may be the same or diffez~ent, depending upon the coding of B;

~ 341 06 7 r , r and, wherein, likewise K , W , M , X , N , Y
and Lr is the amino .acid sequence encoded for by K, W, M, X, N, Y and L respectively. However, in the case of K and L, subsequent: processing, such as protease treat-s ment, cyanogen bromide treatment, etc. may result in partial or completE: removal of the N- or C-terminal non-multimeric chains.
n, p, q, k, i and 1 have the same definitions as previously :indic:ated.
Parti~;ular polymeric compositions having repeating mult:imeric units having the same compositions (A) will have 'she following formula where x and y are 0, K~; C(D)n (E)P~Q Ll _ where all of the symbols have been defined previously; an~i the D1QA sE~quence will have the formula KN; C(A)n (B)p~Q L1 where all of the symbols have been defined previously.
Particular DNA sequences encoding copolymeric compositions having a repeating unit of two to three multimeric blocks will have the following formula:
Kk (W" M'r Xrr Nn YY) rr Ll i wherein:
W" is a multimer having the formula C~(A3)n3 (B3)p37q3~
where A3 is of 4 to 8, usually 4 to 6 codons, otherwise coming within the definition of A;
n3 will be from about 2 to 12, usually 2 to 10;

B3 is of from 2 to 8, usually 4 to o codons;
p3 is 0 or 1;
q3 is of from about 2 to 25, usually 2 to 20;
X" and Y" are the same as or different from 5 W", usually different, coming within the same definitions as W";
M" anc~ N" come within the definitions of M' and N';
i~~ is at least 2, usually at least 5 and not 10 more than about: 75, usually not more than about 50, generally not Exceeding 30;
with t:he other symbols as defined previously.
The compositions of the invention will usually have a molecular wE:i;ght of at least anout 5kDal, usu-ally lOkDal, preferably l5kDa1 and may have molecular weights as high or higher as 400kDa1, usually not exceeding 300kI)al, more usually not exceeding about 250kDa1, the higher ranges generally being the multimer combinations, with the individual multimer usually being less than about 150kDa1, usually less than about 100 kDal.
The nucleatide sequences which are employed will be synthesized, so that the repetitive units will have different codans for the same amino acid as described above. Usually, at least about 25~, more usually at least about 40$, and generally at least about 60~, but not greater than about 95~, preferably not greater then about 90~ of the nucleotide sequences encoding the rE~petitive units will be the same.
Greater divers::ty within those ranges will be employed where the init:Lal constructs are experimentally shown to undergo spontaneous recombination events.
Of particular interest are polypeptides which have as a repe<3tinF; 'unit SGAGAG (G = glycine; A -alanine; S = sc~rine~). This repeating unit is found in a naturally occ:uring silk fibroin protein, which can be represented as GAGAG(SGAGAG)BSGAAGY (Y = tyrosine). In 1 341 Ofi 7 the subject invention, the repeating unit is designed where the N-terminus may be MGAGAG or any other sequence of generally at least about 3 amino acids, usually at least about 5 amino acids, more usually 12 amino acids and not greater than 200, usually not greater than 100 amino acids, which may be different from the repetitive unit. Generally, a different N-terminus will be ';.he result of insertion of the gene into a vector in a manner that results in expression of a fusion protein. Any protein which does not interfere with the desired properties of the product may provide the N-terminus. Particularly, endogenous host pro-teins, e.g. bacterial proteins, may be employed. The choice of protein may depend on the nature of the transcriptional initiation region. Similarly, the C-terminus may have a.n amino acid sequence different from the repeat sequence. Conveniently, there may be from 1 to 100, usually 1 to 25 amino acids, which may be the C-terminus of a naturally occurring structural gene, which again typically results from the formation of a fusion product.
A silk-like-protein (Slp) gene may be produced by providing oligomers of from about 5 to 25 repeat units as described above, more usually of about 10 to 20 repeat units. By having different cohesive ends, the oligomers may be concatemerized to provide for the polymer having 2 or' more of the oligomeric units, usually not more than about 50 oligomeric units, more usually not more than about 30 oligomeric units, and frequently not more than about 25 oligomeric units.
The silk-like proteins may be varied by having alternate multimers with the same or different handed-ness. For example,. in the formula, (B)p may provide an even or odd number of amino acids. In silk, the hydro-gens of the glycine~ may align on one side and the methyls and hydroxyls of alanine and serine on the other. If (E)p is even, there will be continuous alignment, if odd, there will be alternating alignment of (A)n. Thus, different properties can be achieved by changing the number of amino acids encoded by (B)p.
Of particular interest are polypeptides which mimic the composition and physical properties of silk of Bo_ mbyx mori.
Also of interest are polypeptides which have as a base repeating unit GUGVP (G a glycine, V ~ valine, P = proline), which may be found in natur-ally occurring e:Lastin. In the subject invention, the N-terminus may be any convenient sequence and, if desired, may be n whole or in part removed by a pro-tease. Usually the N-terminal sequence which does not have the subject motif will be less than about 100 amino acids, more usually less than about 60 amino acids.
Of particular interest is a base sequence of about 6 to 10, preferably 8, units separated by a sequence of about 6 to 20 amino acids, usually 8 to amino acids, which may include an internal repeat different from the basic repeating unit of from a to amino acids. For example, the second repeat sequence could be GAGAGS, repeated twice. The total number of base repeating units will generally be in the range of about 150 to~ 300, usually 175 to 250. The C-terminus ~

or portion thereof may terminate with a repetitive unit or a different sequence of from 1 to 100, usually 1 to 30 amino acids. The C-terminus is not critical to the invention and will be selected primarily for conveni-ence. As with the N-terminus, it may be designed for proteolytic cleavage. As in the case of the silk pro-tein, the subject elastin-like (ELP) protein may be similarily engineered.

Of particular interest are proteins which mimic the pc~operties of elastin and provide for elastomeric properties.

~~

12a ~rnrougnout trie spc~citication, the term "SLP" refers to "silk-like protein"' (i.e. a protein comprising silk-like repeating unit;). The term "ELP" refers to "elastin-like protein" (i.e., a protein comprising elastin-like repeating units). The term "SELP" refers to silk- and elastin-like protein (i.e., a protein comprising both silk-like rep.=_ating units and elastin-like repeating units). Roman Numbers, such as I, II, III, etc. in combination wi.tn the above terms refer to particular variants of "ShP", "ELP", or "SELP". Numbers, such as l, 2, 3, etc. following the above terms refer to the number of the repeating units.

~ 341 06 7 The copolymer involving repeating units is a powerful method for varying properties, by appropriate choice of the different units, the number of units in each multimer, the spacing between them, and the number of repeats of the multimer combination assembly. Thus, by varying the number and arrangement of primary mono-mers, a variety of different physical and chemical properties can be achieved.
Exemplary of the use of the block copolymers are combinations of silk units and elastin units to provide products having properties distinctive from polymers only having the same monomeric unit.
To prepare the structural genes, various approaches can be employed. To prepare the oligomers, complementary strands of DNA may be synthesized, so that upon hybridization double-stranded DNA is obtained with the appropriate termini. If desired, each of the oligomeric units may be the same, so that in a single step, a concatemer may be obtained depending upon the conditions of the hybridization. Normally, conven-tional annealing and ligating conditions will be employed, such as are described in the examples that follow.
If desired, two different oligomeric units may be prepared where the termini of the two units are com-plementary one with the other but the termini of the same unit are unable to bind together. In this way one can build individual oligomeric units and then join them together to form the concatemer, where the inter-vening linking sequences are defined at least in part by the termini. Depending upon the construct, the 5' terminus may provide for the initiation codon methio-nine, or the structural gene may be joined to an adap-ter which may provide for a unique sequence (optionally cleavable by a specific enzyme) at the 5' terminus or may be inserted into a portion of gene, usually endo-genous to the host, in proper reading frame so as to provide for a fusion product. By providing for appro-priate complementary termini between the adapter or truncated gene and the 5' end of the subject structural gene, the sequence~~ can be joined in proper reading frame to provide for the desired protein. Advantages that may be achieved by employing adapters or fusion proteins include having specific sequences for special purposes, such as linking, secretion, complex formation with other proteins., affinity purification, or the like.
Once the structural gene has been assembled, it may be cloned; clones having the desired gene, particularly as to sequence length, may be isolated;
and the gene may be removed and used to join to a ~5 sequence for expression.
The expression construct will include trans-criptional and translational initiation and termination regulatory regions, 5' and 3', respectively, of the structural gene. A.s already indicated, these regions may be created by employing a fusion protein, where the subject structural gene is inserted into a different structural gene downstream from its initiation codon and in reading frame with the initiation codon. Alter-natively, various transcriptional and translational intiation regions a.re available from a wide variety of genes for use in expression hosts, so that these tran-scriptional and translational initiation regions may be joined to the subject structural gene to provide for transcription and translation initiation of the subject structural genes. A wide variety of termination regions are available which may be from the same gene as the transcriptional initiation region or from a different gene. Numerous constructs have been dis-closed in the literature, and the same procedures may be applied with t'he subject gene as have been employed with other structural genes.

Of particular interest is the use of an in-ducible transcription initiation region. In this manner, the post strain may be grown to high density prior to significant expression of the desired product.
5 Providing for inducible transcription is particularly useful where the peptide is retained in the cellular host rather than secreted by the host.
A number of inducible transciption initiation regions exist or can be employed in particular situa-10 tions. The inducible regions may be controlled by a particular chemical, such as isopropyl thiogalactoside (IPTG) for inducing the beta-galactosidase gene. Other inducible regions include lambda left and right pro-moters; various a~~~nino acid polycistrons, e.g., histi-15 dine and tryptophan; temperature sensitive promoters;
and regulatory genes, e.g., cIts 857.
An alternative system which may be employed with advantage is use of a combination of transcription initiation regions. A first transcription initiation region which regulates the expression of the desired gene but which is n.ot functional in the expression host by failing to be :functional with the endogenous RNA
polymerase is employed. A second transcription initia-tion region, such as an inducible region, can then be employed to regulate the expression of an RNA poly-merase with which the first transcription initiation region is functional. In this manner expression only occurs upon activation of the regulatory region con-trolling the expression of the exogenous RNA polymer-ase. In the subject application, this system is illus-trated with the T'l phage transcription initiation region, specifica:Lly the initiation regions of genes 9 and 10 of T7 phage.
An alternative system relies on the use of mutants which undergo a developmental change based on a change in the environment, such as a lack of a nutri-ent, temperature, osmotic pressure, salinity, or the like. Illustrative of this system, strains of B.
subtilis can be obtained which are incapable of sporu-lation but which can produce those components which initiate expression of products involved with sporula-tion. Therefore, t>y a change in the condition of the medium, a transcription initiation region associated with sporulation will be activated. In this situation, the host provides t;he necessa,:y inducing agent or activator to initiate expression.
Various other techniques exist for providing for inducible regulation of transcription and trans-lation of a gene in a particular host.
For the most part, the host will be a uni-cellular organism, either a prokaryote or a eukaryote, selected from bacteria, algae, fungi, filamentous fungi, etc. Illustrative hosts include E. coli, B.
subtilis, B.,stearothermophilus, S. cerevisiae, and the like.
The expression construct for expression of the desired gene, by itself or in conjunction with any auxiliary genes involved with transcription, will nor-mally be joined to an appropriate vector for introduc-tion into the expression host. A large number of vectors are commercially available with others being described in the literature. The vectors are normally characterized by having one or more unique restriction sites, a replication system for extrachromosomal main-tenance in the host,, and orie or more markers which allow for selective pressure on the host. The markers may provide complementation, prototrophy to an auxo-trophic host, resistance to a biocide, e.g., an anti-biotic such as penicillin or kanamycin, or the like.
In some instances, rather than selective pressure, a marker gene is employed which allows for detection of particular colonies containing the gene. This situa-tion is illustrated by the gene for beta-galactosidase, where a substrate i.s employed which provides for a colored product.

The expression construct, including any auxiliary genes, rnay be introduced into the expression vector in accordance with known techniques, particu-larly employing restriction, insertion, and ligation.
The expression construct may then be used for transformation of the appropriate host. Depending upon the host, either intact cells or protoplast may be employed, where transformation or conjugation is em-ployed. Conveniently, calcium-phosphate-precipitated DNA or non-ionic ~3etergents may be employed to intro-duce the plasmid into the host. It should be appreci-ated that it is not necessary to employ vectors for host transformation, since bare DNA can be introduced for integration into the genome. However, even where integration is desired, a much greater efficiency of integration is achieved employing the vectors, thus favoring the employment of vectors.
Depending upon the nature of the vector, the expression construct may be maintained on an extra-chromosomal element or become integrated into the host.
Where integration is desired, it will usually be desirable with prokaryotes and some eukaryotes to have a sequence homologous to a sequence in the chromosome of the host. Usu<~lly the sequence will be at least about 200 by and not more than about 5000 bp, usually not more than about 2000 bp. The choice of the homolo-gous sequence is somewhat arbitrary, but may be used for complementation, where the host is an auxotrophic mutant and the homology provides prototrophy.
The transformants or integrants may then be grown in an appropriate nutrient medium to high den-sity, followed by induction of transcription in accor-dance with the nature of the transcriptional system of the expression construct. Where the desired protein is retained in the c,,rtoplasm, these cells are harvested and lysed, an3, depending upon the use of the protein, the protein may be further purified in accordance with ~ 341 06 7 conventional techniques, such as chromatography, solvent-solvent extraction, affinity chromatography, and the like.
The repetitive proteins can find a variety of uses. The Slp proteins may be used in producing fibers having unique properties, as a substitute for silk, and the like. Collagen proteins can be produced, where the collagen is free of the telopeptide or contains the telopeptide, depending upon its function. Atelopep-tidecollagen should have little if any immunogenicity, so as to be a useful structural element for a variety of prosthetic devices or for use as a collagen substi-tute in other applications. Similarly, other proteins having repetitive sequences, such as keratin, can also be prepared in accordance with the subject invention.
Other useful repetitive proteins can be prepared based on sequences of spider silks and other repetitive animal fibers. Artificial peptides useful for immuni-zation could also be prepared based on repeating sequences present i.n various surface antigens of disease-causing microorganisms, such as parasites, bacteria, and viruses.
The following examples are offered by of illustration and not with limitation.
Example 1 DNA Preparation Methods 1. Preparation of plasmid DNA from E. coli:
A. Small scale: Plasmid DNA was prepared from 1.5 ml cultures by either the boiling procedure or the alkaline lysis method (Maniatis, et al., Molecular Cloning: A Laboratory Manual. Cold Spring Harbor Laboratory, Cold Spring Harbor. (1982)).
B. Large scale: A plasmid-carrying strain was grown overnight; in 1 liter of Luria broth with the appropriate antibiotic. The cells were collected by centrifugation at 10,OOOxg for 5 min and resuspended in ml of ices cold TE (lOmM Tris-HC1 pH 8, 1mM EDTA).
The cells were centrifuged again, resuspended in 4 ml of TES (TE and 25~ w/v sucrose) and homogenized by 5 vortexing. The samples werie kept on ice for the fol-lowing step:. Lysozyme (1 ml of 10 mg/ml) was added to the cell suspension and incubated for 5 min before the addition of 2 ml of 0.5M EDTA pH 8. After 10 min incu-bation, 50 ail of proteinase K (40 mg/ml) were added 10 followed 10 min later with 15 ml of lysing buffer (0.1~
Triton X-100, 1mM EDTA, 50mM tris-HC1 pH 8). After 15-min, the cell lysate was centrifuged at 35,OOOxg for 90-120 min. The supernatant (19.8 ml) was transferred to a plastic: tube with 20 g of CsCl and 400 ul of 15 ethidium bromide C10 mg/ml). After dissolution, the mixture was divided into two polyallomer ultracentri-fuge tubes, sealed with heat and centrifuged in a Beckman Ti 65 motor at 60,000 rpm for 24 hr. The lower plasmid DNA band was removed from the tube with a hypo-20 dermic needle. The ethidium bromide was extracted three times with an equal volume of NaCl-saturate d isopropanol. Two volumes of H20 were added to the DNA
solution, and then the DNA was precipitated with ethanol.
2. Preparation of double-stranded DNA:
A culture of JM103 was grown to an OD600 of about 0.2 and then divided into aliquots of 2 ml. Eaeh aliquot was infected with a fresh plaque of M13 and incubated at 37°C for about 6 hr with vigorous shaking.
Then the cells were pelleted and the supernatant was saved for subsequent infections. The double-stranded phage DNA was extracted by the boiling method (Maniatis et al.).
- *Trade-mark 3. Deproteinization:
Phenol extraction was performed on a conveni-ent volume of DNA sample, typically between 100 ul to ml. The DNA sample was diluted in 0.01M Tris-HC1 pH
5 7.5, 1mM EDT.A and an equal volume of water-saturated phenol was added.. The sample was vortexed briefly and placed on icy for 3 min. After centrifugation for 3 min in a microfuge, the aqueous layer was removed to a new tube and extracted once with an equal volume of 10 chloroform:i~soamylalcohol (24:1).
4. Ethanol pre<:ipitation:
DNA in an aqueous buffer was concentrated by ethanol precipitation. To the DNA sample was added 1/10 volume of 3M sodium acetate pH 7.5 and 2-3 volumes of cold ethanol. The DNA was precipitated for 30 min at -70°C or overnight at -20°C and then pelleted by centrifugation in the microfuge f or 15 min at a°C. The pellet was wished once with 200 ul of cold 80$ ethanol and pelleted again for 10 min at a°C. After air drying or lyophilization, the pellets were resuspended in~the appropri ate buf f er .
5. Phospha~:ase treatment of DNA
Phosphatase treatment of DNA was performed by adding 1 ul ;25 units) of calf intestinal phosphatase (Boeringer M:~nnheim) directly to the restriction enzyme digestion reaction and continuing the incubation for 30 min at 37°C. The ;phosphatase was inactivated for 60 3p min at 65°C ~~rioc~ to deproteinization by phenol extrac-tion.

6. Fill-in reaction with DNA polymerase I:
DNA was resuspended in buffer containing 50mM
Tris-HC1 pH ?.4, 50mM KC1, 5mM MgCl2, and 400 uM each of the four ~ieoxynucleotide triphosphates. Ten units of Klenow*DN11 polymerase (BRL) were added, and the *Trade-mark ~ 34 ~ as ~

reaction was allowed to proceed for 15 min at room temperature. The DNA was then phenol extracted and ethanol precipitated.

7. T4 polynucleot,ide kinase reaction:
The reaction (10 ul) contained: T4 poly-nucleotide kinase (BRL), 150 ng of DNA, 1 ul of 10x kinase buffer (0.7r1 Tris-HC1 pH 7.6, 0.1M MgCl2, 50mM
DTT) and C32P]-ATP (200-300 nCi). This was incubated at 37°C for 30 min and then the DNA was purified using a NACS column (Bethesda Research Labs).

8. Digestion with restriction endonucleases:
DNA was digested with restriction endonu-cleases (REN) in 1 x "AA" buffer [10 x AA buffer is 330 mM Tris-acetate, pH 7.9, 660mM potassium acetate, 100mM
magnesium acetate, 50mM dithiothreitol (DTT) and 1 mg/ml bovine serum albumin (nuclease free)]. Whenever possible, the concentration of DNA was kept below 1 ug/25 ul. Incubati.on was at 37°C for 1-4 hrs for most restriction endonuc:leases except for Ball, BanI and NaeI digestions which were incubated overnight.

9. Analytical a~arose gel electrophoresis of DNA:
To DNA samples for gel analysis we added 0.2 volumes of loading buffer (5 x electrophoresis buffer, 0.01 bromphenol blue dye, 50mM EDTA, and 50~
glycerol). Then the samples were loaded into lanes of a horizontal submerged electrophoresis unit containing a 1.0~ (w/v) agarose gel. The electrophoresis buffer was either 1 x TAC or 1/2 x TBE. The 1 x TAC is 40mM
Tris-base, lOmM EDT A, adjusted to pH 7.8 with acetic acid. The 1/2 x TBE is 0.045M Tris-base, 0.045M boric acid, 1mM EDTA, pH 8. The gel was run at 40-50V f or 18 hr, then removed and stained with 0.5 ug/ml ethidium bromide for 30 min. The DNA bands were visualized on a long wavelength UV transilluminator.

~ 341 06 7 10. Preparative agarose gel electrophoresis:
The procedures and materials are the same as for the analytica:L agarose geI electrophoresis. The only difference is the use of low melting point agarose, ranging in concentration from 0.5 to 2.5~
(w/v) depending on the size of the DNA fragment to be purified. DNA restriction fragments were excised from the LMP agarose gels after visualization with ethidium bromide.

11. NACS purification:
Gel fragments containing DNA were melted at 70°C for 5 min and diluted approximately 5 fold with TE1 (lOmM Tris-HC1 pH 7.5, 0.2M NaCl). The gel solu tion was applied to a NACS column (BRL). The column was washed with 5 ml of the same buffer. The bound DNA
was eluted with 300 ul of either TE2 (lOmM Tris-HC1 pH
7.5, 1.OM NaCl) far DNA fragments smaller than 1000 by or TE3 (lOmM 'Tris--HC1 pH 7.5, 2M NaCl) for larger frag-ments. The eluted DNA was concentrated by ethano l precipitation.

12. DNA ligation:
Reaction;3 for ligating cohesive ends con-tained: 1 ug DNA, 1 x AA buffer (see step 8, above) 1mM 'ATP and 2~~ units of T~ DNA ligase (BRL) in a 20 ul final reactiovz volume. The ligation was allowed to proceed for 15-18 hr at 15°C or 1-2 hr at room tempera-ture. For blunt-ended ligations the reactions con-tained 1 ug D:VA, 25mM Tris-HC1 pH 7.5, 5mM MgCl2, 5mM
DTT, 0.25mM spermidine, 200mg BSA, 1mM hexamine cobalt chloride (HCC), 0.5mM ATP and X00 units T4 DNA ligase (NEB) in a 20 ul reaction volume. The ligation was al-lowed to proceed for 30 min to 1 hr at room temperature.

Bacterial Transformation Methods 1. Preparation of transformation-competent E. coli cells:
A culture of 200 ml of sterile L broth was inoculated with a :mall loopful of E. coli cells. This was incubated with shaking at 37°C until the OD600 was approximately 0.5. The culture was placed on ice for min and centrifuged at 6,OOOxg for 10 min. The cell pellet was resuspended in 100 ml of ice-cold 0.1M
10 M8C12, kept cn ice for 30-40 min and centrifuged again.
The pellet wa.s resuspended in 2 ml of ice-cold 100mM
CaCl2, transferred to a sterile test tube and incubated on ice for 24 hr. The competent cells were then ali-quoted and stored at -70°C.
2. Transformation of E. coli:
An aliquot of frozen competent cells were thawed on ice. To 50 ul of cells 0.1 to 1 ug of DNA
was added and the mixture was incubated on ice for 30 min. The tube was removed from ice and placed in a 42°C bath for 2 min. L broth (1 ml) was added and the transformation mix incubated with shaking at the desired temperature' (usually 30°C or 37°C) for 2 hr.
Then one-tenth of t:he transformation was plated on L
broth plates containing the appropriate antibiotic and, when necessary, XGAL and IPTG were added.
3. DNA transformation of B. subtilis:
B. subtili.s cells were grown to early sta-tionary phase (change in Klett units of S5~ in 15 min.). Transformation followed established procedures (Anagnostopoulos.et; al., 1981) (ref. 8). Cells (0.45 ml) were incubated with 1-10 ug of DNA at 37°C for 80 min with shaking, and then plated on TBAB agar plates with an appropriates antibiotic.

2~ 1 341 06 7 ~4. Isolation of~lasmid DNA from B. subtilis:
Plasmid DNA from B. subtilis was obtained by a method similar to t;he alkaline-lysis method except that pelleted cells were resuspended in 8 ml of solution 1 (50mM glucose, lOmM EDTA, 25mM Tris-HC1 (pH 8.0), 10 mg/ml lysozyme) anti incubated at room temperature for 30 min. Then 16 ml. of solution 2 (0.2N NaOH, 1~ (w/v) SDS) was added and incubated on ice for 10 min.
Finally, 12 ml of 3M potassium acetate (pH ~t.8) was added and incubated an additional 20 min on ice. The lysed cells were centrifuged 15 min at 15,000 rpm in a Sorval SS-3~4 rotor. The DNA was precipitated by adding an equal volume of isopropyl alcohol and centrifuged at 7,000 rpm.- The pellet was resuspended in 5 ml of lOmM
Tris-HC1 (pH 7.5), 1mM EDTA (TE). The solution was phenol extracted once and chloroform extracted. DNA
was precipitated with ethanol and resuspended in 3 ml of TE. The volume was adjusted to 5.2 ml by adding 4.2g CsCl, X100 ul of ethidium bromide at 10 mg/ml and TE. The solution was transferred to a Beckman quick-seal polyallomer centrifuge tube and centrifuged at 45,000 rpm in a Beckman vti65 rotor for 18 hr.
Antibody Production, Protein Chemistry and Electrophoresis of Proteins 1. Preparation of antibody to artificially synthe-sized pe tides_:
Synthetic peptide of sequence (GAGAGS)BGAAGY
was coupled to BSA using the gluteraldehyde procedure of Kagen and Glick (1979). The degree of coupling was monitored using trace amounts of radioactive iodinated synthetic peptide. Peptide conjugates at a concentration of 1mg/ml in complete Freund's adjuvent were used to immunize rabbits at day 0. Animals were re-injected with antigen in Freund's incomplete adjuvant at day 30 and titered at day 60. Positive sera was det~acted using a ~nicrotiter RIA using the synthetic peptide as antigen. Kagen and Glick (1979), in Methods of Radioimmunoassay, Jaffe and Berman (eds.), Acad~smic Press, p 328.
5 A p~sptide of 53 amino acids corresponding to the SIpIII s~squence was prepared on an Applied Bio-systems peptide synthesizer. The yield oP this mater-ial, which has a molecular weight of 3640 was approxi-mately 0.5 g:~ams. The peptide was coupled to bovine 10 serum albumi,z. The material was sent to Antibodies, Inc, for preparation of antibodies in rabbits. Anti-sera was obt,~ine<i that reacted with synthetic peptides of both the .3lpI and SIpIII sequences. These antisera have been useful for the detection of Fusion peptides 15 containing'gly-ala sequences.
Fol.Lowing the procedure described above an antigen, was synthesized having the formula (IT-P-G-Y-G)8, which was coupled to keyhole limpet hemocyanin. Polyclonal antisera was then prepared as 20 described ab~we which bound to the ELF peptide.
2. Polyacr;Ylamide gel electrophoresis of proteins:
Approximately 109 E. cola cells from growing cultures were pelleted by centrifugation at 10,OOOxg 25 for 5 min. 'The cell pellets were resuspended in 100 to 500 ul of 2X sample buffer (100mM Tris-HC1 pH 6.8, 4~
SDS; 10x B-mercaptoethanol, 60x glycerol or sucrose) and sonicated for. 30 sec using a Tekmar sonic dis-ruptor. Samples were boiled for approximately 5 min and 20 to 100 yl of the cell lysates were loaded on an SDS-polyacrylamide gel (7.5 to 16~ w/v). The gels were prepared following the procedure of Laemmli (Nature, 227:680 -685 (1970)). The proteins in the gels were stained with 2~ Coomassie brilliant blue in 10~
methanol, 7.5~ acetic acid for 1 hr and destained in 10x methanol, 7.5x acetic acid overnight.
*Trade-mark ~.. ~ ~~'..

3. Immunoblotting oP proteins in gels:
Af!~er protein electrophoresis, one of the flanking glass plates was removed from the polyacryl-amide gel. The gel surface was wetted with transfer buffer (25m14 Tr713-HC1, 192mM glycine, 20~ methanol).
A

piece of ni~~rocellulose paper (Sartorius, SM11307) was saturated with transfer buffer and laid on the gel.

Air bubbles between the Filter and the gel were re-moved. The gel and nitrocellullose filter were placed in the transfer unit as specified by manufacturer (Bio-Rad). Transfer was allowed to proceed at 200 mA for 3-~ hr. ThESn the nitrocellulose filter was removed and stained with Amido-Schwartz for 3 min (0.05 Amido black, 45$ deionized H20), 45~ methanol, 10~ acetic acid) and dE~stained in H20. The filter was incubated for at least; 10 min at room temperature in "BLOTTO"
(5~

w/v nonPat dry milk, 50mM Tris-HC1 pH 7.~1, 0.9~ w/v . NaCl, 0.2~ w/v sodium azide). The filter was placed in serum appropriately diluted (1:50 to 1:500) in 0.5X

- 20 Blotto*(2.59~ nonfat dry milk, 50mM Tris-HC1 pH 7.4, 0.9~ NaCl, 0.2~ sodium azide) and was gently agitated for approxinnately 16 hr at room temperature. The filter was crashed Por 1 hr with 5 changes of TSA (50mM

Tris-HC1 pH 7.4, ~0.9~ NaCl, 0.2~ sodi*m azide). The blot was placed in 15m1 of 0.5X BLOTTO solution con-taining 1x107 epm oP the 125I -protein A and gently agitated Poc 2 hr at room temperature. The filter was washed for ~' hr with a minimum of 7 changes oP TSA, rinsed once with deionized H20 and air dried. The blot was covered with .Saran*wrap and autoradiographed.

Amino Ileid Analysis:

Amino acid compositions are determined by the FTC deriviti:zation procedure of Henrickson and Meredith (1984). Protein samples were hydrolysed with 5.7 N

constant boiling HC1 at 108C Por 24 hours in vacuo.

After reaction with PITC, amino acid derivatives were *Trade-mark detected at 25~ nm by HPLC reverse phase chromatography using a Hewlet:t Packard 1090 system and a Supelco-~C18 column (4.6 mm x 25 cm) with a linear gradient of 0-50~
acetonitile in 0.1M NH40Ac pH 6.78 as a mobile base.
Henrickson, R.L. an<i Meredith, S.C. (1984) Amino Analysis by Reverse Phase High Performance Liquid Chromatography. Anal. Biochem. 137:65-74.
5. Amino Acid Seguence Analysis:
The Dl-terminal amino acid sequence was determined by automated Edman degradation using an Applied Biosy:~tems Model 470A gas phase protein sequenator. The PTH amino acid derivatives were analyzed by reverse phase HPLC using a Hewlett Packard 1090*system ~ar~d an Alter C18 column (2 mm x 25 em) with a complex gradient buffer system.
6. Peptide ~~ynthesis:
Synthetic peptides were prepared by solid phase synthesis on an Applied Biosystems Model ~130A
Peptide Synthesizer using the standard symmetric anhydride chemistry as provided by the manufacturer.
The coupling yield at each step was determined by the quantitative ninhydrin procedure of Sarin _et _al., (1981). The.synthetic peptide was cleaved from the solid support and amino acid blocking groups were removed using anhydrous HF (Stewart and Young, 198~t).
Crude peptide: were desalted by chromatography over Sephadex*G-50. Sari. n, Y.K., Kent, S.B.H., Tam, J.P.
and Merrifield, R.B. (1981). Anal. Biochem. 237:927-936. Stewart, J.M. and Young, J.D. (1984). Solid Phase Peptide Synthesis, Pierce Chemical Company, Rockford, IL. pp 85-89.
*Trade-mark 28 1 341 0~ 7 Synthetic DNA iHethods 1. In vitro DNA ;~nthesis:
The N,N-d:iisopropylphosphoramidites, con-trolled-pore glass columns and all synthesis reagents were obtained from Applied Biosystems, Foster City, California.
Synthetic oligonucleotides were prepared by the phosphite triester method with an Applied Biosystems Model 3f30A DNA synthesizer using a 10-fold excess of prot~scted phosphoramidites and 1 umole of nucleotide bound to the synthesis support column. The chemistries used for synthesis are the standard proto-cols recommended for use with the synthesizer and have been described (Matteucci, et al., Journal Amer. Chem.
Soc., 103:3185-3319 (1981)). Deprotection and cleavage of the oligome~s from the solid support were performed according to standard procedures as described by McBride, et al., Tetrahedron Letters, 24:245-248 (1983). The repetitive yield of the synthesis as mea-sured by the o;~tical density of the removed protecting group as recommended by Applied Biosystems (1984) was greater than 9'l . 5% ,.
The crude oligonucleotide mixture was purified by preparative gel electrophoresis as described by the Applied Biosystems protocols of November 9, 1984 (User Bulletin No. 1:3). The acrylamide gel concentration varied from 10 to 20~ depending upon the length of the oligomer. The purified oligomer was identified by UV
shadowing, excised from the gel and extracted by the crush and soak procedure (Smith, Methods in Enzymology, 65 :371 -379 ( 1 9~30 ) ) ..
2. Sequencin~~ of DNA:
DNA sequences were determined by the following methods. Fragment; containing the region of interest were cloned into the multiple cloning site of M13mp18 or M13mp19 (Maniatis et al., 1982, and Norrander et ~34~ 067 al., 1983). Single-stranded DNA was prepared and se-quenced by the primer extension method (Sanger et _al., 1977 and Biggin et.al., 1983) using 35S-deoxyadenosine 5'-(alpha-thio)-trig>hosphate (New England Nuclear) as label. In some cases, reverse transcriptase (Molecular Genetics) was used t.o extend the primer, using the dideoxy:deoxynucleosidetri-phosphate ratios utilized by Zagursky et al. (Gene Anal. Techn. (1985) 2:89-9~).
Deoxyadenosine triphosphate labeled with either 32P or 35S was used in these reactions. Compression artifacts which appeared in some G-C rich sequences were overcome by eliminating deoxyguanosine triphosphate from the G
reaction, and using deoxyinosine triphosphate (P-L
Biochemicals) at a final concentration of 37.5 uM in-stead. In the other' mixes, the concentration of dideoxyGTP in the G reaction was 0.5 mM. All sequences were run on 6 or 8~ polyacrylamide gels containing 8 M
urea (Sanger et al. 1978). Primers used for sequencing were purchased from P-L Biochemicals. Storage and analysis of data utilized software from both DNAstar and International Biotechnologies, Inc.
3. In vitro mutag~enesis of cloned DNA:
Plasmid DNA (1 ug) containing the sequence to be mutated was digested in two separate reactions. One reaction contained either one or two restriction endo-nucleases which cleave at sites immediately flanking the region of interest. In the second reaction, the DNA was digested with a restriction endonuclease which cleaves only once at. a site distant from the sequence to be mutated. The DNA fragments generated in the first reaction were separated by agarose gel electro-phoresis and the large fragment which lacks the se-quence to be mutated was excised and purified. DNA
from the second reaction, the large fragment of DNA
from the first reaction, and a synthetic oligodeoxy-nucleotide of 20-30 bases in length containing the mutant sequence wee~e mixed in a molar ratio of 1:1:250.
The mixture was denatured by heating at 100°C for 3 min in 25 to 100 ul of 100mM NaCl, 6.5mM Tris-HC1 pH 7.5, 8mM MgCl2, and 1mM B-mercaptoethanol. The denatured 5 mixture was reannealed by gradually lowering the tem-perature as follows: 37°C for 30 min, 4°C for 30 min, and 0°C for 10 min. The reaction was supplemented with 0.5mM deoxyribonuc:Leotide triphosphates, 1mM ATP, X400 units of T4 DNA li;3ase and 5 units of E. coli DNA poly-10 merase large fragm~snt and incubated at 15°C for 12-16 hr. The reaction mixture was then transformed into E.
coli and antibioti~~-resistant colonies were selected.
Fermentation Conditions 15 The fermentor is a 15L Chemap, 10 L working volume. The culture conditions are: temperature -30°C, pH = 6.8; NaOH 2.5 M is used for pH regulation.
The headspace presaure is below 0.1 bar. The dissolved oxygen is regulated at 50~. The air flow varies from 20 0.5 L/min to 2o L/min. The agitation rate varies between 200 to 150c) rpm.
The fermentor is inoculated with a 10~ (v/v) inoculum grown in medium A for 15 hours at 30°C under agitation.
25 Medium B was the fermentor medium. The starting volume was 5 L.
When the glucose concentration reached 1~, a concentrated solution (5x) of medium B was added to the fermentor in order to keep the glucose concentration 30 approximately at 1',$. When the culture reached an OD600 of 60.0, the temperature was increased to ~2°C for 10 min, then lowered to 39°C for 2.5 hours. The cells were then harvested by centrifugation and frozen at -70°C until processed.

Table 1 Medium A: LB Medium Constituent g/L

NaCl 10 tryptone yeast ext ract 5 kanamycin 5x10-3 Medium B

Constituent g/L

NH4C:1 4.5 KH2 P ~J~ 0 . 76 MgS047H20 0.1$

K2S0~ 0.09 CaCl~~
24x10-3 FeSO'~,~7H~,0 7.6x10-3 TE 0.5 ml casamino acids 25 yeast extract 5 glucose 20 kanamycin 5x10-3 32 ~34~067 Example 2 Assembly and Expresr~ion of the SlpI Gene 1. Summary of the scheme for assembling the SlpI
gene:
An 18 by DnIA sequence that codes for the most frequent repeating oligopeptide in the silk fibroin protein made by Bomt~ mori [Lucas, F. and K.M. Rudall (1986) Extracellulac~ Fibrous Proteins: The Silks. p 475-558, in Comprehensive Biochemistry, vol. 26, part B., M. Florkin and F'.H. Stotz (eds.) Elsevier, Amsterdam] was synthesized in vitro. Two single-strands were synthesized, annealed together and then the resulting double-stranded segments were multimerized head-to-tail to generate concatamers of up to and exceeding 13 repeats. The structural gene for silk I that we proceeded to work with had 13 repeats that coded for the oligopeptide gagags, where g =
glycine, a - alanine and s = serine. We refer to this structural gene as the "monomer". We constructed "dimeric, trim.eric, tetrameric, pentameric and hexameric" SlpI genes containing 26 (SlpI-2), 39 (SlpI-3), 52 (SlpI-4), 65 (SlpI-5) and 78 (SlpI-6) repeats. There is a short intervening sequence between each monomer unit. The assembly is pictured as follows Repeating DNA sequence 5'-G G T G C G G G C G C A G G A A G T
C G C C C G C G T C C T T C A C C A-5' ~i'~n~e: n I Z ~ 3 ~ 4 ~ 5 ~ . . .
I~t! timers 3 5 ... L...n L...n ...

~34~067 2. Assembly of the "monomeric" SlpI structural gene:
The two single-strands shown above were syn-thesized as previously described. The strands were separately purified by gel electrophoresis, phosphor-s ylated using T4 polynucleotide kinase and then mixed together and allowed to anneal. This resulted in the double-stranded segments aligning spontaneously head-to-tail in long concatamers. The phosphodiester bonds between segments were formed with T4 DNA ligase. The reaction was stopped by filling in the terminal cohe-sive ends using the Klenow fragment of DNA polymerase I. The blunt-ended repeating DNA was then ligated to the HincII REN site in plasmid vector pUCl2 (Ueiera, et al., Gene 19:259-268 (1982)). The ligated DNA was transformed-into E. coli HB101 and transformants were selected for their ability to grow in the presence of ampicillin. The DNA of potential clones was analyzed;
for size and orientation by REN digestion and gel elec-trophoresis. DNA sequences were determined for iso-lates with large inserts that were oriented properly.
The "monomer" clone selected for subsequent multi-merization had 13 repeats coding for the oligopeptide agagsg, and was named pSY708. The DNA sequence, de-duced amino acid sequence and REN sites of the SlpI
insert and flanking regions of pSY708 are shown in Table 2.

Table 2 I S V I~i N T A i"~
3 1 1 :L
AAGCTTGGGCTGCAGGTCACCc::GGGCGGGCGCAGGAAGTGGTGCGGGCGCAGGAAGTGGT
____,____+____.____+_.____.____+____,____+____,____+____.____+ 60 TTCGAACCCGACGTCCAGTGG(~CC:CGCC(:GCGTCCTTCACCACGCCCGCGTCCTTCACCA
1 ~ k 1 g 1 q v t ._ a g a g s g a g a g s g ____.____+____.____+_.___..____+____,_____+____,____+____.____+
GCGGGCGCAGGAAGTGGTGCG(~G<:GCAGGAAGTGGTGCGGGCGCAGGAAGTGGTGCGGGC
____.____+____.____+_____..____+____,____+____,____+____,____+ 120 CGCCCGCGTCCTTCACCACGC(~CGCGTCCTTCACCACGCCCGCGTCCTTCACCACGCCCG
a g a g s g a c~ a g s g a g a g s g a g ____.____+____.____+_____.,____+____.____+____.____+____.____+
GCAGGAAGTGGTGCGGGCGCAC~GAAGTGGTGCGGGCGCAGGAAGTGGTGCGGGCGCAGGA
____ ____+____ ____+_____. ____+____ ____+____ ____+____ ____+ 180 CGTCCTTCACCACGCCCGCGT(:CTTCACCACGCCCGCGTCCTTACCACGCCCGCGTCCT
a g s g a g a cI s g a g a g s g a g a g ____.____+____,____+__.__..____+____,____+____.____+____,____+
AGTGGTGCGGGCGCAGGAAGTC~GTGCGGGCGCAGGAAGTGGTGCGGGCGCAGGAAGTGGT
____.____+____.____+__.__..____+____.____+____.____+____.____+ 240 TCACCACGCCCGCGTCCTTCAC;CACGCCCGCGTCCTTCACCACGCCCGCGTCCTTCACCA
s g a g a g s cI a g a g s g a g a g s g ____.____+____.____+__.__..____+____.____+____.____+____.____+
X B A S E
B A V M C
A M A A R

GCGGGCGCAGGAAGTGGGACTC;TP,GAGGATCCCCGGGCGAGCTCGAATTC
____.____+____ ____+__.__. ____+____ ____+____ ____+ 290 CGCCCGCGTCCTTCACCCTGAGATCTCCTAGGGGCCCGCTCGAGCTTAAG
a g a g s g t 7. o d p r a s s n s ____.____+____.____+__.__.____+____.____+____,____+
v _~ v.

~3~~ 067 3. Construction of the expression vector, pSY701:
Plasmid pSP65 (10 ug, Boehringer Mannheim) was digested with AatII REN, phenol extracted and ethanol precipitated. The C~NA was resuspended in 10u1 of H20. One-half of this DNA was digested with exonuclease III in the following mix: 5 ug DNA, 10 ul 10X exonuclease III buffer (600mM Tris-HC1 pH 8.0, 6.6mM MgCl2, 10 mM s-mercaptoethanol) and 9 units of exonuclease III in a. total volume of 200 ul. Samples of 20 ul were taken at 0, 1, 2.5, 5 and 7.5 min and diluted immediately in 100 ul of the following buffer (30 mM sodium acetate, pH ~.5, 0.25 M NaCl, 1mM ZnSO~) containing 5 ug tRNA. and 36 units of S1 nuclease.
Incubation was at 30°C for ~5 min and then the reaction was terminated by the addition of 15 ul of stop buffer (0.5M Tris pH 9.0, 125mM EDTA, 1~ w/v SDS, 200 ug/ml tRNA). The samples were phenol extracted and ethanol precipitated. The resuspended DNA was digested with SmaI REN and electrophoresed through a 1~ gel of low melting point agarose. The gel band corresponding to the DNA fragment carrying the S-lactamase gene, the plasmid origin and the ~-galactosidase promoter was excised from the gel and melted at 65°C. One volume of H20 was added. The DNA in each sample (timepoint) was recircularized by ligation in the presence of agarose.
The reaction included 8 ul melted gel, 2 ul of ligation buffer (100mM Tris-HC1 pH 7.5, 50mM MgCl2, 50mM DTT, 1mM ATP), 10 units T~4 DNA ligase and was incubated at 15°C for 3 hr. Competent cells of JM101 were trans-formed with the ligated DNA and transformants were selected by growth on L broth plates containing ampi-oillin (40 ug/ml). Plasmid DNA was prepared from four transformants. The DNA was digested with BamHI REN, labeled with 32P-dGTP using the Klenow fragment of DNA
Polymerase I, digested with Pvu I and then the smallest fragment was gel purified. The fragment from one transformant was sequenced using the Maxam and Gilbert 6 ?34067 technique. Th~~ fragments of the other three plasmids were further digested with TaqI and electrophoresed on the same gel. The sequenced plasmid had a fusion between the mu:ltipl_e cloning site and a position up-s stream from the N-germinal ATG of S-lactamase. The size of the BamHI-7'aqI fragment of two of the other plasmids indicated a fusion between the multiple cloning site and the 4th amino acid of the S-lactamase gene. The DNA and corresponding amino acid sequences of the N-terminal region of the altered S-lactamase are given below, along with a circular map of REN sites for pSY701 (see Fi~~ure 1). The amino acid sequence of Figure 1 is me~~-thr-met-ile-thr-pro-ser-leu-gly-cys-arg-ser-thr-leu-glu-asp-pro-his-phe-arg-val-ala-leu-ile-pro-phe-phc:-ala-ala-phe-cys-leu-pro-val-phe-ala-his.
4. Insertion of '"monomer" SlpI from pSY708 into Sp Y701 Plasm:id pSY'708 was digested with HindIII, the cohesive ends mere filled in using the Klenow fragment of DNA polymerise T and then digested with BamHI.
Plasmid pSY701 was digested with XbaI, filled in as above and then digested with BamHI. The DNA fragment from pSY708 and the backbone of pSY701 were then puri-fied by electrophoresis through a low melting tempera-ture agarose gc:l and purified with NACS (BRL) columns.
The appropriatf~ fragments were mixed, ligated, and then transformed in~:o _E, coli JM109. Transformed cells were selected by growth o:n L plates containing ampicillin (~0 mg/ml), IP'CG (5x10-~M) and XGAL (20 mg/ml). Trans-formants were analyzed for plasmid contents and one (pSY756 ) was sf~lect:ed for further study since it carried the insert of the monomer Slpl-1 sequences in the proper orientation, as determined by mapping of REN
sites. Althou~3h the entire DNA sequence was not deter-~34~ 067 mined for pSY7'~6, i:he junctions between the insert and vector were ve:~ified as correct restriction sequences for XbaI, upst:~eam and BamHI, downstream.
5. Multimeri:~ation of the SlpI gene of pSY756:
Plasmid p;>Y708 was digested with the REN SmaI
and the DNA fr;~gment carrying the coding sequence for the polypeptid~~ arg(ala-gly-ala-gly-ser-gly)l3thr-leu-glu-asp-pro (R(AGA(sSG)13TLEDP) was purified as in ~+
above. Plasmi~i pS'.C756 was digested with SmaI, de-proteinized an~~ then ligated with the purified DNA
fragment from pSY708. Transformants of E. coli JM109 were selected on medium containing ampicillin. Clones were found to ~~ontain 2 units (dimer pSY882), 3 units (trimer pSY883), and ~ units (tetramer pSY915) of the original monomer sE:quence of the pSY708 clone.
Similarly, pen~tamers and hexamers have also been constructed. .411 of these plasmids are genetically stable and produce the gly-ala peptide as a fusion with S-lactamase.
6. Expression of the SlpI gene fusion to the S-lactamase protein:
Synthesis in E. coli cells of the SlpI peptide as a fusion pr~~tein with S-lactamase was detected by immunoblotting (Western) analysis. Anti-"Slp" anti-bodies were raised against a synthetic silk peptide.
Fusions between S-:Lactamase and SlpI were also detected with antibodies raised against the E. coli s-lactamase.
As shown in Fi,3ure 2, this antibody reacts with dimers and trimers of Slpl fused to the E. cola S-lactamase.
The SlpI insert proceeds the fifth amino acid of the signal sequenc~a fo.r this enzyme. The ~-lactamase anti-body (Figure 2.A) dE~tects both the unprocessed fusion proteins as well as the processed mature enzyme which appears as the major antigenic band in this figure, at ,._ 1 341 06 7 about the 28 k:D po:~ition. The mobilities of all Slp-containing polyypet:ides are anomalously slow and the proteins are not a:~ large as they appear on the gels.
The anti-Slp antibody is useful in detecting these fusion products. Lanes 2-5 of Figure 2B
represent 4 se:~arat~e clones that contain dimer fusions of SlpI with ~-lact~amase, while lanes 6 and 7 are from two clones containing trimer fusions. As can be seen the antigenicity of_- the trimer is considerably greater than for the d.imer,. It is known from prior experiments that fusion proteins containing only a monomer of SlpI
are not detected at: all with this anti-Slp antibody.
The increased ;~ntic3e:ncity of the trimer peptide allows it to be detecv~ed as a processed fusion with the ~-lactamase signal peptide. The processed form is seen at about the 3:3 kD position in lanes 6 and 7 of Figure 2B. The appearance of normally processed ~-lactamase mature enzyme (dete'c'ted with ~-lactamase antibody) as well as a peptide c:o:rresponding to the fusion between the SlpI-3 trimer and the signal peptide of ~-lactamase (detected with gly-~a:la antibody) suggests that despite the insertion of SlpI sequences within the signal sequence, normal proteolytic processing of the enzyme 2 5 occurs in Ei ~s?1 i .
7 . a . Rx= _ress,_' om of the S1_~T ye_n_e bar f ~ ; on o T7 q~enes The 3lpI sequence has also been expressed as a fusion protein with both the gene 9 and gene 10 pro-teins from baci~eriophage T7 in E~ coli. The construc-tion is diagrammed in Figure 3. Plasmid pSY915 (con-taining the SlpI-4 tetramer) was digested to completion with REN Ss~lI and partially wth ~mHI. The DNA frag-ment containing the'SlpI-4 tetramer was purified and then cloned in pla~~m:id pSY114 (pGEM2 of Promega Biotech) which had been digested with RENs SsalI and ~mHI. From this intermediate plasmid, named pGEM2/915, the tetramer insert of SlpI was removed with the RENs ~I and F'~?RI. This frag-ment was then ~lone~d in pSY633 (pBR322 containing the complete T7 gene 9 sequence; pAR441 of Studier (1986) ) which was digested with Rr~RI and Bs~aII. In the resulting ~lasrnid the SlpI tetramer is fused to the gene 9 translational reading frame near the C-terminus of gene 9. This p~asmid was named pG9/silkI (also named pG9/SlpI-4; containing 4 units of SlpI). This plasmid was than u:~ed to transform Ey ooli strain 0-48 (strain HMS174 (~,DE:3) of Studier, etal, 1986) which contains the T'7 RNA :polymerase gene inserted into the chromosome undE=r tz°a:nscriptional control of the IPTG-inducible ~-ga:Lactos:idase promoter. In this configuration, expz-e~ssion of the SlpI-4 sequence is dependent upon procLuction of the T7 RNA polymerase which itself i:~ cont:rolled by the IPTG-inducible ~-galactosidase promoter. As shown in Figure 4B and 4C, when these cells are induced with IPTG a protein product of the gene 9/SlpI-4 fusion gene is synthesized and is detected with antibody to the synthetic Slp peptide. The fusion product migrates in the gel as if it was 82 kD in sia:e. The size expected is only 65 kD.
The anomalous rnobil.ity is characteristic of the unusual amino acid composition (rich in glycine and alanine) and is seen for all.Slp-containing products.
In like manner, plasmid pSY638 (pAR2113 of Studier) containing the promoter region and the first 13 amino acids of the T7 gene 10 protein, was digested with REN ~mHI, fi:Lled in with the Klenow fragment of DNA polymerase and then digested with REN EcoRI. Into this linearize~3 plasmid was cloned the Hi _n_r_II-EcoRI
fragment of pG:~M2/915, containing the SlpI-4 tetramer.
The resulting ~~lasrnid was named pGlO/silkI (also named pGlO/SlpI-4; c~~ntaining 4 units of SlpI). This ligation creates an in-:Frame fusion of the silk tetramer following the ~~hirt~eenth amino acid of T7 gene 10. The latter fusion ~~rodizct may be used for spinning without further proces;~ing since the N-terminal 13 amino acids are only a small part of the large SlpI protein.

39a Although the fusion product is about 30 kD in size, it has an anomalous mobility and migrates as if it was larger, 50 kD. This is shown in Figure 4A.

1 341 06'7 The plasmids pG9/SlpI-4 and pGlO/SlpI-4 were further improved by inserting a kanamycin-resistance gene in the B-lactamase gene in the orientation opposite to the T7 expression system. Thus, any low 5 level expression from the T7 system does not lead to elevated B-lactamase activity. Such activity elminated the ampicillin in the medium that was added to select for maintenance of the plasmid. When the ampicillin was depleted tha plasmids were lost from the culture.
10 The kanamycin-resistance gene circumvents this problem and represents a significant improvement in the TT
expression system, especially for large scale cultures. TY~e kanamycin-resistance gene (originally from Tn903) eras isolated from a plasmid pUC4K (Yeira, 15 J. and J. Measing, 1982. Gene. 19:259-268) was a HirieII fragmE~rit. The so modified pGlO/SlpI-4 plasmid is designated pSY977.
T.b. Fermentation and purification of SlpI-4:

_E. ~:oli strain 0-48 carrying pSY997 was grown 20 at 37.C, using a Chemap or a Braun fermentor, in 10,L
of . LB to an OD ~(Klett units) of 300 (3x109 cells/ml).
The TT system was then induced with the addition of 3.5 mM

' IPTG. After 150 min the cells were concentrated 10x using a Millipore filter unit (Pellicon cassette 25 system, 100,000 molecular weight cut off filter). The cell suspension Was then frozen a;; -70C until processing.

The cell suspension was melted in a water bath at 42C and lysed in a french press, and the lysate was 30 spun at 125,OOOxg f or 1 hcur at 25C. The cleared supernatant was treated with DNAase (250 ym/ml) for min at room temperature, then filtered through a .45 um .

and sterile Filter. The filtrate volume was measured _ ~ incubated in ice with slow stirring. Then 231 mg of 35 ammonium sulphate were added for each ml of filtrate :, over a period of 45 min. One ml of NaON for each 10 g '. of ammonium sulphate was added to neutralize the pH.

*Trade-mar.
BI

u, After 2 hours of continuous stirring the mixture was spun at 9,OOOxg for 10 min. The pellet was resuspended in 1/10 of the original filtrate volume using distilled water. The centrifugation and resuspension was repeated three times. The pellet was resuspended in 1/10 of the original. filtrate volume in distilled water. Samples were analyzed for protein concentra-tion, amino acid composition and protein sequence by standard methods. This is one of several methods for obtaining the product. This method results in an SlpI-4 product that is greater than 90~ pure. The amino acid composition is almost entirely gly, ala and ser, as expected, and the N-terminal amino acid sequence is that of the gene 10 leader.
~5 8. Controlled expression of the T7 RNA polymerase gene in Bacillus subtilis:
The coding sequence of the T7 RNA polymerase gene (T7 gene 1, T7 nucleotides 3128 to 5845) from plasmid pSY558 (pAR1151 of Studier, et al., 1986) was modified by in,vitra mutagenesis of cloned DNA. We inserted the recognition sequence for the restriction endonuclease NdeI at position 3171. Using an oligode-oxynucleotide which was synthesized as previously described, the T7 gene , sequence was changed from its natural sequence, TAAATG, to the modified sequence, CATATG.
Similarly, the upstream regulatory sequence of the Bacillus subtilis gene spoUG,, obtained from plasmid pCB1291 (Rosenblum, et al., J. Bacteriology, 1~48:341 351 (1981)), was modified by in vitro mutagenesis at position 85 (Johnson, et al., Nature, 302:800-804 (1983)) such that it. also includes an NdeI cleavage site. The upstream regulatory sequences of the spoVG
gene were then ligated with the coding sequence of the T7 RNA polymerase gene via these novel NdeI cleavage sites. After transformation of E. coli HB101, the ~ 341 06 7 plasmid contents of individual ampicillin-resistant isolates were checked by restriction mapping. The correct construction was named pSY649.
Plasmid DNA. containing the spoUG:T7 RNA poly-merase fusion gene (pSY649) was further modified to include a chloramp~enicol-resistance gene that func-tions in B. subtilis. First the NdeI to SalI fragment of about 1200 base pairs from plasmid pGR71-P43 (Goldfarb, et al., Nature, 293:309-311 (1931)) was isolated. This fragment carries the P43 promoter of B.
subtilis and an adjacent chloramphenicol acetyl-transferase gene from Tn9. After filling in all the cohesive ends using the Klenow DNA polymerase reaction, this fragment was inserted into the XbaI site within the multiple-cloning site of pUCl3 (Ueiera, et al., Gene, 19:259-268 (1982)). Ampicillin and chloram-phenicol-resistant transformants were selected for further use. The correct plasmid construction was named pSY630. The SmaI to HincII endonuclease cleavage fragment from plasmid pSY630 containing the chloram-phenicol acetyltransferase gene fused to the P43 pro-moter sequence was gel purified and blunt-end ligated to the PvuI site of plasmid pSY649 that had been treated first with T4 DNA polymerase. The resulting plasmid, pSY856, was then transformed into B. subtilis I168. Because plasmid pSY856 is unable to replicate autonomously in B. subtilis, stable transformants resistant to chloramphenicol must result from the inte-gration of the plasmid into the B. subtilis chromosome (Ferrari, et al., .1. Bacteriology, 154:1513-1515 (1983)). The integration event, facilitated by homolo-gous recombination, most likely occurred at either the spoUG or the P43 loci of the bacterial chromosome (pSY856 contains DNA sequences homologous to the B.
subtilis chromosome at only these two sites). The resulting strain, p'BIPoL", was grown both in the pre-sence and absence of chloramphenicol in order to deter-1 3~1 06 7 ~+ 3 mine the stability of the selectable marker. Expres-sion of the T7 polymerase was obtained and this has no apparent effect: on the growth or viability of this strain.
9.a. Expression of a plasmid-borne target gene (kanamycin-resistance) in B. subtilis strain BIPoL:
The S~:aphylococcus aureus plasmid pUB110 (Lacey, et al., J. Med. Microbiology, 7:285-297, 197+) which contains the gene coding for resistance to the antibiotic kanamyci.n was used to test the expression of the growth-regulated spoVG:T7 RNA polymerase gene of strain BIPoL. An E:coRI-BamHI fragment of phage T7 DNA
(positions 21,~~02 t:o 22,858 containing the T7 gene 9 promoter sequence was purified from plasmid pAR~+~41 (Studier, et a:~., 1986). This DNA fragment was ligated into pUB110 between the EcoRI and BamHI restriction endonuclease sites. The resulting plasmid, pSY952, contains the T7-specific promoter in the same orien-tation as the kanamy~~in-resistance gene. Plasmid pSY952 was transformed into B. subtilis I168 and BIPoL
and these stra:W s were analyzed for the level of ex-pression of thE~ polypeptide encoded by the plasmid-derived kanamyc:in-c~esistance gene. Approximately 109 cells from grocaing cultures of I168, I168 containing pUB110, I168 containing pSY952, BIPoL, BIPoL containing pUB110, and BIPoL containing pSY952 were obtained at several times during the growth and sporulation cycle.
The proteins in these cell samples were processed and analyzed by po:lyacrylamide gel electrophoresis.
Becau:~e the rate of transcription from the spoVG promoter increases as a function of cell density and reaches a maximum during early sporulation, an accelerated acc:umul.ation of the target protein is ex-pected in the BIPoL, strain containing pSY952 during growth as the culture enters sporulation. The results ~ 341 06 7 show that a protein of molecular weight 34 kilodaltons increases in ~~bundance as the culture approaches and enters stationary phase. The size of the protein is in agreement with the predicted size of the kanamycin-resistance gene product (Sadaie, et al., J.
Bacteriology, 111:1178-1182 (1980)) encoded in pSY952.
This protein i.s not present in BIPoL or I168 containing pSY952 which ~:acks the spoUG-regulated T7 RNA
polymerase gene or' in BIPoL containing pUB110 which lacks the T7 promoter sequence. The maximum accumu-lated level of target protein after 24 hours of growth in BIPoL containing pSY952 was 20~ of the total cellu-lar protein as determined by densitometry.
9.b Expression of SlpI-4 in B. subtilis:
Plasr~id pG10S1pI-4 was digested with EcoRI
REN. After filling in the cohesive ends using the Klenow DNA po7_ymerase reaction, the DNA was digested with BgIII RErI, flasmid pSY662 was digested with SmaI
and BamHI REN;3. The two plasmids were then purified by electrophoresis through a low melting temperature agarose gel and purified with NACS (BRL) columns. The DNA fragment of pG10S1pI-4 was ligated to the backbone of pSY662 and transformed into E. coli containing ampicil-lin (40 ug/ml;i. Transf ormants were analyzed f or plas-mid contents rind one (pSY662/G10/SlpI-4) was selected for further study.
Competent cells of B. subtilis BIPol were transformed wj.th pSY662/G10/SlpI-~1 and incubated at 37°C with sha4;ing for 90 min. The transformation mixture Was then diluted 1:100 in fresh LB containing 10 ug/ml of tE:tracycline and incubated at 37°C with shaking. Samples were taken and equal numbers of cells were lysed an<i loaded on gels for separation by SDS-PAGE. Immunot>lot analysis was performed using anti-Slp antibodies to detect the synthesis of the gene 10/SlpI-4 fusion protE:in.
*A shuttle vector from E.coli to B.subtilis derived from pUCl3 and pF~Cl6, was digested with SmaI.

~ 34 9 06 7 The expression of the SlpI-~ polypeptide in B.
subtilis was detected by its seroreactivity with anti-Slp antibody, after transfer of the cellular proteins from the polyacrylamide gel to a nitrocellulose 5 filter. We verified that the seroreactive protein was the product of the Slpl-~ gene by exhaustively treating the cellular proteins with CNBr. This should cleave after methionine re~~idues, but since SlpI-4 lacks methionine it will remain intact. The CNBr treatment 10 eliminated greater than 98% of the proteins stainabie with Coomassie blue dye. And as expected for a protein lacking methionine, SlpI-~l remained intact and still reacted with anti-Sl.p serum.
15 - Example 3 Assembly and Expression of the SIpIII Gene 1. Summary of the scheme for assembling the SIpIII
gene:
20 The synthetic SIpIII gene codes for a protein similar to the SlpI gene and to the crystalline region of the silk fibroin protein made by the silkworm, Bombyx mori. SIpIII: more closely resembles the silk fibroin molecule because it includes the amino acid 25 tyrosine at regular intervals (about 50 residues), whereas multimers of SlpI do not. The SIpIII gene was assembled from smaller parts. First, three double-stranded sections of DNA of about 60 by in length were chemically synthesi~;ed. Each section was cloned by 30 insertion into bacteriophage M13 and the DNA sequence was verified. These sections were then removed from the vector and linked together in a specific order.
This linkage of about 180 by is named the SIpIII
"monomer". "Monomers" were then linked in a specific 35 order to yield dimers, trimers, tetramers, etc., of SIpIII. The multimers were then cloned either directly into plasmid expression vectors to detect the SIpIII

protein or i:~iti<~lly into an adapter plasmid. Inser-tion of the SlpI:II DNA into the adapter allows for further gene man=~pulation and is further described later. The ~~ssernbly scheme is pictured as follows:
Synthesis of double-stranded DNA sections --Section 1 - 5--Section 2 -Illlllillllllllll Section 3 - 51111'IIIIIIIIIIII

Assembly of "monomer" --1 ~ 2 ~ 3 Multimerizat_ion ---'~ 123 ~ 12~ ~ ~ ~ 12 The DNA and corresponding amino acid sequences of the three sections of the SIpIII gene are shown in Table 3.

x tt7 QI UI U7 N U7 m o W ~ ~ in u, ~

GL N aJ rl -a U

C~ a E

d vi c9 a ~a H

m ro x ~ __ C7 U C7 2 .~ G ri C7 ~J U __ a E-U ~ U
a .7 C U

C7 U a F a ~ C7 U a J UJ E
U

~ C
~ 7U

i:JU UC7 P7 ro G~__C7VC7 ~ J U

a C7 C7 U a a a a CJ C U
C 7U ~
U C;) C7 U C7 , U C7 H;i qu m a aH~n ~ i~ ~ a a a ~' U c~ U ~

C~ v c~ U C~

CPU Ea aE

~u H

a~ ay qua M

v U U U a U
U

C C ~
_7 F .~ a N F a LCJ J U
U U

R. U' C
a 'J
a a CJ ~ U H a a ~ H a F

2 0 E fc :n c~ v a H ~ m ~ U V
E V

V C U a a C7 CJ a E-~ a Ul U' U a V ~ U U
CJ U

~e' qu a a qua U ~ a ' U

' 7 U
J C

E, V
V

v) C H 4 ~

a C V U
~ V U

i C7 C C

-.

U C': U f7 E

ye H ~

a u, a N"r.,___ c Ja E ~ a wroa-,_._ wrox.~__ a U~ ~

~ ~34 1 06 7 4a The double-stranded DNA sequence is shown in the 5' to 3' direction. The amino acids (g - glycine, a - alanine, s - serine, y - tyrosine, coded by the sequence are shown immediately below each section.
Recognition sequences for cleavage by restriction endonucleases are ~sh.own above each section.
The above six single-strands were synthesized.
After synthesis, the strands of DNA were purified and the homologous strands were annealed. About 1 ul (0.5 ug) of each strand was mixed with 2 ul of 10X AA
(description) buffer and 16 ul of sterilized deionized H20 in a 1.5 ml polypropylene Eppendorf tube. The tube was placed in a boiling water bath (500 ml in a 1 liter beaker) for 10 min and then the beaker was removed from the hot plate and allowed to cool on the bench to room temperature. This required about 1-2 hr.
Each of the three double-stranded sections was cloned separately :into M13mp18. Section 1 was ligated between the SmaI and BamHI restriction sites of the multiple-cloning sate. Section 2 was ligated between the BamHI and PstI sites. And section 3 was inserted between the PstI and HindIII sites. The respective clones are: Ml3mp'18.1, M13mp18.2, M13mp18.3. The DNA
sequence was deterrnined for each cloned section. One representative of each section that had the correct DNA
sequence was recovered and became the material for the next.step: assembly of the "monomer".
3. Assembly ~~f the "monomer" of SIpIII:
The DATA sE~ctions 2 and 3 were isolated by digestion of the M'13 clones with restriction enzymes:
for section 2, M13mp18.2 was digested with BamHI and Pstl; for section 3, M13mp18.3 was digested with Pst1 and HindIII. 'the two sections were purified and mixed together in eqv.zal rnolar amounts with M13mp18.1 that had been first digested with BamHI and HindIII. T~ DNA
ligase was added to link the homologous overlapping 9 ~34~~67 ends in the order 1--2-3. Due to the hybridization specificity of the <:ohesive ends, the three sections are efficiently linked in only this order. The DNA
sequence of t~2 cloned "monomer" in the assembly named M13mp18.1.2.3 was determined to be correct and as shown in 2 above.
4. Multimerization of the "monomer" of SIpIII:
In order to prepare large amounts of the "monomer" structural. gene we first subcloned the "monomer" into the plasmid vector pUCl2. M13mp18.1.2.3 was digested with EcoRI and HindIII restriction enzymes. The SIpIII: "monomer" was geI purified and ligated into pUCl2 digested with EcoRI and HindIII.
The resulting plasmid DNA was prepared, the "monomer"
was released from thae vector by digestion with BanI REN
and the fragment was gel purified.
To create multimers, "monomer" DNA with BanI
ends were linked by ligation. The nonpalindromic terminal BanI recognition sequence allows linkage only in a head-to-tail order. The extent of multimerization is monitored by gel electrophoresis and staining the DNA with ethidium bromide. Multimers of more than 20 units have been obtained by this method.
5. Cloning of the multimers of SIpIII:
Plasmid pCQV2 (Queen, et al., J. Appl. Mol.
Gen., 2:1-10 (1983)) was digested with EcoRI and BamHI
restriction endonucleases and a fragment of about 900 by was purified. 'this DNA fragment contains the bac-teriophage lambda cI_857 repressor gene, the closely linked rightward promoter, PR, and the beginning of the cro gene. Plasmid pSY335 (described as pJF751 in Ferrari, _et _al., J. Bacteriology, 161:556-562 (1985)) was digested with EcoRI and BamHI restriction enzymes and subsequently l:igated to the DNA fragment of approx-imately 900 by of pCQU2. The plasmid obtained from this construction, pSY751, expresses the S- ~ 3 4 ? 0 6 7 galactosidase gene at 37°C and 42°C, but not at 30°C
(Figure 8).
In this approach the SIpIII gene is first cloned into an "adapter" sequence in an intermediate plasmid and then subcloned to the expression systems.
The adapter sequence has the following useful features: a unique central BanI REN site, three unique REN sites to either side of BanI, information coding for protein cleavage at either methionine, aspartate-proline or arginine amino acids and small size. The BanI site is the paint of insertion for the SIpIII
multimers with BanI ends.
The adapter was synthesized with the Applied Biosystems 380A Synthesizer, cloned in M13mp18 and the DNA sequence verified. The adapter was then subcloned into a specially-constructed plasmid vector that lacked BanI REN sites. 'The recipient plasmid was made as follows. Plasmid pJH101 (Ferrari, et al., 1983) was partially digested with AhaIII restriction enzyme and religated. Trans:formants of E. coli HB101 were selected on medium containing chloramphenicol (12.5 mg/ml). After restriction analysis of several isolates one plasmid was chosen, pSY325 (Figure T). This plas-mid contains only the chloramphenicol-resistance gene and the replication origin (from pBR322) of pJH101.
After digestion to completion with XhoII, pSY325 was ligated with the gel-purified adapter. The result was the adapter-plasmid, pSY937* and its new REN sites were verified.
The SlpI:CI multimers were cloned into the BanI
site of pSY937 (F:igure 7). Positive clones were identified by colany hybridization and with the lower strand of section 1 of SIpIII as the DNA probe for hybridization (probe sequence shown in Table 2). Posi-tive clones were characterized by gel electrophoresis for the size of the inserted multimer. Finally, the * Figure 7 SIpIII sequen~~es were subcloned using the REN site in the flanking adapter regions to specific locations of expression pl<ismids.
The SlpII:I protein had the following amino acid composit::on:
SIpIII 117.9 AA MW 83,000 (fm) DF'VVLQRRDWENPGVTQLNRLAAHPPFASDPM
1 0 GE~GS ( GAGAGS ) 6 GAAGY
C f GAGAGS ) 9 GAAGY] 1 8 GE~GAG~',GAGAGSGAGAMDPGRYQLSAGRYHYQLUWCQK
(fm) intends the initiation codon SIpIII Expres~;ion Vector Plasmid DNA pSY1086 is a pSY937 derivative containing 19 repeats of SIpIII (3.5 kb). This plasmid DNA was digested with NruI and PvuII and the fragments separated by a.garose gel electrophoresis. The purified SIpIII multime~r was then cloned in plasmid pSY751 digested with PvuIII REN. Several clones were analyzed and one (pSY1008) was chosen to be used in expression experiments and SIpIII purification.
The ampicillin drug resistance gene of pSY1008 was substituted with the Kanamycin marker from pSY1010 (produced by digestion of pSY633 with DraI and Ss~I and insertion of KanR obtained by HincII digestion of pUC4K) and the subsequent plasmid was called pSY1186.
By removing the SIpIII portion of plasmid pSY1186 with BanI, a new plasmid, pSY1262, was generated. This plasmid contains a unique BanI site which allows for the direct ligation of fragments containing BanI ends obtained by polymerization of monomers. This plasmid has been used to generate plasmids containing inserts for the following proteins: SELP1, 2, 3, and Slp4.

~2 ~ ~4 1 06 7 Production and Purification of SIpIII
Cell Culture E. coli a:re cultured in the following medium:
g/1 yeast extract 20 casamino acids 20 pept~~ne 20 gelatin peptone 20 KH2P0~ 2 K2HP0~ 2 ~la2HP0~ ~ 7H20 2 glucose 2 ampicillin 0.1 An overnight culture (500 ml - 1 L) which had been grown at 30°C was used to inoculate 375 L of media contained in a 500 L fermentor. Fermentor conditions include a tachometer reading of 100 rpm, vessel back pressure of 5 psi and an air flow of 170 1/min in order to maintain dissolved 02 at greater than 50~.
Glucose (1 gm/1) and ampicillin (0.05 g/1) were added to the fermentation when the culture reached an OD650 of 1.0 and again at 2Ø When the culture reached an OD650 of 2.0 the temperature was increased to ~t2°C for 10 min and then lowered to 38°C for 2 hours. The culture was then chilled to 10°C and cells were harvested by centrifugation in a continuous cen-trifuge and frozen at -70°C until processed. Yields from two separate fermentations were 7.3 kg and 5.2 kg wet weight of cells.
It should be noted that other media can be used and, with different plasmids, various selection conditions can be :imposed (i.e., substitution of kana-mycin selection for ampicillin). These conditions have been used in laboratory scale fermentations (10 L
volumes).

53 ~34ros~
Cell Lysis Method 1. Cells are thawed and suspended to a concentration of 1 kg wet weight/6 1 in 50 mM Tris-HC1 pH 7.0, 1 mM EDTA and broken by 2 passages through an APR Gaulin cell disrupter at 8000 psi. During this lysis procedure the cells are kept cold with an ice bath. The cell lysate is then centrifuged at 26,OOOxg with a continu~~us centrifuge, such as the T2-28 rotor in a Sorvall R~5B refrigerated centrifuge operated at 4~C. Under these conditions greater than 90~ of the SIpIII produced can be found in the pellet. The super-nate does cont;~in ;some product which can be recovered by NH~SO~ precipitation as described below. The pellet is extracted with Liar as described below.
Methoci 2. Frozen cells are thawed and re-suspended to a concentration of 1 kg wet weight/6 1 in 50 mM Tris-HC1 pH i'.0, 10 mM EDTA, and 5 mM PMSF to inhibit protea,e activity. Cells are stirred in this buffer at room temperature for 0.5 to 2 hours, then lysozyme is added t;o a concentration of 1 g/1 and incubation is continued for 20 min. S-Mercaptoethanol is then added 1;0 70 mM and the detergent NP40 is then added to a final concentration of 1~ for 20 min while continuously sl;irri.ng the cell suspension. Then MgCl2 is added to 50 mM followed by DNAse at a concentration of 1 mg/1 and :Lncubation is continued at room tempera-ture for 20 min. 7.'he cell lysate is then centrifuged as in method 1 at 26,OOOxg in a continuous centrifuge and the supernatant; is collected and passed through the continuous cent;rifuge a second time at 26,OOOxg. The supernate resu~~ting from this second centrifugation contains <5~ of the total SIpIII, but what is there can be recovered w::th NH4S0~ as described below. The pellets result~_ng from the 1st and 2nd 26,OOOxg centri-fugations are c:ombi.ned and extracted with Liar as described below.

~'~4 ~ ~6 7 ~4 Method 3. For this method, a strain of E.
coli is used that contains a second plasmid which encodes the T7 phage lysozyme. This plasmid is com-patible with t:ne p:Lasmid encoding the SIpIII gene and 5 the drug resistance determinant. The strain is grown in the same medium and under the same conditions as in the first two methods. However, due to the production of the T7 lysoayme inside the cells, their cell wall is weakened and tzey can be easily lysed at the completion of the ferment~~tion by the addition of EDTA to >100 mM
and NP~O to a concentration of from 0.5 to 1.0% v/v.
Lysis can also be <~chieved by the addition of chloro-form (20 ml per liter) of fermentation broth instead of NP~+0. Alternatively, cells may be collected by centri-fugation prior to lysis, resuspended to 1 kg wet weight/6 1 in 'rris--EDTA as described in the first two methods and then lysed by the addition of NP40 or chloroform. F~~llowing cell lysis by either method the lysate is centrifuged in a continuous rotor at 26,OOOxg as described i:n the first two methods. As with those methods, Liar ~sxtraction of the pellet and NH~SO~
precipitation ~~f the supernate are used to recover the product.
Purification oP SIpIII
The p~sllet obtained by centrifugation of the cell lysate at 26,c)OOxg as described above is extracted with an equal volume of 9M Liar. The salt solution is added and the pellet is evenly suspended by stirring at room temperature (13T). The mixture is stirred for 1 hour at RT after an even suspension is obtained. The mixture is them centrifuged at 26,OOOxg in a continuous rotor at 4°C or at RT to generate a pellet and a super-natant fraction. 'the supernate is saved and the pellet is re-extracted with another equal volume of 9M Liar as above. After mixing for 1 hour the mixture is centri-fuged at 26,OO~xg and the supernate from this centrifu-5~ ~ 3~' ~ p6 7 gation is combined with the supernate from the first Liar extraction and allowed to stand at 4°C overnight.
Approximately 90% of the SIpIII contained in the cell lysate 26,OOOxg pellet is extracted by Liar using this procedure.
After the L,iBr extract stands overnight at 4°C
a precipitate forms, is removed by centrifugation at 26,OOOxg and is discarded. The supernate is then placed in dialysis bags and dialyzed against several changes of dH20 for 2 days. As the Liar is removed by dialysis the SIpIII product precipitates in the dialy-sis bags. The precipitate is collected by centrifu-gation and washed 2-~3 times with dH20. The final washed product is centrifuged and dried by lyophilization.
For the recovery of SIpIII from the 26,000 g supernatant fractions, NH~SO~ precipitation is used.
Solid NH~SO~ is slowly added to the sample which is maintained at ~4°C, until 38~ saturation is achieved (231 g/1). The mixture is then stirred at 4°C for 2-3 hours. The precipitata is recovered by centrifugation in a continuous flow centrifuge and washed 4-5 times with an equal volume of distilled H20 or with 0.5~ SDS
in H20. After each wash the precipitate is recovered by continuous centrifugation. The pellet becomes increasingly white with successive washes as contami-nating protein is removed. SIpIII is recovered as a washed pellet and can be dried by lyophilization.
Tr;rpsin 'Treatment Step of S1~III
SIpIII was suspended in 50 mM Tris HC1, pH
8.0, 0.1 M NaCl buffer, and was placed in a 37°C water bath, and TPCK treated trypsin solution was mixed into the suspension. The final trypsin concentration was 0.1~. After 3 hours, the solution was centrifuged at 16,OOOxg for 15 min., the pellet was washed with a half equal volume cf 0.5~ SDS in H20 first, then with distilled water. After each wash the pellet was recovered by centrifugation. The final product was resuspended in water and kept at ~!°C for further analysis.
With the trypsin treatment, SIpIII was purified to 99.~4~ purity.
Physical Measurements of SIpIII
Physical measurements of the purified silk-like proteins have been compared with those of Bombyx mori silk in order to establish that the repetitive amino acid polymers produced microbiologically accurately mimic tine properties of naturally occurring polymers. Physica:L measurements were performed to confirm the_model of anti-parallel chain pleated sheet conformation for the crystalline regions of Bo-- mbyx mori silk fibroin (Marsh, Corey and Pauling, Biochem.
Biophys. Acta (1955) 16; Pauling and Corey, Proc. Natl.
Acad. Sci. USA (1953) 39:247. Preliminary analysis of x-ray difraction patterns obtained from Slp films are consistent with those described by Fraser, MacRai, and Steward (1966) (Table 4). Circular Dichroic (CD) and Fourier transform :infrared (FTIR) spectroscopic analysis of SIpIII are consistent with a high degree of extended S and B-turn conformations. Comparisons of the spectra obtained from SIpIII with that of naturally occurring silk fibroin in various solvents (Isuka and Young, Proc. N,atl. Acad. Sci. USA (1966) 55:1175) indicate that SIpIII in solution consists of a mixture of the random ~~nd highly ordered structures seen in silk f ibroins .

Table 4 Material a (A) b (A) c (A) (AG)n 9.42 6.95 8.87 (AGAGSG)n 9.39 6.85 9.05 CTP fraction 9.38 6.87 9.13 Native fibroin 9.40 6.97 9.20 9.44 6.95 9.30 SIpIII 9.38 6.94 8.97 Referenced i:z Fraser et al., Mol. Biol. (1966) J.

19:580.

Example 4 Elastromeric Biological Sequence Protein (EBSI) Gene Construction (the protein is an elastin and silk block copolymer) .

Sia oligonucleotide strands were synthesized and purified as described previously.
(HIII) BanI:I StuI
i . 5' AGCTGGGCTCTG(sAGTAGGCCTG3' ii. 5'AATTCAGGCCTACTCCAGAGCCC3' (ERI) Stul BanII
(HI II ) BanI
iii. 5'AGCTTGGTGCC;AGGTGTAGGAGTTCCGGGTGTAGGCGTTCCGGGAGTTGG
TGTACCTGGAGTGGGTGTTCCAGGCGTAGGTGTGC3' (XmaI) iv. 5'CCGGGC.~CACC:TACGCCTGGAACACCCACTCCAGGTACACCAACTCCCGGA
ACGCCTAC.~CCCC~GAACTCCTACACCTGGCACCA3' BanI
(XmaI) AhaII
v. 5'CCGGGG'rAGGAGTACCAGGGGTAGGCGTCCCTGGAGCGGGTGCTGGTAG
CGGCGCAGGCGC_G_GC~CTCCGGAGTAGGGGTGCCG3' B,anI I BanI
(ERI) Ban.I BanII
vi 5'AATTCGGCACCCC:TACTCCGGAGCCCGCGCCTGCGCCGCTACCAGCACCCG
CTCCAGG_G~~CGCCTACCCCTGGTACTCCTACC3' ~~h a I I

_ ,.~Ar 5a ~ 341 06 7 Oligonucleotide strands (iii), (iv), (v) and (vi) were annealed and ligated with the DNA of plasmid pBSml3(+) (Stratagene) which had been digested with HindIII and EcoRI. The products of this ligation reaction were transformed into E. coli strain JM109.
Transformant colonies were selected for resistance to ampicillin. Colonies were screened for their hybridi-zation with 32P-la~elled oligonucleotides (iii), (v).
Plasmid DNA from several positively hybridizing clones was purified and sequenced. Two of the plasmids, pSY1292 and pSY1293, contained the sequence shown for oligonucleotides (iii), (v) and (iv), (vi). These sequences contained all of the nucleotides present in this synthetic oligonucleotides except one. A G:C
basepair was missing at position 7 (iii). The lack of the basepair obstructed one of the BanI sites. In order to introduce a second BanII site at the 5' end of the gene fragment, oligonucleotides (i) and (ii) were annealed and ligated with plasmid pBSml3(+) which had been digested pith HindIII and EcoRI. Plasmid DNA from the transforma:zt colonies resistant to ampicillin was purified. Two plasmids, pSY1295 and pSY1296, which were digestibl~s with StuI, a unique site contained in the oligonucleotide sequence, were sequenced. They were both shown to contain the sequence shown for oligonucleotidE~s ( i. ) and (ii ) . Plasmid DNA from pSY12,92 was di~3ested sequentially with HindIII, SI
nuclease, and 1?coRl. The digestion products were separated by electrophoresis in an agarose gel and the DNA fragment of approximately 150 basepairs was excised from the gel. This DNA fragment was digested with plasmid DNA pS'.C1296 which had been digested with StuI
and EcoRI. The products of this ligation reaction were transformed intro E. coli strain JM109 and were selected for resistance to ampicillin. Colonies were screened for hybridization to 32P-labelled oligonucleotide (v).
The plasmid DNA from two positively hybridizing clones was purified and :sequenced. These plasmids were named pSY1297 and pSY1298. They contained the following sequence:

(HindIII) BanII
AGCTGGGCTCTGGi~GTAGGTGTGCCAGGTGTAGGAGTTCCGGGTGTAGGCGTTCCGGGAG 60 TCGACCCGAGACC'CCATCCACACGGTCCACATCCTCAAGGCCCACATCCGCAAGGCCCTC
xMAi TTGGTGTACCTGG~~GTGGGTGTTCCAGGCGTAGGTGTGCCCGGGGTAGGAGTACCAGGGG 120 AACCACATGGACC'.'CACCCAC.AAGGTCCGCATCCACACGGGCCCCATCCTCATGGTCCCC
BanII
TAGGCGTCCCTGG1~GCGGGTG~CTGGTAGCGGCGCAGGCGCGGGCTCCGGAGTAGGGGTGC 180 ATCCGCAGGGACC'..'CGCCCACGACCATCGCCGCGTCCGCGCCCGAGGCCTCATCCCCACG
EcoRi CGAATTC
2 ~ GCTTAAG
EBSI Multimer Gene Assembly:
The BanI acceptor plasmid pSY937 was modified 25 in order to accept BanII terminal cohesive DNA frag-ments. Two olig~~nucleotides were synthesized for this purpose.
(BamHI) DraI SspI NruI BanII
vi. 5'GATCCTA'..'GTTTAAA'rATTCTCGCAACGTTTTTGTATGGGCTCGATGTGT
TACCGTGCG(;ATGGATA'rCAGCTG3' Fsp== EcoRV PuvII
(BamHI) PvuII EcoRV FspI BanII
vii. 5'GATCCAGC;TGATATCCATGCGCACGGTAACACATCGAGCCCATACAAAAA
3 5 CGTTCGCGAC~AATAT'rTAAACATAG3' NruI SspI D:raI

Oligonucleotides (vii) and (viii) were annealed and ligate<i with plasmid DNA pSY937 which was digested with BamHI. The products of this ligation were transforrr.ed into E. coli strain JM109 and colonies were selected for resistance to chloramphenicol.
Transformant colonies were screened by hybridization to 32P-labelled oligonucleotid~ (vii). Plasmid DNA
from two positively hybridizing clones, pSY1299 and pSY1300, contained t;he sequence shown for.oligo-nucleotides (vii) and (viii), as determined by DNA
sequencing.
Plasmid DNA pSY1298 was digested with BanII
and the digestion fragments separated by agarose gel electrophoresis. The EBSI gene fragment, approximately 150 base pairs, was excised and purified by electro-elution and ethanol precipitation. Approximately 1 ug of purified fragment was self-ligated in order to pro-duce multimers ranging in size from 450 by to 6,000 bp.
The products of the self-ligation were then ligated with plasmid DNA pSY1299 which had been digested with BanII. The products of this ligation reaction were transformed into E. coli strain HB101. Transf ormants were selected for resistance to chloramphenicol.
Plasmid DNA from individual transf ormants was purified and analyzed for increased size due to EBSI multimer DNA insertions. Ten clones (pSY1240-1249) with inserts ranging in size from 1.5 Kbp to 4.4 Kbp were obtained.
Expression of EBSI Multimer Gene:
One of these clones, pSY1248, which contained a 4 Kb EBSI multimer gene was recloned in the aPR
expression vector, pSY751. Plasmid DNA from pSY1248 was digested with NruI and PvuII, separated by agarose gel electrophoresis, and the DNA band corresponding to the EBSI multimer gene was excised and purified by NACS
purification. DNA from plasmid pSY751 was digested with PvuII and ligated with the NruI-PvuII fragment from pSY12~48. The products of this ligation were transformed into E. coli HB101, and the transformants selected for r~sis~ance to ampicillin. Two clones were isolated containing the new plasmid pSY1280. E. coli cells containing p:~Y1280 were grown at 30°C to an OD600 of 0.7 and then sh fted to X42°C for 1.5 hours. The proteins produced by these cells was analyzed by SDS-PAGE. The separated proteins were transferred to nitrocellulose paper and detected by immunoreactivity with anti-ELP rabbit serum. A strongly reactive protein band was observed with an apparent molecular weight of 120 kD.
The Anpic.illin drug resistance gene of pSY1280 was substituted with the Kanamycin marker and the subsequent plasmid was called pSY1332. This plasmid was used in fermentation for the purification of EBSI.
(See Methods) pSY1332/pS'.C1280 EBSI Protein 1113 AA MW 113,159 MDPVVLQRRDjJENPGVTQLNRLAAHPPFASERFCMGS
[ ( GVGVP ) 8 I; GAGAGSGAGAGS ) 1 ~ 26 MCYRAHGYQLSAGRYHY'QLVWCQK
Purification of EBSI Protein:
E, coli strain HB101 containing plasmid pSY1280 was fermented in 10L volume. The cells were concentrated by filtration and further harvested by centrifugation. Pelleted cells were stored frozen at -70°C until processed. Frozen cells were thawed on ice and suspended in 4 ml of 50 mM Tris-HC1 pH 7.0, 10 mM
EDTA, 5 mM PMSF per gram wet weight of cells. The cells were broken by French pressing twice at 15,000 psi and then cooled to 0°C. The crude lysate was cleared by centrifugation at 26Kxg for 20 minutes. The supernatant proteins were precipitated by addition of s2 solid ammonium sulfate to 20~ of saturation (114 g/1).
The precipitate was collected by centrifugation at lOKxg for 10 minutes. The pellet was resuspended in 10 ml of H20 and dialyzed against 10 mM Tris pH 8.0, 0.15M
NaCl at 4°C. The dialyzed solution was digested with 0.1~ Trypsin (Sigma) for 1.5 hours at room temperature, and reprecipitated with 20~ ammonium sulfate. The pre-cipitated protein was resuspended in H20 and dialyzed against 10 mM Tris pH 7.0, 1 mM EDTA at 4°C. The pro-tein purity of this sample was analyzed by amino acid compostion and determined to be 83~.
Elastic Pro;~erties of EBSI Protein:
The soluble preparation of semi-purified EBSI
protein described above was incubated at 37°C for 30 minutes and centrifuged at lOKxg for 10 minutes at room temperature. This treatment caused the EBSI protein to aggregate, become insoluble, and pellet into a trans-lucent solid. The solid was resistant to mechanical disruption either by vortexing or by maceration using a glass rod. The solid could be cut with a razor blade into strips which exhibited a high degree of elasticity. They fully retained their shape after repeated extensions and relaxations. They resisted compression with no apparent irreversible deformation of structure.
EBSI Purification EBSI sample (-70~ pure) was dialyzed in 50 mM
Tris HC1, 50 m1H NaCl, pH 8.0 at 4°C overnight with one change of buffer. If precipitation was observed, the sample was centrifuged at 27,OOOxg for 15 min at 4°C.
All remaining ;steps were performed at 4°C. The super-*_ natant was applied to a DEAE-Sephacel column which had been equilibrated with 50 mM Tris HC1, 50 mM NaCl, pH
8Ø The flow through fractions which contained EBSI
were collected and pooled. NaCl was added to the *Trade-mark ,._ .~. _, pooled fractions from DEAF-Sephacel column to make a final concentration of 2 M NaCl in the sample. Insoluble material was r~?move~d by centrifugation at 27, OOOxg for 20 min. The supernatant was then loaded onto Phenyl-Sepharose column which was equilibrated with 50 mM sodium phosphate buffo r, pH 7.0, with 2 M NaCl. The column was washed extensively with buffer until no eluting protein was detected b~T A28~ . The column was then eluted stepwise with 50 mM so3ium phosphate buffer, pH 7.0 and finally with water. The EBSI active fractions were pooled and stored at 4°C for further analysis.
With the addition of these steps to the previous procedures, lOC% pure EBSI was obtained.
Example 5 ELPI (elastin-like protein) Construction and Expression Two oligonucleot:ide strands were synthesized and purified as described in the Methods section.
(EcoRI) BanI SmaI
2 5 i ) 5' -AATTCGG'I'GCCC(3G'TGTAGGAGTTCCGGGTGTAGGCGTTCCCGGGGTAG
GCGTTCCGGG~,GTAGG_GGTGCCA- 3 ' BanI
BanI SmaI
ii) 3'-GCCACGGGaCCACATCCTCAAGGCCCACATCCGCCAAGGGCCCCATCCGCA
AGGCCCTCATC'.CCCAC_GGTTCGA- 5' BanI (HindiII) The two o:Ligonucleotide strands were annealed and ligated with the DNA of plasmid pBS m13(+) (Stratagene) which had been digested with RENs HindIII
and EcoRI.
The products of this ligation reaction were transformed into E. coli strain JM109. Transformant colonies were screened for their hybridization with 32P-labeled oligonucleotide (i). Plasmid DNA from positively hybridizing clones was purified and sequenced. One plasmid, pSY1287, contained the sequence shown for oligonucleotides (i) and (ii).
Plasmid DNA. from pSY1287 was digested with BanI REN and the digestion fragments were separated by agarose gel electrophoresis. The ELPI gene fragment, approximately 60 bp, was excised and purified by NACS
column. Approximately 1 ug of purified fragment was self-ligated in order to produce multimers ranging in size from 300 by to 5000 bp.
The products of the self-ligation were then ligated with plasmid DNA pSY937 which had been digested with REN Banl. The product of this ligation reaction was transformed into E. cola strain HB101. Transfor-mants were selected for resistance to chloramphenicol.
Plasmid DNA from individual transformants was purified and analyzed for increased size due to ELPI multiple DNA insertions. Four clones (pSY1388-1391) with inserts ranging in size from 1.0 kbp to 2.5 kbp were obtained. These clones were recloned in the aPR
expression vector pSY751. The clones obtained (pSY1392-1395) were used for expression of ELPI.
The ELPI protein had the following amino acid composition:
pSY1395 ELPI Protein 859 AA MW 72,555 MDPVVLQRF:DWENP(~VTQLNRLAAHPPFARNILAIRW
[(VPGUG)~.~J~O UPWTRUDLSAGRYHYQLUWCQK

~5 1 341 06 7 SELP1 Gene Construct;ion and Expression Two oligonucleotide strands were synthesized and purified as described in the Methods section.
FspI: PvuII SnaBI (PstI) ( i ) 5'-GTGCGCAGCTGGTACGTAGCTGCA-3' (PstI:) PvuII
(ii) 3'-ACGTC; ACGCGTCGACCATGCATCG-5'-FspI SnaBI
These oligonucleotide strands were annealed and ligated with plasmid pSY1304 which had been digested with PstI HEN (pSY1304 differs from pSY857 by having a monomeric unit in place of the trimeric unit of pSY857)._ Plasmid DNA from transformant colonies resistant to chloramphenicol was purified. One plasmid, pSY1365, which was digestible with REN SnaBI, was sequenced and proven to be correct.
ELPI gene fragment purified as described (ELPI
construction and expression) was treated with Mung Bean Nuclease as described by supplier (Stratagene). The DNA fragments mixture was then ligated with plasmid DNA
pSY1365 which had been digested sequentially with RENs Fs~.I, SnaBI and calf' intestinal phosphatase. The pro-ducts of this ligati.on reaction were transformed into E. coli strain HB101 and were selected for resistance to chloramphenicol. Plasmid DNA from individual trans-formants was purified and analyzed for the ELPI monomer DNA insertion. Two plasmids, pSY1366 A and B, were sequenced. They were both shown to contain the ELPI
DNA sequence in the correct orientation.
Plasmid DNA pSY1366 was digested with REN BanI
and the DNA fragment containing the SELP1 monomer was gel purified. To create multimers, 1 ug of the SELP1 DNA fragment was self-ligated. Multimers were obtained ranging in size from 500 by to 10 kbp. The SELP1 multimers were cloned into the BanI site of pSY1262.

Positive clone: were characterized by gel electro-phoresis for the size of the inserted multimer and used for expression and protein analysis.
pSY1396 SELP1 Protein 2025 AA MW 148,212 MDPVULQRRDWENPGVTcaLNRLAAHPPFASDPMGAGS (GA GAGS)6 [GAA(VPGVG)~ VAAGY (GAGAGS)9]~~
GAA(VPGVG)~ VAA(lY (GAGAGS)2 GAGAMDPGRYQLSAGRYHYQLVWCQK
SELP2 - Monomer' Construction Plasmi.d DNA pSY1298 was digested with BanII
REN and the EBSI gene fragment was purified as described previously. The EBSI monomer fragment was ligated into p:>Y1304 (pSY937 containing a monomer of SIpIII, constructed as pSY857) which had been digested with BanII REN and treated with calf intestinal phosphatase).
The products of the ligation mixture were transformed in E. co:Li strain HB101. Transformants were selected for resistance to chloramphenicol. After restriction analysis of several isolates, one plasmid was chosen, pS'.!1301, containing a DNA fragment corresponding t;o the EBSI monomer gene .
SELP2 - Multip:le Gene Assembly and Expression Plasm::d DNA pSY1301 was digested with REN BanI
and the DNA fragment containing the SELP2 "monomer" was gel purified. To create multimers, 1 ug of the SELP2 DNA fragment was self-ligated. Multimers were obtained greater than 1a? kb in size.
The S1:,LP2 multimers were cloned into the BanI
site of pSY126;?. Positive clones were characterized by gel electrophoresis for the size of the inserted 67 ~ ~ 4 ~ 0 6 ~
multimer. The clones with inserts ranging in size from 1.5 kb to 11 kb wew~e selected. Plasmid DNA pSY1372 containing an insert of 6 kb (18 repeats) was used for further analysis and protein purification.
SELP2 - Protein Purification E. coli strain HB101 containing plasmid pSY1372 was fermented according to the procedure described in Methods for fermentation. The cells were harvested by centrifugation. Pelleted cells were stored frozen at -'l0°C until processed. Frozen cells were thawed on ice and suspended in 4 ml of 50 mM Tris-HC1, pH 7.0, 10 mM EDTA, 5 mM PMSF per gram wet weight of cells. The cel:Ls were broken by passing through a Gaulin cell~disrupter at 8,000 psi. The crude lysate was cleared by centrifugation at 26,OOOxg for 20 min.
The supernatant, which contained >75~ of the SELP2 protein, was precipitated by addition of 20~ ammonium sulfate (11~ g/L). The precipitate was collected by centrifugation at 10,OOOxg for 10 min. The pellet was resuspended in 10 ml of H20 and dialyzed against lO mM
Tris pH 8.0, 0.15 M NaCl at 4°C. The dialyzed material was centrifuged at 26,OOOx for 15 min in order to collect the insoluble fraction of protein which contained approximately 10~ of the SELP2 protein. This insoluble protein pellet was washed twice in 0.2~ SDS
at 50°C for 30 min with occasional shaking. The insoluble protein was collected each time by centri-fugation at 26,OOOxg for 15 min. followed by a wash of 50~ ethanol. The final protein pellet was resuspended in water and analy~aed by Western blot analysis and amino acid composition. By Western blot the SELP2 protein appears to be homogeneous in size consistent with its large molecular weight (>150 kd). By amino acid composition the SELP2 preparation is approximately 80~ pure and the observed molar ratio of amino acids ~ 341 06 7 (Ser:Gly:Ala:Pro:Val:Tyr) agrees very closely with the expected composition as predicted from the SELP2 sequence present in pSY1372.
pSY1372 SELP2 Protein 2055 AA MW152,35~
MDPUULQRRDWENPGVfQLNRLAAHPPFASDPMGAGS (GAGAGS)2 (GVGUP)8 [(GAGAGS)6 GAAGY (GAGAGS)5 (GUGVP)8]17 ( GA GA GS) 6 GAAGY ( GAGA GS) 2 GAGAMDPGRYQLSAGRYHYQLVWCQK
SELP3 - Construction and Expression Plasmid DNA pSY1301 was partially digested with REN HaeII and the digestion fragments separated by agarose gel electrophoresis. The larger DNA fragments were excised and purified by NACS column. The purified fragments were self-ligated, the ligation reaction was heated at 70°C for 15' to inactivate the T4 DNA ligase and eventually digested with REN PstI. The digestion mixture was then transformed into E. coli strain JM109.
Transformants were selected for resistance to chloram-phenicol. Plasmid DNA from individual transformants was purified and analyzed for: (1) resistance to REN
PstI; and (2) deletion of 60 by HaeII fragment con-tained within the SELP2 gene fragment. One clone (pSY1377) satisfied both requirements. Plasmid DNA
pSY1377 was digested with REN Banl and the DNA fragment containing the SELP3 monomer was gel purified. To create multimers, 1 ug of the SELP3 DNA fragment was self-ligated. Multimers were obtained ranging in size from 500 by to 10 kbp. The SELP3 multimers were cloned into the BanI site of pSY1262. Positive clones were characterized by g~sl electrophoresis for the size of the inserted multime~r and used for expression and protein analysis.

~ 341 06 7 pSY1397 SELP3 Protein 2257 AA MW 168,535 MDPVVLQRRDWENPGVTQLNRLAAHPPFASDPMGAGS (GAGAGS)2 [(GVGUP)8 (GA GAGS)8J24 (GVGVP)8 (GAGAGS)5 GAGAMDPGRYQLSAGRYHYQLVWCQK
SLP~ - Construction and Expression Plasmid DIdA pSY1304 (pSY857 with a single monomeric unit as distinct from the trimeric unit of pSY857) was partially digested with REN HaeII and the digestion fragments separated by agarose gel electrophoresis;. The larger DNA fragments were excised and purified b;~ NACS column. The purified fragments were self-ligated, the ligation reaction was heated at 70°C for 15'- t~~ inactivate the T4 DNA ligase and eventually dig~ssted with REN PstI. The digestion mixture was then transformed into E. coli strain JM109. Transformants were selected for resistance to chloramphenico:L. Plasmid DNA from individual transformants was purified and analyzed for: (1) resistance to 13EN PstI; and (2) deletion of 60 by HaeII
fragment contained within the SELP2 gene fragment. One clone (pSY1378) satisfied both requirements. Plasmid DNA pSY1378 wa:3 digested with REN BanI and the DNA
fragment containing the SLP4 monomer was gel purified.
To create mult.imers, 1 ug of SLP4 DNA was self-ligated.
Multimers were obtained ranging in size from 300 by to 6 kbp. The SL?~4 multimers were cloned into the BanI
site of pSY126;?. Positive clones were characterized by gel electropho~.~esis for the size of the inserted multimer and uaed for expression and protein analysis.
pSY1398 SLP~4 Protein 1101 AA MW 76,231 MDPVVLQRRDIrfENPGVTQLNRLAAHPPFASDPMGAGS [(GAGAGS)6J27 (GAGAGS)~ GAGAMDPGRYQLSAGRYHYQLVWCQK

As is evident from the above results, highly repetitive sequences can be prepared, cloned, and used for expression to produce a wide variety of products which may mimi~~ natural products, such as silk and 5 other proteins and antigens. In addition, novel systems are provided for controlling the expression of the peptide un~3er inducible conditions in a variety of hosts. In this maaner, new proteinaceous products can be provided which provide for new properties or may 10 closely mimic 'she properties of naturally occuring products.
Bibliography 1. Maniatis, T., Fritsch, E.F. and Sambrook, J.
15 1982. -Mo:Lecular Cloning: A Laboratory Manual.
Cold Spring Harbor Laboratory, Cold Spring Harbor, NY.
2. Laemmli, ~:J.K. 1970. Nature (London), 227:680-20 685.
3. Applied Biosystems User Bulletin. 1984. No. 13.
~4. Matteucci, M.D. and Caruthers, M.H. 1981.
25 Journal Amer. Chem. Soc., 103:3185-3319.
5. MeBride, L.J. and Caruthers, M.H. 1933.
Tetrahedron Letters, 2:245-248.
30 6. Smith, 1980. Methods in Enzymology, 65:371-379.
7. Vieira, J. and Messing, J. 1982. Gene, 19:259-268.
35 8. Anagnostopouls, C. and Spizizen, J. 1981. J.
Bacteriol., 8171-7~6.

71 1 341 06 ~
9. Davanloo, P., Rosenberg, A.H. Dunn, J.J. and Studier, '.W. 1984. Proc. Natl. Acad. Sci. USA, 81:2035-2039.
10. Rosenbluh, A., Banner, C.D.B., Losick, R. and Fitz-Jam es, P.C. 1981. J. Bacteriol., 1~8:3~41-351.
11. Sadaie, Y., Burtis, K.C. and Doi, R. 1980. J.
Bacteriol., 11:1178-1182.
12. Queen, C. 198:3. J. Applied Molecular Genetics, 2:1-10.
13. Ferrari, F.A., Trach, K. and Hoch, J.A. 1985. J.
Bacteriol., 161:556-562.

14. Johnson, W.C., Moran, C.P. and Losick, T.R. 1983.
Nature (London), 302:800-804.

15. Studier, W.F. and Moffat, B.A. 1986. J. Mol.
Biol., 189:11:3-130.

16. Goldfarb, D.S., Doi, R.H. and Rodriguez, R.L.
1981. Nature (London), 293:309-311.

17. Ferrari, F.A., Nguyen, A., Lang, D. and Hoch, J.A.
1 983. J. Bacteriol. , 1 5~4 :1 51 3-1 51 5.

18. Lacey, R.W. and Chopra, I. 1974. J. Med.
Microbiology, T:285-297.

19. Norrander, J., Kempe, T. and Messing, J. 1983.
Gene, 26:101-106 20. Sanger, F., Nicklen, S. and Coulson, A.R. 1977.
Proe. Natl. Acad. Sci. USA, 7:5463-5467.

21. Biggin, M.D., Gibson, T.J. and Hong, G.F. 1983.
Proc. Natl. Acad. Sci. USA, 80:3963-3965.

22. Zagursky, R.J., Baumeister, K., Lomax, N. and Berman, M.L. 1985. Gene Anal. Techn., 2:89-94.

23. Sanger, F. and Coulson, A.R. 1978. FEBS Letters, 87:107-i10.

24. Sadler, J.R., Techlenburg, M. and J. L. Betz.
1980. Plas~nids containing many tandem copies of a synthetic lactose operator. Gene 8:279-300.
The invention now being fully described, it will be apparent to one of ordinary skill in the art 25 that many ch;3nges and modifications can be made thereto without departing from the spirit or scope of the appended claims.
F ~:
.~,r"

Claims

1. A recombinant DNA sequence encoding a protein of at least about 10 kDal containing at least one oligopeptide repeating unit, which repeating unit is characterized bar containing at least three different amino acids and a total of from 4 to 30 amino acids, there being at least two repeating units in said protein and at least two identical amino acids in each repeating unit and wherein said units are optionally joined by an amino acid bridge of from about 1 to 15 amino acids, said DNA
sequence having the following formula:
K k(WMX x NY y)i L l wherein:
the K and L unite are each DNA sequences encoding an amino acid sequence of from about 1 to 100 amino acids, the K and L units together being fewer than about 20% of the total number of amino acids;
k and l are 0 or 1;
the W unit is of the formula:
[(A)n(B)p]q the A unit is a DNA sequence coding for said oligopeptide repeating unit where A will contain from about 12 to 90nt, at least two codons coding for said identical amino acid in said repeating units being different, where there will be at least two different A
units differing by at least one nucleotide;

the B unit is a DNA sequence different from the A
unit coding for other than the oligopeptide unit coded by the A unit and having from about 3 to 45nt, where the B
units may be the same or different;
n is an integer in the range of 1 to 100, each p is independently 0 or 1; and q is at least 1 and is selected so as to provide a DNA sequence of at least 90nt;
the M and N units are the same or different and are a DNA sequence of codons and are of 0 to 18nt in reading frame with the W and X units respectively;
the X unit is the same as or different from the W
unit and is of the formula:
[A1)n1(B1)p1]q1 the Y unit is the same as or different from the W unit and is of the formula:
[A2)n2(B2)p2]q2 wherein:
all of the symbols come within the definitions of their letter counterparts;
x and y are 0 or 1;
i is 1 to 100; arid wherein q1 is at least 1, when x=1; and wherein q2 is at least 1, when y=1; the total of q, q1 and q2 is not greater than about 50.

2. A DNA sequence according to Claim 1, wherein said DNA sequence is not greater than about 10knt.

3. A DNA sequence according to Claim 2, wherein the M and N units are 0 nt. and at least one of x and y is 0.

4. A recombinant DNA sequence encoding a protein of at least about 10 kDal containing at least one oligopeptide repeating unit, which repeating unit is characterized by containing at least three different amino acids and a total of from 4 to 30 amino acids, there being at least two repeating units in said protein and at least two identical amino acids in each repeating unit and wherein said units are optionally joined by an amino acid bridge of from. about 1 to 15 amino acids, said DNA
sequence having the following formula:
K k[(A)n(B)p]q L1 wherein:
the K and L units are each DNA sequences encoding an amino acid sequence of from about 1 to 100 amino acids, the K and L units together being fewer than about 20% of the total number of amino acids;
k and 1 are 0 or 1;
the A unit is a DNA sequence coding for said oligopeptide repeating unit where A will contain from about 12 to 90nt, air least two codons coding for said identical amino acid in said repeating units being different, where there will be at least two different A
units differing by at least one nucleotide;
the B unit is a DNA sequence different from the A
unit coding for other than the oligopeptide unit coded by the A unit and having from about 3 to 45nt, where the B
units may be the same or different;
n is an integer i.n the range of 1 to 100;
each p is independently 0 or 1; and q is at least 1 and is selected so as to provide a DNA sequence of at least 90nt.

5. A DNA sequence according to Claim 4, wherein said A unit codes for the same amino acid at least three times with at least two different codons.

6. A DNA sequence according to Claim 4, wherein said identical amino acid is glycine.

7. A recombinant DNA sequence encoding a protein of at least about 10 kD containing at least one oligopeptide repeating unit, which repeating unit is characterized by comprising the amino acid sequence GAGAGS or GVGVP, said DNA sequence having the following formula:
[(A)n(B)p]q wherein:
the A unit is a DNA sequence coding for said oligopeptide repeating unit wherein the codons for at least two glycine residues in the same or different A units are different, where there will be at least two different A units differing by at least one nucleotide;
the B unit is a DNA sequence different from the A
unit coding for other than the oligopeptide unit coded by the A unit and having from about 3 to 45nt, where the B
units may be the same or different;
n is an integer i.n the range of 5 to 25;
each p is independently 0 or 1; and q is at least 1 and is selected so as to provide a DNA sequence of at least 90nt.

8. A DNA sequence according to Claim 7, wherein said oligopeptide repeating unit is GAGAGS and the B unit comprises a codon for tyrosine.

9. A DNA sequence according to Claim 7, wherein an A
unit is of. the formula.:
GGTGCGGGCGCAGGAAGT.

10. A DNA sequence according to Claim 8, wherein an A unit comprises the sequence GGT GCC GGC AGC GGT GCA GGA GCC GGT TCT GGA GCT GGC GCG
GGC TCT GGC GCG GGC GCA GGA.

11. A DNA sequence according to Claim 7, wherein said oligopepticLe repeating unit is GVGVP.

12. A DNA sequence according to Claim 11, wherein one A unit is of the formula:
GGTGTAGGCGTTCCG.

13. A DNA sequence according to Claim 1, wherein the units WMX comprise[s] the sequence:
GGA GTA GGT GTG CCA GGT GTA GGA GTT CGC GGT GTA GGC GTT
CCG GGA GTT GGT GTA CCT GGA GTG GGT GTT CCA GGC GTA GGT

GTG CCC GGG GTA GGA GTA CCA GGG GTA GGC GTC CCT GGA GCG

GGT GCT GGT AGC GGC GCA GGC GCG GGC TCC.

14. A polypeptide comprising the recombinant DNA
expression product of a DNA sequence encoding a protein of at least about 10 kDal having repeating units of at least one oligopeptide, which oligopeptide repeating unit is characterized by having at least three different amino acids and a total of from 4 to 30 amino acids, there being at least two repeating units in said protein and at least two identical amino acids in each repeating unit and wherein said units are optionally joined by an amino acid bridge of from about 1 to 15 amino acids, said DNA
sequence having the following formula:
K k (WMX x NY y)i L1 wherein:
the K and L units are each DNA sequences encoding an amino acid sequence of from about 1 to 100 amino acids, the K and L units together being fewer than about 20% of the total number of amino acids;
k and l are 0 or 1;
the W unit is of the formula:
[A)n(B)p]q the A unit is a DNA sequence coding for said oligopeptide repeating unit where A will contain from about 12 to 90nt, at least two codons coding for said identical amino acid in said repeating units being different, where there will be at least two different A
units differing by at least one nucleotide;
the B unit is a DNA sequence different from the A
unit coding for other than the oligopeptide unit coded by the A unit and having from about 3 to 45nt, where the B
units may be the same or different;
n is an integer in the range of 1 to 100;
each p is independently 0 or 1; and q is at least 1 and is selected so as to provide a DNA sequence of at least 90nt;
the M and DT unit: are the same or different and are a DNA sequence of codons and are of 0 to 18nt in reading frame with the W and X: units respectively:
the X unit is the same as or different from the W
unit and is of the formula:
[(A1)n1(B1)p1]q1 the Y unit: is the same as or different from the W
unit and is of the formula:
[(A2)n2 (B2)p2]q2 wherein:
all of the symbols come within the definitions of their letter counterparts;
x and y are 0 or 1;
i is 1 to 100; and wherein q 1 is at least 1, when x=1; and wherein q2 is at least 1, when y=l; the total of q, q1 and q2 is not greater than about 50.

15. A polypeptide according to Claim 14, wherein said polypeptide will include as a repeating unit at least one of GAGAGS and GVGVP.

16. A polypeptide comprising the recombinant DNA
expression product of a DNA sequence encoding a protein of at least about 10 kDal having repeating units of at least one oligopeptide, which oligopeptide repeating unit is characterized by having at least three different amino acids and a total of from 4 to 30 amino acids, there being at least two repeating units in said protein and at least two identical amino acids in each repeating unit and wherein said units are optionally joined by an amino acid bridge of from about 1 to 15 amino acids, said DNA
sequence having the following formula:
K k [(A)n (B)p]q L1 wherein:
the K and L units are each DNA sequences encoding an amino acid sequence of from about 1 to 100 amino acids, the K and L units together being fewer than about 20% of the total number of amino acids;
k and l are 0 or 1;

the A unit is a DNA sequence coding for said oligopeptide repeating unit where A will contain from about 12 to 90nt, at least two codons coding for said identical amino acids in said repeating unit being different, where there will be at least two different A
units differing by at least one nucleotide;
the B unit is a DNA sequence different from the A
unit coding for other than the oligopeptide unit coded by the A unit and having from about 3 to 45nt, where the B
units may be the same or different;
n is an integer in the range of 1 to 100;
each p is independently 0 or 1; and q is at least 1 and is selected so as to provide a DNA sequence of at least 90nt.

17. A polypeptide according to Claim 16, wherein the A unit codes for GAGACTS or GVGVP.

18. A method for producing a polypeptide comprising the step of:
providing a prokaryotic host comprising:
(1) a DNA sequence which comprises a structural gene encoding an RNA. polymerase exogenous to said host which is capable of transcribing DNA to messenger RNA and which is under t:he transcriptional control of an inducible promoter; and (2) a DNA sequence according to any one of claims 1, 4 or 7 under the transcriptional control of a promoter which is not functional with the endogenous RNA
polymerase of said host, but is functional with said exogenous RNA polymerise.

19. A method according to claim 18, wherein said inducible promoter is the beta-galactosidase promoter of the spoVG promoter.

20. A prokaryotic host comprising a DNA sequence according to claim 1.

21. A method for producing a polypeptide of at least about 10 kDal comprising the recombinant DNA expression product of a DNA sequence encoding a protein having repeating unite of at least one oligopeptide, which oligopeptide repeating unit is characterized by containing at least three different amino acids and a total of from 4 to 30 amino acids, there being at least two repeating units in said protein and at least two identical amino acids in each repeating unit and wherein said units are optionally joined by an amino acid bridge of from about 1 to 15 amino acids, said DNA sequence having the following formula:
K k(WMX x NY y)i L1 wherein:
the K and L units are each DNA sequences encoding an amino acid sequence of from about 1 to 100 amino acids, the K and L units being fewer than about 20% of the total number of amino acids;
k and l area 0 or 1;
the W unit is of the formula:
[(A)n (B)p]q the A unit is a DNA sequence coding for said oligopeptide repeating unit where A will contain from about 12 to 90nt, at least two codons coding for said identical amino acid in said repeating units being different, where there will be at least two different A
units differing by at least one nucleotide;
the B unit is a DNA sequence different from the A
unit coding for other than the oligopeptide unit coded by the A unit and having from about 3 to 45nt, where the B
units may be the same or different;
n is an integer in the range of 1 to 100;
each p is independently 0 or 1; and q is at least 1 and is selected so as to provide a DNA sequence of at least 90nt;
the M and N units are the same or different and are a DNA sequence of codons and are of 0 to 18 nt in reading frame with the W and X units respectively;
the X unit. is the same as or different from the W
unit and is of the formula:
[A1)n1(B)p1]q1 the Y unit is the same as or different from the W
unit and is of the formula:
[(A2)n2 (B2)p2]q2 wherein:
all of the symbols come within the definitions of their letter counterparts;
x and y area 0 or 1;
i is 1 to 100; and wherein q1 is at least 1, when x=1; and wherein q2 is at least 1, when y=1; the total of q, q1 and q2 is not greater than about 50;
said method comprising:
growting a prokaryotic host cell according to Claim 20, wherein said DNA sequence is under the transcriptional and translational regulation of initiation and termination. regulatory regions functional in said host, whereby said polypeptide is expressed; and isolating the expression product.

22. A method according to Claim 21, wherein said polypeptide comprises at least one of the repeating units GAGAGS and GVGVP.

23. A method of preparing a synthetic DNA sequence having at least about 80% of the synthetic DNA sequence encoding repeating units of from 4 to 8 amino acids, said repeating units being varied as to nucleotide sequence utilizing codon redundancy and encoding a protein of at least 10 kdal, said method comprising:
(1) synthesizing a DNA monomer encoding not greater than 200 amino acids, (2) cloning said monomer in a cloning vector, (3) sequencing said monomer either in portions or in its entirety, (4) excising said monomer from said cloning vector, and (5) oligomerizing said monomer to provide at least one multimer comprising at least two monomers;
wherein two or more different multimers encoding different amino acid units may be joined together to form a block copolymer and wherein the sequences of said monomer and vector are selected to permit insertion of said segments and excision of said monomer by restriction enzyme digestion.

24. A method of preparing a synthetic DNA sequence having at least about 80% of the synthetic DNA sequence encoding repeating units of from 4 to 8 amino acids, said repeating units being varied as to nucleotide sequence utilizing codon redundancy and encoding a protein of at least 10 kdal, said method comprising:
in a first step synthesizing a monomer by:

(A) (1) synthesizing at least two different double stranded sections of DNA of from about 12 to 120 bases, (2) cloning said sections of DNA in a cloning vector to form a prior segment, (3) sequencing said prior segment to ensure the fidelity of replication of said segment, (4) sequentially adding one or more additional DNA
segments comprising said sections of DNA 3' or 5' of the prior segment in reading frame with the prior segment by repeatedly cloning said sections of DNA into a cloning vector with the prior segment to form a monomer, and sequencing the prior segment and additional segments to ensure fidelity of replication, and (5) excising the monomer from the cloning vector;
or (B) (1) synthesizing at least two different pairs of single strands of DNA of from about 12 to 120 bases, wherein each of the strands of a pair overlap, (2) hybridizing each of said pairs of single strands to provide successive segments, (3) cloning said segments in a cloning vector to form a monomer, (4) sequencing said monomer to ensure the fidelity of replication of each of said segments, and (5) excising the monomer from the cloning vector;
or (C) (1) synthesizing a DNA monomer encoding not greater than 200 amine acids, (2) cloning said monomer in a cloning vector, and (3) excising the monomer from the cloning vector;
and oligomerizing said monomer to provide at least one multimer comprising at least two monomers; wherein two or more different multimers encoding different amino acid units may be joined together to form a block copolymer, and wherein the sequences of said segments, monomer and vector are selected to permit insertion of said segments and excision of said monomer by restriction enzyme digestion.

25. A method. of preparing a synthetic DNA sequence comprising repeating units of from 4 to 30 codons and encoding a protein of at least about 5 kDal, said method comprising oligomerizing a monomer DNA sequence to provide said synthetic DNA sequence, wherein said monomer DNA
sequence comprises at least two different nucleotide sequences which encode the same repeating amino acid sequence.

26. The method according to Claim 25, wherein prior to said step of oligomerizing said monomer DNA sequence, the method comprise:
(a) preparing said monomer DNA sequence having an open reading frame in a cloning vector; and (b) excising said monomer DNA sequence from said cloning vector by restriction enzyme digestion.

27. The method according to Claim 26, wherein said step of preparing said monomer DNA sequence comprises combining a plurality of DNA segments comprising said repeating units in a cloning vector to provide a monomer having an open reading frame.

28. A method according to Claim 25, wherein at least a portion of said monomer DNA sequence is sequenced prior to oligomerizing said monomer DNA sequence.

29. The method according to Claim 27, wherein said step of combining comprises:
(a) synthesizing different pairs of single stranded oligomers, wherein each of the oligomers of a pair overlap except as to any protruding ends;
(b) hybridizing each pair of single stranded oligomers to provides double stranded segments; and (c) combining said segments or cloned copies thereof in a cloning vector to form said monomer DNA sequence, where the combined segments are in reading frame.

30. The method according to Claim 29, wherein said double stranded segments are sequenced prior to combining said segments in a cloning vector.

31. The method according to Claim 27, wherein said step of combining comprises either:
(a) adding each successive segment in reading frame to prior segments to provide a monomer DNA sequence and determining the fidelty of the sequence of each successive segment; or (b) cloning each successive segment in a cloning vector and analyzing each successive segment to determine the fidelity of the sequence and combining said segments or copies thereof in a cloning vector to form a monomer DNA
sequence, where the combined segments are in reading frame.

32. A method according to Claim 31, wherein each DNA
segment has a 3' terminus which specifically hybridizes to the 5' terminus of the next successive DNA segment and not to its own 5' terminus or has a 5' terminus which specifically hybridizes to the 3' terminus of the next successive DNA. segment and not to its own 3' terminus.

33. A method according to Claim 25, wherein said monomer DNA sequence has protruding termini which are complementary to each other.

34. The method according to Claim 25, wherein said repeating units comprise a DNA sequence encoding the amino acid sequence GVGVP, VPGVG, SGAGAG or GAGAGS.

35. The method according to Claim 25, wherein said protein is at least about 30 kDal.

36. A method of preparing a protein of at least about 5 kDal and comprising repeating units of from 4 to 30 codons, said method comprising:
(a) inserting a synthetic DNA sequence prepared according to they method of Claim 25 into an expression vector functional for expression in an expression host;
(b) introducing said expression vector comprising said synthetic DNA sequence into said expression host; and (c) growing said expression host, whereby said protein encoded by said synthetic DNA sequence is expressed.