[go: up one dir, main page]

CA2501585A1 - Robo: a family of polypeptides and nucleic acids involved in nerve guidance - Google Patents

Robo: a family of polypeptides and nucleic acids involved in nerve guidance Download PDF

Info

Publication number
CA2501585A1
CA2501585A1 CA002501585A CA2501585A CA2501585A1 CA 2501585 A1 CA2501585 A1 CA 2501585A1 CA 002501585 A CA002501585 A CA 002501585A CA 2501585 A CA2501585 A CA 2501585A CA 2501585 A1 CA2501585 A1 CA 2501585A1
Authority
CA
Canada
Prior art keywords
seq
robo
residues
ser
pro
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
CA002501585A
Other languages
French (fr)
Inventor
Corey S. Goodman
Thomas Kidd
Kevin J. Mitchell
Guy Tear
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of California San Diego UCSD
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority claimed from CA002304926A external-priority patent/CA2304926C/en
Publication of CA2501585A1 publication Critical patent/CA2501585A1/en
Abandoned legal-status Critical Current

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/435Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans
    • C07K14/705Receptors; Cell surface antigens; Cell surface determinants
    • C07K14/70503Immunoglobulin superfamily

Landscapes

  • Health & Medical Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Immunology (AREA)
  • Organic Chemistry (AREA)
  • Biochemistry (AREA)
  • Gastroenterology & Hepatology (AREA)
  • Zoology (AREA)
  • Toxicology (AREA)
  • Biophysics (AREA)
  • General Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Medicinal Chemistry (AREA)
  • Molecular Biology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Cell Biology (AREA)
  • Peptides Or Proteins (AREA)

Abstract

Robo 1 and Robo2 polypeptides may be produced recombinantly from transformed host cells from the disclosed Robo encoding nucleic acids or purified from human cells.
The invention provides isolated Robo hybridization probes and primers capable of specifically hybridizing with the disclosed Robo genes, Robo-specific binding agents such as specific antibodies, and methods of making and using the subject compositions in diagnosis, therapy and in the biopharmaceutical industry.

Description

ROBO: A FAMILY OF POLYPEPTIDES AND NUCLEIC ACIDS INVOLVED IN NERVE
GUIDANCE
Inventors: Corey S. Goodman, Thomas Kidd, Kevin J. Mitchell and Guy Tear The research carried out in the subject application was supported in part by NIH grant NS 18366. The US government may have rights in any patent issuing on this application.
INTRODUCTION
Field of the Invention The field of this invention is proteins involved in nerve cell guidance.
Background Bilaterally symmetric nervous systems, such as those found in insects and vertebrates, have special midline structures that establish a partition between the two mirror image halves. Axons that link the two sides of the nervous system project toward and across the midline, forming axon commissures. These commissural axons project toward the midline, at least in part, by responding to long-range chemoattractants emanating from the midline. One important class of midline IS chemoattractants are the netrins (Serafmi et al., 1994; Kennedy et al., 1994), guidance signals whose structure, function, and midline expression is evolutionarily conserved from nematodes and fruit flies to vertebrates (Hedgecock et al., 1990; Wadsworth et al., 1996; Mitchell et al., 1996; Harris et al., 1996). The amactive actions of netrins appear to be mediated by growth cone receptors of the DCC
subfamily of the immunoglobulins (Ig) superfamily (Keino-Masu et al., 1996;
Chan et al., 1996;
Kolodziej et al., 1996).
The tnidline also provides important short-range guidance signals. This is best illustrated by considering the different classes of axon projections in the spinal cord of vertebrates or the nerve cord of insects. Although some growth cones extend away from the midline, most extend towards or along the midline during some segment of their trajectory. Certain classes of growth cones either extend towards the midline or longitudinally along it _ and yet never cross it. Most growth cones (~90% in the Drosophila CNS), however, do cross the midline. After crossing. the majority of these growth cones turn to project Longitudinally, growing along or near the midline. Interestingly, these axons never cross the midIine again, despite navigating in the vicinity of other axons that continue to cross.
What midline signals and growth cone receptors control whether growth cones do or do not cross the midline? After crossing once, what mechanism prevents these growth cones from crossing again? Studies in the chick (Stoeekli and Landmesser,1995;
Stoeekli et al., 1997) and grasshopper (Myers and Bastiani,1993) embryos have led to the suggestion that the nudline contains a contact-mediated repellent, and that commissural growth cones must overcome this repellent to cross the midline. For example, this notion that the midline can be repulsive even to growth cones that cross it is supported by time-lapse imaging of the first commissural growth cone in the grasshopper embryo. On contacting the midline, this growth cone often abruptly retracts, although ultimately it overcomes the repulsion and crosses the midline.
One approach to find the genes encoding the components of such a midline guidance system is to screen for mutations in which either too many or too few axons cross the midline.
Such a large-scale mutant screen was previously conducted in Drosophila and led to the identification of two key mutations: commissureless (comm) and roundabout (robo) (Seeger et al., 1993; reviewed by Tear et al., 1993). In comm mutant embryos, commissural growth cones initially orient toward the midline but then fail to cross it and instead recoil and extend on their own side. comm encodes a novel surface protein expressed on midline cells. As commissural growth cones contact and traverse the CNS midline, Comm protein is apparently transferred from midline cells to coaunissural axons (Tear et al., 1996). In robo mutant embryos, many growth cones that normally extend.only on their own side instead now project across the midline, and axons that normally cross the midline only once instead appear to cross and recross multiple times (Seeger et al, 1993; Kidd et al.,1997).
Double mutants of comm and robo display a robo-like phenotype.
Here we disclose the characterization of robo across animal species. robo encodes a new class of guidance receptor with 5 Ig domains, 3 fibmnectin (FIB type III
domains, a transmembrane domain, and a long cytoplasmic domain. Robo defines a new subfamily of Ig superfamily proteins that is highly conserved from fruit flies to mammals. The results of protein expression and transgenic rescue experiments indicate that Robo functions as the gatekeeper controlling midline crossing and that Robo respoads to an unknown midline repellent.
SUMMARY OF THE INVENTION
The invention provides methods and compositions relating to Robol and Robo2, collectively Robo polypeptides, related nucleic acids, polypeptide domains thereof having Robo-specific structure and activity, and modulators of Robo function. Robo polypeptides can regulate cell, especially nerve cell, function and morphology. The polypeptides may be produced recombinantly from transformed host cells from the subject Robo polypeptide encoding nucleic acids or purified from mammalian cells.
The invention provides isolated Robo hybridization probes and primers capable of specifically hybridizing with natural Robo genes, Robo-specific binding agents such as specific antibodies, and methods of making and using the subject compositions in diagnosis (e.g.
genetic hybridization screens for Robo transcripts), therapy (e.g. Robo inhibitors to promote nerve cell growth) and in the biopharmaceurical industry (e.g. as immunogens, reagents for isolating Robo genes and polypeptides, reagents for screening chemical libraries for lead pharmacological agents, etc.).
According to a first aspect of the invention, there is provided an isolated Robo polypeptide comprising a polymer of amino acids of at least 25 consecutive residues of any of SEQ ID NO: 2, 4, 8, 10 or 12, wherein said polypeptide is capable of modulating Robo-ligand binding and/or Robo-mediated signaling.
According to a second aspect of the invention, there is provided an isolated Robo polypeptide comprising a polymer of amino acids of at least 50 consecutive residues of any of SEQ ID NO: 2, 4, 8, 10 or 12, wherein said polypeptide is capable of modulating Robo-ligand binding and/or Robo-mediated signaling.

3a According to a third aspect of the invention, there is provided an isolated Robo polypeptide having at least 95% sequence identity to any of SEQ ID NO: 2, 4, 8, IO or 12, wherein said polypeptide is capable of modulating Robo-ligand binding and/or Robo-mediated signaling.
According to a fourth aspect of the invention, there is provided an isolated polypeptide comprising SEQ ID NO: 12, wherein said polypeptide is capable of modulating Robo-ligand binding and/or Robo-mediated signaling.
According to a fifth aspect of the invention, there is provided an isolated immunogenic Robo polypeptide, capable of eliciting a Robo-specific antibody, selected from the group of a polypeptide comprising the amino acid sequence set out in SEQ ID N0:2, 4, 8 or 10 or an immunogenic polypeptide fragment thereof; an immunogenic polypeptide of SEQ ID N0:2 selected from the group of residues 68-77, 79-94, 95-103, I22-129, 165-176, 18I-191, 193-204, 244-251, 274-290, 322-331, 339-347, 407-417, 441-451, 453-474, 502-S 16, 541-553 and 617-629 of SEQ ID
N0:2; an immunogenic polypeptide of SEQ ID N0:8 selected from the group of residues I-12, 18-28, 31-40, 45-65, 106-116, 13?-145, 174-184, 214-230, 274-286, 314-324, 399-412, 496-507, 548-565, 599-611, IS 660-671, 717-730, 780-791, 835-847, 877-89I, 930-942, 981-998, 1040-I051, 1080-1090, 1154-1168, 1215-1231. 1278-1302, 1378-1400, 1460-1469, 1497-1519, 1606-1626 and 1639-1651 of SEQ ID
N0:8; and an immunogenic polypeptide of SEQ ID NO:10 selected from the group of residues 5-16, 38-47, 83-94, 112-125, 168-180, 195-209, 222-235 and 24I-254 of SEQ ID NO:10.
According to a sixth aspect of the invention, there is provided a soluble form of a Robo polypeptide which comprises one or more Robo immunoglobulin domain of SEQ ID
NO: 2, 4, 6, 8 or 10, wherein said polypeptide is capable of modulating Robo-ligand binding andlor Robo-mediated signaling.

3b According to a seventh aspect of the invention, there is provided a soluble form of a Robo polypeptide which comprises two or more Robo imrnunoglobulin domains of SEQ ID
NO: 2, 4, 6, 8 or 10, wherein said polypeptide is capable of modulating Robo-ligand binding and/or Robo-mediated signaling.
According to an eighth aspect of the invention, there is provided a soluble form of a Robo polypeptide which is a human Robo polypeptide, said polypeptide is capable of modulating Robo-ligand binding and/or Robo-mediated signaling, comprising a sequence selected from: (a) residues 1-67 of SEQ ID N0:8; (b) residues 68-167 of SEQ ID N0:8; (c) residues 168-258 of SEQ ID N0:8; (d) residues 259-350 of SEQ ID N0:8; (e) residues 351-450 of SEQ ID N0:8; (f) residues 451-546 of SEQ
ID N0:8; (g) residues 547-644 of SEQ ID N0:8; (h) residues 645-761 of SEQ ID
N0:8;
(i) residues 762-862 of SEQ ID N0:8; (j) residues 1-167 of SEQ ID N0:8; (k) residues 1-259 of SEQ
ID N0:8; (1) residues 1-350 of SEQ ID N0:8; (rn) residues 1-451 of SEQ ID
N0:8; (n) residues 68-259 of SEQ ID N0:8; (o) residues 1-67 joined to residues 168-258 of SEQ ID N0:8;
(p) residues 1-67 joined to residues 259-450 of SEQ ID N0:8; (q) residues 68-167 joined to residues 168-258 of SEQ ID
N0:8; (r) residues 1-91 of SEQ ID NO:10; (s) residues 92-185 of SEQ ID NO:10;
and (t) residues 186-282 of SEQ ID NO:10.
According to a ninth aspect of the invention, there is provided a soluble form of a human Robo polypeptide of SEQ ID NO: 8 or 10 which lacks a transmembrane domain and cytoplasmic motif and having Robo-ligand binding activity.
According to a tenth aspect of the invention, there is provided a soluble form of a human Robo polypeptide of SEQ ID NO: 8 or 10 which lacks a transrnembrane domain and cytoplasmic motif and is capable of modulating Robo-mediated signaling.

3c According to an eleventh aspect of the invention, there is provided an isolated Robo polypeptide which is a deletion mutant comprising one or more Robo fibronectin or cytoplasmic motif domains of any of SEQ ID NOS:2, 4, 8, 10 or 12, wherein said polypeptide is capable of modulating Robo-ligand binding and/or Robo-mediated signaling.
According to a twelfth aspect of the inven~on, there is provided a fusion product of the Robo polypeptide as described above, wherein said Robo polypeptide is fused with another peptide or polypeptide.
According to a thirteenth aspect of the invention, there is provided an isolated antibody specific for a poiypeptide as discussed above.
According to a fourteenth aspect of the invention, there is provided an isolated antibody specific for the Robo polypeptide fusion product described above.
According to a fifteenth aspect of the invention, there is provided a pharmaceutical composition comprising the antibody of claim 21, further comprising a pharmaceutically acceptable excipient.
According to a sixteenth aspect of the invention, there is provided a pharmaceutical composition comprising a polypeptide as described above, further comprising a pharmaceutically acceptable excipient.
According to a seventeenth aspect of the invention, there is provided an immunogenic composition comprising a polypeptide as discussed above, further comprising an adjuvant.
According to an eighteenth aspect of the invention, there is provided an isolated recombinant nucleic acid comprising a coding strand encoding the polypeptide as described above.
According to a nineteenth aspect of the inventions isolated cell comprising a nucleic acid as described above.

' 3d BRIEF DESCRIPTION OF THE FIGURES
Figure 1 Organization of the roundabout Genomic Locus (A) Cosmid chromosome walk through the 58F/59A region of 2'~ chromosome. The position of deficiency breakpoints within the cosmids used are shown in the top two rows.
Identified transcripts from the walk are shown below the cosmids. The 12-1 transcript corresponds to the robo gene; the direction of transcription is distal to proximal. The location of the I6kb XbaI genomic rescue fragment is indicated below.
(B) Position and size of introns within the robo transcript. Coding sequence is indicated by the thicker part of the line. Introns are represented by gaps. The transcript is shown 3'-5' to reflect its orientation in (A) Figure 2 Structure of Robo Proteins Schematic of the stricture of Drosophila Robo protein. The position of the Immunoglobulin (Ig), fibronectin (FN) and transmembrane (TM) domains and the amino acid substitution in robo6 are shown. Percent amino acid identity between Drosophila Robo l and Human Robo I

_ is indicated for each domain.
DETAILED DESCRIPTION OF THE INVENTION
The nucleotide sequences of exemplary natural cDNAs encoding drosophila l, dmsophila 2, C. elegans, human I, human 2 and mouse 1 Robo polypeptides are shown as SEQ ID NOS:1, 3, 5, 7, 9 and 11, respectively, and the full conceptual translates are shown as SEQ ID NOS:2, 4, 6, 8, 10 and 12. The Robo polypeptides of the invention include incomplete translates of SEQ ID NOS:1, 3, 5, 7, 9 and 11 and deletion mutants of SEQ ID!
NOS:2, 4, b, 8, 10 and 12, which translates and deletion mutants have Robo-specific amino acid sequence, binding specificity or function. Preferred translates/deletion mutants comprise at least a b, preferably at feast an 8, more preferably at least a 32, most preferably at least a 64 residue domain of the translates. In a particular embodiment, the deletion mutants comprise one or more structuraUfunctional Robo immunoglobulin, fibronectin or cytoplas~nic motif domains described herein. For acample, soluble forms of the disclosed Robo polypeptides which comprise one or more Robo IG domains, and especially fusions of two or more Robo IG domains, particularly fusions of IG# 1 and #2, provide competitive inhibitors of Robo-mediated signaling. Exemplary such deletion mutants and recombined deletion mutant fusions include human Robo 1 (SEQ ID N0:8) residues 1-67; b8-167; 168-259; 260-350; 351-451; I-I67; I-259; 1-350; 1-451; 68~259; 1-67 joined to 168-259; and 1-67 joined to 260-451.
Other deletion mutants provide Robo-specific antigens and/or immunogens, especially when coupled to carrier proteins as described below. Generic Robo-specific peptides are readily apparent as conserved regions in the aligned Robo polypcptide sequences of Table 1.
Table 1. Sequence Alignment of Robo Family M~cmbers: The complete amino acid alignment of the predicted Robo proteins encoded by drosophila robo 1 (DI, SEQ ID N0:2) and Human robo I (HI, SEQ ID N0:8) are shown. The extracellular domain of C.elegans robo (CE, SEQ
ID NO:b; Sax-3; Zallen et al., 1997), the extracellular domain ofDrosopHfla robo 1 (D2, SEQ
ID N0:4), and partial sequence of Human robo 2 (H2, SEQ ID NO:10) are also aligned. The D2 sequence was predicted by the gene-finder program Grail. The position of immunoglobulin domains (Ig), 8bronectin domains (FN), the transmembraae domain (TM), and conserved cytoplasmic motifs are indicated. The extracellular domain of rat robo I is nearly identical to Hl.
mH.............BNHAIaRSTSTT~iNPSraR88RMWLIpAWLLLVLVIISNaLP 47 D1 m.FNRKTLICTi.11V1QA..............vIrsFCBDASNlA.............. 30 CE
mKWKHVPFIVMiS11S1SpNHLFLaQLIPDpBDvErf3.NDHGTPIpTSDNDDNSLIiYTOS 59 Hl >It3 #1 AVraQYQ8priiehpTdlvvKknepatlackVegKpBptiewflu3gepvstn..BKXshr 105 D1 (iBNpriiehpMdTTvPknDpFtFncQaegNptptiQwfkdgRELKt...dTGshr D2 ........pVilehpIdVwsRgSpatlncCiaK.PStAKiTwykdgQpvItnkEQVNshr 81 CE
RLrQEDFBpriVehpSdlIvskgepatlackaegRptptiewykOkJeRvEtDkDdPRshr 119 H1 >I(i #2 VQFImgAlffYriMQgkkeQ..dOgBywcvakaRVgQavsrHaslqIavlrddfzvepKd 163 D1 iMlpAgalfflkvIhSrReB..dagTywcBakneFgVaRarnaTlqvavlrdBfrLepAN D2 iVlDTgalfLlkv.NBgkNGImSdagAyYcva8neHgeVKaNB(ialI~aMlrEdfrvRpRT 141 CE
MLlpsgslfflriVhgrkSRP.dBgVyVcvaRnYLgeavsHnaslEvaIlrddfrQNpSd 198 H1 trvaicgeTallecgpplGgIpeptLIwIkdgV~lddLicAmSFC4A88rVrivdggnlLiSNv 223 D1 trvaQgeValmecgAprgSpepQiswrkNgQTINL......V(~CririvdggnlAlQSA D2 vQAhageMavlecSpprgBpepWawrkdDIGalRI.QDmP.....rYTLHBDgalIiDPv 195 CS
vMvaVgePavmecQpprgHpeptiswKkdgBpldd.......ImBri.TIRggKIMiTYT 23o H1 >IQ #3 EPIdBgNyKcIaQnLvgtrasSYaICIIvQvkpYfMkopkdqVMLYgQTaTl8c8vggdpP 283 D1 rQsdDgRyqctrVKavvgtreaATaFlxvHvrpFI,,IRapQaqtAVvgSsvvfQcrlggdpL D2 DRsdSgTyqcvaNnmvgerVSNPaRIBvFekpRfBQepkdMtvDvgRAvLfDcrvTgdpQ 255 C8 rICsdAgiCyVcvi3Tamvgerea8Va81TvLerpBfVkRpBaLAvTvDDsaElKCEARgdpV 290 Hi pICvlwkk..EBgnIpvarA..........RiLHd8ICs18iSNItpTdegTyvceaHaNvg 331 D1 pDvlwrrTASGgnmpLRRFSWLHSAB(iRVHVI.BdrslkLDDvtLEdmgeytceaDaAvg DZ
pQIT~akr..KNBPmpvTra..........Y3AKdNrt3lRiERvQp8degeyvcYaRnPAg 303 CE
pTvRwrk..DDgBLpKSrY..........8l.RddHTlkiRKvtAC3dmgSytcVaEaMvg 337 Hi >IQ #4 QiSaRaSlIvhappNfTKrpBnlCNGlNgVvQLPcMa8gapPpBvfwTkegVBTlMfpn. 388 D1 CiiTaTQIltvhappKfvIrpICnqLv8IgD8vLfecQaNgHpRpTLYwsVegN8S11Lptiy D2 TLeasaHlRvqappSfQTkpAdqSvPAggtAtfecTLVgQpBpaYfwskegQqDllfpsy 363 CE
KAeasaTltvqBppHfvVkpRdqWalgrtvtfQceaTgnpqpaIfwRRegeqnllf.sy 396 H1 qIvaQgrtvtfpceTlCgapqpavfwQkegsqallfpn. H2 ...SeHtirQY~rAADgtlQitDvrqedegyyv.cSaF9vvDssTVrVFIQvSS..vD.... 440 D1 RDQRMEVTLTPECiRSVISiARFAredSgKVvTeNalnAvgsVSsrTWSvDt..QF.... D2 VSA~RTIC..vsptgtltiEEvrqVdegAyv.cA(3Mn8agsslskaAlKvttKAvTCiNTP 420 CE
qpPQsSsrFsvsQtgdltitnvqradVgyyi.cqTlnvagsiITkaYlevtd..vlA... 450 H1 qpQQPNsrCSVSptgdltitnIqrsdAgyyi.cqalTvagsilAkaQlevtd..vLT... X2 >IC3 #5 erpppiiQIgpAaqtlpICgsVaTlpcratgNpBpRiKwFHdgHAvQA.(iNRYSi.iq(i.. 496 D1 eLpppiieqgpvnqtlpvKsIVwlpcrTLgTpvpQVswYLdgIpidVqEHERrNLsDA.. D2 AKpppTieIigHQaqtlMvgs8allpcQaBgKpTpc3iawlRdgLpidITd..sri.sqHST 477 CE
drpppViRqgpvnqtVavdgtFvlScVatgSpvpTiLwRkdgVLvSTqd..sriK.qLeN 507 H1 drpppiiLqgpAaqtlavdgtaLcKcKatgDpLpViswlkEgFTFPC~Rd..PrATiq.eQ Hz >FN #1 SslRVDdlq.lsd8gtytciasC~eRgeTswAaTltveICpgs..TB?~iraAdpstypAppg 553 D1 gAlTiBdlqrHEdEgLytcvasrsRNglCsswsc~ylRLDTptNpNiKfFragElstypgppg D2 gslHiAdl.kKPdtgVytciaXaeDgestwea8ltveDHtsN.AqfVrMpdpsNFpsSpT 535 CE
gvlqfR.YAklCidtgRytciasTPsgeatwsayIEvQeFgVp.VqPPrPTdpNLIpsAps 565 H1 gTlqilCNl.rISdtgtytcvaTSSsgeaswsaVlD~TeSgAT.i..SKNYdIsDLpgpps HZ
Tpic~rLnvsrtsISiRwAKSqEKPGtAVgpIi.gyTVeyfepdlQTgwIVAaHrvODtQVti 612 D1 kpqMvEKC3Ensvtlsw. . . TRSNKVggssLVgyVieMfaIQ~TBTDgwvAvOTrvQNttF'tQ Dz QpIIvnvtDtEv8lHw...NAPSTsga(3pitgyiiQyYspdlgQTwFNIPDYvAStByRi 592 C8 kpEvtdvsrnTlrtlew...cg~NLNsga'I~.tSyiieafaFiASgSswqtvaENvktEtSAi 621 Hl kpqvtdvtKnsvtlsw...qp(3TPCiTLpA.SAyiieafsQSVSNswqtvaNHvkttLytV H2 >FN #2 ' Sgl'i'pgtsyVflvraenTQgisvpsaLsNViktIEA....DfDAABANdlsAarT.llTg 667 D1 TglLpgVNyFfliraenSIigLaLpsPMeBpitVC~TR....YlNS..gLdlsEarASllsg D2 kgIkpSHsyMfViraenBkgiOTpsV8sALvttSIQ~AAQVAlSDKNIQ~IdMAIa&lDtlTaS 652 CE
kglkpnAiylflvraAnAYgisDpsqIsDpvktQDV.....1PTSQgVdHKQVQRE.IaN 675 Hl RglRpatiylfMvraInPkV.svT.q H2 RBvelIDasAinAaavrl8wMi4iv8ADBkyvegLRiHyK..DaSVPSAQYHSITvMDAsa 725 D1 DvvelBnasvVDstsMKlTwQI...INCikyvegFyVYArQLpNPLNTI~rRMLTILNC3Gda D2 QLIKiBEVxTinstavrlFwKKR..XLEBLiDQyyiKf4raPpRTND~NQ~VN...vTSpaT 707 CE

~ AvLHIHnPTvLBsssIBVHwT..vDQQSQyiQqyKiLyrPB(3aNHOBSDWLVFEvRTpAK 733 H1 aFN $3 eaFwvGalKkytKyeffLTpf...fETiegQpanskTaltYedvpsappDNIQiGtnYn.. 780 D1 SsCTiTf3lVQytLyeffIVpf...YKSVegKpansRIaRtledvpsEApYgMEALLln.. D2 eNYw6nIMPFtnyeffVIpYHBGVIiaiHyapensMDVltAeAPpsLpp8;DvRiRmlaL. 766 CE
NaVviPDIRk(3VnyeIKARpf...fNBFQgaDeBIkFaKtleBllpsappQgvTV8KND(iN 790 H1 QtaOWvRwTpppSQHHagNlYgyklBVSAgnTM.....KVlAnMtLnaTtTsvLlNnltt 835 D1 SSaVFLKwkapFLlmRHgVILNyH.vivRgIDtAHNF'SAIlTaVtIdaABPTLvlAaltB D2 .tTLRISwkapKAdC3IngIlKgFQiviv.gQABNNNR.....altTnBRAAsvTIFAIVt 819 C8 f3talLvswQpppEdTQrtgMVQBykV.WCLgnEtR.....YHInKtVdGStBsWIPFIVP 844 H1 gAVysvrLNSFtIGagDgpysKpISlFMdpTi~VHPpRAHPsG~THDGRH8c3qDLTYHDI6TgN 895 D1 gVMyTvGvaaGNnagvgpyCVpATIRIdpITIG'tLDpFINQRDHVND.............. D2 gMTyKIrvAARBnOgvgv..........ShgTBSVIM~IqDTIBKL~..AAQq$NESFLYgL 868 CE
gIRysvBvaaStciagSgvXsRpQFIQId~9NPVSpBDqVslAQQI.............. 890 H1 > TM <
iPPCiDINPTTHKKZTdYlBOpwLMViVCiVILvlVisAAIsM.vyFkrkhQmTK81(3HLS 954 Dl ................vlTqpwFIiiLgAilavlMLs..fGAMvP'Vkrkl~Mra..MkQsAL D2 iNK..............SHVpVIViVaILiIF~ViiIAY.CYwRNS.rNSD...gkDRSF 909 C8 ..............SdvVKqp..AFiagiGAaCWiiLMVfsIwLyRHrkKR..NgITsTY 932 H1 WSDNSIT.......................AlniNBICSSL.wIDHI~R(3wRTADTDIm.. 988 D1 AC3IRKVPSFTFTPTVTYQRt30BAV8Sf3GRPf3L1ni88PAAQPwLAD..TwPNT(~D1NHD1DC 990 Hl ........SgLsRaKILSHVNBSQ..SnynaS.........-.DGGtDyABvd....TRNL 1024 D1 SISCCTAGNgNaDsNITTYSRPADCIAnynaQLDNiCQTNt~ti.P88tVyGDvdLSNKINSM 1050 Hl CYTOPLASMIC MOTIP #1 TtfYNCR.......K$PDNptpyattMIfOTB........sSBTCTkT.TBISADkDSC3T 1068 Dl KtfNSPNLKDGRFVNPSGQptpyattQLiQSNLSNNMNNOsciDSdBkHWKPLOQQkQFVA 1110 Hl HSPyB........DAFAGQVIiAVpW..KBNyLqYPVEP..................... 1097 D1 PVQyNIVSQNICLNKDYRANDTVPpTIBYNQSyDqNTl3GBYNSSDRGSHTS(iSQGHK~AR 1170 Hl - CYTOpLASMIC MOTIF #2 .........InwSEFlppppEttppp...sSTy,.....(iyAqQSp............... 1124 Dl TPI(VpKQGGMnwADLlppppAhpppHSNsEEyNISVDESyDqEMpCPVPPARMYLQQDEL 1230 H1 ..eSSRKSSKSAOSgISTNQSILNAsIHsSSSGOFsAWaVSPQYAVAcp........... 1171 Dl EEeEDERGPTPPVRgAASSPAAVSYsIiQsTATLTPspQBELQPMLQDcpEET~3HMQHQPD I290 H1 ................pBNVy...eNpl.....SAVA(30TQNRYQITPTNQHPPQ1.... 1203 Dl RRRQPVSPPPPPRPISpPHTy~3YISC~pIVSDMDTDAPE8E8DEADMEVAI~iQTRRILLRG 1350 Hl ....paY................FAT'TtiPGGAVBpNHLP.............faTQRHaa 1230 D1 LEQTpaSSVGDLESBVT(3SMIN~8A888DNISS(iRSSVSSSDOSFFTDADfaQAVAaa 1410 H1 BeyQaglNAar................cAQSRACNsCdALATpSPmq............. 1261 Di Aey.agllcVarRQMQDAAORRHFHABQcpRPTSPVeTdSNMSAAVtnqKTRPARIQrIGiQPG 1469 Hi ...........ppppvpvpEOwYQPVHPa~sx.pM~tpTS.sNxQIYQCSSSCsDasRSsQs 1307 ai HLRRBTYTDDLppppvpPpAIKSPTAQSl~TQLBVRpVWPIQrpSMDARTDRsSDRKGaSY 1529 H1 HXrQL.................QLEeH(3SSAkQrgCiHHRRrA.pWQPCMBSeN......HNM Di ICC3rEVLDGRQWDMRTNPGDPRBAQeQQNDOkCirgNKAAKrDLpPAKTHLiQeDILPYCRPTF Hi LAEYEQrQYTaDCCNserSC3DTC..........SCSe~38Cl..yAeAgePAPRQMTA1~1T 1395 D1 PTSNNPrDPSBSSSMssz(3SGSRQRBQANVGRRNIAeMQVIGt3y.eRgeDNNEELEETES 1651 H1 Exemplary such Robo specific immunogenic and/or antigenic peptides ace shown in Table 2.
Table 2. Immunogeaic Rol~ polypeptides eliciting Robo-specific rabbit polyclonal antibody:
Robo polyeptide-KLH conjugates immunized per protocol described below.
Robo PoPoIYU~id~.e~mmttnQ~i~i~C
SEQ 1D N0:2, residues 68-77 +++
SEQ ID N0:2, residues 79-94 +++
SEQ lD N0:2, residues 95-103 SEQ )17 N0:2, residues 122-129 SEQ ID N0:2, residues 165-176 8 _ - SEQ >D N0:2, residues 181-191. +++

SEQ ID N0:2, residues 193-204+++

SEQ ID N0:2, residue 244-251+++
, SEQ 1D N0:2, residues 274-290+~-+

SEQ ID N0:2, residues 322-331+++

SEQ ID N0:2, residues 339-347+++

SEQ ID N0:2, residues 407-417+++

SEQ ID N0:2, residues 441-451+++

SEQ ID N0:2, residues 453-474+++

SEQ ID N0:2, residues 502-516+++

SEQ >D N0:2, residues 541-553+++

SEQ ID N0:2, residues 617-629+++

In addition, species-specific antigenic and/or immunogenic peptides are readily apparent as diverged extracellular or cytosolic regions in Table 1. Exemplary such human specific peptides are shown in Table 3.
Table 3. Immunogenic Robo polyp~tides eliciting human Robo-specific rabbit polyclonal antibody: Robo polyeptide-KLH conjugates immunized per protocol described below (soma antibodies show cmss-reactivity with corresponding mousdrat Robo polypeptides).
~f~I7~'dlS~t- ~c$~~ence j~nix7C
SEQ ID N0:8, residues +++

SEQ ID N0:8, residues ~ +++

SEQ ID N0:8, residues +++

SEQ ID N0:8, residues +++

SEQ >D N0:8, residues +-f+

SEQ D7 N0:8, residues +++

SEQ ID N0:8, residues +++

SEQ ID N0:8, residues +++-SEQ ID N0:8, residues ++~+-SEQ ID N0:8, residues +++

SEQ >D N0:8, residues ++t-- SEQ ID N0:8, residues 49b-507+++

SEQ ID N0:8, residues 548-565 +++

SEQ ID N0:8, residues 599-611 +++

SEQ ID N0:8, residues 660-671 ++i-SEQ ID N0:8, residues 717-730 +++

SEQ ID N0:8, residues 780-791 +++

SEQ ID N0:8, residues 835-847 +++

SEQ ID N0:8, residues 877-891 ++~

SEQ ID N0:8, residues 930-942 +++

SEQ ID N0:8, residues 981-998 +++

SEQ ID N0:8, residues 1040-1051+++

SEQ ID N0:8, residues 1080-1090+++

SEQ ID N0:8, residues 1154-.1168+++

SEQ ID N0:8, residues 1215-1231+++

SEQ ID N0:8, residues 1278-1302+++

SEQ ID N0:8, residues 1378-1400+++

SEQ ID N0:8, residues 1460-1469+++

SEQ ID N0:8, residues 1497-1519+++

SEQ ID N0:8, residues 1606-1626+-~+

SEQ ID N0:8, residues 1639-1651+++

SEQ ID NO:10, residues 5-16 +++

SEQ ID NO:10, residues 38-47 +++

SEQ ID NO:10, residues 83-94 +++

SEQ ID NO:'10, residues 112-125 SEQ ID NO:10, residues 168-180+++

SEQ ID N0:10, residues 195-209+++

SEQ ID NO:10, residues 222-235+++

SEQ ID NO:10, residues 241-254+++

In a particular embodiment, expressed sequence tags BST;yu23d11, Accession #H77734 and EST;yq76e12, Accession #H52936, as well as peptides conceptually encoded thereby, are not within the scope of the present invention (Tables 4 and 5).
In a particular embodiment, the subject Robo polypeptidea occlude the correspondiag regions of the disclosed natural human Robe I polypeptide, i.e. SEQ ID N0:8, residues 168-217 and SEQ ID
N0:8, residues 1316-1485.
Table 4 EST:yu23d11 sequences compared to H-Robol. yu23d11 refers to the fragment of DNA which was sequenced. The fragment was sequenced from both ends generating the following taro sequences: H77734 and H77733. yu23d11 is an unspliced cDNA.
Only bases 59-215 match the coding sequence of H-Robol (502-65I). The remaining bases are intronic.
No bases of H77733 match the coding sequence of H-Robo 1.
LRDDFRQNPSDVMVAVGEPAVMECQPPRGHPEPTISWKKDGSPLDDKDER H-Robol There is an error in the sequence, a T to G change which results in the amino acid N being replaced by K. The sequencx is shown below and has been nwersed for clarity:
TACTTCGGGATGACTTCAGACAAAAACCTTCGGATGTCATGGTTGCAGTA H-Robol L R D D F R Q K P S D V M V A V
N
Table 5 EST:yq76e12 sequences compared to H-Robol. yq76e12 refers to the fragment of DNA which was sequenced. The fragment was sequenced from both ends generating the following two sequences: H52936 and H52937 (the latter has been reversed for clarity). The sequences can be seen to overlap in the middle. A gap indicates a frameshift error. Note that errors only occur in one sequence at any one position.
GPLVSDMDTDAPEEEEDEADMEVAKMQTRRLLLRGLEQTPASSV H-Robol GPLVSDMDTDAPEEEEDEADMEVAKMQT.RLLLRGLEQTPASSV EST H52936 GDLESSVTGSMINGWGSASBEDNISSGRSSVSSSDGSFFTDADF H-Robol ' AQAVAAA AEYAGLKVARRQMQDA AGR RHFH AS QC PRPT H-Robol ?AAT A?YAGLKVAR.RQMRDA AGR RHFH AS QC PRPT EST H52937 SPVSTDSNMSAAVMQKTRPAKKLKHQPGHLRRETYTDDLPPPPV H-Robol PPPAIKSPTAQSKTQLEVRPVWPKLPSMDARTDK H-Robol The subject domains provide Robe domaia specific activity or function, such as Robo-specific cell, especially neuron modulating or modulating inhibitory activity, Robo-ligand-binding or binding inhibitory activity. Robe-specific activity or function may be determined by convenient in vitro, cell-based, or in viva assays: e.g. in vitro binding assays, cell culture assays, in aaitnais (e.g. gene therapy, transgenics, etc.), etc.
Binding assays encompass any assay where the molecular interaction of a Robo polypeptide with a binding target is evaluated. The binding target may be a natiual intracellular binding target, a Robo regulating protein or other regulator that directly modulates Robo activity or its localization;
or non-natural binding target such as a specific immune protein such as an antibody, or a Robo specific agent such as those identified in scraaiing assays such as described below. Robo-binding specificity may be assayed by binding equilibrium constants (usually at least about 10' M'', preferably at least about 14° M'', more preferably at least about 10' M''), by the ability of the subject polypeptide to function as negative mutants in Robo-expressing cells, to elicit Robe specific antibody in a heterologous host (e.g a rodent or rabbit), etc.
The claimed Robo polypeptides are isolated or pure: an "isolated" polypeptide is unaccompanied by at least some ofthe material with which it is associate! in its natural state, preferably constituting at least about 0.5%, and more preferably at least about 5% by weight of the total polypeptide in a given sample and a port polypeptide constitutes at least about 90%, and preferably at least about 99% by weight of the total polypeptide in a given sample.
A polypeptide, as used herein, is a polymer of amino acids, generally at least 6 residues, preferably at least about IO residues, more preferably at least about 25 residues, most preferably at least about 50 residues in length. The Robo polygeptides and polygeptide domains may be synthesized, produced by cxcombinant technology, or purified from mammalian, preferably human cells. A wide variety of molecular and biochemical methods arc available for biochemical synthesis, molxular expression and purification of the subject compositions, see e.g. Molecular Cloning, A Laboratory Manual (Sambmok, et al.
Cold Spring Harbor Laboratory), Currant Protocols in Molecular Biology (Eds.
Ausubel, et al, Gn,~ene Publ. Assoc., Wiley-lnterscience, N~ or that are otherwise known in the art.
The invention provides binding agents specific to the claimed Robo polypeptides, including natural intracellular binding targets, etc., methods of identifying and malting such agents, and their use iwdiagnosis, therapy and pharmaceutical development. For example, specific binding agents are useful in a variety of diagnostic and therapeutic applications, especially where pathology, wound repair incompetency or prognosis is associated with improper or undesirable axon outgrowth, orientation or inhibition thereof.
Novel Robo-specific binding agents include Robo-specific receptors, such as somatically recombined polypeptide receptors like specific antibodies or T-cell antigen receptors (sec, e.g Harlow and Lane ( 1988) Antibodies, A Laboratory Manual, Cold Spring Harbor Laboratory), natural intracellular binding agents identifiai with assays such as one-, two- and three-hybrid screens, non-natural intracellular binding agents identified in screens of chemical libraries such as described below, etc. Agents of particular interest modulate Robo function.
In a particular embodiment, the subject polypeptides are used to generate Robo-or human Robo-specific antibodies. For example, the Robo- and human Robo-specific peptides described above are covatently coupled to keyhole limpet antigen (K1;,H) and the conjugate is emulsif ed in Freunds complete adjuvant. Laboratory rabbits are immunized according to conventional prratoco! and bled. The presence of$obo-specific antibodies is assayed by solid phase immunosorbant assays using immobilized Robe polypeptides of SEQ ID N0:2, 4, 6, 8, or 12. Human Robe-specific antibodies are characterized as uncross-reactive with non-human Robo polypeptides (SEQ m NOS:2, 4, 6 and 12).
Accordingly, the invention provides methods for modulating cell function comprising the step of modulating Robe activity, e.g. by contacting the cell with a Robo inhibitor, e.g.
inhibitory Robe deletion mutants, Robo-specific antibodies, etc. (supra). The target cell may reside in culture or in situ, i.e. within the natural host. The inhibitor may be provided in any convenient way, including by (i) intracellular expression from a recombinant nucleic acid or (ii) acogenous contacting of the cell. For many in situ applications, the compositions are added to a retained physiological fluid such as blood or synovial fluid. For CNS
administration, a variety of techniques are available for promoting transfer of the therapeutic across the blood brain barrier including disruption by surgery or injection, drugs which transiently open adhesion contact between CNS vasculature endothelial cells, and compounds which facilitate translocation through such cells. Robe poIypeptide inhibitors may also be amenable to direct injection or infusion, topical, intratracheal/nasal administration e.g. through aerosol, intraocularly, or within/on implants e.g. fibers e.g. collagen, osmotic pumps, graf3s comprising appropriately transformed cells, etc. A particular method of administration involves coating, embedding or derivatizing fibers, such as collagen fibers, pmtein polymers, etc. with thaapeutic proteins. Other useful approaches are described in Otto et aL (1989) J
Neuroscience Research 22, 83~91 and Otto and Unsicker (1990) J Neuroscience 10, 1912-1921. Generally, the amount administered will be empirically determined, typically in the range of about 10 to 1000 pglkg of the recipient and the concentration will generally be in the range of about 50 to 500 ~tg/mI in the dose administered. Other additives may be included, such as stabilizers, bactericides, etc, will be present in conventional amounts. For diagnostic uses, the inhibitors or other Robo binding agents are frequently labeled, such as with fluorescent, radioactive, chemiluminescent, or other easily detectable molecules, either conjugated directly to the binding agent or conjugated to a probe specific for the binding agent.
The amino acid sequences of the disclosed Robo polypeptides arc used to back-translate Robo polypeptide-encoding nucleic acids optimized for selected expression systems (Holler et al. (1993) Gene 136, 323-328; Martin et al. (1995) Gene 154, 150-166) or used to generate degenaste oligonucleotide primers and:'gmbes for use in the isolation of natural Robe-encoding nucleic acid sequences ("GCG" software, Genetics Computer Group, Inc, Madison WI). Robe-encoding nucleic acids used in Robo-expression vectors and incorporated into recombinant host cells, e.g. for expression and screening, transgenic animals, e.g. for functional studies such as the efficacy of candidate drugs for disease associated with Robo-modulated cell Rutction, ete.
The invention also provides nucleic acid hybridization probes (Tables 6, 7) and replication / amplification primers (Tables 7, 8) having a Robo cDNA specific sequence comprising SEQ ID NO:1, 3, 5, 7, 9 or 11 and sut'flcient to effect specific hybridization thereto (i.e. specifically hybridize with SEQ ID N0:1, 3, 5, 7, 9 or 11, respectively, in the presence of CDO cDNA.
Table $. Hybridisation Probes for Human Roundabout 1 Lnmunoglobulin Domain #1 CCACCT CQCATTC3TTCiAACACCCTTCAOACCTGATT~iTTCTCAAAAG(~A~3AAC
GAAQGCC(3CCCCACACCCACTATT3AATQGTACAAAOQC3GOAGAOAt3AOT~A0ACA0ACAAAflATf3ACCCTCGC

TCACACCC3AATCiT3'GCTt3CC~A(i'~GOATCTTTATTTTTCTTACCiTATAf3TACATQGAC(i(3AAAA(iTAC
3ACCTCiAT
GAA~C~AGTCTATCiTCTCiT6TAC3CRAf3QAATTACCTTCiOAGACiGCTCiTt3A(3CCACAATOCATCGCTGGAA
C3TAaCC
ATA
Immunoglobulin Domain#2 CTTCGaGATQACTTCAGACAAAACCCTTCGf3ATCiTCAT'CaGTTGCAaTA~C30AaAC3CCT~3CA0TAATaCiAAT
TC~CCAA
CCTCCACCiAC~CCATCCTCiAf3CCCA~CCATTTCATCadfAAC3HAAGAT~CTCTCCACTC3GATGATAAAQATl3A
AA~iA
ATAACTATACC3AQ~ACiGAAAf3CTCATBATCACTTACACCCt3TAAAA(iTl3ACGCTOGCAAATATCiTTI~(iTC

ACCAATAT3CiTTQQG(3AACGTGAOACiTGAAGTAf3CCGAQCT(~ACT(~1'C!T
ImInunOglObulln LJaimain #~
AG~AaAL3ACCATCATTTGTC~AA~ACCCAGTAACTTOaCAtiTTAACZ'(3T03ATGACAGTflCAtiAATTTARATa TaA
G3CCCOACi~TOACCCTOTACCTACAOTACGATQtiAaaAAAf~iATGATCd3AGAtiCT3CCCAAATCCAOATATGAA
AT
CCC3AClATC9ATCATACCTT~3AAAiIT'TAGaAAOQTOACAGCTGOT(~ACAT~TTCATACACTTCiTOTTGCAC3A
AAA
TAT~Tf3O3CAAAC~1CTGAAaCATCTaCTACTCTt3ACTCiTTCAAC~AACC
Immunoglobulin Domain #4 CCACATTTT(iTTaT(iAAACCCC(3TCiAGCAG(iTTOTTGCTTTt3GC~ACt~iACTt3TAACTTiTCA3TGTC~AA

GaOAAATCCTCAACCA(iCTA TCTACTTTTCTCATATCAACCACCACAC3 .' TCATCCAC~1CCOATTTTCAGTCTCCCAOACTO(~CaACCTCACAATTACTAATCiTCCAL~COATCTaATGTT(iaTT
AT
TACATCTGCCAiGACTTTAAAT3TTr3CT(3GRAGCATCATCACAAACif3CATATTTl3GAAC~TTACAGAT(iTC~A
TTC~CA
Immunoglobulin Domain #$
(~ATCt~GCCTCCCCCA3TTATTCaACARaQTCCTCiTGAATCA~iACTCiTAf3CC(~TQGRTG~3CACTZTC<iTCC
TCAaC
TGT3Ti~CCACA(i4CA~OTCCAbTLiCCCACCATTCTCiTC~OA(3AAAadATa(~ACiTCCTCGTTTCAACCCAA(~
ACTCT
CQAATCAAACAGTTCiCiAGAATOGAGTACTGCAGATCCCiATATaCTAAGCTCi~TOATACTLiCiTCGGTACACCTC
iC
ATTaCATCAACCCCCAGTaCiTaAAGCAACATOQAGTGCTTACAT3'aAA~OTTCAAI~AATTTCi 1$

' Fibronectin Domain #1 aAGTTCCAGTTClUOCCTCCAAaACCTACTCiACCCA11ATTTAATCCCTA~iCCCCATCAAAACCTGA~1GTGACAG
ATGTCACdCACiAAATACACiTCACATTATCGTGGCAACCAAATTIbAATTCAGGAGCAACTCCAACATCTTATATTA
TAGAAGCCTTCAGCCATGCATCTGGTAGCAGCTGGCAGACCC3TAGCAGAC~AATGTQAAAACAf3AAACATCTGCCA
TTAAAGQACTCAAACCTAATGCAATTTACCTTTTCCTTaTGAGGGCAGCTAATGCATATGGAATTAGTGATC
Fibronectin Domain #2 CAAGCCAAATATCA3ATCCA3TGAAAACACAAOAT3TCCTACCAACAAQTCA3GGC3GT~iGACCACAAGCAGGTCC
AGAGAGAGCTGGGAAAT'GCTGTTCTCiCACCTCCACAACCCCACCGTCCTZ'I'CTTCCTCTTCCATCCiAAGTGCAC
T
GGACAGTAQATCAACAGTCTCAGTATATACAAGGATATAAAATTCTCTATCGOCCATCTGG7~(iCCAACCACGGACi AATCAGACTGCiTTA(3TTTTTGAA(~TbAaGACGCCAGCCAAAAACAGTGTOGTAATCCCTGATCTCAf3AAAGGGAC
i TCAACTATGAAATTAAOGCTCGCCCTTTTTTTAAT3AATT1'CAAGQAaCAQ
Fibronectin Domain #3 ATAGTCiAAATCAAGTTTGCCAAAIICCCTGGAAQ~AAGCACCCAGTGCCCCACCCCAAGGTGTAACTGTATCCAJ1(i A
ATGATGGAAACGGAACTOCAATTCTA~Ci'ITACiTTGGCAGCCACCTCCAGAAGACACTCAAAATGGAATCiGTCCAA
Ci TGGTCATTCCCTTTC'I'TCiTTCCT(dGAATCCGATACACiTGTGOAAGTGGCAGCCAGCACTGGGGCT00(~TCTGG
QG
TAAACi Transrnembrane Domain Af3ATTTCAGATt3TGl3TGAAGCACCCG~iCCTTCATAGCAGGTATTGGAGCALiCCTCiTTQaATCATCCTCAT~3G
TCT
TCA(iCATCTCiGCTTTAfiCGACACCG
Cytoplaamic Motif #1 AATCT~3AAGGATGGGCtiTTTTGTCAATCCATCA000CACiCCTACTCCTTACGCCACCACTCA(3CTCATCCAGTCA

AACCTCAGCAACAACATGAACAATO _ cytopiasmic Motif#z CCCAAGGTACCAAAACAGGGTGGCATGAACTaGGCA(iACCTGCTTCCTCCTCCCCCAGCACATCCTCCTCCACAC
AGCAATACiCGAAGAGTACAACATTT
Cytoplasmic Motif #3 CCAGCCAGGACATCTGCQCAGAC,SAAACC"PACACAOATGATCTTCCACCACCTCCTf3TGCCGCCACCTGCTATAAA

- Table 6. Hybridisation Pcabes for Human Roundabout 2 Immunoglobulin Domain #4 CAi3ATTCiTTgCTCAAC~GTCCiAACAGTCiACATTTCCCTC~TQRiIACTAAA(iCiAAACCCACAQCCAGCTaTTT
TTTCiG
CAtiAAAQRAOOCAC3CCA0AACCTA~CTT'ITCCCAAACCAACCCCAaCAaCCCAACAGTACiATOCTCA(3TC3TCA
CCA
ACTCiCiAQACCTCACAATCACCAACATTCAAC(3TTCCOACC3C(iQCiTTACTACATCTCiCCAtiQCTTTAACTOT
OQCA
~3QAA~3CATTT'I'AOCAAAAOCTCAACTCiCiAaf3TTACT<3ATCiTTTT(iACA
)mmunoglobulin Domain #5 i~ATA(3ACCT'CCACCTATAATTCTACAAC~GCCCA~3CC7~11CCA7WCGCTGC~CA~iTD(iATtif3TACAGCGT
TACTI~AAA
TC~TAAAQCCACTG(3TOATCCTCTTCCn3TAAZTAGCTbOTTAAAQGA~ATTTACTTTTC.'CtiGaTAGACiATCCA

AtiAOCAACAA TTA7~OAATTTAC<iQATTTCT(3ATACTOC3CACTTATACTT3T
OTr3aCTACAAC3TTCAAGiTQaAOAf3GC'1TCC'l"f30AGTt'3CACiTOCTOaATt~T~ACA(3At3TCT
Fibronectin I~main #1 GQAC3CAACAATCA(3TAAAAACTAT3ATTTAAiOTf~AGCTtiCCAO~GCCACCA Tt3TT
ACTAA~f3AACAt3TQTCACCTTCiTCCTO0CAf3CCA~t3GTACCCCTaCiAACCCTTCCAOCAAaT(3CATATATCA
TT(iAC3 (3CTTTCAOCCAATCAOTt3AaCAA CATt3TAAAtiACCACCCTCTATACTGTAA~iA
CiGACT<3C(3GCCCAATACAATCTACTTATTCATC~f3TCA4AGCC3ATCRACCCCAAf3(3TYTCAt3TGACCCAAC
iT
Table 7. Primer Pairs for PCR of Human Roundabout 1 Domains Immunoglobulin Domain #1 Forward : 5 ' CCACCTCCaCATniTTaAACACCCTTCAf3AC 3 ' Reverse: 5' ATCiGCTACTTCCAGCQAT3CATT~3TC3GCTC 3' TlnmttIlO~Obtt~ln Domalil #2 Forward : 5 ' CTTCOG10ATCiACTTCaI~ACA11AACCC~TCG 3 ' Revnrae:. 5' Z74A3ACAOTCAOCTC(~tiCTACTTCACTCTC 3' Immunoglobulia Domain #3 Forwara: s' 7~~QAGAOACCATCATTTGTiiAAQAaACCCA~ti 3' Reverse: 5' A~3GZTCTTC~AACAaTCAaR~TAf3CA(iAT~iC 3' Immunoglobulin Domain #4 Forward : 5 ' CCACATTTT~3TTi3T(~RAACCCCaTGACCAC3 3' Reverse: 5' Tt~CAATCACATCTGTAACTTCCAAATATCiC 3' _ Immunoglobulia Domain #5 Forward : 5 ' ATCC3GC.CTCCCCCAC~T'i'ATl'Ci311CAAGl3TC 3 ' Reverse: 5' CAAA,TTCTTaAACTTCAAT3TAA~CACTCC 3' Fibronectin Domain #1 Forward: 5' GAaTTCCAGTTCAC3CCTCCAAt3ACCTACTB 3' Reverse: 5' TCACTAATTCCATATGCA:TACiCT~CCCTC 3' Fibronectin Domain #2 Forward: 5' CAA(iCCAAATATCAGATCCAC~TGA7~1ACAC 3' Reverse: 5' ATCTCiCTCCTTQAAATTCATTAAAAAAA(i(i 3' Fibronoctin Domain #3 Forward: 5' ATAC~TtiAAATCAA~t3TTTC~CCAAAACCCI~Ci 3' Reverse : 5 ' CTCTTTACCCCAGi~ICCCAGCCCCAL3Tf3CTG 3 ' Tmsmcmbrane Domain Forwaxd : 5 ' (3CiACCAAGTCA3CCTCCiCTCAC3CACiATiTC 3 ' Reverse: 5' ACTA3TAA3TCCGTTTCTCTTCTTCiCO(~TG 3' Cytoplasmic Motif # 1 ,Forward: 5' CTf3AA~OCiAT(it~GCCiTS"i'TGTCAATCCATC 3' Reverse : 5 ' C~TCCCACiTCitiTTTCCA~3TGCTTCTCOCCAG 3 ' Cytoplasmic Motif #2 Forward : 5 ' QCi CCit7~OG 3 ' Reverse: 5' ATAOCTTTCATCTACAC~AAATG1TTGTACTC 3' Cytopiasmic Motif #3 Forward: 5' ACCAGACCAC3CCAAQAAACT(3AAACACCAf3 3' Reverse: 5' CiTACTTCCACiCTCiTC3TCTTQCiATT~OQCAQ 3' Table 8. Human Roundabout 2 Primer Pairs Immunoglobulin Domain #4 Forward : 5 ' t3TTaCTCAAC3GTCaAACACiTOACATTTCCC 3 ' Reverse: 5' TGTCAAAACATCAGTRACCTCCAGTTGAGC 3' Immunoglobulin Domain #5 Forward: 5' f3ATA3ACCTCCACCTATAATTCTACAAQGC 3' Reverse: 5' (3ACTCTX3TCACATCCAC3CACTCiCACTCCAO 3' Fibmnectin Domain #I
Forward: 5' CAATCACiTAAAAACTATGATTTAA~ 3' Reverse : 5 ~ TCt3CTCTbACCATGAATAAC3TAC3ATT0 3 ' Such primers or probes are at least 12, preferably at least 24, more preferably at least 36 and most preferably at least 96 bases in length. Demonstrating specific hybridization generally requires stringent conditions, for example, hybridizing in a buffer comprising 30% fonnamide in 5 x SSPE (0.18 M NaCI, 0.01 M NaP04, pH7.7, 0.001 M EDTA) buffer at a temperature of 42°C and remaining booed when subject to washing at 42°C with 0.2 x SSPE; preferably hybridizing in a buffer compsisang SO'/o fonmsmide in 5 x SSPE buffer at a tempi of 42°C and remaining boutrd when subject to washing at 42°C with 0.2 x SSPE buffer at 42°C.
Robe nucleic acids can also be distinguished using aligamait algorithms, such as BLASTX
(Altschul et al. (1990) Basic Local Alignment Search Tool, J Mol Biol 215, 403-410).
The subject nucleic acids are of syntheticlnon-natural sequences and/or are isolated, i.e. unaccompanied by at least some of the material with which it is associated in its natural state, preferably constituting at least about 0.5%, preferably at least about 5% by weight of total nucleic acid present in a given fraction, and usually recombinant, meaning they comprise a non-natural sequence or a natural sequence joined to nucleotides) other than that which it is joined to on a natural chromosome. The subject recombinant nucleic acids comprising the nucleotide sequence of SEQ ID NO:1, 3, 5, 7, 9 or 11, or fragments thereof, contain such sequence or fragment at a terminus, immediately flanked by (i.e. contiguous with) a sequence other than that which it is joined to on a natural chromosome, or flby a native flanking region fewer than 10 kb, preferably fewer than 2 kb, more preferably fewer than 500 bp, which is at a terminus or is immediately flanked by a sequence other than that which it is joined to on a natural chromosome. While the nucleic acids are usually RNA or DNA, it is often advantageous to use nucleic acids comprising other bases or nucleotide analogs to provide modified stability, ctc.
In a particular embodiment, expressed sequence tags EST;yu23d11, Accession #H77734 and EST;yq76e12, Accession #HS2936, and deletion mutants thereof, art not within the scope of the present invention. In another embodiment, the subject Robo nucleic acids exclude the corresponding regions of the disclosed natural human Robo I
nucleic acids, i.e.
SEQ iD N0:7, nucleotides S00-551 and SEQ ID N0:7, nucleotides 3945-44SS.
Table 10. Exemplary diffaxnces behvcen H52936 and corresponding human Robo I
sequences.
(1 ) At position 86, there is a T instead of an A. The new colon therefore reads TGA (Stop) instead of AGA (R).
(2) There is a missing G at position 286-7, causing a frameshift.
(3) There is an ~ G at position 334, causing a frameshift.
(4) There is an extra T at position 344, causing a frameshift.
(S) There is an extra N at position 357, causing a &ameshift.
(6) Them is a T instead of a C at 362. The new colon reads TTT (F) insttad of TCT (S).
(7) There is an extra T at position 364, causing a frameshift.
(8) There is an extra N at position 370, causing a frameshift and a changed amino acid (the colon TTN is ambiguous).
(9) There are two Ts at position 394 and 39S instead of a C, causing a framashift and amino acid changes.
Table 11 . Exemplary diffistenccs beriveen H529~7 (reverse sequence) and corresponding human Robo I sequences.
(1) There are multiple errors in the first 30 bases.
(2) At position 63, a G replaces an A. The new colon CGG codes for R instead of CAG for Q.
(3) The EST ends by joining to part of the human glycophorin B gene (353-442) The subjat nucleic acids find a wide variety of applications including use as translatable transcripts, hybridization probes, PCR primers, diagnostic nucleic acids, etc.; use in detecting the presence of Robo genes and gene transcripts and in detecting or amplifying _ nucleic acids encoding additional Robo homologs and structural analogs. In diagnosis, Robo hybridization probes find use in identifying wild-type and mutant Robo alleles in clinical and laboratory samples. Mutant alleles are used to gtnerate allele-specific oligonucleotide (ASO) probes for high-throughput clinical diagnoses. Ia therapy, therapeutic Robo nucleic acids are used to modulate cellular expression or intracellular concentration or availability of active Robo.
The invention provides efficient methods of identifying agents, compounds or lead compounds for agents active at the level of a Robo modulatable cellular function. Generally, these screening methods involve assaying for compounds which modulate Robo interaction with a natural Robo binding target. A wide variety of assays for binding agents are provided including labeled in vitro protein-protein binding assays, immunoassays, cell based assays, etc. The method$ art amenable to automated, cost-effective high throughput screening of chemical libraries for lead compounds. Identified reagents find use in the pharmaceutical industries for animal and human trials; for example, the reagaits may be derivatized and rescreened in in vitro and in vivo assays to optimize activity and minimize toxicity for pharmaceutical devalopmont.
Cell and animal based neural guidance/repuision assays are described in detail in the acperimental section 'below. In vitro binding assays employ a mixture of components including a Robo polypeptide, which may be part of a fusion product with another peptide or polypeptide, e.g. a tag for detection or anchoring, etc. The assay mixtures comprise a natural intracellular Robo binding target. While native full-length binding targets may be used, it is fraluently preferred to use portions (e.g. peptides) thereof so long as the portion provides binding affinity and avidity to the subject Robo polypeptide conveniently measurable in the assay. The assay mixture also comprises a candi~tata pharmacological agent.
Candidate agents encompass numerous chemical classes, though typically they are organic compounds;
preferably small organic compounds and are obtained from a wide variety of sources including libraries of synthetic or natural compounds. A variety of other reagents may also be included in the mixture. These include reagents like salts, buffers, neutral proteins, e.g.
albumin, detergents, protease inhibitors, nuclease inhibitors, antimicrobial agents, etc. may be used.
The resultant mixture is incubated under conditions whereby, but for the presence of the candidate pharmacological agent, the Robo polypeptide specifically binds the cellular binding target, portion or analog with a reference binding affinity. The mixture components can be added in any order that provides for the requisite bindings and incubations may be performed at any temperature which facilitates optimal binding. Incubation periods are likewise selected for optimal binding but also minimized to facilitate rapid, high-throughput screening.
After incubation, the agent-biased binding between the Robo polypeptide and one or more binding targets is detected by any convenient way. Where at least one of the Robo or binding target polypaptide comprises a label, the label may provide for direct detection as radioactivity, luminescence, optical or electron density, etc. or indirect detection such as an epitope tag, etc. A variety of methods may be used to detect the label depending on the nature of the label and other assay components, e.g. through optical or electron density, radiative emissions, nonradiative energy transfers, e2c. or indirectly detectod with antibody conjugates, etc.
A difference in the binding affinity of the Robo polypeptide to the target in the absence of the agent as compared with the binding affinity in the presence of the agent indicates that the agent modulates the binding of the Robo polypeptide to the Robo binding target. For example, in the cell-based assay also described below, a difference in Robo-dependent modulation of axon outgrowth or orientation in the presence and absence of an agent indicates the agent modulates Robo function. A difference, as used herein, is statistically significant and preferably represents at least a 50%, more preferably at least a 90%
difference.
The following experimental section aad examples are offered by way of illustration and not by way of limitation.
EXPFRIIV~NTAL
Cloning of the roundabout Gene. The robo~ allele was mapped to the plexus-brown interval on the right arm of the second chromosome by recombination mapping;
the numbers of recombinants suggested a map position very close to plexus at 58F/59A. One deficiency [Df(1R)P, which deletes 58E3/Fl through 60D14/E2] fails to complement robo mutations, two other deficiencies [Df(2R)S9AB and D, ft'2R)S9AD, which delete 59A1/3 through 59B 1/2 and 59A1/3 thmugh 59D1/4 respectively] do complement robo, and a duplication [Dp(2; Y)bw'"Y, which duplicates 58F1/59A2 through 60E3/F1] rescues robo mutations. This mapping places robo in the 58F/59A region.

We initiated chromosomal walks from P1 clones moped to the region, beginning from the distal side using close DS02204 and from the proximal side using clone DS05609.
We used cosmid clones (Tamkun et al., 1992) to complete a walk of 150 kb. We then looked for RFLPs in the recombinants between the multiple marked chromosome and the rvbo mutant chromosome. A 6.8kb EcoRI fragment from cosmid 106-5 identified a HindII RFLP
on the mapping chromosome that was present on a single robo mutant recombinant line. This fragment identified a proximal limit for the location of robo. Further deficiencies in this region were then tested (Kerrebrock et al., 1995). Of these deficiencies, Df(1R)XS8-S and Dj(2R)X58-12 remove robo while Df(2R)XS8-I does not. Df(ZR)XS8-IZ fails to complement Df(2R)S9AB yet complements Df(2R)S9AD indicating that Df(2R)S9AB extends further proximal; this proximal endpoint provides a distal limit for the location of robo. Probes from the walk were used to identify the breakpoints of these deficiencies (Figure lA). Df(ZR)XS8-1 breaks in a 9.6 kb EcoRI/BamHI fiagment within cosmid GJ12, whereas Df(1R) S9AB breaks in a 8 kb BamHI/EcoRI fragment within cosmid 106-1435. This r~uces the location of robo to a 75 kb regian bounded by these restriction fiagments. Hybridization of 0-16 hr poly-A' embryonic Northern blots with cosmids GJ12, 106-12, and 106-1435 revealod at least five transcripts. Reverse Northern mapping identified the regions containing these transcripts (Figure lA). These regions were used as probos to isolate cDNAs. Seven different cDNAs were isolated and analyzed by in situ hybridization. The expression pattern of five of these transcripts allowed us to tentatively discount them as encoding for robo since they were not expressed in the embryonic CNS at the appropriate stage. Of the two cDNAs remaining, 12-1 appeared by its size and expression the most likely candidate for robo. A 16 kb XbaI
fragment including the I Z-1 transcript and a region 5' to the transcript is capable of rescuing the robo mutant.
roundabout Encodes a Member of the Immunoglobulin Superfamily. We recovered and sequenced overlapping eDNA clones corresponding to the 12-1 transcription unit. A
single long open reading frame (ORF) that encodes 1395 amino acids was identified (D1 in Table 1 ). Conceptual translation of the ORF reveals the Rabo protein to be a manber of the Ig superfamily; Robo's ectodomain contains five immunoglobulin (Ig)-like repeats followed by three fibronectin (Fn) type-III repeats. The predicted ORF also contains a traasmembrane domain and a large 457 amino acid (a.a.) eytoplasmic domain. Hydropathy analysis of the Robo sequence indicates a single membrane spanning domain of 25 a.a. (Kyle and Doolittle, - 1982) plus a signal sequence with a predicted cleavage site between G51 and Q52 (Nielsen et al 1997).
We identify the 12-1 transcript as encoding robo based on several criteria First, the embryonic robo phenotype can be rescued by the 16 kb XbaI genomic fragment containing this cDNA; no other transcripts are contained in this 16 kb XbaI fragment.
Second, we identified a CfoI RFLP associated with the allele robo6. This polymorphism is due to a change of nucleotide 332 of the ORF from G to A, which results in a change of Gly", to Asp.
Glyl l l is in the first Ig domain (Figure 2), and is conserved in all Robo homologues identified. The change is specific to the allele robo6 and is not seen in the parental chromosome or in any of the other seven alleles, all of which were generated from the same parental genotype. Third, the production of antibodies (below) which recognize the Robo protein reveals that the alleles robo', roboi, robo', robo' and robo' do not produce Robo protein (Table 12).
Table 12. robo Mutant Alleles Allele Synonym Class roboT GA285 Protein null robai GA1112 Protein null robo3 214 Protein null robo' 2570 Protein null robaf 21772 Protein null robo Z 1757 Protein positive; Gly", to Asp robo' ' 22130 Reduced pmtein levels robo8 23127 Pmtein positive All alleles were generated by EMS mutagetiesis of Faslll null chromosomes.
Each of these alleles appear to represent a complete, or near cdmplete, loss-of function phenotype for robo, since the mutant phenotype observed when these alleles are placed over a chromosome deficient for the robo locus [Df(2R) X58-5] is indistinguishable finm the homozygous allele.
Finally, transgenic neural expression of robo rescues the midline crossing phenotype of robo mutants (sec below).
Developmental Northern blot analysis using both cDNA and genomic probes suggests that robo is ancoded by a single transcript of 7500 bp. We sequenced genomic DNA and identified I 7 introns within the sequence of which 14 are only 50-75 by in length plus three introns of 843 bp, 236 bp, and 110 by (Figure 1 B). The precise start point of the transcript has not been determined.
A Family of Evolutionarily Conserved Robo-like Proteins. The presence of five Ig and three Fn domains, a transmembrane domain, and a long (452 a.a.) cytoplasmic region indicates that Robo may be a receptor and signaling molecule. The netrin receptor DCC/Frazzled/UNC-40 has a related domain structure, with 6 Ig and 4 Fn domains and a similarly long cytoplasmic region (Keino-Masu et al., 1996; Chan et al., 1996;
Kolodziej et al.,1996). The only currently known protein with a "5 + 3" organization is CDO
(Kang et al., 1997). However, CDO is only distantly related to Robo (15-33% a.a. identity between ~rresponding Ig and FN domains).
We identifial other "5 + 3" proteins in vertebrates whose amino acid identity exceeds that of CDO and represent Robo homologues. A human expressad sequence tag (EST;
yu23d11, Accession #H77734) shows high homology to the second Ig domain of robo and was used to probe a human fetal brain cDNA library (Stratagene). The clones recovered correspond to a human gene with 5ve Ig and three Fn domains (Figure 2).
Exaznplary functional Robo domains are listed in Tables 13-17 (the corresponding encoding nucleic acids are readily discernable from the corresponding nucleic acid sequences of Sequence Listing).
Table 13. Exemplary domains of human Robo 1, by amino acid sequence positions Signal sequence: 6-21 First Immunoglobulin 68-167 domain:

Second Immunoglobulin 168-258 domain:

Third Immunoglobulin 259-350 domain:

Fourth Immunoglobulin 351.50 domain:

Fifth lmmunoglobulin 451-546 domain:

First Fibronectin domain:547-644 Second Fibronectin domain:645-761 Third Fibronectin domain:762-862 Transmembrane domain: 896-917 Cytoplasmic motif#1: 1070-1079 Cytoplasmic motif #2: 1181-1195 Cytoplasmic motif #3: 1481-1488 25 ' Table 14. Exemplary domains of human Robo II, by amino acid sequence positions Fourth Immunoglobulin domain: 1-91 Fifth Immunoglobulin domain: 92-185 First Fibronectin domain: 186-282 Table 15. Exemplary domains of drosophila Robo 1, by amino acid sequence positions Signal sequence: 30-46 First Immunoglobulin domain: 56-152 Second Immunoglobulin domain: 153-251 Third l:mmunoglobulin domain: 252-344 Fourth Immunoglobulin 345-440 domain:

FiRh Immunoglobulin domain:441-535 First Fibronectin domain:536-635 Second Fibronectin domain:636-753 Third Fibronectin domain:754-854 Transmembrane domain: 915-938 Cytoplasmic motif #1: 1037-1046 Cytoplasmic motif #2: 1098-1119 Cytoplasmic motif #3: 1262-1269 Table 16. Exemplary domains of dmsophila Robo II, by amino said sequence positions Immunoglobulin #1: 4-99 domain Immunoglobulin #Z: 100-192 domain Immunoglobulin #3: 193=296 domain Irrnnunoglobulin#4: 297-396 domain Immunoglobulin #5: 397-494 domain Fibronectin 495-595 domain #1:

Fibronectin 596-770 domain #2:

F'bronectin 771-877 domain #3:

Transmembrane 906-929 domain:

Conserved cytoplasmic 1075-1084 motif #1:

' Table 17. Exemplary domains of C. elegans Robo 1, by amino acid sequence positions First Immunoglobulin 30-129 domain:

Second Immunoglobulin 130-223 domain:

Third Immunoglobulin 224-315 domain:

Fourth Immunoglobulin 316-453 domain:

Fifth Immunoglobulin 454-543 domain:

First Fibronectin domain:544-643 Second Fibronectin domain:644-766 Third Fibronectin domain:767-865 Transmembrane domain: 900-922 Cytoplasmic motif#1: 1036-1045 Cytoplasmic motif #2: 1153-1163 Cytoplasmic motif #3: 1065-1074 The homology is particularly high in the first two Ig domains (58% and 48% as identity respectively, compared to 26% and 30% for the same two Ig domains between D-Robot snd CDU) and together with the overall identity throughout the extracellular region and the presence of three conserval cytoplasmic motifs has lad us to designate this as the human roundabout 1 gene (H robot ). Database searching reveals a nucleotide sequence corresponding to H rebel in the database, DU?Tl, which differs in the signal sequence suggesting alternative splicing, a 9 by insertion and scum single base pair changes. Five ESTs (sec Experimental Procedures) show high sequence similarity to the cytoplasmic domain of H rebel. Sequencing of cDNAs isolated using one of these ESTs as a probe confirmed a second human roundabout gene (H rebel). .
Degenerate PCR primers based on conserved sequences between H robot and D-robol were used to isolate a PCR fragment from a rat embryonic El3 brain cDNA
library.
The fragment was used to probe an E13 spinal cord cDNA library, resulting in the isolation of a full length Rat robe gene (R-robe!). The predicted protein shows high sequence identitiy (>95%) with H robot over the entire length. The 5' sequences of different R-robot cDNA
clones indicates that this gene is alternatively spliced in a sinular fashion to H robollDU?Tl.
We used a similar approach to isolate eDNA clones for R-robot, which is highly homologous to H robot.
z7 The mouse EST vi92e02 is highly homologous to the cytoplasmic portion of H
robol.
The G elegans Sax-3 gene is also a robo homologue (Table i; Zallen et al., 1997). A second Drosophila robo gene (D-robot) is also predicted from analysis of genomic sequence in the public database. Taken together these data indicate that Robo is the founding member of a new subfamily of Ig superfamily proteins with at least one member in nematode, two in Dmsophila, two in rat, and two in human.
The alignment of the Robo family proteins reveals that the first and second Ig domains are the most highly conserved portion of the extracellular domain. The cytoplasmic domains are highly divergent except for the presence of three highly conserved motifs (Table 18).
Table 18. Conserved Cytoplasmic Motifs: Amino acid alignments of the three conserved cytoplasmic motifs are shown below the structure; in C.elegans ratio, motifs #2 and #3 have been switched to provide a better alignment.
Conserved Cytoplasmic Motif #1 PDNPTPYATTMIIGTSS 1050 Drosophila roundabout-I
SGQPTPYATTQLIQSNL 1083 Human roundabout-I
NASPAPYATSSILSPHQ 1088 Drosophila roundabout-II
HDDPSPYATTTLVLSNQ 1049 C.elegans roundabout PtPYATT.hh.... Consensus (where h is I, L or V) Conserved Cytaplasmic Modf #2 INWSE.FLPPPPEHPPPSSTYG.Y 1119 Drosophila roundabout-I
MNWAD.LLPPPPAHPPPHSNSEEY 1202 Fit~man roundabout-I
STWANVPLPPPPVQPLPGTELEHY 31 Human roundabout-II
KTLN)D.FIPPPPSNPPPP.GGHVY 1168 C.elegans roundabout-I
nW...hhPPPP. PPP.s....Y Consensus (where h is hydrophobic) Conserved Cytoplasmic Motif #3 PSPMQPPPPVPVPEGW.Y 1273 Drosophila roundabout-I
YTDDLPPPPVPPPAIKSP 1493 Human roundabout-I
YADDLPPPPVPPPAIKSP 90 Mouse roundabout-I

RAPAMPTNPVPPBPpARY 1077 C.elegans roundabout .....PPPPVPPP.... Consensus The consensus for the first motif is PtPYATTxhh, where x is any amino acid and h is I, L, or V. The presence of a tyrosine in the center of the motif indicates a site for phosphorylation.
The other two motifs consist of runs of prolines separated by one or two amino acids and are reminiscent of binding sites for SH3 domains. Flt particular, the LPPP
sequence in motif #2 provides a good binding site for the Drosophila Enabled pmtein or its mammalian homologue Mena (Niebuhr et al.,1997). All three of these conserved sites can function as binding sites for domains (e.g. SH3 domains) of linker/adapter proteins functioning in Robo-mediated signal transduction.
Robo is Regionally F.xpraased on Longitudinal Axons in the Drosophiia Embryo.
In order to determine the role that robo might play in regulating axon ctnssing behavior, we examined the robo expmssion pattern in the embryonic CNS. The in situ hybridization pattern of robo mRNA in Drosophila shows it to have elevated and widespread expression in the CNS. We raised a monoclonal antibody (MAb 13C9) against part of the extracellular portion (amino acids 404-725) of the protein to visualize Robo expression Robo is first seen in the embryo weakly wcpressed in lateral stripes during germband extension.
At the onset of gerntband retraction, Robo expression is observed in the nauroectoderm. By the end of stage 12, as the growth cones first extend, Robo is seen on growth cones which project i~ilaterally, including pCC, aCC, MP 1, dMP2, and vMP2. Strilciagly, little or no Robo acpression is observed on comtnissurai growth cones as they extend towards and across the midline.
However, as these growth cones turn to project longitudinally, their level of Robo expression dramatically increases. Robo is expressed at high levels on all longitudinally-projecting growth cones and axons. In contrast, Robo is expressed at nearly undetectable levels on cocnmissural axons. This is striking since ~90% of axons in the longitudinal tracts also have axon segments crossing in one of the commissures. Thus, Robo expression is regionally restricted. Robo expression is also seen at a low level throughout the epidermis and at a higher level at muscle attachment sites. In stage 16-17 embryos, faint Robo staining can be seen in the commissures but at levels much lower than observed in the longitudinal tracts.
Immunoelectron Microscopy of Robo. We used ima~unoelectron microscopy to examine Robo localization at higher resolution. In stage 13 embryos, Robo is expressed at higher levels on growth cones and filopodia is the longitudinal tracts than on the longitudinal axons themselves. This localization is consistent with the model that Robo functions as a guidance receptor. The increased sensitivity of immunoelectron microscopy reveals the presence of very low levels of Robo protein on the surface of commissural axons. In addition, Robo-positive vesicles can be seen inside the commissural axons, possibly representing transport of Robo to the growth cone. Finally, by reconstructing the path of single axons by use of serial sections, we confirm that Robo acpression is greatly up-regulated after individual axons taro from the commissure into a longitudinal tract. The expression of Robo on noa-crossing and post-crossing axons and its higher level of expression on growth cones and its filopodia, provide a model wham Robo functions as an axon guidance receptor for a repulsive midline cue.
Tnansgenic Expression of Robo. We hypothesized that if Robo is indeed a growth cone receptor for a midline repellent, than pan neural expression of Robo protein during the early stages of axon outgrowth might lead to a robo gain-of function phenotype similar to the cornm loss-of function and opposite of the robo Ions-of function. To test this hypothesis, we cloned a robo cDNA containing the complete GRF but lacking most of its untraaslated regions (LrTRs) downstream of the UAS promoter in the pUAST vector and generated transgenic flies for use in the GAL4 system (Brand and Penrimon, I993).
Expression of robo in ail neurons was achievod by crossing the UAS-robo flies to either the elav~-G.lL4 or scabrous-GA1.4 lines.
Surprisingly, pan-neural expn;ssion of robo mRNA did not produce a strong axon scaffold phenotype as assayed with MAb BP102. Staining with aati-Fas II (MAb 1D4) revaalod subtly fasciculation defects, but overall the axoa scaffold looked quite normal. An insight into why we failed to observe a stmngerirobo ectopic expression phenotype was providod by staining these embryos with the anti Robo MAb. Int~tingly, the Robo protein, although acprcs$ed at higher levels than in wild type, remains restricted as in wild type, i.e., high levels of expression on the longitudinal portions of axons and very low levels on the commissures. This result indicates that there must be strong regulation of Robo expression, probably post tranaiational, that assures its localization to longitudinal axon segments. Such a mechanism could operate by the regulation of protein translation, transport, insertion, internalization and/or stability.
We used these tranagenic flies to rescue robo mutants. Expression of robo by the elav-GAL4 line in both roboj and robo' homozygotes rascued the midline crossing of Fns II
positive axons including pCC and other identified neurons.
Robo Appears to Function in a Cell Autonomous Fashion. To test whether Robo can function in a cell autonomous fashion, we used the UAS robo transgene with the f~,~ GAL4 line (Lire et al., 1994) . The ftza GAL4 line expresses in a subset of CNS
neurons, including many of the earliest neurons to be affected by the robo mutation such as pCC, vMP2, dMP2, and MP 1. Expression of robo by the Jtz,~ GAL4 line is sufficient to rescue these identified neurons in the robo mutant: pCC, which in robo mutants heads towards and crosses the midline, in these rescued embryos now projects ipsilaterally and does not cross the midline.
When the same embryos wen stained with the anti-robo MAb 13C9, we observed that all Robo-positive axons did not cross the midline. Tha,~tz,~ GAL4 line drives expression in many of the axons in the pCC pathway (Lire et al., 1994), a medial longitudinal fascicle. Ia robo mutants, this axon fascicle freely crosses and circles the midline, joining with its contralateral pathway. When rescued by the,~tza GAL4 line driving UAS-robo, this pathway now largely remains on its own side of the midline, even though occasionally a few axons cross the midline, These experiments support the notion that Robo can function in a cell autonomous fashion.
Expression of Mammalian robot in the Rat Spinal Cord. The isolation of several vertebrate Robo homologues suggests that Robo may play a similar role in orchestrating midline crossing in the vertebrate nervous system as it does in Drosophila. In the vertebrate spinal cord, the ventral midline is comprised of a unique group of cells called the floor plate (for review, Colamarino and Tessier-Lavigne,1995). As in the Drosophila nervous system, the vertebrate spinal cord contains both crossing and non-crossing a~cons.
Spinal commissural neurons are born in the dorsal half of the spinal Cord; commissural axons project to and cross the floor plate before turning longitudinally in a roatral direction. In contrast, the axons of two other classes of netunns, dorsal association neurons and ventral motor neurons, do not cross the floor plate (Altman and Bayer,1984).
To address the possibility that Robo may play a role in organizing the projections of these spinal neurons, we txaminod the expression of rat robot by RNA in situ hybridization.
A rat robot riboprobe spanning the first three Ig domains was hybridized to transverse sections of E 13 rat spinal cord. At E 13, when many commissural axons will have already extended across the floor plate (Altman and Hayer,1984), rat robot is expressed at high levels in the dorsal spinal cord, in a pattern corresponding to the cell bodies of commissural neurons. Rat robol is also expressed at lower levels in a subpopulation of ventral cells in. the region of the developing motor column. Interestingly, this expression pattern is similar to and overlaps partly with the mRNA encoding DCC, another Ig superfamily member which is also expressed on commissural and motor neurons and encodes a receptor for Netrin-1 (Keino-Masu et al, 1996). Rat robol is not, however, expressed in the either the floor plate or the roof plate of the spinal cord or in the dorsal root ganglia. This is in contrast to rat cdo, which is strongly expressed in. the roof plate (KB, MT-L, and R.
Krauss. In the periphery, rat robol is also found to be expressed in the the myotome and developing limb, in a pattern reminiscent of c-met (Ebens et al, 1996), indicating that rat robol may also be expressed by migrating muscle precursor cells. Therefore, like its Drosophila homologue, rat robol RNA is expressed by both crossing and .non-crossing populations of axons, indicating that it encodes the functional equivalent of D-Robol.
Genetic Stocks. All eight independent robo alleles were isolated on chromosomes deficient for Fasciclin III as described in Seeger et al., 1993. Subsequent use of a duplication that includes Faslll, and recombination of the robo chromosomes, indicates that the robo phenotype is independent of the absence of Faslll Deficiencies were obtained from the Drosophila stock center at Bloomington, Indiana.
Cloning and Molecular Analysis of the robo Genes. Start points for a molecular walk to robo were obtained from the Berkeley and Crete Drosophila Genome Projects.
Chromosomal walking was performed using standard techniques to isolate cosmids from the Tamkun library (Tamkiui et al., 1992).
cDNAs were isolated from the Zinn 9-12 hour Drosophila embryo gtl 1 library (Zion et al., 1988), and from a human fetal brain library (Stratagene). Northern blot of poly-A+ RNA
and reverse Northern blots were hybridized using sensitive Church conditions.
Sequencing of the cDNAs and genomic subclones was performed by the dideoxynucleotide chain termination method using SequenaseTM (USB) following the manufacturer's protocol and with the AutoReadTM kit or AutoCycleTM kit (Pharmacia) or by 33P cycle sequencing.
Reactions were analyzed on a Pharmacia LKB or ABI automated laser fluorescent DNA sequencers respectively. The cDNAs were sequenced completely on both strands. Sequence contigs were compiled using LasergeneTM, IntelligeneticsTM, and AssemblyLIGNTM software (Kodak Eastman). Database searches were performed using BLAST

(Altschuel et al., 1990).
A full length D-robal cDNA was, generated by ligating two partial cDNAs at an internal HpaI site and subcloning into the EcoRI site of pBluescript.SK+. A
ihll length H
rebel cDNA was synthesized by ligating an Xbal-SaII fragment iiom a cDNA and a PCR
product coding for the carboxy-terminal 222 amino voids at a SaII site. Tha PCR product has an EcoRI site introduced at the stop colon. The ligation product was cloned into pBluescript.SK+ digested with XbaI and EcoRI.
To clone the rat rebel cDNA, degenerate oligonucleotide primers designed against sequences conserved between the 5' ends of D-Robot and H-Robot were used to amplify a 500 by fragment from an E13 rat brain eDNA by PCR. This fragment was used to an EI3 spinal cord library at high stringency, resulting in the isolation of a 4.2 kb cDNA clone comprising all but the last 700 nucleotides. Subsequent screenings of the library with non-overlapping probes from this cDNA led to the isolation of 4 partial and 7 full length clones.
To clone the rat robot cDNA, we screens the same h'bra~y with a fragment of the H robot cDNA.
Expressed Sequex~ce Tag and Genomic Sequetu~s. The ESTs yuZ3d11 (#H77734), zr54g12 (#AA23b414) and yq76e12 (#H52936, #H52937) code for portions of H-RolioI. The EST yq7el2 is ab~tly spliced to part of the human glycophorinB gene. Five ESTs yn50a07, yg02b06, ygI7b06, ynI3a04 and ym17g11 code forpart ofH robot. The Dmsophila PI clone DS00329 encodes the genomic sequence of D-robot. Sequences 1825710 and 1825711 (both: #U88183; locus ZK377) code for the predicted sequence of C.
stagers robe. The EST vi62e02 (#AA499193) codes for mouse robot .
Identification of Molecular Defects In robe Alleles. Southern blots of robe alleles and their parental chromosomes were hybridized with,fragments from the genomic cosmid clone 106-1435 or partial cDNA clones to identify restriction figment length polymorphisms affecting the robs trat~niption unit. DNA was obtained from homozygous mutant embryos.
35 cycles of the PCR was subsequently performed on the DNA obtained from half an embryo.
Primers specific for the region flanking the CfoI polymorphism used were :
ROB06 (5'-GCATTGGGTCATCTGTAGAG -3') and 808023 (5'-AGCTATCTGGAGGGAGGCAT
3~. The PCR products were purified on a Pharmacia H300 spin column and sequenced directly.
Transformation of Dmsophila, robe Rescue, and Overexpression. The lb kb XbaI

fragment from cosmid 106-1435 was cloned into the Drosophila transformation vector pCaSpeR3.
Transformant lines were generated and mapped by standard procedures. Four independent lines were shown to rescue robol'j'f alleles as judged by MAb 1D4 staining.
PCR amplification of the D-robo ORF using the primers (5' GAGTGGTGAATTCAACAGCACCAA.A.ACCACA.A.AATGCATCCC-3') and (5' CGGCrGAGTCTAGAACACTTCATCCTTAGGTG-3') produced a PCR product with an altered ribosome binding site that more closely matches the Drosophila consensus (Cavener, 1987), and has only 2lbp of 5' UTR and no 3' UTR sequences. The PCR product was digested with EcoRI and XbaI
and cloned into pBluescript (Stratagene) and subsequently, pUAST (Brand and Perrimon 1993).
Transformant lines were crossed to elav-GAL4 and sea-GAL4 lines which express GAL4 in all neurons, or ftz~sg GAL4 which expresses in a subset of CNS neurons (Lin et al, 1994).
Bmbryos were assayed by staining with MAbs BP102, 1D4 and 13C9. For ectopic expression in the robo mutant background, the stocks robo3 and robos (both protein nulls) were used. Crosses utilized the stocks w; robolCyO;
UAS robo and w; robolCyO; elav-GAL4. Due to the difficulty of maintaining a balanced stock, I 5 robot+; . f~-ngGAL4/+ males were generated as required.
Generation of Fusion Proteins and Antibodies. A six histidine tagged fusion protein was constructed by cloning amino acids 404-725 of the D-robo protein into the PstI
site of the pQE31 vector (Qiagen). Fusion proteins were purified under denaturing conditions and subsequently dialyzed against PBS. Immunization of mice and MAb production followed standard protocols (Patel, 1994).
. RNA Localization and Protein. hmmunocytochemistry. Digoxigenin labeled antisense robo transcripts were generated from a subclone of a robo cDNA in Bluescript: In-situ tissue hybridization was performed, as described in Tear et al., 1996. Immunocytochemistry vvas performed as described by Patel, 1994. MAb 1D4 was used at a dilution of 1:5 and BP102 at 1:10. For anti-lobo staining, MAb 13C9 was diluted 1:10 in PBS with 0.1% Tween-20TM, and the embryos were fixed and cracked so.a~s to minimize exposure to methanol. The presence of triton and storage of embryos in methanol were both found to destroy the activity of MAb 13C9.
In situ hybridization of rat spinal cords was earned out essentially as described in Fan and Tessier-Lavigne, 1994. E 13 embryos were fixed in 4% paraformaldehyde, processed, embedded in OCT, and sectioned to 10 m. A l.Okb 35S antisense rRobo riboprobe spanning - the the first three immuryo$lobulin domains was used for hybridization. An additional non-overlapping probe was also used with identical results. DCC transcripts were detected as described in Keino-Masu et al.,1996. Immunohistochemistry against TAG-1 was carried out on 10 m transverse spinal card sections using 4D7 monoclonal antibody (Dodd et al, I988).
Electron Microscopy. Canton S embryos were hand devitellinized, opened dorsally to remove the gut, and prepared for immunoelectron microscopy according to the procedures described previously (Lin et al.,1994), with the following modifications. The fixed embryos were incubated sequentially with MAb 13C9 (1:1) for 1-2 hours, biotinylated goat anti-mouse secondary antibody {1:250) for 1.5 hours, and than streptavidin-conjugated HRP
(1:200) for 1.5 hours. Hydrogen peroxide (0.01%) was used instead of glucose oxidase for the HRP-DAB
ruction.
References Altman, J. and Bayer, SA (1984) Adv. Anat. Embryol. Call Biol. 85, 1-164.
Altschul, S.F., et al. (1990) J. Mol. Biol. 215, 403-410.
Bastiani, M.J., et al. ( 1987) Cell 48, 745-755.
Brand, A. H., and Pe~rimon, N. (1993) Developanent 118, 401-415.
Caverier, D. (1987) Nucl. Acids Res. 15, 1353-1361.
Chars, S. et al. {1996) Cell 87,187-195.
Dodd, J., et al. (1988) Neuron 1, I05-116.
Eberrs, A., et al. (1996) Neuron 17, 1'157-1172.
Elkirrs, T., et al. (1990) Cell 60, 565-575.
Fan, C.M. and TessieT Lavigne, M. (1994) Cell 79,1175-1186.
Gentler, F.B., et al. (1995) Gores Develop. 9, 521-533.
Harris, R., Sabatelli, L.M., and Seeger, M.A. ( 196) Neuron 17, 217-228.
Hexigecock, E.M., Culotti,1.G., and HaU, D.H. (1990) Neumn 4, 61-85.
Kong, J-S., et al. (1997) J. Cell Biol. 138, 203-213.
Keino-Masu, K., et al. (1996) Cell 87,175-185.
Kennedy, T.E., et al. (1994) Call 78, 425-435.
Karebrock, A. W., et al. (1995) Cell 83, 247-256.
Kidd, T., Russell, C., Goodman, C.S., and Tear, G. (1997). Dosage sensitive and complementary functions of Roundabout and Commissureless control axon crossing of the CNS midline. Neuron, in review.

Kolodziej, P.A., et al. (1996) Cell 87, 197-204.
Kyte, J., and Doolittle, R.F. (1982) J. Mol. Biol. 157, 105-132.
Lin, D.M., et al. (1994) Neuron 13, 1055-1069.
Mitchell, K.J., et al. (1996) Neuron 17, 203-215.
Myers, P.Z., and Bastiani, M.J. (1993) Journal of Neuroscience 13, 127-143.
Niebuhr, K., et al. (1997) EMBO J. 16, 5433-5444.
Nielsen, H., et al. (1997) Protein Engineering 10, 1-6.
Patel, N. H. (1994) In "Methods in Cell Biology, VoI 44. Drosophila melanogaster: Practical Uses in Cell Biology" (L. S. B. Goldstein and E. Fyrberg, eds) Academic Press, New York.
IO Seeger, M., Tear, G., Ferres-Marco, D., and Goodman C.S. (1993) Neuron I0, 409-426.
Serafmi, T., et al. (1994) Cell 78, 409-424.
Stoeckli, E.T., and Landmesser, L.T. (1995) Neuron 14, 1165-1179.
Stoeckli, et al. (1997) Neuron 18, 209-221.
Tamkun, J.W., et al. (1992) Cell 68, 561-572.
Tear, G., et al. (1993) Perspectives on Developmental Neurobiology 1, 183-194.
Tear G., et al. (1996) Neuron 16, 501=51.4.
Tessier-Lavigne, M., and Goodman, C.S. (1996) Science 274, 1123-1133.
Wadsworth, W.G., Bhatt, H., and Hedgecock, E.M. (1996) Neuron 16, 35-46.
Zallen, J., Yi, A., and Bargmann, C. (1997). The conserved immunoglobulin superfamily member SAX-3/Robo directs multiple aspects of axon guidance in C. elegans. Cell, in review.
Zinn, K., McAllister, L., and Goodman, C. S. (1988)Cell 53, 577-587.
Although the foregoing invention has been described in some detail by way of illustration and example for purposes of clarity of understanding, it will be readily apparent to those of ordinary skill in the art in light of the teachings of this invention that certain changes and modifications may be made thereto without departing from the spirit or scope of the appended claims.

SEQUEI~$ LISTING
(1) GENERAL INFORMATION:
(i) APPLICANT: Goodmnn, Corey S.
Kidd, Thomas Mitchell, Kevin Tear, Guy (11) TITLE OF INVENTION: Robo: A Novsl Family of Polypeptide and Nucleic Acids (iii) NUMBER OF SEQUENCES: 12 (iv) CORRESPONDENCE ADDRESS:
(A) ADDRESSEE: SCIENCE & TECHNOLOGY LAW GROUP
(B) STREET: 75 DENISE DRIVE
(C) CITY: HILLS80ROUGH
(D) STATE: CALIFORNIA
(E) COUNTRY: USA
(F) ZIP: 94010 (v) COMPUTER READ,AHLE FORM:
(A) MEDIUM TYPE: Floppy disk (H) COMPUTER: IBM PC compatible (C) OPBRATING SYSTEM: PC-DOS/MS-DOS
(D) SOFTWARE: PatentIn Release (i1.0, Version X1.30 (vi) CURRBDiT APPLICATION DATA:
(A) APPLICATION NUMHER:
(B) FILING DATE:
{C) CLASSIFICATION:
(viii) ATTORNEY/AGENT INFORMATION:
(A) NAME: OSbIAN. RICHARD A
(H) REGISTRATION NUMBER: 36,6x7 (C) REFERENCE/DOCRET NUMBER: 89'8-006 (ix) TELECOMMUNICATION INFORMATION:
(A) TELEPHONE: (650) 343-4341 (B) TELEFAX: (650) 343-4342 (2) INFORMATION FOR SEQ ID NO:1:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 4188 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double A
(D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA

(xi) SEQUFNCE DESCRIPTION: SEQ ID N0:1:

CACTAATAAC

CCTCGTCCTG

TATCATCGAG

CAAAGTGGAG

GGCAAGCCOG AACCCACCAT TGAGTQGTTT AAG6AT0<iCa AACCCGTCAG300 CACCAACGAA

AAGAAATCGC ACCGC(iTCCA GTTCAAGGAC GGGGCCCTCT TCTTTTACAG360 GACAATGCAA

AGTGGGCCAG

OCCGTTAGTC GCCATGCCTC CCTCCAGATA GCTGTTTniC GCGACGATTT480 TCGCGTGGAG

CCCAAAGACA CGCGACiTGGC CAAA,GGCGAG ACGGCTCTGC TGGAGTGTGG540 GCCGCCCAAA

CGACCTGAAA

CCTGCTGATC

TCTGGTAGGC

TATGAAGGAG

AGTGGGCGGT

GATCCGCCGC CGAAAGns'TT GTGGAAAAAG GAGGAGGGCA ATATTCCGGT900 GTCCAGAGCG

CGATGAGGGC

ACCTATGTCT GCGAGGCACA CAACAATGTC GCiTCAGATCA GCGCTAGGOC1020 TTCTCTTATA

ACTAAATGGG

CTGGACCAAG

TGTGGCTGCC

GATGGAACTC TGCA~G1ATTAC GGATGTGCGG CAGGAAGACG AAGQCTACTA1260 TGTGTGTTCC

GCTTTCAGTG TAGTCGATTC CTCTACAGTA CGGCiTTTTCC TGCRAGTCAG1320 CTCGGTAGAC

CAAGGGATCA

GTTGCTACTT TACCCTGTCG GGCCACTGCiA AATCCCAGTC CCCGTATCAA1440 GTGGTTCCAC

CTCACTGAGA

GTCGATGACC TTCAACTAAG TGACTCTGGT ACCTACAC~T GCACTGCATC1560 TGGCGAACGA

GGAGAAACTT CCTGGGC1'GC CACACTAACG GTGGAAAA11C CCGGTTCTAC1620 ATCTCTTCAC

CCTGAATGTC

AGTCGCACCA GCATTAGTCT TCCiTTGQGCT AAAAGCCAAG AOAAACCCGG1740 AGCT(iTGGGC

CCAATCATTCi GATACACTGT AGAGTACTTC AGTCCGGATC TGCAAACTGG1800 TTGGATTGTG

GCTGCCCATC GAGTCGQCGA CACTCAACiTC ACTATCTCGG GTCTCACTCC1860 TGGCACTTCG

TATC~PGTTCC TAGTTAGAGC TGAGAATACT CAGGCiTATTT CTGTGCCTTC1920 CGGCTTATCA

TTTGTCAGCA

TATCAATGCT

CGTAGAGGGC

CTGCGCATAC ACTATAMiGA 1GCCAGTGTA CCATCCGCAC AiCiTATCACTC2160 GATCACTfi'1"I' ATGGATGCCT CTGCAGAATC GTTTG'1'GGTG GGAAACCTTA AGAAGTACAC2220 CAAGTATGAG

TTCTTCCTAA CACCCTTTTT TGALiACAATT GAAQaACAGC CCAGTAACTC2280 CAAGACAGCC

CATGTACAAC

TGGCAATTTG

CAATATGACT

TGTGTACAGC

G4'CiAGGTTGA ACTCCTTTAC CAAQGCAGGA GATGGACCTT ACTCCAAACC2580 GATATCACTA

CACCCATGAT

ACCTGGCGAC

GCTAATGGTG

CrGGTCPGCA TCGT'i'CTTCT AGTCCTGGTT ATTTCGGCGG CTATTTCGAT2820 GGTCTACTTC

TGACAACGAA

TCGTGGATGG

ATCCCACGTT

AGAAGTTGAC

ACCCGTAACC TTACCACC1'T CTACAATTGT CGCAAGAGCC CCGATAATCC3120 CACGCCGTAC

AACATCTATA

CGGTCAGGTG

CCAGCGGTTC CTGTTGTCAA ATCCAACfAT CTTCAGTATC CGGTTGAACC3300 GATCAACTGG

TCAfiACiTTTC TACCCCCGCC GCCAGAACAC CCACCTCCGT CTTCTACCTA3360 TGGATACGCA

CAAGCiATCTC CTGAATCTTC GCGGAAGACiC TCCAAAAGCG CAGGTTCCGG3420 CATTTCTACA

AATCAAAGCA TTCTGAACGC ATCCATACAC AGCAGCTCGT CGGGCGGCi'T3480 TTCAGCTTGG

C~AGTATCGC CCCAATATGC TGTCGCCTGT CCACCGGAAA ACGTTTATAG3540 CAATCCGCTG

2'CGGCAGTGG C1'GGCGGCAC CCAGAACCGC TATCAGATAA CGCCCACAAA3600 CCAACATCCG

ACCCAACCAC

CTGCCATTTG CCACACAGCG TCA'ibCAGCC AGCGAGTACC AGOCTGGACP3720 GAATGCAGCG

CTCGCCCATG

CAACCCCCAC CGCCAGTTCC CGTACCCGAG GGCTGGTAtC AACCGGTGCA3840 TCCCAATAGC

CACCCGATGC RCCCGACC1'C CTCCAACCAC CAGATCPACC AGTGCTCCTC3900 CGAGTGCTCG

GATCACTCGA GGAGCTCGCA GAGTCACAAG CGGCACiCTGC AGCrCCiAGGA3960 GCACGGCAGC

AGTGCCAAAC AACGCGGAQG ACACCACCGT CGAC<'s'ACiC:CC 4020 CGGTGGTGCA GCCGTGCATG

GAGAGCGAGA ACGA(iAACAT GCTGGCGGAG TACGACiCAGC GCCAGTACAC4080 CAGCGATTGC

TGCAATAGCT CCCGCGAQGG CGACACCTGC TCCTGCAGCG Af3GGATCCTG4140 TCTTTACGCC

GAGGCGGGCG AGCCGf3CQCC TCGTCAAATG ACTGCTAAGA ACACCTAA4188 ( 2 ) INFORbIATION FOR SEQ ID NO : 2 ( i ) SEQUECICE CFiARACTERI STICS

(A) LENGTH: 1395 amino acids (B) TYpE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID N0:2:
Met Hfa Pro Met His Pro Glu Asn His Ala Ile Ala Arg Ser Thr Ser Thr Thr Asn Asn Pro Ser Arg Ser Arg Ser Ser Arg Met Trp Leu Leu Pro Ala Trp Leu Leu Leu Val Leu Val Ala Ser Asn Gly Leu Pro Ala Val Arg Gly Gln Tyr Gln Ser Pro Arg Ile Ile Glu His Pro Thr Asp Leu Val Val Lys Lys Asn Glu Pro Ala Thr Leu Asn Cys Lys Val Glu Gly Lys Pro Glu Pro Thr Ile Glu Trp Phe Lys Asp Gly Glu Pro Val Ser Thr Asn Glu Lys Lys Ser His Arg Val Gln Phe Lys Asp Gly Ala Leu Phe Phe Tyr Arg Thr Met Gln Gly Lys Lys Glu Gln Asp Gly Gly Glu Tyr Trp Cys Val Ala Lys Aen Arg Val Gly Gln Ala Val Ser Arg His Ala Ser Leu Gln Ile Ala Val Leu Arg Asp Asp Phe Arg Val Glu Pro Lya Asp Thr Arg Val Ala Lys Gly Glu Thr Ala Leu Leu Glu Cys 165 1?0 175 Gly Pro Pro Lya Gly Ile Pro Glu Pro wcu Leu ile Trp Ile Lys Asp Gly Val Pro Leu Asp Asp Leu Lys Ala Met Ser Phe Gly Ala Ser Ser Arg Val Arg Ile Val Asp Gly Gly Asn Leu Leu Ile Ser Aan Val Glu Pro Ile Asp Glu Gly Asn Tyr Lys Cys Ile Ala Gln Asn Leu Val Gly Thr Arg Glu Ser Ser Tyr Ala Lya Leu Ile Val Gln Val Lys Pro Tyr Phe Met Lys Glu Pro Lys Asp Gln Val Met Leu Tyr Gly Gln Thr Ala Thr Phe His Cys Ser Val Gly Gly Asp Pro Pro Pro Lys Val Leu Trp Lys Lys Glu Glu Gly Asn Ile Pro Val Ser Arg Ala Arg Ile Leu Hia Asp Glu Lys Ser Leu Glu Ile Ser Asn Ile Thr Pro Thr Asp Glu Gly Thr Tyr Val Cys Glu Ala His Asn Asn Val Gly Gln Ile Ser Ala Arg Ala Ser Leu Ile Val His Ala Pro Pro Asn Phe Thr Lys Arg Pro Ser Asn Lys Lye Val Gly Leu Asn Gly Val Val Gln Leu Pro Cys Met Ala 355 360 3fi5 Ser Gly Asn Pro Pro Pro Ser Val Phe Trp Thr Lya Glu Gly Val Ser 370 3?5 380 Thr Leu list Phe Pro Asn Ser Ser His Gly Arg Gln Tyr Val Ala Ala Asp Gly Thr Leu Gln Ile Thr Asp Val Arg Gln Glu Asp Glu Gly Tyr Tyr Val Cya Ser Ala Phe Ser Val Val Asp Ser Ser Thr Val Arg Val Phe Leu Gln Val Ser Sar Val Asp Glu Arg Pro Pro Pro Ile Ile Gln d35 440 445 Ile Gly Pro Ala Asn Gln Thr Leu Pro Lya Gly Ser Val Ala Thr Leu Pro Cya Arg Ala Thr Gly Asn Pro Ser Pro Arff Ile Lys Trp Phe His Asp Gly His Ala Val Gln Ala Gly Asn Arg Tyr Ser Ile Ile Gln Gly Ser Ser Leu Arg Vah Asp Asp Leu Gln Leu Ser Asp Ser Gly Thr Tyr Thr Cys Thr Ala Ser Gly Glu Arg Gly Glu Thr Ser Trp Ala Ala Thr Leu Thr Val Glu Lys Pro Gly Ser Thr Ser Leu His Arg Ala Ala Asp Pro Ser Thr Tyr Pro Ala Pro Pro Gly Thr Pro Lys Val Leu Asn Val Ser Arg Thr Ser Ile Ser Leu Asr~ Trp Ala Lys Ser Gln Glu Lys Pro Gly Ala Val Gly Pro Ile Ile Gly Tyr Thr Val Glu Tyr Phe Ser Pro Asp Leu Gla Thr Gly Trp Ile Val Ala Ala His Arg Val Gly Asp Thr Gln Val Thr Ile Ser Gly Leu Thr Pro Gly Thr Ser Tyr Val Phe Leu Val ArQ Ala Glu Asn Thr Gln Gly Ile Ser Val Pro Ser Gly iwu Ser Asn Val Ile Lys Thr Ile Glu Ala Asp Phe Aap Ala Ala Ser Ala Asn Asp Leu Ser Ala Ala ArQ Thr Leu Leu Thr Gly Lys Ser Val Glu Leu Ile Aap Ala Ser Ala Ile Asn Ala Ser Ala Val Ar~r Leu Glu Trp Met Leu His Val Ser Ala Asp Glu Lys Tyr Val Glu Gly Leu Arg Ile His Tyr Lys Asp Ala Ser Val Pro Ser Ala Gln Tyr His Sar Ile Thr Val 705 710 715 7a0 Net Asp Ala Ser Ala Glu Ser Phe Val Val Gly Asn Leu Lys Lys Tyr 7a5 730 735 Thr Lys Tyr Glu Phe Phe Leu Thr Pro Phe Phe Glu Thr Ile Glu Gly Gln Pro Ser Asn Ser Lys Thr Ala Leu Thr Tyr Glu Asp Val Pro Ser Ala Pro Pro Asp Asn Ile Gln ile Gly Met Tyr Asn Gln Thr Ala Gly Trp Val ArQ Trp Thr Pro Pro Pro Sar Gln His His Asn Gly Asn Leu Tyr Gly Tyr Lys Ile Glu Val Ser Ala Gly Asn Thr Net Lys Val Leu 805 ' 810 815 Ala Asn Met Thr Leu Asn Ala Thr Thr Thr Ser Val Leu Leu Asn Asn Leu Thr Thr Gly Ala Val Tyr Ser Val Arg Leu Asn Ser Phe Thr Lys Ala Gly Asp Gly Pro Tyr Ser Lys pro Ile Ser Leu Phe Met Asp Pro Thr His His Val His Pro Pro Arg Ala His Pro Ser Gly Thr His Asp Gly Arg His Glu Gly Gln Asp Leu Thr Tyr His Asn Asn Gly Asn Ile Pro Pro Gly Asp Ile Asn Pro Thr Thr His Lys Lys Thr Thr Asp Tyr Leu Ser Gly Pro Trp Leu Met Val Leu Val Cya Ile Val Leu Leu Val Leu Val Ile Ser Ala Ala Ile Ser Met Val Tyr Phe Lys Arg Lys His Gln Met Thr Lys Glu Lsu Gly His Lsu Ser Val Val Ser Asp Asn Glu Ile Thr Ala Leu Asn Ile Asn Ser Lys Glu Ser Leu Trp Ile Asp His His Arg Gly Trp Arg Thr Ala Asp Thr Asp Lys Asp Ser Gly Leu Ser Glu Ser Lye Leu Leu Ser His Val Asn Ssr Ser Gln Ser Asn Tyr Asn Asn Ser Asp Gly Gly Thr Asp Tyr Ala Glu Val Asp Thr Arg Asn Leu Thr Thr Phe Tyr Asn Cys Arg Lys Ser Pro Asp Asn Pro Thr Pro Tyr Ala Thr Thr Met Its Ile Gly Thr Ser Ser Ser Glu Thr Cys Thr Lys Thr Thr Ser Ile Ser Ala Asp Lys Asp Ser Gly Thr His Ser Pro Tyr Ser Asp Ala Phe Ala Gly Gln Val Pro Ala Val Pro Val Val Lys Ser Asn Tyr Leu Gln Tyr Pro Val Glu Pro Ile Asn Trp Ser Glu Phe Leu Pro Pro Pro Pro Glu His Pro Pro Pro Ser Ser Thr Tyr Gly Tyr Ala Gln Gly Ser Pro Glu Ser Ser Arg Lya Ser Ser Lys Ser Ala Gly Ser Gly Its Ssr Thr Asn Gln Ser Ile Leu Asn Ala Ser Ile His Ser Ser Ser Ser Gly Gly Phe Ser Ala Trp Gly Val Ser Pro Gln Tyr Ala Val Ala Cya Pro Pro Glu Asn Val Tyr Ssr Asn Fro Leu Ser Ala Val Ala Gly Gly Thr Gln Asn ArQ Tyr Gln Ile Thr Pro Thr Asn Gln His Pro Pro Gla Leu Pro Ala Tyr Phe Ala Thr Thr Gly Pro Gly Gly Ala Val Pro Pro Asn His Leu Pro Pha Ala Thr Gln Arg His Ala Ala Ser Glu Tyr Gln Ala Gly Leu Asn Ala Ala Arg Cys Ala Gln Ser Arg Ala Cys Asn Ssr Cys Aap Ala Lsu Ala Thr Pro Ser Pro Mst Gln Pro Pro Pro Pro Val Pro Val Pro Glu Gly Trp '1yr Gln Pro Yal His Pro Asn Ser His Pro Met His Pro Thr Ser Ser Asn His Gln Ile Tyr Gln Cys Ser lae5 1290 ~ a 9s Ser Glu Cys Ser Asp His Ser Arg Ser Ser Gln Ser His Lya ArQ Gln Leu Gln Leu Glu Glu His Gly Ser Ser Ala Lys Gln Arg Gly Gly His His Ark Arg ArQ Ala Pro Val Val Gln Pro Cys Met Glu Ser Glu Aan Glu Asn Met Lsu Ala Glu Tyr Glu Gln ArQ Gln Tyr Thr Ser Asp Cys Cys Asn Ser Ser Ark Glu Gly Asp Thr Cys Ser Cys Ser Glu Gly Ser Cys Leu Tyr Ala Glu Ala Gly Glu Pro Al~ Pro Arp Gln Met Thr Ala Lys Asn Thr t2) INFORMATION FOA SEQ ID N0:3:
t i ~ ssaar~rc$, cHAAACTSAISTICS
(A) LENGTH: 4146 base pairs tB) TYPE: nucleic acid (C) STR11DTDEDNSSS: double (D) TOPOLOGY: linear i 3 ) MOLaCTJi.E TYPE : cD~NA

(xi? SEQUENCE DESCRIPTION: SEQ ID N0:3:

GGTGAAAATC CACGCATCAT CGAGCATCCC ATGGACACGA CGG'1'GCCAAA60 AAATGATCCA

GTTTAAGGAC

CGGGGGTCTA

CTGGTGCGAG

GCCAAAAACG AGTTTGGAGT GGCACCtiTCC AGGAATGCAA CGTTGCAAGT300 GGCAGTTCTC

GGTGGCCCTG

CAAGAACGGC

CAGACCCTGA ATCTTGTCGG GAACAJ1GCGG ATTCGCATTG TCGACGG'1'GG480 CAATCTGGCC

GAATGTGGTT

CCTCATCCGA

GCiACCCCAGA ATCAGACGGC GGTGGTGGGC AGCTCGGTOG TCTTCCAGTG660 CCGCATCGGA

GGCGATCCCC TGCCTGATGT CCTCiTGGCGA CGCArTGCCT CCGOCGGCAA720 TATGCCACTG

CGTAAGTTTT CTTGGCTTCA TTCAGCTTCA I~TCGTGTGC ACGTACTTGA780 GGACCGCAGT

CTGAAGCTGG ACGACGTTAC TCTaGA(~GAC ATGGGCGAGT ACACTTGCGA840 GGCGGACAAT

CAAATTTGTG

ATACGCCCCA ACiAATCAGCT GGTGGAGATC GGTGATGAAG TGCTGTTCGA960 GTGCCAAGCG

CCTGCTGCTC

CTCGGTGCTC

CGCCCTGAAC

CGAGCTGCCA

AATTGTGGTT

GGATGGCATA

CTTAACCATT

CAATCGCAAC

GAATATCAAG

TTCTTCAGAG CCCCAGAACT TTCCACCTAC CCAGGC;CCGC CAGGAAAACC1560 GCAAATGGTG

GGGCGGCTCC
~1 CTGGGTGGCT

GTGGGCACTA GOCiTGCAAAA TACCACGTTT ACCCAAACGG GTCTGCt"GCC1740 GGGTGTGAAT

TCCGATGTCG

GAACCCATTA CGGT'OQGAAC GCGCTACTTC AATAGTGGTC TGGATCTGAG1860 CGAGGCTCGT

GCCAGTCTr'aC TGTCCGGAGA TGTTGTGGAG CTGAGCAACG CCAGTGTGGT1920 GGACTCCACT

CTATGTCTAT

GC:GAGACAGT TGCCAAATCC AATAGTCAAC AATCCGGCGC CCGTTACTAG2040 CAATACCAAT

GGCATTGATT

CCAGAGTGGA

CAATGGCGGT

GGCGCCTCAT CCTaCACCAT CACC(obGCTC GTCCAGTACA CGCTGTATGA2284 ATTTTTCATC

GTGCCATTTT ACAAATCCGT CGAGf'aGCAAG CCGTCGAATT CGCGCATCGC2340 TCGCACCCTT

GAAGATGTTC CCTCTGAGGC ACCATATGGA ATGCiAGGCTC TGCTGTTGAA2400 CTCCTCCGCG

GTCTTCCTCA AAZ'GGA7~GGC ACCAGAACTC AAGGATCGGC ATGGTGTTCT2460 CTTGAACTAT

CATGTTATAG TCCGAGGTAT TGACACTCiCC CACAATTTCT CACGCATTTT2520 GACAAATGTC

ACCATCGATG CCGCTTCGCC TACTCTGGTT TTQGCCAATC TCACCGAA~GG2580 CGTCATGTAC

CCCAGCTACT

TTGCGTTTV'G ATCCCATCAC AAAGCGACTC GATCCGTTCA TCAATCAGCG2700 GGACCATGTT

AACGATGTGC Z'C3ACGCAGCC CTGGTTCATA ATACTCCTGG GCGCCATCCT2760 GGCCGTTCTT

GAAGCAGTCG

GAGTCTATCG

GTGGCGTCCC

TCGCCCGGCG GCGACTCGCT GGACiATGCRA AAGGliTCACA TCGCCGACTA3000 TGCGCCGGTC

CGGTGGCGCG

GGCAGCGGTG CCAGCGGCGG CGATGACATT CATGGiAGGAC ACGGCAGCGA3120 ACGCAATCAG

GTCCAGTTTT

GGCAA~GGCAC CCAGCGAGTA TGGTGQGCAT OGCAACGCCT CCCCGGCCCC3240 TTATGCCACC

TCTTCGATCC TGAGTCCCCA CCAGCA<iCAA CAGCAGCAGC AGCCGCGTTA3300 TCAACAGCGA

GCAGCAGCAT

GCACCAGCAA

CCCCACGAAC

CAAGCAGAGA

CACATCCACA TCACCGAGAA CRAGCTGAGC AACTGCCACA CCTATGAGCiC3600 GGCTCCTGGC

GCAGCTGCCG

GGATCAGGGC

CGGTCTGGCA

CGAGGACGAG

CACGCGCTGT ACCACAC.'GGC GGATOGGGAT CTGGACG~ TGGAACGACT3900 GTACGTCAAG

CCCACAGCAT

CCGGCGGAAG GTCACCTGCA GTCCTGQCGG AATCA6AGCA CGCGC'sAGCAG4020 TCGGAAGAAC

CGTGGCCAGC

GAACC~GA~GCC TCCTCAGCAA CTCGGGTAGC GGCACCAGCA GCCAGCCAGCd140 TGGCCACAAT

(2) INFORMATION FOR SEQ ID N0:4:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1381 amino acids (H) TYPE: amino acid (C) STRANDEDt~IBSS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID N0:4:
Gly Glu Asn Pro Ary Ile Ile Glu His Pro Met Asp Thr Thr Val Pro Lys Asn Aap Pro Phe Thr Phe Asn Cya Gln Ala Glu Gly Asn Pro Thr Pro Thr Ile Gln Trp Phe Lys Asp Gly ArQ Glu Leu Lys Thr Asp Thr Gly Ser His ArQ Ile Met Leu Pro Ala Gly Gly Leu Phe Phe Leu Lys Val Ile His Ser Arqr ArQ Glu Ses ,Asp Ala Gly Thr Tyr Trp Cys Glu Ala Lys Asn Glu Phe Gly Val Ala ArQ Ser Arg Asn Ala Thr Leu Gln 85 90 . 95 Val Ala Val Leu ArQ Asp Glu Phs ArQ Leu Glu Pro Ala Asn Thr ArQ

Val Ala Gln Gly Glu Val Ala Leu Diet Glu Cys Gly Ala Pro ArQ Gly 115 lao lay Ser Pro Glu Pro Gln ile Ser Trp ArQ Lys Asn Gly Oln Thr Leu Asn Leu Val Gly Asn Lys Arg Ile ArQ Ile Val Asp Gly Gly Asn Leu Ala Its Gin Glu Ala Ary Gln Ser Asp Asp Gly ArQ Tyr Gln Cys Val Val Lys Asn Val Val Gly Thr ArQ Glu 8er Ala Thr Ala Phe Leu Lys Val His Val ArQ Pro Phe Leu Ile ArQ Gly Pro Gln Asn Gln Thr Ala Val Val Gly Ser Ser Val Val Phe Gln Cys ArQ Its Gly Gly Asp Pro Leu Pro Asp Val Leu Trp ArQ ArS~ Thr Ala Ser Gly Gly Asn Met Pro Leu ArQ Lys Phe Ser Trp Leu Hfa Ser Ala Ser Gly Arq Val His Val Lsu Glu Asp Ar8 Ser Leu Lys Leu Asp Asp Val Thr Lsu Glu Asp Met Gly aso 2s5 a7o Glu Tyr Thr Cya Glu Ala Asp Asn Ala Val Gly Gly IIe Thr Ala Thr Gly Ile Leu Thr Val His Ala Pro Pro Lys Phe Val Ile Arg Pro Lys Aan Gln Leu Val Glu Ile Gly Asp Glu Val Leu Phe Glu Cys Gln Ala Asn Gly Hia Pro Arq Pro Thr Leu Tyr Trp Ser Val Glu Gly Aan Ser Ser Leu Leu Leu Pro Gly Tyr ArQ Aap Gly Arg Met Glu Val Thr Leu Thr Pro Glu Gly ArQ Ser Val Leu Ser Ile Ala Arg Phe Ala Arg Glu Aap Ser Gly Lys Val Val Thr Cys Aan Ala Leu Asn Ala Val Gly Ser Val Ser Ser ArQ Thr Val Val Ser Val Asp Thr Gln Phe Glu Leu Pro Pro Pro Ile Ile Glu Gln Gly Pro Val Aan Gln Thr Leu Pro Val Lys Ser Ile Val Val Leu Pro Cya ArQ Thr Leu Gly Thr Pro Val Pro Gln Val Ser Trp Tyr Leu Asp Gly Ile Pro Ile Asp Val Gln Glu His Glu Arg ArQ Asn Leu Ser Asp Ala Gly Ala Leu Thr Ile Ser Asp Lau Gln Arg His Glu Asp Glu Gly Leu Tyr Thr Cys Val Ala Ser Aan ArQ Asn Gly Lya Ser Ser Trp Ser Gly Tyr Leu Arg Leu Asp Thr Pro Thr Asn 485 d90 495 Pro Aan ile Lys Phe Phe Arg Ala Pro Glu Leu Ser Thr Tyr Pro Gly Pro Pro Gly Lys Pro Gln Met Val GIu Lya Gly Glu Aan Ser Val Thr Leu Ser Trp Tar ArQ Ser Aan Lys Val Gly Gly Ser Ser Leu Val Gly Tyr Val Ile Glu Met Phe Gly Lys Aan Glu Thr Aap Gly Trp Val Ala Val Gly Thr ArQ Val Gln Asn Thr Thr Phe Thr Gln Thr Gly Leu Leu Pro Gly Val Aan Tyr Phs Phs Lsu Ile Arg Ala Glu Asn Ser His Gly Leu Ser Leu Pro Ser Pro Met Ser Glu Pro Ile Thr Val Gly Thr Arg Tyr Phs Asn Ser Gly Leu Asp Leu Ser Glu Ala Arg Ala Ser Leu Leu Ser Gly Asp Val Val Glu Leu Ser Asn Ala Ssr Val Val Asp Ser Thr Ser Met Lys Leu Thr Trp Gln Ile Ile Asn Gly Lys Tyr Val Glu Gly Phe Tyr Val Tyr Ala Arg Gln Leu Pro Asn Pro Ile Val Asn Asn Pro Ala Pro Val Thr Ser Asn Thr Asn fro Leu Leu Gly Ser Thr Ser Thr Ser Ala Ser Ala Ser Ala Ser Ala Ser Ala Leu Ile Ser Thr Lys Pro Asn Ile Ala Ala Ala Gly Lys Arg Asp Gly Glu Thr Asn Gln Ser Gly Gly Gly Ala Pro Thr Pro Leu Asn Thr Lys Tyr Arg Met Leu Thr Ile Leu Asn Gly Gly Gly Ala Ser Ser Cys Thr Its Thr Gly Leu Val Gln Tyr Thr Leu Tyr Glu Phe Phe Ile Val Pro Phe Tyr Lys Ser Val Glu Gly Lys Pro Ser Asn Ser Arg Ile Ala Arg Thr Leu Glu Asp Val Pro Ser Glu Ala Pro Tyr Gly Met Glu Ala Leu Leu Leu Asn Ser Ser Ala Val Phe Leu Lys Trp Lys Ala Pro Glu 'beu Lys Asp Arg His Gly Val 805 810 8i5 Lsu Leu Asn Tyr His Val Ile Val Arg Gly Ile Asp Thr Ala His Asn Phe Ser Arg ile Leu Thr Asn Val Thr Ile Asp Ala Ala Ser Pro Thr Lsu Val Leu Ala Asn Leu Thr Glu Gly Val Met Tyr Thr Val Gly Val Ala Ala Gly Asn Asn Ala Gly Val Gly Pro Tyr Cys Val Pro Ala Thr Leu Arg Leu Asp Pro Ile Thr Lys Arg Leu Aap Pro Phe Ile Asn Gln Arg Asp His Val Asn Asp Val Leu Thr Gln Pro Trp Phe Ile Ile Leu Leu Gly Ala Ile Leu Ala Val Leu Met Leu Ser Phe Gly Ala Met Val Phe Val Lya Arg Lys Hfa Met Met Met Lya Gln Ser Ala Leu Asn Thr Met Arg Gly Aan His Thr Ser Asp Val Leu Lys Met Pro Ser Leu Ser Ala Arg Asn Gly Aan Gly Tyr Trp Leu Aap Ser Ser Thr Gly Gly Met Val Trp Arg Pro Ser Pro Gly Gly Asp Ser Leu Glu Met Gln Lya Asp 980 ~ 985 .990 His Ile Ala Asp Tyr Ala Pro Val Cys Gly Ala Pro Gly Ser Pro Ala Gly Gly Gly Thr Ser Ser Gly Gly Ser Gly Gly Ala Gly Ser Gly Ala Ser Gly Gly Asp Aap Ile His Gly Gly His Gly Ser Glu Arg Aan Gln Gln Arg Tyr Val Gly Glu Tyr Ser Aan Ile Pro Thr Asp Tyr Ala Glu Val Ser Ser Phe Gly Lys Ala Pro Ser Glu Tyr Gly Arg His Gly Asn Ala Ser Pro Ala Pro Tyr Ala Thr Ser Ser Ile Leu Ser Pro His Gln Gln Gln Gln Gln Gln Gln Pro Arg Tyr Gln Gln Arg Pro Val Pro Gly Tyr Gly Leu Gln Arg Pro Mat Hia Pro FIis Tyr Gln Gln Gln Gln His Gln Gln Gln Gln Ala Gln Gln Thr Hia Gln Gln Hia Gln Ala Leu Gln Gln His Gln Gln Leu Pro Pro 8er Asn Ile Tyr Gln Gln Met Ser Thr Thr Ser Glu Ile Tyr Pro Thr Asn Thr Gly Pro Ser Arg Ser Val Tyr Ser Glu Gln Tyr Tyr Tyr Pro Lya Asp Lya Gln Arg His Ile His Ile 1170 1175 1180 Thr Glu Asn Lys Leu Ser Asn Cys His Thr Tyr Glu Ala Ala Pro Gly Ala Lys Gln Ser Ser Pro Ile Ser Sar Gln Phe Ala Ser Val Arg Arg Gln Gln Leu Pro Pro Asn Cys Ser Ile Gly Arg Glu Ser Ala Arg Phe Lys Val Leu Asn Thr Asp Gln Gly Lys Asn Gln Gln Rsn Leu Leu Asp Leu Asp Gly Sex Ser Met Cys Tyr Asn Gly Leu Ala Asp Ser Gly Cya Gly Gly Ser Pro Ser Pro Met Ala Met Leu Met Ser His Glu Asp Glu Hia Ala Leu Tyr His Thr Ala Asp 131y Asp Leu Aap Asp Met Glu Arg Leu Tyr Val Lys Val Asp Glu Gln Gln Pro Pro Gln Gln Gln Gln Gln Lnu Ile Pro Leu Val Pro Gln His Pro Ala Glu Gly His Leu Gln Ser Trp Arg Asn Gln Ser Thr Arg Ser Ser Arg Lys Asn Gly Gln Glu Cys Ile Lys Glu Pro Ser Glu Leu Ile Tyr Ala Pro Gly Ser Val Ala Ser Glu Arg Ser Leu Leu Ser Asn Ser Gly Ser Gly Thr Ser Ser Gln Pro Ala Gly His Asn Val (2) INFORMATION FOR SEQ ID N0:5:
(i) SEQUENCE CHARACTERISTICS:
(A) LEt~TH: 3894 bast pairs (H) TYPE: nucleic acid (C) STRANDED61ESS: double (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:5:
ACATGGTACA AGOAT('GACA GCCCGTAATC ACGAATAAGG AGCAAQi'GAA240 CAGCCACCGG

ATTGTTCTCG ACACGGGATC CCn3TTTCTT CZ~AAAGTGA ATAGTGGAAA300 AAACaGAAAA

GACAGCGATG CG(~aA~GCGTA CTATTGT~ GCCAQC~IACG AGCACOGAGA3 AACGAAGGAT CGTTAlIIlATT f3QCGATGCTT CGCGAAGACT 420 TTCGAGTTCG GCCAACiAACA

GTTCAGGCTC TTGGT'GGAGA GATGGCCGTT CTGGAATGCA GTCCGCCACG480 TGGATTCCCG

GAGCCGCa'I~1'G TGAGC"!'OGCG GA11A~GACGAC AAAaAGCTCC540 GAATTCAJIGA CATGCCACGA

CGATTCTGGT

CGCAA~,GATTG

AGTCiTCTTTG AGAAACCW1A G'i'TTGAf~C~IA CiAACCCAAGG720 ACATGACGGT CGACGTCGGA

TACGTGGAAA

TCGGGGGTTG

AGRATrGAAA GAGTTCAACC ATCAGACOAA GCiTG~IATACG 900 TTTGCTATGC ACGAAATCCA

GCGGGAACTC TTGAAGCATC TGCACATCTT C(~1~TCCAGG CACC'TCCATC960 CTTCCAGACA

AAACCAGCAG ACCAGTCAGT TCCAGCTGGA QOCACGOCAA CTTTZ'C'sAATG1020 CACCTTGGTC

TCTTTTCCCA

IiUGTTATG1'fs't' CCGCTGATGG TAGAACGAAA GTTTCACCAA1140 CTtiGAACATT GACAATTGAG

GGCAGGAAGC

TCGTTGAGCA AGGCAGCTTT GAAAGCAACA TTTGAAJvCCA AAGGCCOTGT1260 AA(iAGCAAAA TGGGCA11ACA GAAACAAAAA AATGTTCAAT 1320 CAATTATCAA ATATTTAATT

TGGTCATCAA

CGGAAAACCA

TAGTCGTATC

CACCGGAGTT

TACACTTGCA TTGCGAAGAA CGAGGATOGA GAGT'CMCAT GGTCQGCATC1620 TCTGACTGTT

CTTCCCGTCT

TCTCCAACGC AACCCATTAT TaTCAIITGTC ACTGATACCG AAGTAGAGCT1740 CCACTOGAAT

GCTCCCTCCA CATCTGGCGC AGGACCAATC ACTGGZ'1'ATA 1800 TCATTCAGTA CTACAGTCCA

GACCTCGGAC AGACGTGaTT TAACATTCCA GACTACCiTGG CATCTACTGA1860 AAGGC~PCTGA AACCATCTCA CTCGTATATG TTTG'rGAfiTC 1920 G11GCAGAAAA Z'GAGAAAGaT

ATTL;GAACGC CG11GT~TGTC GTCC~CiCTCTC GTTACCACTA 1980 GCAAOCCAGC AGCTCAAGTT

CACTTCGGAA

TTTGTTCTGG

A71GAAG1IGf'sA AAC'1'TGAAGA GCTGATT~iAT G('sTTACT11CA2160 TGAGTGGAG AGGGCCTCCA

CTATGTTGTT

TCAAATTTAA TGCCATTCAC CAACTATGAG TT'TTTCQTGA TTCCTTATCA2280 TTCCQGAGTT

AGCTCCACCT

TCGTATCTCT

AATTGTTATT

aTTGGTCAAa cacccAACAA cAAZrGGAAC ATr.ACTACAA AcaACAGAacz52o Tc~ccAGTaTT

ACTCZ~'CC ATTTAC~'~GAC TGaAATGACa TATAAJlATTC GT<iTAOCaaC2580 TA~aAAQCAAT

GGTGGAGTTG GAGTCTCACA TGCiAlICGAGT CiAAaTCATCA 2640 TaAATCAA,CiA CACGCTaaAA

AAACACCTTG CTGCTCAACA AGAAAACGAA TCATTTTTGT ATOGGCTCiAT2700 CAATAAATCT

CATGTTCCTG TGATTGTCAT TGTTGCAATT CTaATTATTT TCaTAGTCAT2760 CATTATAGCC

TATTaI"1'ACT GGAaaAATAG CAGAAACAGT GATaGAAIIGG 2820 ATCGAAGTTT TATAAAGATC

AATGATGGAA aTat'TCATAT GGCTTCGAaIT AATCTTTGaG ATGTTGCACA2880 AAATCCGAAT

CAGAATCCAA TaTACAACAC TGCTGGAAaA ATaACTATGA ACAATAGAAA2940 TOGCCAaGCT

CTACAGTGGA

ACaAZ'aCACA GACCAQQATC CGA~C~CATCAC TATCATTATG 3060 CTCAACTaAC lbaCGGACCT

TCCATATaCC

ACCACAACAC TGGTCC1~T~ GAACCAACAA CCAGC'M'OGC TCAATGACAA3180 AATGCTTCGC

GCGCCAGCAA TGCCAACAAA TCCCGTGCCA CCA~AGCCAC CGGCGCGATA3240 TdCAGATCAT

ACCGCTGGAA GACGATCTCa ATCGAGCCaT GCATCCGATG GGAGAGGAAC3300 TCTGAATaGC

GGACTCCATC ACCGGACTAG CGGAAGTCAA CGGT'CGGATA GTCCACCTCA3360 CACAGATGTG

AGCTATGTTC AGCTTCACTC ATCCGATGaA ACTaGTAaTA GTAAf'~GAAAG3420 AACTOCbQAG

CGCiAGAACAC CACCGAATAA GACTCTGATG GACTTTATTC CGCCACCACC3480 TTCCAATCCA

CCACCACCTa GAGGGCAGGT TTATGACACA GCAACTAaOC GTCAGTTaAA3540 TCGTGGAAGT

ACTCCACGAG AAGACACCTA CGATTCGGTC AGTi3ACGGAG CTTTTGCTCG3600 GaTTGATGTG

AATGCAAGGC CAACGAGTCCi GAATCGGAAT TTGpGAGGAA GGCCGCTaAR3660 AGGGAAACGA

GACGACGATA GTCAGCGGTC TTCGTTGATa ATQaACGATG ATaGTGGATC3720 TTCTGAAaCT

QACGf~GGAGiA AC'i'CTGAAGG AGACGTTCCG CGTaGAGGTG 3780 TTAGAAAAGC AGTTCCTCGA

ATGGaTATCT CTGCAAGTAC GCTGGCTCAT AGTTGTTACG GGACAAACaG3840 CACTGCTCAA

CGATTCCGGT CAATTL~CACG TAACAATaaA ATCGTCACAC AAaAACAAAC3894 TTaA

(2) INFORMATION FOR SEQ ID N0:6:
(i) SEQ~JENCE CHARACTERISTICS:
(A) LENGTH: 1297 amino~acids (8) TYPE: axaino acid (C) sTRANDEDNESS: ainQle (D) TOPOLOGY: linear (11) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID N0:6:
Met Tyr Tyr Leu Gly Phe Tyr His Thr His Thr His Thr His Thr Tyr Ile Asn Phe Asp Lys Ile Pro Asn Ala Ser Asn Leu Ala Pro Val Ile Ile Glu His Pro Ile Asp Val Val Val Ser RrQ Gly Ser Pro Ala Thr Leu Asn Cys Gly Ala Lys Pro Ser Thr Ala Lys Ile Thr Trp Tyr Lys Asp Gly Gln Pro Val Ile Thr Asn Lys Glu Gln Val Asn Ser His Arg s5 70 75 eo Ile Val Leu Asp Thr Gly Ser Leu Phe Leu Leu Lys Val Asn Ser Gly Lys Asn Gly Lys Asp Sar Asp Ala Gly Ala Tyr Tyr Cys Val Ala Ser Asn Glu His Gly Glu Val Lys Ser Aan Glu Gly Ser Leu Lys Lau Ala Mst Leu Arg Glu Asp Phe Arg Val Arg Pro Arg Thr Val Gln Ala Leu Gly Gly Glu Met Ala Val Leu Glu Cys Ser Pro Pro ArQ Gly Phe Pro Glu Pro Val Val Ser Trp ArQ Lys Asp Asp Lys Glu Leu ArQ Ile Gln Asp Met Pro Arg Tyr Thr Leu His Ser Asp Gly Asn Leu Ile Ile Asp Pro Val Aep Arg Ser Asp Ser Gly Thr Tyr Gln Cya Val Ala Asn Asn Met Val Gly Glu Asv Val Ser Asn Pro Ala Arg Leu Ser Val Phe Glu alo als a2o Lys Pro Lys Phe Glu Gln Glu Pxo Lys Asp Met Thr Val Asp Val Gly Ala Ala Val Leu Phe Asp Cys Arg Val Thr Gly Asp Pro Gln Pro Gln 245 250 ~ 255 Ile Thr Trp Lys ArQ Lys Asn Glu Pro Met Pro Val Thr Arg Ala Tyr ile Ala Lys Asp Asn Arg Gly Leu ArQ ile Glu Arg Val Gln Pro Ser Asp Glu Gly Glu Tyr Val Cys Tyr Ala Arg Asn Pro Ala Gly Thr Leu Glu Ala Ser Ala His Leu Ary Val Gln Ala Pro Pro Ser Phe Gln Thr Lys Pro Ala Asp Gln Ser Val Pro Ala Gly Gly Thr Ala Thr Phe Glu Cys Thr Leu Val Gly Gln Pro Ser Pro Ala Tyr Phe Trp Ser Lys Glu Gly Gln Gln Asp Leu Leu Phe Pro Ser Tyr Val Ser Ala Asp Gly Arg Thr Lys Val Ser Pro Thr Gly Thr Leu Thr Ile Glu Glu Val Ary Gln Val Asp Glu Gly Ala Tyr Val Cys Ala Gly Met Asn Ser Ala Gly Ser Ser Leu Ser Lys Ala Ala Leu Lys Ala Thr Phe Glu Thr Lys Gly ArQ

Val Gln Lys Lys Lys Ser Lys Met Gly Lys Gln Lys Gln Lys Asn Val Gln Ser Ile Ile Lys Tyr Leu Ile Ser Ala Val Thr Gly Asn Thr Pro Ala Lys Pro Pro Pro Thr Ile Glu His Gly His Gln Asn Gln Thr Leu Met Val Gly Ser Ser Ala Ile Leu Pro Cys Gln Ala Ser Gly Lye Pro Thr Pro Gly Ile Ser Trp Leu ArQ Asp Gly Leu Pro Ile Asp Ile Thr Asp Ser ArQ Ile Ser Gln His Ser Thr Gly Ser Leu His Ile Ala Asp Leu Lys Lys Pro Asp Thr Gly Val Tyr Thr Cys Ile Ala Lys Asn Glu Asp Gly Glu Ser Thr Trp Sar Ala Ser Leu Thr Val Glu Asp His Thr Ser Asn Ala Gln Phe Val Arg Met Pro Asp Pro Ser Asn Phe Pro Ser Ser Pro Thr Gln Pro Ile Ile Val Asn Val Thr Asp Thr Glu Val Glu 565 5"TO 575 Leu His Trp Asn Ala Pro Ser Thr Ser Gly Ala Gly Pro Ile Thr Gly Tyr Ile Ile Gln Tyr Tyr Ser Pro Asp Leu Gly Gln Thr Trp Phe Asn Ile Pro Asp Tyr Val Ala Ser Thr Glu Tyr Arg Ile Lys Gly Leu Lys Pro Ser His Ser Tyr Met Phe Val Ile Arg Ala Glu Asn Glu Lya Gly Ile Gly Thr Pro Ser Val Ser Ser Ala Leu Val Thr Thr Sar Lys Pro Ala Ala Gln Val Ala Leu Ser Aap Lys Asa Lys Met Asp Met Ala Ile Ala Glu Lys Arg Leu Thr Ser Glu Gln Leu Ile Lys Leu Glu Glu Val Lys Thr ile Asn Ser Thr Ala Val Arg Leu Phe Trp Lys Lys Arg Lys Leu Glu Glu Leu Ile Asp Gly Tyr Tyr ile Lys Trp Arg Gly Pro Pro Arg Thr Asn Asp Asn Gln Tyr Val A:rs Val Thr Ser Pro Ser Thr Glu Asn Tyr Val Val Ser Asn Leu Met Pro Phe Thr Asn Tyr Glu Phe Phe 740 '145 ?50 Val Ile Pro Tyr His Ser Gly Val His Ser Ile His Gly Ala Pro Ser Asn Ser Met Asp Val Leu Thr Ala Glu Ala Pro Pro Ser Leu Pro Pro Glu Asp Val Arg Ile Arg Met Leu Asn Leu Thr Thr Leu Arg Ile Ser Trp Lys Ala Pro Lys Ala Asp Gly Ile Asn Gly Ile Leu Lys Gly Phe Gln Ile Val Ile Val Gly Gln Ala Pro Asn Asn Aan Arg Asn Ile Thr Thr Aan Glu Arg Ala Ala Ser Val Thr Leu Phe His Leu Val Thr Gly Met Thr Tyr Lys Ile Arg Val Ala Ala Arg Ser Asn Gly Gly Val Gly Val Ser His Gly Thr Ser Glu Vai Ile Met Asn Gln Asp Thr Leu Glu Lys His Leu Ala Ala Gln Gln Glu Asn Glu Ser Phe Leu Tyr Gly Leu Ile As~i Lye Ser His Val Pro Val Ile Val Ile Val Ala Ile Leu Ile Ile Phe Val Val Ile Ile Ile Ala Tyr Cys Tyr Trp Arg Asn Ser Arg Asn Ser Asp Gly Lys Asp Arg Ser Phe Ile Lys Ile Asn Asp Gly Ser Val His Met Ala Ser Asn Asn Leu Trp Asp Val Ala Gln Asn Pro Asn Gln Rsn Pro Met Tyr Asn Thr Ala Gly Arg Met Thr Met Asn Asn Arg Aen Gly Gln Ala Leu Tyr Ser Leu Thr Pro Asn Ala Gln Asp Phe Phe Asn Asn Cys Asp Asp Tyr Ser Gly Thr Met His Arq Pro Gly Ser Glu His His Tyr His Tyr Ala Gln Leu Thr Gly Gly Pro Gly Asn Ala Met Ser Thr Phe Tyr Gly Asn Gln Tyr His Asp Asp Pro Ser Pro Tyr Ala Thr Thr Thr Leu Val Leu Ser Asn Gln Gln Pro Ala Trp Leu Asn Asp Lys Met Leu Arg Ala Pro Ala Met Pro Thr Asn Pro Val Pro Pro Glu Pro Pro Ala Arq Tyr Ala Asp His Thr Ala Gly Arg Arg Ser Arg Ser Ser Arg Ala Ser Asp Gly Arq Gly Thr Leu Asn Gly Gly Leu His His Arg Thr Ser Gly Ser Gln Arq Ser Asp Ser Pro Pro His Thr Asp Val Ser Tyr Val Gln Leu His Ser Ser Asp Gly Thr Gly Ser Ser Lys Glu Arg Thr Gly Glu Arg Arg Thr Pro Pro Asn Lys Thr Leu Met Asp Phe Ile Pro Pro Pro Pro Ser Asn Pro Pro Pro Pro Gly Gly His Val Tyr Asp Thr Ala Thr Arg Arg Gln Leu Asn Arg Gly Ser Thr Pro Arg Glu Asp Thr Tyr Asp Ser Val Ser Asp Gly Ala Pha Ala Arg Val Asp Val Asn Ala ArQ Pro Thr Ser Arq Asn Arq Asn Leu Gly Gly Arg Pro Leu Lys Gly Lys Arg Asp Asp Asp Ser Gln Arg Ser Ser Leu Met Met Asp Asp Asp Gly Gly Ser Ser Glu Ala Asp Gly Glu Asn Ser Glu Gly Aap Val Pro Arg Gly Gly Val Arq Lys Ala Val Pro Arg Met Gly Ile Ser Ala Ser Thr Leu Ala Hia Ser Cya Tyr Gly Thr Aan Gly Thr Ala Gln Arg Phe Arg Ser Ile Pro Arg Asn Aan Gly Ile Val Thr Gln Glu Gln Thr (2) INFORMATION FOR SBQ ID N0:7:
(i) SEQUENCE CHARACTERISTICS:

;A) LENGTH: 4956 base pairs ;8) TYPE: nucleic acid ;C) STRANDEDNESS: double (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:7:

ATGAAATGGA AACATGTTCC T't'TTTTGGTC ATGATATCAC TCCTCAGCTT60 ATCCCCAAAT

GAACGACCAC

AGGCTCCCGT

GATTGTCTCA

AAAG'GAGAAC CTGCAACTTT GAACTOCAAA GCTG7~AC'.GCC 300 ACACCGAATG

TTGCTGCCGA GTGGATCTTT A't'TTTTCTTA CGTATAGTAC ATGGACGGAA420 AAGTAGACCT

GAGCCACAAT

GGATGTCATG

TCCTGAGCCC

ACCATTTCAT GGAAGAAA,GA TGGCTCTCCA CTGGATGATA AAGATGRAAG660 AATAACTATA

CGAGGACiGAA AGCTCATGAT CACTTACACC CGTAAAAGTG ACGCTGGCAA720 ATATGTTTGT

GTTGGTACCA ATATGGTIGCi GGAACGTGAG AGTGAAGTAG CCGAGCTGAC780 TGTCTTAGAG

AGACCATCAT TTGTGAAGAG ACCCACiTAAC TTGQCAGTA''~A 840 CTGTGGATGA CAGTGCAGAA

AGATGATGGA

GA0CIbCCCA AATCCAGATA TGAAATCCGA GATGATCATA CCTTGAAAAT960 TAGGAAGGTG

ACAQCTGGTG ACATGGG'1'TC ATACACTTGT GTTaCAGAI1A 1020 ATATGGTGGG CAAAGCTGAA

CCGTGACCAG

GTT'(3'!'TGCTT TGGGACOGAC TGTAACTTTT CAGTGTGAAG 1140 CAACCGGAAA TCCTCAACCA

GCTATTTTCT GGAGGAGAGA AGGGACiTCAG AATCTACTTT TCTCATATCA1200 ACCACCACAG

TGTCCAGCGA

TCTGATGTTG GTTATTACAT CTGCCAGACT TTAAATC~I~S CTOGAAGCAT1320 CATCACAAAG

TCGACAAGGT

ccTaTG~ATC AaACZwrAOC eoTC~G~Tacc ACrrT~aTCC Tc~aCTaraT1440 GcccACA~ooc AGTCCAGTGC CCACCATTCT GTGGAGAAACf GAT3GaGTCC TCGTTTCAAC1500 CCAAGACTCT

CGAATCAA71C AGTTGtiAGAA TaGJuGTACni CAGATCCGAT 1560 A'1GCTAA~3CT GGGTGATACT

TGCTTACATT

G~IA~tSTTCA11G AATTTGGJ1~GT TCCAGTTCAG CCTCCAAGAC 1680 CTACTGACCC AAATTTAATC

CCTAGTGCCC CATCAAAACC TGA~IG'TGACA GATGTCAGCA GAAATACAGT1740 CACATTATCG

TGGCAACCAA ATTTGAATTC AGGM~CAA~CT CCAACATCTT ATATTATAGA1800 AGCCTTCAGC

CATGCATCTG GTAQCAOCTB GCAG7~CCGTA GCAGAG)IATG TaA7~A11CAGA1860 AACATCTGCC

ATTAAAQQAC TCAAACCTAA TGCAATTTAC CTTTTCC1'TG TGAGQGCAGC1920 TAATGCATAT

GGAATTAOTG ATCCAAGCCA AATATCAGAT CCAGTGAAAA CACAACiATGT1980 CCTACCAACA

AGTCACTOGACCACAA GCAGdTCCACi AGAG111GCTGG GAAATG(:TGT2040 TCTGCACCTC

AGATCAACAG

TCTCAGTATA TACAAaGATA TAAIWTTCTC TATC'C~3CCAT C.Tt~GAGCCAA216 CCACOGAC~AA 0 TCAGACTGGT TAG'TT'1'TTGA AGTGAGGACG CCAQCCAAAA 2220 ACAGTGTGGT AATCCCTGAT

ATTTCAAGGA

CCCACCCCAA

GGTGTAACTG TATCCAAGAA TGATGGAAAC GGAACTGC'.AA TTCTAGTTAG2400 TTGGCAGCCA

CCTCCAGAAG ACACTCAAAA TGGiIATGGTC CAAGACiTATA AGGTTTGGTG2460 TCTGGOCAAT

QGTCATTCCC

TTTCr'1'GTTC CrGGAATCCG ATACAt~~1'G GAAtiT00CAG 2580 CCAGCACTGG GOCTGGGTCT

GGOCiTAAAQA GTGAaCCTCA GTTCATCCAG CTGGATGCCC ATGGAAACCC2640 TGTGTCACCT

GAGGACCAAl3 TCMiCCTCQC TCA1GCAGATT TCAGATGTGG TGA~IIG(:AQCC2700 OGCCTTCATA

GCAGGTATTG GAGCAGCCTG TTGGATCATC CTCATOGZ'CT TCAGCATCTG2760 GCTTTATCGA

CACCG('.AAGA AGAaAAACGG ACTT11CTA(iT ACCTACGCQQ 2820 GTATCAGAAA AGTCCCGTCT

CAGTGGAQGG

A~1OCCTGG~1C TTCTCAACAT CAG~KiAACCT GCCGCQCAGC 2940 CATGGCTi~C AQJ4CACGTGG

AGGCAATGGA

AACAGCGACA GCA1~CCTCAC TACCTACAGT CGCCC11GCTG ATTGTATAGC3060 AAATTATAAC

AACCAA,CTGG ATAACAAACA AACAAATCTG AT13CTCCC1'(t~ 312 GZ'OGi~CCTTA GTAACAAAAT CAATGAGATG A71AACC1"1'CA 3180 ATAGCCCAliA TCTGA71C3GAT

GGGCGTTTTG TCAATCCATC AGGGCAGCCT ACTCCl"TACG CCACCACTCA3240 GCrCATCCAG

TCAAACCTCA GCAAC11AC',AT GAACAATGGC AGCGOOGalCT 3300 CTC~CiCGA~f'i7~A GCAGTGGAAA

CCACTL~GAC AGCAGAAACA AO;AA~t3TGGCA CCAGTTCAGT 3360 ACAACATCGT GGAGCAAAAC

ATACAACCAA

TCATACGACC AGAACAC'.AOG AGGATCCTAC AACAGCTCAG ACCGOGGCAG3480 TAGTACATCT

GGGAGTCAGG GGCACAACiRA AQGGOCAAGA ACACCCAAGG TACCAAAA,CA3540 GGGTGQCATG

AAt:TOGGCAG ACCTGCTTCC TCCTCCCCCA GCACATCCTC CTCCACACAG3600 CAATAGCGAA

GAGTACAACA T'ITC~GTAGA TGAAAGCTAT GACCAAGAAA TGCCATGTCC3660 CGTGCCACCA

GCAAGQATGT ATTTGCAACA AGATGAATTA GAAC'aAGGAOG AAGA'1'GAACG3720 AGGCCCCACT

CCCCCTdTTC GGGGA~GCAGC TTCTTCTCCA GCTGCCQTGT CCTATAGCCA3780 TCAGTCCACT

TTGTCCAGAG

GAGACTGGCC ACATGCAGCA CCAGCCCGAC AGGAGrACGGC AGCCTGTGAG3900 TCCTCCTCCA

CCACCACGGC CGATCTCCCC TCCACATACC TATOGCTACA TTTCAGCiACC3960 CCTGGTCTCA

GGTAGCCAAG

CAGTGTTGGG

GACCTGGAGA GCTCTGTCAC GGGaTCCATG ATCAACGGCT GGGGCTCAGC4140 CTCAGAGGAG

TTTCACTGAT

AGTAGCACGA

CGGCAAATGC AGGATGCTGC TGGCCGTCGA CATTT'TCATG CGTCTCAGTG4320 CCCTAGGCCC

AACCAGACCA

GCCAAGAAAC TGAAACACCA GCCAGGACAT CTCiCGCAGAG AAACCTACAC4440 AGATGATCTT

CAAGACACAG

CTGGAAGTAC GACCTGTAGT GGTGCCA~1AA CTCCCTTCTA TQGATGCAAG4560 AACAGACAGA

TCATCAGACA GAAAAGGAAG CAGTTACAAG GGGAGAGAAG TCiTTGGATGG4620 AAGACAGGTT

GTTGACATGC GAACAAATCC ACiGTGATCCC AGAGAAGCAC AGGAACAGCA4680 AAATGACGGG

AAAGCiACGTG GAAACAAGGC AGCAAAACGA GACCTTCCAC CAGCAAAGAC4740 TCATCTCATC

CAAGAGGATA TTCTACCTTA TTCiTAGACCT ACTTTTCCAA CATCAAATAA4800 TCCCAGAGAT

CCCAGTTCCT CAAGCTCAAT GTCA2'CAAGA GGATCAGGAA GCAGACAAAG4860 AGAACAAGCA

AAGAGGACiAA

GATAATAATG AAGAATTAGA GGAAACTGAA AGCTGA 495fi (2) INFORMATION FOR SEQ ID N0:8:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1651 amino acids (H) TYPE: amino acid (C) STRANDEOT1ESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID N0:8:
Met Lys Trp Lys His Val Pro Phe Leu Val Met Ile Ser Leu Leu Ser Leu Ser Pro Asn His Leu Phe Leu Ala Gln Leu Ile Pro Asp Pro Glu Asp Val Glu ArQ Gly Asn Aop Hi: Gly Thr Pro Ile Pro Thr Ser Asp Asn Asp Asp Asn Ser Leu Gly Tyr Thr Gly Ser Az~ Leu ArQ Gln Glu Asp Phe Pro Pro Arg Ile Val Glu Hfs Pro Ser Asp Leu Ile Val Ser Lys Gly Glu Pro Ala Thr Leu Aan Cys Lya Ala Glu Gly Arg Pro Thr pro Thr Ile Glu Trp Tyr Lys Gly Gly Glu Arg Val Glu Thr Asp Lys Asp Asp Pro Arg Ser His Arg Met Leu Leu Pro Ser Gly Ser Leu Phe Phe Leu Arg Ile Val His Gly Arg Lys Ser Arg Pro Asp Glu Gly Val Tyr Val Cys Val Ala Arg Asn Tyr Leu Gly Glu Ala Val Ser His Asn Ala Ser Leu Glu Val Ala Ile Leu Arg Asp Asp Phe Arg Gln Asn Pro Ser Asp Val Met Val Ala Val Gly Glu Pro Ala Val Met Glu Cys Gln Pro Pro Arg Gly His Pro Glu Pro Thr Ile Ser Trp Lys Lys Asp Gly Ser Pro Leu Asp Asp Lys Asp Glu Arg Ile Thr Ile Arg Gly Gly Lys 210 215 ~ 220 Leu Met Ile Thr Tyr Thr Arg Lys Ser Asp Ala Gly Lys Tyr Val Cys Val Gly Thr Asn Met Val Gly Glu Arg Glu Ser Glu Val Ala Glu Leu Thr Val Leu Glu ArQ Pro Ser Phe Val Lys Arg Pro Ser Asn Leu Ala Val Thr Val Rsp Asp Ser Ala Glu Phe Lys Cys Glu Ala Arg Gly Asp Pro Val Pro Thr Val Arg Trp Arg Lys Asp Asp Gly Glu Leu Pro Lys Ser Arg Tyr Glu Ile Arg Asp Asp His Thr Leu Lys Ile Arg Lys Val Thr Ala Gly Asp Met Gly Sex Tyr Thr Cys Val Ala Glu Aaa Mat Val Gly Lya Ala Glu Ala Ser Ala Thr Leu Thr Val Gln Glu Pro Pro His Phe Val Val Lys Pro Arg Asp Gln Val Val Ala Leu Gly Arg Thr Val Thr Phe Gln Cys Glu Ala Thr Gly Asn Pro Gln Pro Ala Ile Phe Trp Arg Ary Glu Gly Ser Gln Asn Lsu Leu Phs Ser Tyr Gla Pro Pro Gln Ser Ser Ser Arg Phe Ser Val Ser Gln Thr Gly Asp Leu Thr ITe Thr Asn Val Gln Arg Ser Asp Val Gly Tyr Tyr Ile Care Gln Thr Leu Asn 4a0 4a5 d30 Val Ala Gly Ser Ile Its Thr Lys Ala Tyr Leu Olu Val Thr Asp Val ~35 440 445 Ile Ala Asp Arg Pro Pro Pro Val Ile Arg Gln Gly Pro Val Asn Gln Thr Val Ala Val Asp Gly Thr Phe Val Leu Ssr Cys Val Ala Thr Gly Ser Pro Val Pro Thr Ile Leu Trp Arg Lys Asp Gly Val Leu Val Ser Thr Gln Asp Ser Arg ile Lys Gln Leu Glu Asn Gly Val Leu Gln Ile Arg Tyr Ala Lya Leu Gly Asp Thr Gly Arg Tyr Thr Cys Ila Ala Ser Thr Pro Ser Gly Glu Ala Thr Trp Ssr Ala Tyr ile Glu Val Oln Glu Phe Gly Val Pro Val Gln Pro Pro ArQ Pro Thr Asp Pro Aan Leu Ile Pro Ser Ala Pro Ser Lys Pro Glu Val Thr Asp Val Ser Arg Asri Thr Val Thr Lsu Ser Trp Gln Pro Asn Leu Asn Ser Gly Ala Thr Pro Thr Ser Tyr Ile Ile Glu Ala Phe Sex His Ala Ser Gly Ssr Ser Trp Gln Thr Val Ala Glu Asn Val Lys Thr Glu Thr Ser Ala Ile Lys Gly Leu Lys Pro Asn Ala Ile Tyr Leu Phe Leu Val Arg Ala Ala Asn Ala Tyr Gly Its Ser Aap Pro Ser Gln Ile Ser Asp Pro Val Lys Thr Gln Asp Val Leu Pro Thr Ser Gln Gly Val Asp His Lys Gln Val Gln Arg Glu sso ss5 s7o Leu Gly Asn Ala Val Lsu His Leu His Asn Pro Thr Val Leu Ser Ser Ser Ser Ile Glu Val His Trp Thr Val Asp Gln Gln Ser Gln Tyr Ile Gln Gly Tyr Lys Ile Lsu Tyr Arq Pro Ser Gly Ala Asn His Gly Glu Ser Asp Trp Leu Val Phe Glu Val Arg Thr Pro Ala Lys Asn Ser Val Val Ile Pro Asp Leu Arg Lys Gly Val Asn Tyr Glu Ile Lys Ala Arg Pro Phe Phe Asn Glu Phe Gln Gly Ala Asp Ssr Glu Ile Lys Phs Ala Lys Thr Leu Glu Glu Ala Pro Ser Ala Pro Pro Gln Gly Val Thr Val Ser Lys Asn Asp Gly Asn Gly Thr Ala Its Leu Val Ser Trp Gln Pro Pro Pro Glu Asp Thr Gln Asn Gly Met Val Gln Glu Tyr Lys Val Trp Cys Leu Gly Asn Glu Thr Ark Tyr His Ile Asn Lys Thr Val Asp Gly Ser Thr Phe Ssr Val Val Ile Pro Phs Leu Val Pro Gly Ile Arg Tyr Ser Val Glu Val Ala Ala Ser Thr Gly Ala Gly Ser Gly Val Lys Ser Glu Pro Gln Phe Ile Gln Leu Asp Ala His Gly Asn Pro Val Ser Pro Glu Asp Oln Val Ser Leu Ala~Gln Gln Its Ser Asp Val Val Lys Gln Pro Ala Phe Ile Ala Gly Ile Gly Ala Ala Cys Trp Ile Ile Leu Mst Val Phe Ser Ile Trp Leu Tyr ArQ His ArQ Lys Lys Arq Asn Gly Leu Thr Ser Thr Tyr Ala Gly Ila ArQ Lys Val Pro Ser Pha Thr Phe Thr Pro Thr Val Thr Tyr Gln ArQ Gly Gly Glu Ala Val Ser Ser Gly Gly Arg Pro Gly Lsu Leu Asn Ile Ser Glu Pro Ala Ala Gln Pro Trp Lsu Ala Asp Thr Trp Pro Asn Thr Gly Asn Asn His Asn Asp Cys Ser Ile Ser Cys Cya Thr Ala Gly Asn Gly Asn Ser Asp Ser Asn Leu Thr Thr Tyr Ser Arg Pro Ala Asp Cys Ile Ala Asn Tyr Asn Asn Gln Leu Asp Aan Lys Oln Thr Asn Leu Met Leu Pro Glu Ser Thr Val Tyr Gly Asp Val Asp Leu Ser Asn Lye Ile Asn Glu Mat Lys Thr Phe Asn Ser Pro Asn Leu Lys Asp Gly Arg Phe Val Asn Pro Ser Gly Gln Pro Thr Pro Tyr Ala Thr Thr G Ile Gln Ser Asn Leu Ser Asn Asn Met Asn Asn Gly Ser Gly Asp Ser Gly Glu Lys His Trp Lys Pro Leu Gly Gln Gln Lys Gln Glu Val Ala Pro Val Gin Tyr Asn Ile Val Glu Gln Aan Lys Leu Asn,Lys Asp Tyr Arg Ala Asn Asp Thr Val Pro Pro Thr ile Pro Tyr Asn Gln Ssr Tyr Asp Gln Asn Thr Gly Gly Ser Tyr Asn Ser Ser Asp Arg Gly Ser Ser Thr Ser Gly Ser Gln Gly His Lys Lys Gly Ala Arg Thr Pro Lys Val Pro Lya Gln Gly G1y Met Asn Trp Ala Asp Leu Leu Pro Pro Pro Pro Ala His Pro Pro Pro His Ser Asn Ser Glu 1185 1190 1195 . 1200 Glu Tyr Asn Ile Ser.Val Asp Glu Ser Tyr Asp Gln Glu Mst Pro Cys Pro Val Pro Pro Ala Arg Met Tyr Leu Gln Oln Asy Glu Leu Glu Glu Glu Glu Asp Glu Arg Gly Pro Thr Pro Pro Val Arg Gly Ala Ala Ssr Ser Pro Ala Ala Val Ser Tyr Ser His Gln Ser Thr Ala Thr Leu Thr Pro Ser Pro Gln Glu Glu Leu Gln Pro Mst Lsu Gln Asp Cys Pro Glu Glu Thr Gly His Met Gln His,Gln Pro Asp Arg Arg Arg Gln Pro Val Ser Pro Pro Pro Pro Pro Arg Pro ile Ser Pro Pro His Thr Tyr Gly Tyr Ile Ser Gly Pro Leu Val Ser Asp Met Asp Thr Asp Ala Pro Glu Glu Glu Glu Asp Glu Ala Asp Met Glu Val Ala Lys Met Gln Thr Arg Arg Leu Leu Leu Arg Gly Leu Glu Gln Thr Pro Ala Ser Ser Val Gly Asp Leu Glu Ser Ser Val Thr Gly Ser Met Ile Asn Gly Trp Gly Ser Ala Ser Glu Glu Asp Aan Ile Ser Ser Gly Arg Ser Ser Val Ser Ser Ser Asp Gly Ser Phe Phe Thr Asp Ala Asp Phe Ala Gln Ala Val Ala Ala Ala Ala Glu Tyr Ala Gly Leu Lya Val Ala Arg Arg Gln Met Gln Aep Ala Ala Gly Arg Arg His Phe His Ala Ser Gln Cys Pro Arg Pro Thr Ser Pro Val Ser Thr Asp Ser Asn Met Ser Ala Ala Val Met Gln Lys Thr Arg Pro Ala Lys Lya Leu Lys His Gln Pro Gly His Leu Arg Arg Glu Thr Tyr Thr Asp Asp Leu Pro Pro Pro Pro Val Pro Pro Pro Ala Ile Lys Ser Pro Thr Ala Gln Ser Lys Thr Gln Leu Glu Val Arg Pro Val Val Val Pro Lys Leu Pro Ser Met Asp Ala Arg Thr Asp Arg Ser Ser Asp Arg Lys Gly Ser S~r Tyr Lys Gly Arg Glu Val Leu Asp Gly Arg Gln Val Val Asp Met Arg Thr Asn Pro Gly Asp Pro Arg Glu Ala Gln Glu Gln Gln Asn Asp Gly Lya Gly Arg Gly Asn Lys Ala Ala Lys Arg Asp Leu Pro Pro Ala Lys Thr this Leu~Ile Gln Glu Asp Ile Leu Pro Tyr Cys Arg Pro Thr Phe Pra Thr Ser Asn Asn Pro Arg Asp Pro Ser Sex Ssr Sex Ser Met Sex Ser Axg Gly Ser Gly Ser Arg Gln Arg Glu Gln Ala Asn Val Gly Arg Arg Asn Ile Ala Glu Met Gln Val Leu Gly Gly Tyr Glu Arg Gly Glu Asp Asn Aan Olu Glu Leu Glu Glu Thr Glu Ser ( 2 ) INP'ORaIATION FOR SEQ ID N0: 9 ( i ) SEQUENCE CFI~RACTBRISTICS

(A) LENGTH: 1300 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double (D) TOPOLOGY: linear (ii) MOLHCOLE TYKE: cDNA , (ix) FEATURE:

(A) NAMElKBY: misc_feature (B) LOCATION: 855..1187 (D) OTHER INFORMATION: !note= 'N signifies gap in sequence' (xi ) SEQ~JSNCE DESCRIPTI~1: SEQ ID NO: 9 CA(iATTGTTG CTCAAGG't'CG AACA~3TGACA TTTCCCTaTG 60 AAACTAAAGG AAACCCACAG

CCAACCCCAG

CAQCCCAACA GTAGATGCTC AGTGTCACCA ACTGaAGACC 1'CACAATCAC180 CAACATTCAA

CGTTCCGACG CGGGTTACTA CATCTC,1CCAG GCTTTAACTG TGGCAGGAAG140 CATTTTAGCA

AAAGCTCAAC TOGAGGTTAC TGATLiTTTTG ACAGATAG1CC CTCCACCTAT300 AATTCTACAA

TAAAGCCACT

GGTGATCCTC TTCCTGTAAT TAGCTGGTTA AAG<iMsGGAT TTACTTTTCC4Z0 GGGTAGAGAT

CCAAGAaCAA CAATTCAAGA GCAAGQCACA CTCiCAGATTA AGAATTTACGdB0 GATTTCTtiAT

ACTOGCACTT ATACTTG'1'GT Q6CTACAAGT TCAAG~'GGAG AGQCTTCCTG540 GAGTGCAGTG

CTGG)1TGTGA C~1'CTGG ApCAACAATC AGTAAAAACT ATGATTTAAG600 TGACCTGCG

GGGCCACCAT CCAAACCGCA AGTCACZ'CiAT GTTACTAAGA ACAGZbTCAC660 CTTGTCCTGG

CAGCCAGGTA CCCCTt3GAAC CCTTCCAGCA AGTGCATATA TCATTGAGGC720 TTTCAI~sCCAA

CTATACTGTA

AGAGGACTGC GGCCCAATAC AATCTACTTA TTCATGG'I'CA GAGCGATCAA840 CCCCAAGGTY

TCA~3ACCC AAtff~TAAACC ACAGAAAAAC AATGGATCCA CTTaOt?CCAA900 TGTCCCTCTA

CCTCCCCCCC CAOTCCAGCC CCTTCCTGGC ACGGAGCTG(i AACACTATGC960 AGTGGAACAA

CAM~~1AJ1ATG GCTAT<iACAG TGATAGCTOG TGCCCACCAT 1020 TGCCAGTACA AACTTACTTA

CACCAAQGTC TGGAAGATGA ACTGGAM~AA GATGATGATA GGGTCCCAAC1080 ACCTCCTGTT

CGAGGCG'fGG CTTCTTCTCC TGCTATCTCC TT"1'GGACAGC 1140 AGTCCACTGC AACTCTTACT

CCA'fCCCCAC GOGAAOAGAT OCAACCCATG CTOCAGGC'fT CACCTNTTTA1200 CCTCCTCTCA

AAaACCTCGA CCTACCAGCC CATTTTCTAC TGACAGTAAC ACCA~3TGCAG1260 CCCTGAGTCA

AAGTCA~GJ~G CCTCGOCCCA CTAAAAAACA C7lAt~C;l~C,iOG 1300 (2) INFORMATION FOR SEQ ID NO:10:
(i) SEQO~CB CHARACTERISTICS:
(A) LENGTH: 434 amino acids (8) TYPE: amigo acid (C) STRANDEDTTESS: single (D) TOPOLOGY: linear (1i) HOLBCULE TYPE: peptide (ix) FEATURE:
(A) NAMElKEY: Modified-site (8) LOCATIDrI: 285..396 (D) OTHER INFORItATION: /note. 'Xaa signifies gap in sequence (xi) SEQUENCE DESCRIPTION: SEQ ID NO:10:
Gln Ile Val Ala Gln Gly Arg Thr Val Thr Phe Pro Cya Glu Thr Lya Gly Asn Pro Gln Pro Ala Val Phe Trp Gln Lys Glu Gly Ser Gln Asn Leu Leu Phe Pro Ran Gln Pro Gln Gln Pro Asn Ser Arg Cys Ser Val Ser Pro Thr Gly Asp Leu Thr Ile Thr Rsn ile Gln Arg Ser Aap Ala Gly Tyr Tyr Ile Cys Gln Ala Leu Thr Val Ala Gly Ser ile Leu Ala Lys Ala Gln Leu Glu Val Thr Asp Val Leu Thr Asp Arg Pro Pro Pro Ile Ile Leu Gln Gly Pro Ala Asn Gln Thr Leu Ala Val Asp Gly Thr Ala Leu Leu Lys Cys Lya Ala Thr Gly Asp Pro Leu Pro Val I1e Ser 115 l20 las Trp Leu Lys Glu Gly Phe Thr Phe Pro Gly Arg Asp Pro Arg Ala Thr 130 13s uo Ile Gln Glu Gln Gly Thr Leu Gln Its Lys Asn Leu Arg Ile Ser Asp Thr Gly Thr Tyr Thr Cys Val Ala Thr Ssr Ser Ser Gly Glu Ala Ser Trp Ser Ala Val Leu Asp Val Thr Glu 5sr Gly Ala Thr Ile Ser Lys Aars Tyr Asp Leu Ser Asp Leu Pro Gly Pro Pro Ser Lys Pro Gln Val Thr Asp Val Thr Lys Asn Ser Val Thr Leu Ser Trp Gln Pro Gly Thr Pro Gly Thr Leu Pro Ala Ser Ala Tyr Ile Ile Glu Ala Phe Ser Gln Ser Val Ser Asn Ser Trp Gln Thr Val Ala Asn His Val Lys Thr Thr Leu Tyr Thr Val ArQ Gly Leu Arg Pro Asn Thr Its Tyr Lau Phs Mst Val Arg Ala Its Asn Pro Lys Va1 Ser Val Thr Gln Raa Lys Pro Gln Lys Aan Asn Gly Ser Thr Tsp Ala Asn Val Pro Leu Pro Pro Pro Pro Val Gln Pro Leu Pro Gly Thr Glu Lsu Glu His Tyr Ala Val Glu Gln Gln Glu Asn Gly Tyr Asp Ser Asp Ser Trp Cys Pro Pro Leu Pro Val Gln Thr Tyr Leu His Gln Gly Leu Glu Asp Glu Leu Glu Glu Asp Asp Asp Arg Val Pro Thr Pro Pro Val Arg Gly Val Ala Ser Ser Pro Ala Ile Ser Phe Gly Gln Gln Ser Thr Ala Thr Leu Thr Pro Ser Pro Arg Glu Glu Met Gln Pro Met Lsu Gln Ala Ser Pro Xaa Phe Thr Ser Ser Gln Arg Pro Arg Pro Thr Ser Pro Phe Ser Thr Asp Ser Asn Thr Ser Ala Ala Leu Ser Gln Ser Gln Arg Pro Arg Pro Thr Lys Lya His Lys Gly Gly t2) INFORMATION FOR S8Q ID N0:11:
ti) SEQUENCE CHARACTERISTICS:

tA) LEN'GTH: 448 base pairs tBt TYPE: nucleic acid tc) sTRArIDEDNSSS: double tD) TOPOirOGY: linear t i i ) MOLBCKJLE TYPE : cDTlA

txi) SEQUENCE DESCRIPTION: SEQ ID NO:iI:

GCCCAGGCAG TTGCTGCAGC TOC(~CiAGTAT GCGOOCCTGA 60 AI~Ci'!'(~CSCTCG CCGCCAAATG

CAAGATGCTG CTGOCCGCCG CCACTTCCAT GCCTCTCA~CiT 120 GCCCAAGGCC CACGAGTCCT

CGCCAACiAAG

CAGA~iACI'~1CC AGCCAOGACA TCTGCGCAGG GAAGGCTACG 240 CAGATGATCT TCCACCCCCT

CCAGTGCCAC CACCTGCTAT AAAATCOCCC ACTC~1~CCAGT 300 CCAAGGCACA GCTGGAGGTA

CGQCCTGTCA TGGTGCCAAA ACTCGCGTCT ATAC~CAA GGACAGATAG360 ATCGTCAGAC

AGAAAAGOAG GCAGTTACRA GGOGAGAGAA GCTCTGGATCi GAAGACAAGT4a0 CACTGACCTG

tZ) INFORMATION FOR SEQ ID N0:12:
(i) SEQUB'NC8 CHARACTERISTICS:
(A) LBNCiTH: 148 amino acids tB) TYPE: amino acid tC) STRANDEDNESS: single tD) TOPOLOGY: linear tii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: S8Q ID N0:12:
Ala Gln Ala Val Ala Ala Ala Ala Glu Tyr Ala Gly Leu Lys Val Ala Arg Arg Gln Met Gln Asp Ala Ala Gly Arg Arg His Phe His Ala Ser a0 25 30 Gln Cys Pro Arg Pro Thr Ser Pro Val Ser Thr Asp Ser Asn Met Ser Ala Val Val Ile Gln Lys Ala Arg Pro Ala Lys Lys Gln Lys His Gln Pro Gly His Leu Arg Arg Glu Ala Tyr Ala Asp Asp Leu Pro Pro Pro Pro Va1 Pro Pro Pro Ala ile Lys Ser Pro Thr Val Gln Ser Lys Ala ,.
Gln Leu Glu Val ArQ Pro Val Het Val Pro Lys Leu Ala 8er Ile Glu Ala ArQ Thr Asp lrff Ser Ser Asp Ar9 Lys Gly Gly Ser Tyr Lys Gly 115 1Z0 lay ArQ Glu Ala Leu Asp Gly ArQ Gln Val Thr Asp Leu Arp Thr Asn Pro Ser Asp Pro ArQ

Claims (35)

1. An isolated Robo polypeptide comprising a polymer of amino acids of at least 25 consecutive residues of any of SEQ ID NO: 2, 4, 8, 10 or 12, wherein said polypeptide is capable of modulating Robo-ligand binding and/or Robo-mediated signaling.
2. An isolated Robo polypeptide comprising a polymer of amino acids of at least 50 consecutive residues of any of SEQ ID NO: 2, 4, 8, 10 or 12, wherein said polypeptide is capable of modulating Robo-ligand binding and/or Robo-mediated signaling.
3. An isolated Robo polypeptide having at least 95% sequence identity to any of SEQ ID
NO: 2, 4, 8, 10 or 12, wherein said polypeptide is capable of modulating Robo-ligand binding and/or Robo-mediated signaling.
4. An isolated polypeptide comprising SEQ ID NO: 12, wherein said polypeptide is capable of modulating Robo-ligand binding and/or Robo-mediated signaling.
5. An isolated immunogenic Robo polypeptide, capable of eliciting a Robo-specific antibody, selected from the group of (a) a polypeptide comprising the amino acid sequence set out in SEQ ID NO:2, 4, 8 or 10 or an immunogenic polypeptide fragment thereof;
(b) an immunogenic polypeptide of SEQ ID NO:2 selected from the group of residues 68-77, 79-94, 95-103, 122-129, 165-176, 181-191, 193-204, 244-251, 274-290, 322-331, 339-347, 407-417, 441-451, 453-474, 502-516, 541-553 and 617-629 of SEQ ID NO:2;
(c) an immunogenic polypeptide of SEQ ID NO:8 selected from the group of residues 1-12, 18-28, 31-40, 45-65, 106-116, 137-145, 174-184, 214-230, 274-286, 314-324, 399-412, 496-507, 548-565, 599-611, 660-671, 717-730, 780-791, 835-847, 877-891, 930-942, 981-998, 1040-1051, 1080-1090, 1154-1168, 1215-1231. 1278-1302, 1378-1400, 1460-1469, 1497-1519, 1606-1626 and 1639-1651 of SEQ ID NO:8; and (d) an immunogenic polypeptide of SEQ ID NO:10 selected from the group of residues 5-16, 38-47, 83-94, 112-125, 168-180, 195-209, 222-235 and 241-254 of SEQ ID NO:10.
6. The immunogenic polypeptide of claim 5 coupled to a carrier protein.
7. A soluble form of a Robo polypeptide which comprises one or more Robo immunoglobulin domain of SEQ ID NO: 2, 4, 6, 8 or 10, wherein said polypeptide is capable of modulating Robo-ligand binding and/or Robo-mediated signaling.
8. The polypeptide of claim 7 which is a human Robo polypeptide comprising one or more immunoglobulin domain sequence selected from:
(a) residues 68-167 of SEQ ID NO:8;
(b) residues 168-258 of SEQ ID NO:8;
(c) residues 259-350 of SEQ ID NO:8;
(d) residues 351-450 of SEQ ID NO:8;
(e) residues 451-546 of SEQ ID NO:8;
(f) residues 1-91 of SEQ ID NO:10; and (g) residues 92-185 of SEQ ID NO:10.
9. A soluble form of a Robo polypeptide which comprises two or more Robo immunoglobulin domains of SEQ ID NO: 2, 4, 6, 8 or 10, wherein said polypeptide is capable of modulating Robo-ligand binding and/or Robo-mediated signaling.
10. The polypeptide of claim 9 which is a human Robo polypeptide comprising two or more immunoglobulin domain sequences selected from:
(a) residues 68-167 of SEQ ID NO:8;
(b) residues 168-258 of SEQ ID NO:8;
(c) residues 259-350 of SEQ ID NO:8;
(d) residues 351-450 of SEQ ID NO:8;

(e) residues 451-546 of SEQ ID NO:8;
(f) residues 1-67 joined to residues 168-258 of SEQ ID NO:8;
(g) residues 1-67 joined to residues 259-450 of SEQ ID NO:8;
(h) residues 68-167 joined to residues 168-258 of SEQ ID NO:8;
(i) residues 1-91 of SEQ ID NO:10; and (j) residues 92-185 of SEQ ID NO:10.
11. A soluble form of a Robo polypeptide which is a human Robo polypeptide, said polypeptide is capable of modulating Robo-ligand binding and/or Robo-mediated signaling, comprising a sequence selected from:
(a) residues 1-67 of SEQ ID NO:8;
(b) residues 68-167 of SEQ ID NO:8;
(c) residues 168-258 of SEQ ID NO:8;
(d) residues 259-350 of SEQ ID NO:8;
(e) residues 351-450 of SEQ ID NO:8;
(f) residues 451-546 of SEQ ID NO:8;
(g) residues 547-644 of SEQ ID NO:8;
(h) residues 645-761 of SEQ ID NO:8;
(i) residues 762-862 of SEQ ID NO:8;
(j) residues 1-167 of SEQ ID NO:8;
(k) residues 1-259 of SEQ ID NO:8;
(l) residues 1-350 of SEQ ID NO:8;
(m) residues 1-451 of SEQ ID NO:8;
(n) residues 68-259 of SEQ ID NO:8;
(o) residues 1-67 joined to residues 168-258 of SEQ ID NO:8;
(p) residues 1-67 joined to residues 259-450 of SEQ ID NO:8;
(q) residues 68-167 joined to residues 168-258 of SEQ ID NO:8;
(r) residues 1-91 of SEQ ID NO:10;
(s) residues 92-185 of SEQ ID NO:10; and (t) residues 186-282 of SEQ ID NO:10.
12. A soluble form of a human Robo polypeptide of SEQ ID NO: 8 or 10 which lacks a transmembrane domain and cytoplasmic motif and having Robo-ligand binding activity.
13. A soluble form of a human Robo polypeptide of SEQ ID NO: 8 or 10 which lacks a transmembrane domain and cytoplasmic motif and is capable of modulating Robo-mediated signaling.
14. The soluble Robo polypeptide of claim 12 or 13 which comprises amino acids 1-895 of SEQ ID NO: 8.
15. The soluble Robo polypeptide of claim 12 or 13 having at least 25 consecutive residues of SEQ ID NO: 8 or 10.
16. The soluble Robo polypeptide of claim 12 or 13 having at least 50 consecutive residues of SEQ ID NO: 8 or 10.
17. An isolated Robo polypeptide which is a deletion mutant comprising one or more Robo fibronectin or cytoplasmic motif domains of any of SEQ ID NOS:2, 4, 8, 10 or 12, wherein said polypeptide is capable of modulating Robo-ligand binding and/or Robo-mediated signaling.
18. The Robo polypeptide of claim 17, wherein said one or more fibronectin or cytoplasmic motif domains is selected from the group of:
(a) residues 536-635, 636-753, 754-854, 1037-1046, 1098-1119 and 1262-1269 of SEQ ID NO:2;
(b) residues 495-595, 596-770, 771-877 and 1075-1084 of SEQ ID NO:4;
(c) residues 544-643, 644-766, 767-865, 1036-1045, 1153-1163 and 1065-1074 of SEQ ID NO:6;

(d) residues 547-644, 645-761, 762-862, 1070-1079, 1181-1195 and 1481-1488 of SEQ ID NO:8;
(e) residues 182-282 of SEQ ID NO:10; and (f) residues 73-90 of SEQ ID NO:12.
19. A fusion product of the Robo polypeptide of any of claims 1-18, wherein said Robo polypeptide is fused with another peptide or polypeptide.
20. The fusion product of claim 19 wherein the another peptide or polypeptide is a tag for detection or anchoring, or a his tag.
21. An isolated antibody specific for a polypeptide according to any of claims 1-18.
22. The antibody of claim 21 which is a polyclonal antibody.
23. The antibody of claim 21 which is a monoclonal antibody.
24. The antibody of claim 21 which is a human Robo-specific antibody.
25. The antibody of claim 21, wherein the antibody is produced using an immunogenic fragment of SEQ ID NO:8 or SEQ ID NO: 10.
26. The antibody of claim 25, wherein the immunogenic fragment of SEQ ID NO:8 is selected from the group of residues 1-12, 18-28, 31-40, 45-65, 106-116, 137-145, 174-184, 214-230, 274-286, 314-324, 399-412, 496-507, 548-565, 599-611, 660-671, 717-730, 780-791, 835-847, 877-891, 930-942, 981-998, 1040-1051, 1080-1090, 1154-1168, 1215-1231, 1278-1302, 1378-1400, 1460-1469, 1497-1519, 1606-1626 and 1639-1651 of SEQ ID NO:8.
27. The antibody of claim 25, wherein the immunogenic fragment of SEQ ID NO:10 is selected from the group of residues 5-16, 38-47, 83-94, 112-125, 168-180, 195-209, 222-235 and 241-254 of SEQ ID NO:10.
28. An isolated antibody specific for the Robo polypeptide fusion product of claim 19 or 20.
29. A pharmaceutical composition comprising the antibody of claim 21, further comprising a pharmaceutically acceptable excipient.
30. The pharmaceutical composition of claim 29 wherein the antibody is a polyclonal antibody.
31. The pharmaceutical composition of claim 29 wherein the antibody is a monoclonal antibody.
32. A pharmaceutical composition comprising a polypeptide according to any of claims 1-18, further comprising a pharmaceutically acceptable excipient.
33. An immunogenic composition comprising a polypeptide according to any of claims 1-18, further comprising an adjuvant.
34. An isolated recombinant nucleic acid comprising a coding strand encoding the polypeptide according to any of claims 1-18.
35. An isolated cell comprising a nucleic acid according to claim 34.
CA002501585A 1998-10-20 1998-10-20 Robo: a family of polypeptides and nucleic acids involved in nerve guidance Abandoned CA2501585A1 (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CA002304926A CA2304926C (en) 1997-10-20 1998-10-20 Robo: a family of polypeptides and nucleic acids involved in nerve guidance

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
CA002304926A Division CA2304926C (en) 1997-10-20 1998-10-20 Robo: a family of polypeptides and nucleic acids involved in nerve guidance

Publications (1)

Publication Number Publication Date
CA2501585A1 true CA2501585A1 (en) 1999-04-29

Family

ID=34596801

Family Applications (1)

Application Number Title Priority Date Filing Date
CA002501585A Abandoned CA2501585A1 (en) 1998-10-20 1998-10-20 Robo: a family of polypeptides and nucleic acids involved in nerve guidance

Country Status (1)

Country Link
CA (1) CA2501585A1 (en)

Similar Documents

Publication Publication Date Title
JP4636413B2 (en) ROBO, a family of polypeptides and nucleic acids involved in neural induction
US7119165B2 (en) Nogo receptor-mediated blockade of axonal growth
US6379925B1 (en) Angiogenic modulation by notch signal transduction
AU726385B2 (en) Human cerberus protein
WO1998057621A1 (en) Angiogenic modulation by notch signal transduction
Molday Peripherin/rds and rom-1: molecular properties and role in photoreceptor cell degeneration
US6660499B1 (en) DCR5, a BMP-binding protein, and applications thereof
JPH10150993A (en) New g-protein bond receptor hltex11
JP2001506481A (en) Bradykinin B (1) DNA encoding receptor
CA2307065C (en) Modulating robo:ligand interactions
JPH10201482A (en) Calcitonin gene-related peptide receptor component factor (houdc44)
CA2501585A1 (en) Robo: a family of polypeptides and nucleic acids involved in nerve guidance
JPH10201485A (en) New g-protein coupled receptor huvct36
US5814463A (en) Screening assays using nucleic acids encoding receptors for bombesin-like peptides
US20010014457A1 (en) Nucleic acids encoding receptors for bombesin-like peptides
JP4357003B2 (en) Morphogenic protein
JPH11513896A (en) G-protein coupled receptor HNFDS78
WO1997031945A1 (en) Dna encoding a human imidazoline receptor
JPH10117791A (en) Human g protein bond receptor hlyaz61
JP2000245474A (en) New er protein serp1
JP2002500501A (en) Human calcium sensor protein, fragments thereof and DNA encoding the same

Legal Events

Date Code Title Description
EEER Examination request
FZDE Dead