CN114958758B - Construction method and application of breast cancer model pig - Google Patents
Construction method and application of breast cancer model pig Download PDFInfo
- Publication number
- CN114958758B CN114958758B CN202110187956.7A CN202110187956A CN114958758B CN 114958758 B CN114958758 B CN 114958758B CN 202110187956 A CN202110187956 A CN 202110187956A CN 114958758 B CN114958758 B CN 114958758B
- Authority
- CN
- China
- Prior art keywords
- seq
- pig
- nucleotide sequence
- safe harbor
- pymt
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 206010006187 Breast cancer Diseases 0.000 title claims abstract description 49
- 208000026310 Breast neoplasm Diseases 0.000 title claims abstract description 48
- 238000010276 construction Methods 0.000 title claims abstract description 35
- 239000002773 nucleotide Substances 0.000 claims abstract description 132
- 125000003729 nucleotide group Chemical group 0.000 claims abstract description 132
- 210000004027 cell Anatomy 0.000 claims abstract description 120
- 108090000623 proteins and genes Proteins 0.000 claims abstract description 72
- 102100033601 Collagen alpha-1(I) chain Human genes 0.000 claims abstract description 55
- 108010029483 alpha 1 Chain Collagen Type I Proteins 0.000 claims abstract description 55
- 101001000998 Homo sapiens Protein phosphatase 1 regulatory subunit 12C Proteins 0.000 claims abstract description 31
- 102100035620 Protein phosphatase 1 regulatory subunit 12C Human genes 0.000 claims abstract description 31
- 229920001184 polypeptide Polymers 0.000 claims abstract description 11
- 108090000765 processed proteins & peptides Proteins 0.000 claims abstract description 11
- 102000004196 processed proteins & peptides Human genes 0.000 claims abstract description 11
- 239000003814 drug Substances 0.000 claims abstract description 9
- 239000013598 vector Substances 0.000 claims description 82
- 210000002950 fibroblast Anatomy 0.000 claims description 36
- 238000003780 insertion Methods 0.000 claims description 34
- 230000037431 insertion Effects 0.000 claims description 34
- 238000012216 screening Methods 0.000 claims description 32
- 108091033409 CRISPR Proteins 0.000 claims description 31
- 102000004169 proteins and genes Human genes 0.000 claims description 30
- 108010077850 Nuclear Localization Signals Proteins 0.000 claims description 29
- 108091027544 Subgenomic mRNA Proteins 0.000 claims description 29
- 238000000034 method Methods 0.000 claims description 24
- 108010048367 enhanced green fluorescent protein Proteins 0.000 claims description 12
- 101100118093 Drosophila melanogaster eEF1alpha2 gene Proteins 0.000 claims description 11
- 108091093126 WHP Posttrascriptional Response Element Proteins 0.000 claims description 11
- 230000008506 pathogenesis Effects 0.000 claims description 11
- 230000008685 targeting Effects 0.000 claims description 10
- 210000000287 oocyte Anatomy 0.000 claims description 8
- 238000011144 upstream manufacturing Methods 0.000 claims description 8
- 238000010171 animal model Methods 0.000 claims description 7
- 239000003623 enhancer Substances 0.000 claims description 7
- 210000000481 breast Anatomy 0.000 claims description 6
- 229940079593 drug Drugs 0.000 claims description 6
- 230000001105 regulatory effect Effects 0.000 claims description 6
- 108010076504 Protein Sorting Signals Proteins 0.000 claims description 5
- 238000013456 study Methods 0.000 claims description 5
- 230000030648 nucleus localization Effects 0.000 claims description 4
- 238000002360 preparation method Methods 0.000 claims description 4
- FWMNVWWHGCHHJJ-SKKKGAJSSA-N 4-amino-1-[(2r)-6-amino-2-[[(2r)-2-[[(2r)-2-[[(2r)-2-amino-3-phenylpropanoyl]amino]-3-phenylpropanoyl]amino]-4-methylpentanoyl]amino]hexanoyl]piperidine-4-carboxylic acid Chemical compound C([C@H](C(=O)N[C@H](CC(C)C)C(=O)N[C@H](CCCCN)C(=O)N1CCC(N)(CC1)C(O)=O)NC(=O)[C@H](N)CC=1C=CC=CC=1)C1=CC=CC=C1 FWMNVWWHGCHHJJ-SKKKGAJSSA-N 0.000 claims description 2
- 102000007999 Nuclear Proteins Human genes 0.000 claims 1
- 108010089610 Nuclear Proteins Proteins 0.000 claims 1
- 230000014509 gene expression Effects 0.000 abstract description 51
- 238000010362 genome editing Methods 0.000 abstract description 14
- 238000005516 engineering process Methods 0.000 abstract description 11
- 210000001082 somatic cell Anatomy 0.000 abstract description 6
- 238000011160 research Methods 0.000 abstract description 4
- 238000010370 cell cloning Methods 0.000 abstract description 3
- 239000013612 plasmid Substances 0.000 description 128
- 241000282898 Sus scrofa Species 0.000 description 107
- 108020004414 DNA Proteins 0.000 description 58
- RXWNCPJZOCPEPQ-NVWDDTSBSA-N puromycin Chemical compound C1=CC(OC)=CC=C1C[C@H](N)C(=O)N[C@H]1[C@@H](O)[C@H](N2C3=NC=NC(=C3N=C2)N(C)C)O[C@@H]1CO RXWNCPJZOCPEPQ-NVWDDTSBSA-N 0.000 description 38
- 238000003776 cleavage reaction Methods 0.000 description 25
- 230000007017 scission Effects 0.000 description 23
- 229950010131 puromycin Drugs 0.000 description 19
- 239000002609 medium Substances 0.000 description 18
- 108020005004 Guide RNA Proteins 0.000 description 17
- 208000009869 Neu-Laxova syndrome Diseases 0.000 description 17
- 108091034057 RNA (poly(A)) Proteins 0.000 description 17
- 241001465754 Metazoa Species 0.000 description 15
- 238000010586 diagram Methods 0.000 description 14
- 210000001519 tissue Anatomy 0.000 description 14
- 241000282414 Homo sapiens Species 0.000 description 13
- 241000282887 Suidae Species 0.000 description 13
- 239000000243 solution Substances 0.000 description 13
- 238000001890 transfection Methods 0.000 description 13
- 210000005075 mammary gland Anatomy 0.000 description 12
- 206010028980 Neoplasm Diseases 0.000 description 11
- 125000003275 alpha amino acid group Chemical group 0.000 description 11
- 238000001514 detection method Methods 0.000 description 11
- 125000000539 amino acid group Chemical group 0.000 description 10
- 230000004927 fusion Effects 0.000 description 10
- 239000012212 insulator Substances 0.000 description 10
- 238000010374 somatic cell nuclear transfer Methods 0.000 description 10
- 101150066002 GFP gene Proteins 0.000 description 9
- 238000010367 cloning Methods 0.000 description 9
- 229910052754 neon Inorganic materials 0.000 description 9
- GKAOGPIIYCISHV-UHFFFAOYSA-N neon atom Chemical compound [Ne] GKAOGPIIYCISHV-UHFFFAOYSA-N 0.000 description 9
- 238000013518 transcription Methods 0.000 description 9
- 230000035897 transcription Effects 0.000 description 9
- 108700020796 Oncogene Proteins 0.000 description 8
- 102000004142 Trypsin Human genes 0.000 description 8
- 108090000631 Trypsin Proteins 0.000 description 8
- 238000001962 electrophoresis Methods 0.000 description 8
- 238000003753 real-time PCR Methods 0.000 description 8
- 239000012588 trypsin Substances 0.000 description 8
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 7
- 230000000694 effects Effects 0.000 description 7
- 210000002919 epithelial cell Anatomy 0.000 description 7
- 108020001507 fusion proteins Proteins 0.000 description 7
- 102000037865 fusion proteins Human genes 0.000 description 7
- 230000001965 increasing effect Effects 0.000 description 7
- 238000012163 sequencing technique Methods 0.000 description 7
- 229920000089 Cyclic olefin copolymer Polymers 0.000 description 6
- 241000282412 Homo Species 0.000 description 6
- 201000011510 cancer Diseases 0.000 description 6
- 238000002659 cell therapy Methods 0.000 description 6
- 238000012761 co-transfection Methods 0.000 description 6
- 201000010099 disease Diseases 0.000 description 6
- 238000001415 gene therapy Methods 0.000 description 6
- 239000007788 liquid Substances 0.000 description 6
- 210000004216 mammary stem cell Anatomy 0.000 description 6
- 210000000056 organ Anatomy 0.000 description 6
- 102000053602 DNA Human genes 0.000 description 5
- 241000288906 Primates Species 0.000 description 5
- 235000011449 Rosa Nutrition 0.000 description 5
- 238000006243 chemical reaction Methods 0.000 description 5
- 239000002299 complementary DNA Substances 0.000 description 5
- 210000002257 embryonic structure Anatomy 0.000 description 5
- 239000001963 growth medium Substances 0.000 description 5
- 239000003550 marker Substances 0.000 description 5
- 239000006228 supernatant Substances 0.000 description 5
- 230000014616 translation Effects 0.000 description 5
- 238000005406 washing Methods 0.000 description 5
- 108091032973 (ribonucleotides)n+m Proteins 0.000 description 4
- 102100021519 Hemoglobin subunit beta Human genes 0.000 description 4
- 108091005904 Hemoglobin subunit beta Proteins 0.000 description 4
- 241000699666 Mus <mouse, genus> Species 0.000 description 4
- 241000699670 Mus sp. Species 0.000 description 4
- 241000288667 Tupaia glis Species 0.000 description 4
- 241000700605 Viruses Species 0.000 description 4
- 238000004458 analytical method Methods 0.000 description 4
- 239000006285 cell suspension Substances 0.000 description 4
- 230000029087 digestion Effects 0.000 description 4
- 230000000857 drug effect Effects 0.000 description 4
- 238000007877 drug screening Methods 0.000 description 4
- 238000002474 experimental method Methods 0.000 description 4
- 238000001943 fluorescence-activated cell sorting Methods 0.000 description 4
- 238000011534 incubation Methods 0.000 description 4
- 210000001161 mammalian embryo Anatomy 0.000 description 4
- 238000004519 manufacturing process Methods 0.000 description 4
- 239000000463 material Substances 0.000 description 4
- 238000002156 mixing Methods 0.000 description 4
- 230000004048 modification Effects 0.000 description 4
- 238000012986 modification Methods 0.000 description 4
- 239000000047 product Substances 0.000 description 4
- 210000000130 stem cell Anatomy 0.000 description 4
- 229960005322 streptomycin Drugs 0.000 description 4
- 238000012546 transfer Methods 0.000 description 4
- JKMHFZQWWAIEOD-UHFFFAOYSA-N 2-[4-(2-hydroxyethyl)piperazin-1-yl]ethanesulfonic acid Chemical compound OCC[NH+]1CCN(CCS([O-])(=O)=O)CC1 JKMHFZQWWAIEOD-UHFFFAOYSA-N 0.000 description 3
- 101100328883 Arabidopsis thaliana COL1 gene Proteins 0.000 description 3
- IAZDPXIOMUYVGZ-UHFFFAOYSA-N Dimethylsulphoxide Chemical compound CS(C)=O IAZDPXIOMUYVGZ-UHFFFAOYSA-N 0.000 description 3
- 239000006144 Dulbecco’s modified Eagle's medium Substances 0.000 description 3
- 101000834253 Gallus gallus Actin, cytoplasmic 1 Proteins 0.000 description 3
- 239000007995 HEPES buffer Substances 0.000 description 3
- OKKJLVBELUTLKV-UHFFFAOYSA-N Methanol Chemical compound OC OKKJLVBELUTLKV-UHFFFAOYSA-N 0.000 description 3
- 108091028043 Nucleic acid sequence Proteins 0.000 description 3
- 238000012408 PCR amplification Methods 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 3
- 101150038500 cas9 gene Proteins 0.000 description 3
- 238000004113 cell culture Methods 0.000 description 3
- 238000010790 dilution Methods 0.000 description 3
- 239000012895 dilution Substances 0.000 description 3
- 238000011156 evaluation Methods 0.000 description 3
- 239000013604 expression vector Substances 0.000 description 3
- 201000007741 female breast cancer Diseases 0.000 description 3
- 201000002276 female breast carcinoma Diseases 0.000 description 3
- 239000012091 fetal bovine serum Substances 0.000 description 3
- 238000000338 in vitro Methods 0.000 description 3
- 108020004999 messenger RNA Proteins 0.000 description 3
- 239000011259 mixed solution Substances 0.000 description 3
- 230000001575 pathological effect Effects 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 108091003079 Bovine Serum Albumin Proteins 0.000 description 2
- 238000010354 CRISPR gene editing Methods 0.000 description 2
- 102100031181 Glyceraldehyde-3-phosphate dehydrogenase Human genes 0.000 description 2
- -1 H11 Proteins 0.000 description 2
- 108010003272 Hyaluronate lyase Proteins 0.000 description 2
- 102000001974 Hyaluronidases Human genes 0.000 description 2
- 241000713666 Lentivirus Species 0.000 description 2
- WIDZHJTYKYBLSR-DCAQKATOSA-N Leu-Glu-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O WIDZHJTYKYBLSR-DCAQKATOSA-N 0.000 description 2
- RXGLHDWAZQECBI-SRVKXCTJSA-N Leu-Leu-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O RXGLHDWAZQECBI-SRVKXCTJSA-N 0.000 description 2
- SBANPBVRHYIMRR-UHFFFAOYSA-N Leu-Ser-Pro Natural products CC(C)CC(N)C(=O)NC(CO)C(=O)N1CCCC1C(O)=O SBANPBVRHYIMRR-UHFFFAOYSA-N 0.000 description 2
- HOMFINRJHIIZNJ-HOCLYGCPSA-N Leu-Trp-Gly Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)NCC(O)=O HOMFINRJHIIZNJ-HOCLYGCPSA-N 0.000 description 2
- 206010027476 Metastases Diseases 0.000 description 2
- 241000713333 Mouse mammary tumor virus Species 0.000 description 2
- YBAFDPFAUTYYRW-UHFFFAOYSA-N N-L-alpha-glutamyl-L-leucine Natural products CC(C)CC(C(O)=O)NC(=O)C(N)CCC(O)=O YBAFDPFAUTYYRW-UHFFFAOYSA-N 0.000 description 2
- 102000002488 Nucleoplasmin Human genes 0.000 description 2
- 102000043276 Oncogene Human genes 0.000 description 2
- PKHDJFHFMGQMPS-RCWTZXSCSA-N Pro-Thr-Arg Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O PKHDJFHFMGQMPS-RCWTZXSCSA-N 0.000 description 2
- 238000002123 RNA extraction Methods 0.000 description 2
- 241000700159 Rattus Species 0.000 description 2
- 238000003556 assay Methods 0.000 description 2
- 230000015572 biosynthetic process Effects 0.000 description 2
- 239000007853 buffer solution Substances 0.000 description 2
- 210000004899 c-terminal region Anatomy 0.000 description 2
- 238000010804 cDNA synthesis Methods 0.000 description 2
- 239000012295 chemical reaction liquid Substances 0.000 description 2
- 239000003153 chemical reaction reagent Substances 0.000 description 2
- 210000000349 chromosome Anatomy 0.000 description 2
- 238000012258 culturing Methods 0.000 description 2
- 210000001771 cumulus cell Anatomy 0.000 description 2
- 238000007405 data analysis Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000018109 developmental process Effects 0.000 description 2
- 230000002708 enhancing effect Effects 0.000 description 2
- 210000003527 eukaryotic cell Anatomy 0.000 description 2
- MHMNJMPURVTYEJ-UHFFFAOYSA-N fluorescein-5-isothiocyanate Chemical compound O1C(=O)C2=CC(N=C=S)=CC=C2C21C1=CC=C(O)C=C1OC1=CC(O)=CC=C21 MHMNJMPURVTYEJ-UHFFFAOYSA-N 0.000 description 2
- 239000012634 fragment Substances 0.000 description 2
- 108020004445 glyceraldehyde-3-phosphate dehydrogenase Proteins 0.000 description 2
- 230000036541 health Effects 0.000 description 2
- 108010085325 histidylproline Proteins 0.000 description 2
- 230000006801 homologous recombination Effects 0.000 description 2
- 238000002744 homologous recombination Methods 0.000 description 2
- 229960002773 hyaluronidase Drugs 0.000 description 2
- 230000003834 intracellular effect Effects 0.000 description 2
- 108010057821 leucylproline Proteins 0.000 description 2
- 230000035800 maturation Effects 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 230000009401 metastasis Effects 0.000 description 2
- 238000010172 mouse model Methods 0.000 description 2
- 210000002445 nipple Anatomy 0.000 description 2
- 108060005597 nucleoplasmin Proteins 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 230000007170 pathology Effects 0.000 description 2
- 230000035479 physiological effects, processes and functions Effects 0.000 description 2
- 230000035790 physiological processes and functions Effects 0.000 description 2
- 108091033319 polynucleotide Proteins 0.000 description 2
- 230000035935 pregnancy Effects 0.000 description 2
- 108010090894 prolylleucine Proteins 0.000 description 2
- 101150013400 rag1 gene Proteins 0.000 description 2
- 230000006798 recombination Effects 0.000 description 2
- 238000005215 recombination Methods 0.000 description 2
- 238000012827 research and development Methods 0.000 description 2
- 229920006395 saturated elastomer Polymers 0.000 description 2
- 238000007789 sealing Methods 0.000 description 2
- 230000001568 sexual effect Effects 0.000 description 2
- 239000000600 sorbitol Substances 0.000 description 2
- 238000007619 statistical method Methods 0.000 description 2
- UCSJYZPVAKXKNQ-HZYVHMACSA-N streptomycin Chemical compound CN[C@H]1[C@H](O)[C@@H](O)[C@H](CO)O[C@H]1O[C@@H]1[C@](C=O)(O)[C@H](C)O[C@H]1O[C@@H]1[C@@H](NC(N)=N)[C@H](O)[C@@H](NC(N)=N)[C@H](O)[C@H]1O UCSJYZPVAKXKNQ-HZYVHMACSA-N 0.000 description 2
- 238000003786 synthesis reaction Methods 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 229940126585 therapeutic drug Drugs 0.000 description 2
- 230000009466 transformation Effects 0.000 description 2
- 238000013519 translation Methods 0.000 description 2
- 238000002054 transplantation Methods 0.000 description 2
- 238000012795 verification Methods 0.000 description 2
- FKXYRDYBKXIDDI-AGQURRGHSA-N (2s)-2-[[(2s)-2-[[(2s)-1-[(2s)-2-[[(2s)-2-[[(2s)-2-[[(2s)-2-[[(2s)-2-[[(2s)-2-amino-4-carboxybutanoyl]amino]-4-carboxybutanoyl]amino]-4-carboxybutanoyl]amino]-4-carboxybutanoyl]amino]-3-(4-hydroxyphenyl)propanoyl]amino]-4-methylsulfanylbutanoyl]pyrrolidin Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CCSC)NC(=O)[C@@H]1CCCN1C(=O)[C@H](CCSC)NC(=O)[C@@H](NC(=O)[C@H](CCC(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CCC(O)=O)CC1=CC=C(O)C=C1 FKXYRDYBKXIDDI-AGQURRGHSA-N 0.000 description 1
- 102000007469 Actins Human genes 0.000 description 1
- 108010085238 Actins Proteins 0.000 description 1
- JBVSSSZFNTXJDX-YTLHQDLWSA-N Ala-Ala-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@H](C)N JBVSSSZFNTXJDX-YTLHQDLWSA-N 0.000 description 1
- KIUYPHAMDKDICO-WHFBIAKZSA-N Ala-Asp-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O KIUYPHAMDKDICO-WHFBIAKZSA-N 0.000 description 1
- MKZCBYZBCINNJN-DLOVCJGASA-N Ala-Asp-Phe Chemical compound C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 MKZCBYZBCINNJN-DLOVCJGASA-N 0.000 description 1
- PCIFXPRIFWKWLK-YUMQZZPRSA-N Ala-Gly-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)CNC(=O)[C@H](C)N PCIFXPRIFWKWLK-YUMQZZPRSA-N 0.000 description 1
- KTXKIYXZQFWJKB-VZFHVOOUSA-N Ala-Thr-Ser Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O KTXKIYXZQFWJKB-VZFHVOOUSA-N 0.000 description 1
- XAXMJQUMRJAFCH-CQDKDKBSSA-N Ala-Tyr-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)C)CC1=CC=C(O)C=C1 XAXMJQUMRJAFCH-CQDKDKBSSA-N 0.000 description 1
- 102000000412 Annexin Human genes 0.000 description 1
- 108050008874 Annexin Proteins 0.000 description 1
- UISQLSIBJKEJSS-GUBZILKMSA-N Arg-Arg-Ser Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CO)C(O)=O UISQLSIBJKEJSS-GUBZILKMSA-N 0.000 description 1
- IGULQRCJLQQPSM-DCAQKATOSA-N Arg-Cys-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CS)C(=O)N[C@@H](CC(C)C)C(O)=O IGULQRCJLQQPSM-DCAQKATOSA-N 0.000 description 1
- NKBQZKVMKJJDLX-SRVKXCTJSA-N Arg-Glu-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O NKBQZKVMKJJDLX-SRVKXCTJSA-N 0.000 description 1
- LVMUGODRNHFGRA-AVGNSLFASA-N Arg-Leu-Arg Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O LVMUGODRNHFGRA-AVGNSLFASA-N 0.000 description 1
- WMEVEPXNCMKNGH-IHRRRGAJSA-N Arg-Leu-His Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N WMEVEPXNCMKNGH-IHRRRGAJSA-N 0.000 description 1
- HIMXTOIXVXWHTB-DCAQKATOSA-N Arg-Met-Gln Chemical compound CSCC[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N HIMXTOIXVXWHTB-DCAQKATOSA-N 0.000 description 1
- LCBSSOCDWUTQQV-SDDRHHMPSA-N Arg-Met-Pro Chemical compound CSCC[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N LCBSSOCDWUTQQV-SDDRHHMPSA-N 0.000 description 1
- BSGSDLYGGHGMND-IHRRRGAJSA-N Arg-Phe-Cys Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N BSGSDLYGGHGMND-IHRRRGAJSA-N 0.000 description 1
- VENMDXUVHSKEIN-GUBZILKMSA-N Arg-Ser-Arg Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O VENMDXUVHSKEIN-GUBZILKMSA-N 0.000 description 1
- UVTGNSWSRSCPLP-UHFFFAOYSA-N Arg-Tyr Natural products NC(CCNC(=N)N)C(=O)NC(Cc1ccc(O)cc1)C(=O)O UVTGNSWSRSCPLP-UHFFFAOYSA-N 0.000 description 1
- NLRJGXZWTKXRHP-DCAQKATOSA-N Asn-Leu-Arg Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O NLRJGXZWTKXRHP-DCAQKATOSA-N 0.000 description 1
- TZFQICWZWFNIKU-KKUMJFAQSA-N Asn-Leu-Tyr Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 TZFQICWZWFNIKU-KKUMJFAQSA-N 0.000 description 1
- IDUUACUJKUXKKD-VEVYYDQMSA-N Asn-Pro-Thr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(O)=O IDUUACUJKUXKKD-VEVYYDQMSA-N 0.000 description 1
- VILLWIDTHYPSLC-PEFMBERDSA-N Asp-Glu-Ile Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O VILLWIDTHYPSLC-PEFMBERDSA-N 0.000 description 1
- HOBNTSHITVVNBN-ZPFDUUQYSA-N Asp-Ile-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)O)NC(=O)[C@H](CC(=O)O)N HOBNTSHITVVNBN-ZPFDUUQYSA-N 0.000 description 1
- QJHOOKBAHRJPPX-QWRGUYRKSA-N Asp-Phe-Gly Chemical compound OC(=O)C[C@H](N)C(=O)N[C@H](C(=O)NCC(O)=O)CC1=CC=CC=C1 QJHOOKBAHRJPPX-QWRGUYRKSA-N 0.000 description 1
- GXHDGYOXPNQCKM-XVSYOHENSA-N Asp-Thr-Phe Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)NC(=O)[C@H](CC(=O)O)N)O GXHDGYOXPNQCKM-XVSYOHENSA-N 0.000 description 1
- GIKOVDMXBAFXDF-NHCYSSNCSA-N Asp-Val-Leu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O GIKOVDMXBAFXDF-NHCYSSNCSA-N 0.000 description 1
- 101150018129 CSF2 gene Proteins 0.000 description 1
- 101150069031 CSN2 gene Proteins 0.000 description 1
- 241000283707 Capra Species 0.000 description 1
- 108091026890 Coding region Proteins 0.000 description 1
- 102000029816 Collagenase Human genes 0.000 description 1
- 108060005980 Collagenase Proteins 0.000 description 1
- VNLYIYOYUNGURO-ZLUOBGJFSA-N Cys-Asp-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CS)N VNLYIYOYUNGURO-ZLUOBGJFSA-N 0.000 description 1
- OXFOKRAFNYSREH-BJDJZHNGSA-N Cys-Ile-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)O)NC(=O)[C@H](CS)N OXFOKRAFNYSREH-BJDJZHNGSA-N 0.000 description 1
- WVLZTXGTNGHPBO-SRVKXCTJSA-N Cys-Leu-Leu Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O WVLZTXGTNGHPBO-SRVKXCTJSA-N 0.000 description 1
- ZHCCYSDALWJITB-SRVKXCTJSA-N Cys-Phe-Cys Chemical compound N[C@@H](CS)C(=O)N[C@@H](Cc1ccccc1)C(=O)N[C@@H](CS)C(O)=O ZHCCYSDALWJITB-SRVKXCTJSA-N 0.000 description 1
- 108010090461 DFG peptide Proteins 0.000 description 1
- LFQSCWFLJHTTHZ-UHFFFAOYSA-N Ethanol Chemical compound CCO LFQSCWFLJHTTHZ-UHFFFAOYSA-N 0.000 description 1
- 206010017472 Fumbling Diseases 0.000 description 1
- 108010010803 Gelatin Proteins 0.000 description 1
- RKAQZCDMSUQTSS-FXQIFTODSA-N Gln-Asp-Gln Chemical compound C(CC(=O)N)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N RKAQZCDMSUQTSS-FXQIFTODSA-N 0.000 description 1
- RBWKVOSARCFSQQ-FXQIFTODSA-N Gln-Gln-Ser Chemical compound NC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(O)=O RBWKVOSARCFSQQ-FXQIFTODSA-N 0.000 description 1
- XKBASPWPBXNVLQ-WDSKDSINSA-N Gln-Gly-Asn Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O XKBASPWPBXNVLQ-WDSKDSINSA-N 0.000 description 1
- GIVHPCWYVWUUSG-HVTMNAMFSA-N Gln-Ile-His Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CCC(=O)N)N GIVHPCWYVWUUSG-HVTMNAMFSA-N 0.000 description 1
- PSERKXGRRADTKA-MNXVOIDGSA-N Gln-Leu-Ile Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O PSERKXGRRADTKA-MNXVOIDGSA-N 0.000 description 1
- VYOILACOFPPNQH-UMNHJUIQSA-N Gln-Val-Pro Chemical compound CC(C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCC(=O)N)N VYOILACOFPPNQH-UMNHJUIQSA-N 0.000 description 1
- MUSGDMDGNGXULI-DCAQKATOSA-N Glu-Glu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CCC(O)=O MUSGDMDGNGXULI-DCAQKATOSA-N 0.000 description 1
- LRPXYSGPOBVBEH-IUCAKERBSA-N Glu-Gly-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)NCC(=O)N[C@@H](CC(C)C)C(O)=O LRPXYSGPOBVBEH-IUCAKERBSA-N 0.000 description 1
- MRWYPDWDZSLWJM-ACZMJKKPSA-N Glu-Ser-Asp Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O MRWYPDWDZSLWJM-ACZMJKKPSA-N 0.000 description 1
- QRWPTXLWHHTOCO-DZKIICNBSA-N Glu-Val-Tyr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O QRWPTXLWHHTOCO-DZKIICNBSA-N 0.000 description 1
- IWAXHBCACVWNHT-BQBZGAKWSA-N Gly-Asp-Arg Chemical compound NCC(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N IWAXHBCACVWNHT-BQBZGAKWSA-N 0.000 description 1
- YWAQATDNEKZFFK-BYPYZUCNSA-N Gly-Gly-Ser Chemical compound NCC(=O)NCC(=O)N[C@@H](CO)C(O)=O YWAQATDNEKZFFK-BYPYZUCNSA-N 0.000 description 1
- UQJNXZSSGQIPIQ-FBCQKBJTSA-N Gly-Gly-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)CNC(=O)CN UQJNXZSSGQIPIQ-FBCQKBJTSA-N 0.000 description 1
- YYXJFBMCOUSYSF-RYUDHWBXSA-N Gly-Phe-Gln Chemical compound [H]NCC(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(N)=O)C(O)=O YYXJFBMCOUSYSF-RYUDHWBXSA-N 0.000 description 1
- MYXNLWDWWOTERK-BHNWBGBOSA-N Gly-Thr-Pro Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)CN)O MYXNLWDWWOTERK-BHNWBGBOSA-N 0.000 description 1
- 108010043121 Green Fluorescent Proteins Proteins 0.000 description 1
- VSLXGYMEHVAJBH-DLOVCJGASA-N His-Ala-Leu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(O)=O VSLXGYMEHVAJBH-DLOVCJGASA-N 0.000 description 1
- KAXZXLSXFWSNNZ-XVYDVKMFSA-N His-Ser-Ala Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O KAXZXLSXFWSNNZ-XVYDVKMFSA-N 0.000 description 1
- JGFWUKYIQAEYAH-DCAQKATOSA-N His-Ser-Val Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O JGFWUKYIQAEYAH-DCAQKATOSA-N 0.000 description 1
- HDOYNXLPTRQLAD-JBDRJPRFSA-N Ile-Ala-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(=O)O)N HDOYNXLPTRQLAD-JBDRJPRFSA-N 0.000 description 1
- SVZFKLBRCYCIIY-CYDGBPFRSA-N Ile-Pro-Arg Chemical compound CC[C@H](C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCNC(N)=N)C(O)=O SVZFKLBRCYCIIY-CYDGBPFRSA-N 0.000 description 1
- SENJXOPIZNYLHU-UHFFFAOYSA-N L-leucyl-L-arginine Natural products CC(C)CC(N)C(=O)NC(C(O)=O)CCCN=C(N)N SENJXOPIZNYLHU-UHFFFAOYSA-N 0.000 description 1
- OGCQGUIWMSBHRZ-CIUDSAMLSA-N Leu-Asn-Ser Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O OGCQGUIWMSBHRZ-CIUDSAMLSA-N 0.000 description 1
- QCSFMCFHVGTLFF-NHCYSSNCSA-N Leu-Asp-Val Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O QCSFMCFHVGTLFF-NHCYSSNCSA-N 0.000 description 1
- KAFOIVJDVSZUMD-UHFFFAOYSA-N Leu-Gln-Gln Natural products CC(C)CC(N)C(=O)NC(CCC(N)=O)C(=O)NC(CCC(N)=O)C(O)=O KAFOIVJDVSZUMD-UHFFFAOYSA-N 0.000 description 1
- WMTOVWLLDGQGCV-GUBZILKMSA-N Leu-Glu-Cys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CS)C(=O)O)N WMTOVWLLDGQGCV-GUBZILKMSA-N 0.000 description 1
- OGUUKPXUTHOIAV-SDDRHHMPSA-N Leu-Glu-Pro Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N1CCC[C@@H]1C(=O)O)N OGUUKPXUTHOIAV-SDDRHHMPSA-N 0.000 description 1
- KGCLIYGPQXUNLO-IUCAKERBSA-N Leu-Gly-Glu Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCC(O)=O KGCLIYGPQXUNLO-IUCAKERBSA-N 0.000 description 1
- KYIIALJHAOIAHF-KKUMJFAQSA-N Leu-Leu-His Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CN=CN1 KYIIALJHAOIAHF-KKUMJFAQSA-N 0.000 description 1
- BGZCJDGBBUUBHA-KKUMJFAQSA-N Leu-Lys-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O BGZCJDGBBUUBHA-KKUMJFAQSA-N 0.000 description 1
- CHJKEDSZNSONPS-DCAQKATOSA-N Leu-Pro-Ser Chemical compound [H]N[C@@H](CC(C)C)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O CHJKEDSZNSONPS-DCAQKATOSA-N 0.000 description 1
- LINKCQUOMUDLKN-KATARQTJSA-N Leu-Thr-Cys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CC(C)C)N)O LINKCQUOMUDLKN-KATARQTJSA-N 0.000 description 1
- JGKHAFUAPZCCDU-BZSNNMDCSA-N Leu-Tyr-Leu Chemical compound CC(C)C[C@H]([NH3+])C(=O)N[C@H](C(=O)N[C@@H](CC(C)C)C([O-])=O)CC1=CC=C(O)C=C1 JGKHAFUAPZCCDU-BZSNNMDCSA-N 0.000 description 1
- AXVIGSRGTMNSJU-YESZJQIVSA-N Leu-Tyr-Pro Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N2CCC[C@@H]2C(=O)O)N AXVIGSRGTMNSJU-YESZJQIVSA-N 0.000 description 1
- 206010058467 Lung neoplasm malignant Diseases 0.000 description 1
- LMVOVCYVZBBWQB-SRVKXCTJSA-N Lys-Asp-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCCCN LMVOVCYVZBBWQB-SRVKXCTJSA-N 0.000 description 1
- LXNPMPIQDNSMTA-AVGNSLFASA-N Lys-Gln-His Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@H](C(O)=O)CC1=CN=CN1 LXNPMPIQDNSMTA-AVGNSLFASA-N 0.000 description 1
- SQRLLZAQNOQCEG-KKUMJFAQSA-N Lys-Tyr-Ser Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CO)C(O)=O)CC1=CC=C(O)C=C1 SQRLLZAQNOQCEG-KKUMJFAQSA-N 0.000 description 1
- AHZNUGRZHMZGFL-GUBZILKMSA-N Met-Arg-Ser Chemical compound CSCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CO)C(O)=O)CCCNC(N)=N AHZNUGRZHMZGFL-GUBZILKMSA-N 0.000 description 1
- YNOVBMBQSQTLFM-DCAQKATOSA-N Met-Asn-Leu Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O YNOVBMBQSQTLFM-DCAQKATOSA-N 0.000 description 1
- FWTBMGAKKPSTBT-GUBZILKMSA-N Met-Gln-Glu Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O FWTBMGAKKPSTBT-GUBZILKMSA-N 0.000 description 1
- UYAKZHGIPRCGPF-CIUDSAMLSA-N Met-Glu-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CCSC)N UYAKZHGIPRCGPF-CIUDSAMLSA-N 0.000 description 1
- FWAHLGXNBLWIKB-NAKRPEOUSA-N Met-Ile-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CCSC FWAHLGXNBLWIKB-NAKRPEOUSA-N 0.000 description 1
- 101710202709 Middle T antigen Proteins 0.000 description 1
- 241000699660 Mus musculus Species 0.000 description 1
- 101100494762 Mus musculus Nedd9 gene Proteins 0.000 description 1
- AJHCSUXXECOXOY-UHFFFAOYSA-N N-glycyl-L-tryptophan Natural products C1=CC=C2C(CC(NC(=O)CN)C(O)=O)=CNC2=C1 AJHCSUXXECOXOY-UHFFFAOYSA-N 0.000 description 1
- 101100385413 Neurospora crassa (strain ATCC 24698 / 74-OR23-1A / CBS 708.71 / DSM 1257 / FGSC 987) csm-3 gene Proteins 0.000 description 1
- 101710163270 Nuclease Proteins 0.000 description 1
- 108010019160 Pancreatin Proteins 0.000 description 1
- 229930182555 Penicillin Natural products 0.000 description 1
- JGSARLDLIJGVTE-MBNYWOFBSA-N Penicillin G Chemical compound N([C@H]1[C@H]2SC([C@@H](N2C1=O)C(O)=O)(C)C)C(=O)CC1=CC=CC=C1 JGSARLDLIJGVTE-MBNYWOFBSA-N 0.000 description 1
- 108091000080 Phosphotransferase Proteins 0.000 description 1
- 241001505332 Polyomavirus sp. Species 0.000 description 1
- 101150044917 Prl3b1 gene Proteins 0.000 description 1
- HPXVFFIIGOAQRV-DCAQKATOSA-N Pro-Arg-Gln Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(N)=O)C(O)=O HPXVFFIIGOAQRV-DCAQKATOSA-N 0.000 description 1
- ZSKJPKFTPQCPIH-RCWTZXSCSA-N Pro-Arg-Thr Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(O)=O ZSKJPKFTPQCPIH-RCWTZXSCSA-N 0.000 description 1
- XKHCJJPNXFBADI-DCAQKATOSA-N Pro-Asp-Lys Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O XKHCJJPNXFBADI-DCAQKATOSA-N 0.000 description 1
- VYWNORHENYEQDW-YUMQZZPRSA-N Pro-Gly-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H]1CCCN1 VYWNORHENYEQDW-YUMQZZPRSA-N 0.000 description 1
- BFXZQMWKTYWGCF-PYJNHQTQSA-N Pro-His-Ile Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O BFXZQMWKTYWGCF-PYJNHQTQSA-N 0.000 description 1
- AQGUSRZKDZYGGV-GMOBBJLQSA-N Pro-Ile-Asp Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(O)=O)C(O)=O AQGUSRZKDZYGGV-GMOBBJLQSA-N 0.000 description 1
- QGLFRQCECIWXFA-RCWTZXSCSA-N Pro-Met-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CCSC)NC(=O)[C@@H]1CCCN1)O QGLFRQCECIWXFA-RCWTZXSCSA-N 0.000 description 1
- RFWXYTJSVDUBBZ-DCAQKATOSA-N Pro-Pro-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@H]1NCCC1 RFWXYTJSVDUBBZ-DCAQKATOSA-N 0.000 description 1
- DWPXHLIBFQLKLK-CYDGBPFRSA-N Pro-Pro-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@H]1NCCC1 DWPXHLIBFQLKLK-CYDGBPFRSA-N 0.000 description 1
- 108010079005 RDV peptide Proteins 0.000 description 1
- 238000010802 RNA extraction kit Methods 0.000 description 1
- 238000011529 RT qPCR Methods 0.000 description 1
- 241000283984 Rodentia Species 0.000 description 1
- MOVJSUIKUNCVMG-ZLUOBGJFSA-N Ser-Cys-Ser Chemical compound C([C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CO)C(=O)O)N)O MOVJSUIKUNCVMG-ZLUOBGJFSA-N 0.000 description 1
- VQBCMLMPEWPUTB-ACZMJKKPSA-N Ser-Glu-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O VQBCMLMPEWPUTB-ACZMJKKPSA-N 0.000 description 1
- BVLGVLWFIZFEAH-BPUTZDHNSA-N Ser-Pro-Trp Chemical compound [H]N[C@@H](CO)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O BVLGVLWFIZFEAH-BPUTZDHNSA-N 0.000 description 1
- PCJLFYBAQZQOFE-KATARQTJSA-N Ser-Thr-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CO)N)O PCJLFYBAQZQOFE-KATARQTJSA-N 0.000 description 1
- SNXUIBACCONSOH-BWBBJGPYSA-N Ser-Thr-Ser Chemical compound OC[C@H](N)C(=O)N[C@@H]([C@H](O)C)C(=O)N[C@@H](CO)C(O)=O SNXUIBACCONSOH-BWBBJGPYSA-N 0.000 description 1
- KIEIJCFVGZCUAS-MELADBBJSA-N Ser-Tyr-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CC=C(C=C2)O)NC(=O)[C@H](CO)N)C(=O)O KIEIJCFVGZCUAS-MELADBBJSA-N 0.000 description 1
- 108020004682 Single-Stranded DNA Proteins 0.000 description 1
- 238000010459 TALEN Methods 0.000 description 1
- VGYVVSQFSSKZRJ-OEAJRASXSA-N Thr-Phe-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)[C@H](O)C)CC1=CC=CC=C1 VGYVVSQFSSKZRJ-OEAJRASXSA-N 0.000 description 1
- UQCNIMDPYICBTR-KYNKHSRBSA-N Thr-Thr-Gly Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O UQCNIMDPYICBTR-KYNKHSRBSA-N 0.000 description 1
- 108010043645 Transcription Activator-Like Effector Nucleases Proteins 0.000 description 1
- 108700019146 Transgenes Proteins 0.000 description 1
- ADBFWLXCCKIXBQ-XIRDDKMYSA-N Trp-Asn-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)N ADBFWLXCCKIXBQ-XIRDDKMYSA-N 0.000 description 1
- OGZRZMJASKKMJZ-XIRDDKMYSA-N Trp-Leu-Asp Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)N OGZRZMJASKKMJZ-XIRDDKMYSA-N 0.000 description 1
- ACGIVBXINJFALS-HKUYNNGSSA-N Trp-Phe-Gly Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CC2=CNC3=CC=CC=C32)N ACGIVBXINJFALS-HKUYNNGSSA-N 0.000 description 1
- GLNADSQYFUSGOU-GPTZEZBUSA-J Trypan blue Chemical compound [Na+].[Na+].[Na+].[Na+].C1=C(S([O-])(=O)=O)C=C2C=C(S([O-])(=O)=O)C(/N=N/C3=CC=C(C=C3C)C=3C=C(C(=CC=3)\N=N\C=3C(=CC4=CC(=CC(N)=C4C=3O)S([O-])(=O)=O)S([O-])(=O)=O)C)=C(O)C2=C1N GLNADSQYFUSGOU-GPTZEZBUSA-J 0.000 description 1
- VTFWAGGJDRSQFG-MELADBBJSA-N Tyr-Asn-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC2=CC=C(C=C2)O)N)C(=O)O VTFWAGGJDRSQFG-MELADBBJSA-N 0.000 description 1
- QPBJXNYYQTUTDD-KKUMJFAQSA-N Tyr-Met-Gln Chemical compound CSCC[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)N QPBJXNYYQTUTDD-KKUMJFAQSA-N 0.000 description 1
- TYFLVOUZHQUBGM-IHRRRGAJSA-N Tyr-Ser-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 TYFLVOUZHQUBGM-IHRRRGAJSA-N 0.000 description 1
- DJSYPCWZPNHQQE-FHWLQOOXSA-N Tyr-Tyr-Gln Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(=O)N[C@@H](CCC(N)=O)C(O)=O)C1=CC=C(O)C=C1 DJSYPCWZPNHQQE-FHWLQOOXSA-N 0.000 description 1
- GBOGMAARMMDZGR-UHFFFAOYSA-N UNPD149280 Natural products N1C(=O)C23OC(=O)C=CC(O)CCCC(C)CC=CC3C(O)C(=C)C(C)C2C1CC1=CC=CC=C1 GBOGMAARMMDZGR-UHFFFAOYSA-N 0.000 description 1
- 240000003864 Ulex europaeus Species 0.000 description 1
- DBOXBUDEAJVKRE-LSJOCFKGSA-N Val-Asn-Val Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](C(C)C)C(=O)O)N DBOXBUDEAJVKRE-LSJOCFKGSA-N 0.000 description 1
- JPPXDMBGXJBTIB-ULQDDVLXSA-N Val-His-Tyr Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)N[C@@H](CC2=CC=C(C=C2)O)C(=O)O)N JPPXDMBGXJBTIB-ULQDDVLXSA-N 0.000 description 1
- HGJRMXOWUWVUOA-GVXVVHGQSA-N Val-Leu-Gln Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](C(C)C)N HGJRMXOWUWVUOA-GVXVVHGQSA-N 0.000 description 1
- 230000004913 activation Effects 0.000 description 1
- 101150063416 add gene Proteins 0.000 description 1
- 210000001789 adipocyte Anatomy 0.000 description 1
- 210000004504 adult stem cell Anatomy 0.000 description 1
- 239000000427 antigen Substances 0.000 description 1
- 108091007433 antigens Proteins 0.000 description 1
- 102000036639 antigens Human genes 0.000 description 1
- 238000010009 beating Methods 0.000 description 1
- 239000005395 beveled glass Substances 0.000 description 1
- 230000033228 biological regulation Effects 0.000 description 1
- 239000000090 biomarker Substances 0.000 description 1
- 230000000903 blocking effect Effects 0.000 description 1
- 210000004369 blood Anatomy 0.000 description 1
- 239000008280 blood Substances 0.000 description 1
- 230000037237 body shape Effects 0.000 description 1
- 210000000988 bone and bone Anatomy 0.000 description 1
- 238000009395 breeding Methods 0.000 description 1
- 230000001488 breeding effect Effects 0.000 description 1
- 238000010805 cDNA synthesis kit Methods 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 239000006143 cell culture medium Substances 0.000 description 1
- 230000010261 cell growth Effects 0.000 description 1
- 108091092328 cellular RNA Proteins 0.000 description 1
- YRQNKMKHABXEJZ-UVQQGXFZSA-N chembl176323 Chemical compound C1C[C@]2(C)[C@@]3(C)CC(N=C4C[C@]5(C)CCC6[C@]7(C)CC[C@@H]([C@]7(CC[C@]6(C)[C@@]5(C)CC4=N4)C)CCCCCCCC)=C4C[C@]3(C)CCC2[C@]2(C)CC[C@H](CCCCCCCC)[C@]21C YRQNKMKHABXEJZ-UVQQGXFZSA-N 0.000 description 1
- 229960002424 collagenase Drugs 0.000 description 1
- 230000001276 controlling effect Effects 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 101150055601 cops2 gene Proteins 0.000 description 1
- 238000005138 cryopreservation Methods 0.000 description 1
- 238000005520 cutting process Methods 0.000 description 1
- 108010069495 cysteinyltyrosine Proteins 0.000 description 1
- GBOGMAARMMDZGR-JREHFAHYSA-N cytochalasin B Natural products C[C@H]1CCC[C@@H](O)C=CC(=O)O[C@@]23[C@H](C=CC1)[C@H](O)C(=C)[C@@H](C)[C@@H]2[C@H](Cc4ccccc4)NC3=O GBOGMAARMMDZGR-JREHFAHYSA-N 0.000 description 1
- GBOGMAARMMDZGR-TYHYBEHESA-N cytochalasin B Chemical compound C([C@H]1[C@@H]2[C@@H](C([C@@H](O)[C@@H]3/C=C/C[C@H](C)CCC[C@@H](O)/C=C/C(=O)O[C@@]23C(=O)N1)=C)C)C1=CC=CC=C1 GBOGMAARMMDZGR-TYHYBEHESA-N 0.000 description 1
- 210000000805 cytoplasm Anatomy 0.000 description 1
- 238000003745 diagnosis Methods 0.000 description 1
- 238000007865 diluting Methods 0.000 description 1
- 238000009509 drug development Methods 0.000 description 1
- 235000013601 eggs Nutrition 0.000 description 1
- 210000001671 embryonic stem cell Anatomy 0.000 description 1
- 210000002514 epidermal stem cell Anatomy 0.000 description 1
- 210000000981 epithelium Anatomy 0.000 description 1
- 230000003203 everyday effect Effects 0.000 description 1
- 238000000556 factor analysis Methods 0.000 description 1
- 230000035558 fertility Effects 0.000 description 1
- 230000004907 flux Effects 0.000 description 1
- 238000007710 freezing Methods 0.000 description 1
- 230000008014 freezing Effects 0.000 description 1
- 229920000159 gelatin Polymers 0.000 description 1
- 239000008273 gelatin Substances 0.000 description 1
- 235000019322 gelatine Nutrition 0.000 description 1
- 235000011852 gelatine desserts Nutrition 0.000 description 1
- 238000003198 gene knock in Methods 0.000 description 1
- 230000002068 genetic effect Effects 0.000 description 1
- 238000003205 genotyping method Methods 0.000 description 1
- 210000004602 germ cell Anatomy 0.000 description 1
- 108010009774 glutamyl-glutamyl-glutamyl-glutamyl-tyrosyl-methionyl-prolyl-methionyl-glutamic acid Proteins 0.000 description 1
- 108010089804 glycyl-threonine Proteins 0.000 description 1
- 230000012010 growth Effects 0.000 description 1
- 239000003102 growth factor Substances 0.000 description 1
- 230000005802 health problem Effects 0.000 description 1
- 210000003958 hematopoietic stem cell Anatomy 0.000 description 1
- 210000003897 hepatic stem cell Anatomy 0.000 description 1
- 108010040030 histidinoalanine Proteins 0.000 description 1
- 108010092114 histidylphenylalanine Proteins 0.000 description 1
- 238000009396 hybridization Methods 0.000 description 1
- 230000001900 immune effect Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000002347 injection Methods 0.000 description 1
- 239000007924 injection Substances 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 230000000968 intestinal effect Effects 0.000 description 1
- 230000009545 invasion Effects 0.000 description 1
- 210000003734 kidney Anatomy 0.000 description 1
- 108010083708 leucyl-aspartyl-valine Proteins 0.000 description 1
- 108010044311 leucyl-glycyl-glycine Proteins 0.000 description 1
- 108010034529 leucyl-lysine Proteins 0.000 description 1
- 108010000761 leucylarginine Proteins 0.000 description 1
- 230000004807 localization Effects 0.000 description 1
- 201000005202 lung cancer Diseases 0.000 description 1
- 208000020816 lung neoplasm Diseases 0.000 description 1
- 108010003700 lysyl aspartic acid Proteins 0.000 description 1
- 239000011654 magnesium acetate Substances 0.000 description 1
- 230000013011 mating Effects 0.000 description 1
- 235000013372 meat Nutrition 0.000 description 1
- 238000013160 medical therapy Methods 0.000 description 1
- 210000002901 mesenchymal stem cell Anatomy 0.000 description 1
- 230000004060 metabolic process Effects 0.000 description 1
- 238000000520 microinjection Methods 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 210000003205 muscle Anatomy 0.000 description 1
- 210000000663 muscle cell Anatomy 0.000 description 1
- 239000013642 negative control Substances 0.000 description 1
- 210000001178 neural stem cell Anatomy 0.000 description 1
- 210000004498 neuroglial cell Anatomy 0.000 description 1
- 230000003472 neutralizing effect Effects 0.000 description 1
- 210000004940 nucleus Anatomy 0.000 description 1
- 235000016709 nutrition Nutrition 0.000 description 1
- 238000001543 one-way ANOVA Methods 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 210000001672 ovary Anatomy 0.000 description 1
- 230000002018 overexpression Effects 0.000 description 1
- 210000003101 oviduct Anatomy 0.000 description 1
- 238000004806 packaging method and process Methods 0.000 description 1
- 229940055695 pancreatin Drugs 0.000 description 1
- 239000002245 particle Substances 0.000 description 1
- 230000037361 pathway Effects 0.000 description 1
- 239000008188 pellet Substances 0.000 description 1
- 229940049954 penicillin Drugs 0.000 description 1
- 239000008363 phosphate buffer Substances 0.000 description 1
- 102000020233 phosphotransferase Human genes 0.000 description 1
- 238000013310 pig model Methods 0.000 description 1
- 238000013326 plasmid cotransfection Methods 0.000 description 1
- 210000004508 polar body Anatomy 0.000 description 1
- 102000040430 polynucleotide Human genes 0.000 description 1
- 239000002157 polynucleotide Substances 0.000 description 1
- 238000010837 poor prognosis Methods 0.000 description 1
- 108010020432 prolyl-prolylisoleucine Proteins 0.000 description 1
- 230000001737 promoting effect Effects 0.000 description 1
- 230000010076 replication Effects 0.000 description 1
- 108091008146 restriction endonucleases Proteins 0.000 description 1
- 230000002207 retinal effect Effects 0.000 description 1
- 238000010839 reverse transcription Methods 0.000 description 1
- 239000013049 sediment Substances 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
- 108010048818 seryl-histidine Proteins 0.000 description 1
- 108010071207 serylmethionine Proteins 0.000 description 1
- 238000010008 shearing Methods 0.000 description 1
- 230000019491 signal transduction Effects 0.000 description 1
- 238000003307 slaughter Methods 0.000 description 1
- 210000001057 smooth muscle myoblast Anatomy 0.000 description 1
- 238000002791 soaking Methods 0.000 description 1
- 238000010186 staining Methods 0.000 description 1
- 238000009102 step therapy Methods 0.000 description 1
- 238000010257 thawing Methods 0.000 description 1
- 108010061238 threonyl-glycine Proteins 0.000 description 1
- 230000005030 transcription termination Effects 0.000 description 1
- 230000002103 transcriptional effect Effects 0.000 description 1
- 238000011830 transgenic mouse model Methods 0.000 description 1
- 230000005748 tumor development Effects 0.000 description 1
- 241001515965 unidentified phage Species 0.000 description 1
- 108010073969 valyllysine Proteins 0.000 description 1
- 239000013603 viral vector Substances 0.000 description 1
- 230000003612 virological effect Effects 0.000 description 1
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N5/00—Undifferentiated human, animal or plant cells, e.g. cell lines; Tissues; Cultivation or maintenance thereof; Culture media therefor
- C12N5/06—Animal cells or tissues; Human cells or tissues
- C12N5/0602—Vertebrate cells
- C12N5/0652—Cells of skeletal and connective tissues; Mesenchyme
- C12N5/0656—Adult fibroblasts
-
- A—HUMAN NECESSITIES
- A01—AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING
- A01K—ANIMAL HUSBANDRY; AVICULTURE; APICULTURE; PISCICULTURE; FISHING; REARING OR BREEDING ANIMALS, NOT OTHERWISE PROVIDED FOR; NEW BREEDS OF ANIMALS
- A01K67/00—Rearing or breeding animals, not otherwise provided for; New or modified breeds of animals
- A01K67/027—New or modified breeds of vertebrates
- A01K67/0275—Genetically modified vertebrates, e.g. transgenic
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61K—PREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
- A61K49/00—Preparations for testing in vivo
- A61K49/0004—Screening or testing of compounds for diagnosis of disorders, assessment of conditions, e.g. renal clearance, gastric emptying, testing for diabetes, allergy, rheuma, pancreas functions
- A61K49/0008—Screening agents using (non-human) animal models or transgenic animal models or chimeric hosts, e.g. Alzheimer disease animal model, transgenic model for heart failure
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K14/00—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
- C07K14/005—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from viruses
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/79—Vectors or expression systems specially adapted for eukaryotic hosts
- C12N15/85—Vectors or expression systems specially adapted for eukaryotic hosts for animal cells
- C12N15/8509—Vectors or expression systems specially adapted for eukaryotic hosts for animal cells for producing genetically modified animals, e.g. transgenic
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/87—Introduction of foreign genetic material using processes not otherwise provided for, e.g. co-transformation
- C12N15/90—Stable introduction of foreign DNA into chromosome
- C12N15/902—Stable introduction of foreign DNA into chromosome using homologous recombination
- C12N15/907—Stable introduction of foreign DNA into chromosome using homologous recombination in mammalian cells
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N5/00—Undifferentiated human, animal or plant cells, e.g. cell lines; Tissues; Cultivation or maintenance thereof; Culture media therefor
- C12N5/06—Animal cells or tissues; Human cells or tissues
- C12N5/0602—Vertebrate cells
- C12N5/0625—Epidermal cells, skin cells; Cells of the oral mucosa
- C12N5/0631—Mammary cells
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/14—Hydrolases (3)
- C12N9/16—Hydrolases (3) acting on ester bonds (3.1)
- C12N9/22—Ribonucleases RNAses, DNAses
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N33/00—Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
- G01N33/48—Biological material, e.g. blood, urine; Haemocytometers
- G01N33/50—Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
- G01N33/5005—Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing involving human or animal cells
- G01N33/5008—Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing involving human or animal cells for testing or evaluating the effect of chemical or biological compounds, e.g. drugs, cosmetics
- G01N33/5011—Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing involving human or animal cells for testing or evaluating the effect of chemical or biological compounds, e.g. drugs, cosmetics for testing antineoplastic activity
-
- A—HUMAN NECESSITIES
- A01—AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING
- A01K—ANIMAL HUSBANDRY; AVICULTURE; APICULTURE; PISCICULTURE; FISHING; REARING OR BREEDING ANIMALS, NOT OTHERWISE PROVIDED FOR; NEW BREEDS OF ANIMALS
- A01K2217/00—Genetically modified animals
- A01K2217/07—Animals genetically altered by homologous recombination
- A01K2217/072—Animals genetically altered by homologous recombination maintaining or altering function, i.e. knock in
-
- A—HUMAN NECESSITIES
- A01—AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING
- A01K—ANIMAL HUSBANDRY; AVICULTURE; APICULTURE; PISCICULTURE; FISHING; REARING OR BREEDING ANIMALS, NOT OTHERWISE PROVIDED FOR; NEW BREEDS OF ANIMALS
- A01K2227/00—Animals characterised by species
- A01K2227/10—Mammal
- A01K2227/108—Swine
-
- A—HUMAN NECESSITIES
- A01—AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING
- A01K—ANIMAL HUSBANDRY; AVICULTURE; APICULTURE; PISCICULTURE; FISHING; REARING OR BREEDING ANIMALS, NOT OTHERWISE PROVIDED FOR; NEW BREEDS OF ANIMALS
- A01K2267/00—Animals characterised by purpose
- A01K2267/03—Animal model, e.g. for test or diseases
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2503/00—Use of cells in diagnostics
- C12N2503/02—Drug screening
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2510/00—Genetically modified cells
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2710/00—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA dsDNA viruses
- C12N2710/00011—Details
- C12N2710/22011—Polyomaviridae, e.g. polyoma, SV40, JC
- C12N2710/22022—New viral proteins or individual genes, new structural or functional aspects of known viral proteins or genes
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2800/00—Nucleic acids vectors
- C12N2800/10—Plasmid DNA
- C12N2800/106—Plasmid DNA for vertebrates
- C12N2800/107—Plasmid DNA for vertebrates for mammalian
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2830/00—Vector systems having a special element relevant for transcription
- C12N2830/008—Vector systems having a special element relevant for transcription cell type or tissue specific enhancer/promoter combination
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N2500/00—Screening for compounds of potential therapeutic value
- G01N2500/10—Screening for compounds of potential therapeutic value involving cells
Landscapes
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Engineering & Computer Science (AREA)
- Genetics & Genomics (AREA)
- Chemical & Material Sciences (AREA)
- Biomedical Technology (AREA)
- Organic Chemistry (AREA)
- Zoology (AREA)
- Biotechnology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Wood Science & Technology (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Biochemistry (AREA)
- General Engineering & Computer Science (AREA)
- Microbiology (AREA)
- Cell Biology (AREA)
- Immunology (AREA)
- Urology & Nephrology (AREA)
- Physics & Mathematics (AREA)
- Biophysics (AREA)
- Veterinary Medicine (AREA)
- Medicinal Chemistry (AREA)
- Pathology (AREA)
- Plant Pathology (AREA)
- Environmental Sciences (AREA)
- Gastroenterology & Hepatology (AREA)
- Hematology (AREA)
- Animal Behavior & Ethology (AREA)
- Rheumatology (AREA)
- Toxicology (AREA)
- Virology (AREA)
- Analytical Chemistry (AREA)
- Animal Husbandry (AREA)
- Tropical Medicine & Parasitology (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Mycology (AREA)
- Food Science & Technology (AREA)
- Biodiversity & Conservation Biology (AREA)
- General Physics & Mathematics (AREA)
Abstract
The application provides a pig cell expressing PyMT, a breast cancer model pig obtained from the pig cell through a somatic cell cloning technology, a construction method of the pig cell and application of the pig cell in the field of biological medicine. Comprising inserting a nucleotide sequence encoding PyMT into a safe harbor site of a pig to obtain a polypeptide expressing SEQ ID NO:14 and a breast cancer model pig, wherein the pig safe harbor site is selected from the group consisting of pig ROSA26, AAVS1, H11 and COL1A1 safe harbor site. The application has good applicability of the research object, high expression level of target genes in pig cells and high gene editing efficiency.
Description
Technical Field
The invention relates to the technical field of gene editing, in particular to a pig recombinant cell which is integrated at a specific position in a genome and is driven to express a PyMT oncogene by a mammary gland specific promoter MMTV-LTR and is constructed by a CRISPR/Cas9 system and a homologous recombination technology, wherein the recombinant pig cell is used for cloning and producing a breast cancer model pig, and the model pig can be used in the biomedical fields of drug screening and drug effect evaluation in the next step, gene and cell therapy, study of a breast cancer pathogenesis and the like.
Background
According to the latest global cancer burden data issued by the international cancer research Institute (IARC) of the world health organization in 2020, 60% of new cancer cases in 2020 come from the most common 10 kinds of tumors, wherein female breast cancer accounts for 11.7% of the most common tumors, and the number of female breast cancer exceeds lung cancer (11.4%) for the first time, so that the cancer is the most diagnosed cancer worldwide. Female breast cancer is the first cancer to be diagnosed most often, is the first killer to threaten female health, and is one of the most urgent health problems today. With the continued development of modern medicine, despite advances in cancer diagnosis, treatment, and longevity, mortality has not been improved to a great extent. The lack of understanding of the natural history of the disease is the main reason for this limitation, and it is currently unclear at the molecular level which changes in breast tumors may lead to invasion and metastasis.
MMTV is an important virus that causes mouse mammary tumors, and its viral tissue specific promoters such as the mouse mammary tumor virus long terminal repeat promoter (MMTV-LTR) can be used to mediate the high expression of oncogenes in the mouse mammary glands to develop breast cancer. The polyoma intermediate T antigen (polyoma MIDDLE T ANTIGEN, pyMT, encoded amino acid sequence shown as SEQ ID NO: 14) is an annexin encoded by small DNA polyoma virus. PyMT was found to induce tumor production, and its induced tumors were also prone to tissue metastasis, and biomarker expression was also consistent with that associated with a poor prognosis in humans. PyMT acts as a powerful oncogene, whose product binds and absorbs several signal transduction pathways, including the Src family, ras and PI3k kinase pathways, all of which are altered in human breast cancer.
The mechanism of tumor development and treatment research are all required to be carried out on the basis of corresponding animal models, and the animal models which are commonly used at present are mouse models, for example, patent TW201536324A discloses a method for increasing the effectiveness of medical therapy, wherein in the embodiment, the injection of MMTV-PyMT cells into p16-3MR transgenic mice is generally mentioned, and then treatment and the like are carried out to obtain breast cancer cell models. However, mice differ greatly from humans in terms of body size, organ size, physiology, pathology, etc., and cannot truly simulate normal physiological and pathological conditions of humans. Furthermore, the mouse breast cancer model obtained by injecting MMTV-PyMT cells is also not stably inherited. Furthermore, patent CN103173496B discloses a tree shrew breast cancer model, which is established by injecting lentivirus into the nipple, specifically, firstly constructing a viral vector, then packaging the virus and determining the titer, and finally injecting lentivirus into the nipple of the tree shrew. However, the tree shrew is not a common disease animal model, the size and physiological function of the tree shrew are not high in similarity with those of a human, and normal physiological and pathological states of the human cannot be truly simulated. Thus, there remains a need for more and more suitable animal models of breast cancer, including further improvements in animal selection and in methods of making animal models. Pigs are major meat animals for a long time, have the size and physiological functions similar to those of human beings, are easy to breed and raise on a large scale, have lower requirements on ethical morals, animal protection and the like, and are ideal human disease model animals.
Therefore, the application adopts the gene editing technology and the mammary gland specific promoter MMTV-LTR to construct the pig recombinant cell for the mammary gland tissue specific expression oncogene PyMT, and then uses the recombinant cell as a nuclear transfer cell donor clone to produce the breast cancer model pig, and the model pig obtained by the application becomes a powerful tool for researching the generation mechanism of breast cancer diseases, drug development and preclinical test.
Disclosure of Invention
The invention provides a method for inserting a nucleotide sequence for coding PyMT into a safe harbor site of a pig at fixed points by a gene editing method to prepare a pig cell for expressing the PyMT protein, and the pig cell can further produce a breast cancer model pig for expressing the PyMT protein in breast tissues by a somatic cell cloning technology, so that a powerful experimental tool is provided for pathogenesis of diseases and research and development of therapeutic drugs.
In a first aspect of the present invention, there is provided a pig cell expressing PyMT, the nucleotide sequence encoding PyMT being inserted into a safe harbor site of a pig to obtain a polypeptide expressing SEQ ID NO:14, porcine cells of PyMT.
Preferably, the inserted nucleotide sequence encoding PyMT may be the CDS sequence or cDNA sequence of PyMT.
Preferably, the amino acid sequence of the PyMT is SEQ ID NO:14, and a polypeptide having the amino acid sequence shown in seq id no.
In one embodiment of the present invention, the inserted nucleotide sequence encoding PyMT is as set forth in SEQ ID NO: 39.
Preferably, the swine safe harbor site is selected from swine ROSA26, AAVS1, H11 or COL1A1 safe harbor site.
In one specific embodiment of the invention, the nucleotide sequence of each 500bp of the ROSA26 safe harbor site region and the upstream and downstream thereof is shown in SEQ ID NO:40, and the nucleotide sequence of 500bp respectively at the AAVS1 safe harbor site region and the upstream and downstream thereof is shown as SEQ ID NO:41, and the nucleotide sequence of 500bp respectively at the upper and lower reaches of the H11 safe harbor site region is shown as SEQ ID NO:42, the nucleotide sequence of 500bp of each of the COL1A1 safe harbor site region and the upstream and downstream thereof is shown as SEQ ID NO: 43.
Further preferably, the optimal safe harbor site of the pig is a COL1A1 site.
Preferably, the nucleotide sequence encoding the PyMT is regulated in pig cells by an exogenous promoter, and the exogenous promoter is MMTV-LTR.
Mouse Mammary Tumor Virus (MMTV) Long Terminal Repeat (LTR) -driven transgenes allow targeted expression of various oncogenes and growth factors in mammary tumor transformation.
In one embodiment of the invention, the nucleotide sequence encoding PyMT is driven in porcine cells by MMTV-LTR, the nucleotide sequence of MMTV-LTR is as set forth in SEQ ID NO: 15.
Preferably, the porcine cells are somatic cells of a pig. Further preferred are somatic cells of any pig that can be used in somatic cell nuclear transfer technology.
Preferably, the porcine cells may be breast cells, embryonic stem cells, adult stem cells, hematopoietic stem cells, bone marrow mesenchymal stem cells, neural stem cells, hepatic stem cells, muscle satellite cells, skin epidermal stem cells, intestinal epithelial stem cells, retinal stem cells, pancreatic stem cells, somatic cells, fibroblasts, muscle cells, glial cells, adipocytes, germ cells, or the like.
In one embodiment of the invention, the porcine cell is a porcine fibroblast or a mammary gland cell (preferably a mammary gland epithelial cell).
In a second aspect of the present invention, a construction method of the pig cell is provided, wherein a nucleotide sequence encoding the PyMT is inserted into a safe harbor site of the pig to obtain a nucleotide sequence expressing SEQ ID NO:14, porcine cells of PyMT.
Specifically, gene editing based on homologous recombination, ZFN, TALEN, CRISPR/Cas9 editing technology based on nuclease and the like can be adopted.
Preferably, the construction method comprises inserting a nucleotide sequence encoding a PyMT into a pig safe harbor site using a safe harbor site vector comprising a nucleotide sequence encoding a PyMT and a safe harbor site vector backbone comprising a 5 'homology arm and a 3' homology arm of the safe harbor insertion site, wherein the nucleotide sequence encoding a PyMT is located between the 5 'homology arm and the 3' homology arm, and the safe harbor site vector backbone is selected from any one of the following:
A) The ROSA26 safe harbor site vector skeleton, the 5' homology arm of which is shown in SEQ ID NO:5, the 3' homology arm is shown in SEQ ID NO: shown at 6. Preferably, the nucleotide sequence of the ROSA26 safe harbor site vector skeleton is shown in SEQ ID NO: 4.
B) AAVS1 safety harbor site carrier skeleton, its 5' homology arm is as SEQ ID NO:7, the 3' homology arm is shown as SEQ ID NO: shown at 8. Preferably, the AAVS1 safe harbor site vector backbone has a nucleotide sequence obtained by combining the nucleotide sequence of SEQ ID NO:4, the 5 'and 3' homology arms of ROSA26 are replaced with the 5 'and 3' homology arms of AAVS 1.
C) H11 safe harbor site carrier skeleton, the 5' homology arm of which is shown in SEQ ID NO:9, the 3' homology arm is shown as SEQ ID NO: shown at 10. Preferably, the nucleotide sequence of the H11 safe harbor site vector skeleton is a nucleotide sequence obtained by combining SEQ ID NO:4, the 5 'and 3' homology arms of ROSA26 are replaced with the 5 'and 3' homology arms of H11.
Or D) a COL1A1 safe harbor site carrier skeleton, wherein the 5' -homology arm is shown in SEQ ID NO:11, the 3' homology arm is shown as SEQ ID NO: shown at 12. Preferably, the nucleotide sequence of the COL1A1 safe harbor site vector skeleton is a sequence obtained by using SEQ ID NO:4, the 5 'and 3' homology arms of ROSA26 are replaced with the 5 'and 3' homology arms of COL1 A1.
Further preferably, the pig optimal safe harbor site vector skeleton is a COL1A1 safe harbor site vector skeleton.
Preferably, the safe harbor site vector further comprises a promoter, a signal molecule and nucleotide sequences encoding EGFP protein, mCherry protein and puro resistance protein. Wherein the promoter is EF-1 alpha promoter, PGK promoter and/or pCAG promoter. The signal molecules are EF-1 alpha poly (A) signal, bGH poly (A) signal and/or beta-globin poly (A) signal. Further preferably, the insulating region is also included.
In one embodiment of the present invention, the safe harbor site vector backbone comprises, in order from 5 'to 3', a 5 'homology arm, an insulator region, an EF-1. Alpha. Poly (A) signal, a nucleotide sequence encoding EGFP, an EF-1. Alpha. Promoter, an insulator region, a PGK promoter, a nucleotide sequence encoding mCherry, a bGH poly (A) signal, a loxP-puro-loxP expression cassette region, an insulator region, a beta-globin poly (A) signal, a pCAG promoter, an insulator region, and a 3' homology arm.
In one specific embodiment of the invention, the nucleotide sequence of the COL1A1 safe harbor site vector is shown in SEQ ID NO: shown at 13.
Preferably, construction of porcine cells is performed using an sgRNA vector comprising a sgRNA targeting the ROSA26, AAVS1, H11 or COL1A1 safe harbor site, wherein:
The nucleotide sequence of the sgRNA targeting the ROSA26 is shown in SEQ ID NO:21, the nucleotide sequence of the sgRNA targeting AAVS1 is set forth in SEQ ID NO:22, the nucleotide sequence of the sgRNA targeting H11 is set forth in SEQ ID NO:23, the nucleotide sequence of the sgRNA targeting COL1A1 is shown in SEQ ID NO: shown at 24.
Preferably, the sgRNA vector further comprises a backbone vector, and the nucleotide sequence of the backbone vector is SEQ ID NO:3.
Preferably, the construction of the pig cell is performed using a Cas vector comprising nucleotide sequences encoding Cas proteins, EGFP and Puro resistance proteins, wherein the Cas vector further comprises an EF1a promoter, a WPRE element and a 3' ltr sequence element, preferably, the nucleotide sequence of the Cas vector is, in order from 5' -3 ': a CMV enhancer, an EF1a promoter, a nuclear localization signal, a nucleotide sequence encoding a Cas protein selected from Casl、CaslB、Cas2、Cas3、Cas4、Cas5、Cas5d、Cas5t、Cas5h、Cas5a、Cas6、Cas7、Cas8、Cas9、CaslO、Csyl、Csy2、Csy3、Csy4、Csel、Cse2、Cse3、Cse4、Cse5e、Cscl、Csc2、Csa5、Csnl、Csn2、Csml、Csm2、Csm3、Csm4、Csm5、Csm6、Cmrl、Cmr3、Cmr4、Cmr5、Cmr6、Csbl、Csb2、Csb3、Csx17、Csx14、CsxlO、Csx16、CsaX、Csx3、Csxl、CsxlS、Csfl、Csf2、CsO、Csf4、Csdl、Csd2、Cstl、Cst2、Cshl、Csh2、Csal、Csa2、Csa3、Csa4、Csa5、C2cl、C2c2、C2c3、Cpfl、CARF、DinG、 a homologs or modified forms thereof, preferably Cas9, a nuclear localization signal, a nucleotide sequence encoding a cleavage polypeptide P2A, a nucleotide sequence encoding an EGFP, a nucleotide sequence encoding a cleavage polypeptide T2A, a nucleotide sequence encoding a Puro resistance protein, a WPRE sequence element, a 3' ltr sequence element, and a polyA signal sequence element.
In a specific embodiment of the invention, the Cas vector has a nucleotide sequence set forth in SEQ ID NO:1 or 2.
In one embodiment of the invention, the construction method comprises co-transfecting the safe harbor site vector, the sgRNA vector, and the Cas vector into porcine cells.
In order to increase the gene editing capability of the Cas9 vector, the invention is modified on the basis of a vector purchased from addgene (Plasmid #42230,from Zhang Feng lab) pX330-U6-Chimeric _BB-CBh-hSpCas (PX 330 for short) to obtain pU6gRNA-eEF1a-mNLS-hSpCas9-EGFP-PURO (particle pKG-GE3 for short). The map of PX330 is shown in fig. 1, modified as follows:
1) Removing redundant invalid sequences in the gRNA skeleton of the original vector;
2) Modifying a promoter: the original promoter (chicken beta-actin promoter) is modified into EF1a promoter with higher expression activity, so that the protein expression capacity of the Cas9 gene is increased;
3) Increasing the nuclear localization signal: adding a nuclear localization signal coding sequence (NLS) at the N end and the C end of the Cas9, and increasing the nuclear localization capability of the Cas 9;
4) Adding double screening markers: the original vector does not have any screening mark, is not beneficial to screening and enrichment of positive transformed cells, and P2A-EGFP-T2A-PURO is inserted into the C end of Cas9, so that the fluorescence and resistance screening capability of the vector are endowed;
5) Inserting WPRE, 3' LTR and other sequences for regulating gene expression: the WPRE, 3' LTR and other sequences are inserted into the gene frame at last, so that the protein translation capacity of the Cas9 gene can be enhanced.
The modified vector pU6gRNA-eEF1 a-mNLS-hSpCas-EGFP-PURO (called pKG-GE3 for short) has the modification site shown in figure 2, and the plasmid has the complete sequence shown in SEQ ID NO:2 is shown in the figure; the main elements of pKG-GE3 are:
1) gRNA expression element: u6 gRNA scaffold;
2) Promoter: EF1a promoter and CMV enhancer;
3) Cas9 gene comprising multiple NLSs: cas9 gene containing N-terminal and C-terminal polynuclear localization signals (NLS);
4) Screening marker genes: fluorescent and resistant double selectable marker element P2A-EGFP-T2A-PURO;
5) Element for enhancing translation: WPRE and 3' LTR enhance the translation efficiency of Cas9 and selectable marker genes;
6) Transcription termination signal: bGHpolyA signal;
7) A carrier skeleton: including Amp resistance elements and ori replicons, and the like.
The plasmid pKG-GE3 has a specific fusion gene; the specific fusion gene codes for a specific fusion protein;
The specific fusion protein sequentially comprises the following elements from the N end to the C end: two Nuclear Localization Signals (NLS), cas9 protein, two nuclear localization signals, self-cleaving polypeptide P2A, fluorescent reporter protein, self-cleaving polypeptide T2A, resistance selection marker protein;
In the plasmid pKG-GE3, the EF1a promoter is used for promoting the expression of the specific fusion gene;
in plasmid pKG-GE3, the specific fusion gene has downstream a WPRE sequence element, a 3' LTR sequence element and a bGH poly (A) signal sequence element.
The plasmid pKG-GE3 has the following elements in this order: CMV enhancer, EF1a promoter, the specific fusion gene, WPRE sequence element, 3' LTR sequence element, bGH poly (A) signal sequence element.
In the specific fusion protein, two nuclear localization signals at the upstream of the Cas9 protein are SV40 nuclear localization signals, and two nuclear localization signals at the downstream of the Cas9 protein are nucleoplasmin nuclear localization signals.
In the specific fusion protein, the fluorescent reporter protein can be EGFP protein.
In the specific fusion protein, the resistance selection marker protein may specifically be Puromycin resistance protein.
The amino acid sequence of the self-cleaving polypeptide P2A is "ATNFSLLKQAGDVEENPGP" (the cleavage site where self-cleavage occurs is between the first amino acid residue and the second amino acid residue from the C-terminus).
The amino acid sequence of the self-cleaving polypeptide T2A is "EGRGSLLTCGDVEENPGP" (the cleavage site where self-cleavage occurs is between the first amino acid residue and the second amino acid residue from the C-terminus).
Specific fusion genes are specifically shown as SEQ ID NO:2 from nucleotide numbers 911-6706.
CMV enhancer as set forth in SEQ ID NO:2 from nucleotide 395 to 680.
The EF1a promoter is shown in SEQ ID NO:2 from nucleotide 682 to nucleotide 890.
WPRE sequence element is shown as SEQ ID NO:2 from nucleotide 6722 to nucleotide 7310.
The 3' LTR sequence element is shown in SEQ ID NO:2 from nucleotide 7382 to nucleotide 7615.
The bGH poly (A) signal sequence element is shown as SEQ ID NO:2 from nucleotide 7647 to nucleotide 7871.
Preferably, the safe harbor site vector, the sgRNA vector or the Cas vector are all circular plasmids.
In a third aspect of the invention there is provided a tissue or organ comprising a pig cell as described above.
Preferably, the tissue is breast tissue, more preferably breast epithelial tissue. Preferably, the organ is a breast.
In a fourth aspect of the present invention, there is provided a method for constructing a model pig expressing PyMT, inserting a nucleotide sequence encoding PyMT into a safe harbor site of the pig to obtain a nucleotide sequence expressing SEQ ID NO:14, a model pig of PyMT. Preferably, the swine safe harbor site is selected from swine ROSA26, AAVS1, H11 or COL1A1 safe harbor site. Further preferably, the optimal safe harbor site of the pig is a COL1A1 site.
Preferably, the construction method further comprises the step of preparing the pig cells.
Preferably, the construction method comprises transferring the pig cells into enucleated pig oocytes to obtain model pigs. In one embodiment of the invention, the engraftment site is the peri-oval space of an enucleated porcine oocyte.
In one embodiment of the present invention, the construction method comprises providing the above-mentioned pig cells or obtaining pig cells by the above-mentioned pig cell construction method, and then cloning the pig cells by somatic cell nuclear transfer animal to obtain a model pig expressing the PyMT protein.
In a fifth aspect of the present invention, a method for constructing a model pig for breast cancer is provided, wherein a nucleotide sequence encoding PyMT is inserted into a safe harbor site of the pig to obtain a nucleotide sequence expressing SEQ ID NO:14, a model pig of PyMT. Preferably, the swine safe harbor site is selected from swine ROSA26, AAVS1, H11 or COL1A1 safe harbor site. Further preferably, the optimal safe harbor site of the pig is a COL1A1 site.
In one embodiment of the present invention, the construction method comprises providing the above pig cells or obtaining pig cells by the above pig cell construction method, and then cloning the pig cells by somatic cell nuclear transfer animal to obtain a model pig of the breast cancer by homozygously or heterozygously knocking-in the pyrmt gene.
In a sixth aspect of the present invention, there is provided a safe harbor site vector comprising a nucleotide sequence encoding a PyMT and a safe harbor site vector backbone comprising a 5 'homology arm and a 3' homology arm of a safe harbor insertion site, wherein the nucleotide sequence encoding a PyMT is located between the 5 'homology arm and the 3' homology arm, and wherein the safe harbor site vector backbone is selected from any one of the following:
A) The ROSA26 safe harbor site vector skeleton, the 5' homology arm of which is shown in SEQ ID NO:5, the 3' homology arm is shown in SEQ ID NO: shown at 6. Preferably, the nucleotide sequence of the ROSA26 safe harbor site vector skeleton is shown in SEQ ID NO: 4.
B) AAVS1 safety harbor site carrier skeleton, its 5' homology arm is as SEQ ID NO:7, the 3' homology arm is shown as SEQ ID NO: shown at 8. Preferably, the AAVS1 safe harbor site vector backbone has a nucleotide sequence obtained by combining the nucleotide sequence of SEQ ID NO:4, the 5 'and 3' homology arms of ROSA26 are replaced with the 5 'and 3' homology arms of AAVS 1.
C) H11 safe harbor site carrier skeleton, the 5' homology arm of which is shown in SEQ ID NO:9, the 3' homology arm is shown as SEQ ID NO: shown at 10. Preferably, the nucleotide sequence of the H11 safe harbor site vector skeleton is a nucleotide sequence obtained by combining SEQ ID NO:4, the 5 'and 3' homology arms of ROSA26 are replaced with the 5 'and 3' homology arms of H11.
Or D) a COL1A1 safe harbor site carrier skeleton, wherein the 5' -homology arm is shown in SEQ ID NO:11, the 3' homology arm is shown as SEQ ID NO: shown at 12. Preferably, the nucleotide sequence of the COL1A1 safe harbor site vector skeleton is a sequence obtained by using SEQ ID NO:4, the 5 'and 3' homology arms of ROSA26 are replaced with the 5 'and 3' homology arms of COL1 A1.
Further preferably, the pig optimal safe harbor site vector skeleton is a COL1A1 safe harbor site vector skeleton.
Preferably, the safe harbor site vector further comprises a promoter, a signal molecule and nucleotide sequences encoding EGFP protein, mCherry protein and puro resistance protein. Wherein the promoter is EF-1 alpha promoter, PGK promoter and/or pCAG promoter. The signal molecules are EF-1 alpha poly (A) signal, bGH poly (A) signal and/or beta-globin poly (A) signal. Further preferably, the insulating region is also included.
In one embodiment of the present invention, the safe harbor site vector backbone comprises, in order from 5 'to 3', a 5 'homology arm, an insulator region, an EF-1. Alpha. Poly (A) signal, a nucleotide sequence encoding EGFP, an EF-1. Alpha. Promoter, an insulator region, a PGK promoter, a nucleotide sequence encoding mCherry, a bGH poly (A) signal, a loxP-puro-loxP expression cassette region, an insulator region, a beta-globin poly (A) signal, a pCAG promoter, an insulator region, and a 3' homology arm.
In a seventh aspect, the invention provides an application of the safe harbor site vector, the sgRNA vector or the sgRNA in preparation of pig cells, a model pig expressing a PyMT protein or a model pig of breast cancer.
In an eighth aspect, the present invention provides an application of the pig cell, the pig cell obtained by the construction method, and the model pig obtained by the construction method for expressing PyMT in preparing an animal model of breast cancer, or an application in screening a drug for treating breast cancer and evaluating drug efficacy, or an application in gene and cell therapy, or an application in researching pathogenesis of breast cancer.
In a ninth aspect, the present invention provides an application of the above tissue or organ or the model pig obtained by the above construction method in screening medicines for treating breast cancer and evaluating the efficacy of the medicines, or in gene and cell therapy, or in researching pathogenesis of breast cancer.
The term "vector" is a polynucleotide capable of replication under the control of itself in a cell, or a genetic element such as a plasmid, chromosome, virus, transposon, that replicates and/or is expressed by insertion into the chromosome of a host cell. Suitable vectors include, but are not limited to, plasmids, transposons, bacteriophages and cosmids.
The "gRNA", also called guide RNA, described herein is an RNA that is transcribed from a sgRNA vector in a cell, is specific for a target sequence in the cell, and can form a complex with a Cas protein.
Compared with the prior art, the invention has at least the following beneficial effects:
(1) The subject (pig) of the invention has better applicability than other animals (rats, mice, primates).
Rodents such as rats and mice have great differences from humans in terms of body type, organ size, physiology, pathology and the like, and cannot truly simulate normal physiological and pathological states of humans. Studies have shown that more than 95% of drugs that are validated in mice are ineffective in human clinical trials. In the case of large animals, primates are animals with the closest relationship to humans, but are small in size, late in sexual maturity (mating begins at 6-7 years old), and single animals, the population expansion rate is extremely slow, and the raising cost is high. In addition, primate cloning is inefficient, difficult and costly.
The pig is an animal which has the closest relationship with human except primate, and has the similar body shape, weight, organ size and the like as human, and has the similar anatomical, physiological, immunological, nutritional metabolism, disease pathogenesis and the like as human. Meanwhile, the pigs are early in sexual maturity (4-6 months), have high fertility and have more piglets, and can form a larger group within 2-3 years. In addition, the cloning technology of pigs is very mature, and the cloning and feeding costs are much lower than those of primates. Pigs are thus very suitable animals as models of human diseases.
(2) Compared with the pX330 vector before transformation, the pU6gRNA-eEF1a-mNLS-hSpCas9-EGFP-PURO (called pKG-GE3 for short) vector subjected to experimental verification in the invention replaces a stronger promoter and adds elements for enhancing protein translation, improves the expression of Cas9, increases the number of nuclear localization signals, improves the nuclear localization capability of Cas9 protein, and has higher gene editing efficiency. The invention also adds fluorescent mark and resistance mark into the carrier, which makes it more convenient to apply to the screening and enrichment of the positive transformed cells of the carrier. The efficient expression vector of Cas9 modified by the invention is adopted for gene editing, and the editing efficiency is improved by more than 100% compared with the original vector.
(3) The invention aims at the fumbling of 4 safe harbor site gene knockin expression conditions of pig genome, and selects the optimal pig genome safe harbor site for inserting exogenous genes, thereby effectively improving the expression conditions of target genes after gene knockin.
(4) The invention adopts a mammary gland specific promoter, namely a mouse mammary gland tumor virus long terminal repeated promoter (MMTV-LTR) to drive the specific expression of the exogenous oncogene in mammary gland tissues, so that the exogenous oncogene can specifically play a role in the mammary gland tissues, and meanwhile, the influence of high-level over-expression of the exogenous oncogene on organisms is avoided.
(5) The single cell clone strain homozygous and knocked in by the MMTV-PyMT expression frame can be used for carrying out somatic cell nuclear transfer animal cloning to directly obtain a cloned pig homozygous and knocked in by the MMTV-PyMT expression frame, and the homozygous inserted gene can be inherited stably. Furthermore, the method can be used in the biomedical fields such as drug screening, drug effect evaluation, gene and cell therapy, study of pathogenesis of breast cancer and the like in the next step.
In the mouse model production, fertilized eggs are generally adopted to microinjection gene editing materials and then embryo transplantation is carried out, so that the probability of directly obtaining the offspring of gene knock-in is very low (less than 1%), and meanwhile, the offspring need to be subjected to hybridization breeding to screen homozygous knock-in individuals, which is not suitable for large animal (such as pigs) model production with longer gestation period. Therefore, the method for editing and screening the positive editing single cell clone in vitro by the primary cells with high technical difficulty and high challenge is adopted, and then the corresponding model pig is directly obtained by a somatic cell nuclear transfer animal cloning technology, so that the manufacturing period of the model pig can be greatly shortened, and the manpower, material resources and financial resources are saved.
The invention obtains the PyMT model pig which is highly similar to the development process of human breast cancer through gene editing and somatic cell cloning technology, is helpful for researching and revealing the pathogenesis of breast cancer, can be used for researching drug screening, drug effect detection, gene and cell therapy and the like, can provide effective experimental data for further clinical application, and further provides a powerful experimental means for preventing and treating human breast cancer. The invention has great application value for research of pathogenesis of human breast cancer, research and development of therapeutic drugs and preclinical experiments.
Drawings
Embodiments of the present invention are described in detail below with reference to the attached drawing figures, wherein:
FIG. 1 is a schematic diagram of the structure of plasmid pX 330.
FIG. 2 is a schematic diagram of the structure of plasmid pKG-GE 3.
FIG. 3 is a schematic diagram showing the structure of pU6gRNA vector.
FIG. 4 is a schematic representation of the insertion of a DNA molecule of about 20bp (used for transcription to form gRNA capable of binding to the target sequence) into the plasmid pKG-U6 gRNA.
FIG. 5 is a schematic representation of the structure of a fluorescent donor plasmid containing an insertion site for ROSA 26.
FIG. 6 is a schematic representation of the structure of a fluorescent donor plasmid containing an AAVS1 insertion site.
FIG. 7 is a schematic representation of the structure of a fluorescent donor plasmid containing an H11 insertion site.
FIG. 8 is a schematic structural diagram of a fluorescent donor plasmid containing COL1A1 insertion site.
FIG. 9 is a schematic structural diagram of pKG-MMTV-PyMT Donor plasmid containing COL1A1 insertion site.
FIG. 10 shows the sequencing results of the plasmid proportioning optimization test.
FIG. 11 shows the sequencing results of the editing effect of plasmid pX330 and plasmid pKG-GE 3.
FIG. 12 shows green fluorescent expression patterns of GFP regulated at different safe harbor sites.
FIG. 13 shows the results of fluorescent quantitative PCR for regulating GFP transcription level at different safe harbor sites.
FIG. 14 shows the results of FACS detection of GFP expression at different safe harbor sites.
FIG. 15 is an electropherogram for identifying whether the recombination of the MMTV-PyMT expression cassette at the 5 'end of the safe harbor insertion site of pig COL1A1 is successful, wherein WT is a wild type control, blank is a Blank, sh4 represents the safe harbor site COL1A1, lr represents the 5' homology arm, JDF represents the identification primer F, JDR represents the identification primer R,1414 or 5965 represents the detection site information.
FIG. 16 is an electrophoretogram for identifying whether the recombination of the MMTV-PyMT expression cassette at the 3 '-end of the safety harbor insertion site of porcine COL1A1 is successful, wherein WT is a wild-type control, blank is a Blank, sh4 represents the safety harbor site COL1A1, rr represents the 3' -homology arm, and 282 or 4723 represents the detection site information.
FIG. 17 is an electrophoretogram for identifying whether the MMTV-PyMT expression cassette is homozygous for insertion into the safe harbor site of porcine COL1A1, wherein WT is a wild type control, blank is a Blank, sh4 represents the safe harbor site COL1A1, JDF represents the identification primer F, JDR represents the identification primer R,1085 or 1560 represents the detection site information.
FIG. 18 shows the results of fluorescence quantitative PCR for controlling the transcript level of PyMT at the safe harbor site of the pig COL1A 1.
FIG. 19 shows the results of FACS detection of pig COL1A1 safe harbor site-regulated PyMT protein expression, wherein WT represents unmodified cloned pig mammary epithelial cells and PyMT represents mammary epithelial cells of a pig model for PyMT-expressing breast cancer prepared by nuclear transfer technology.
Detailed Description
The following detailed description of the invention is provided in connection with the accompanying drawings that are presented to illustrate the invention and not to limit the scope thereof. The examples provided below are intended as guidelines for further modifications by one of ordinary skill in the art and are not to be construed as limiting the invention in any way.
The experimental methods in the following examples, unless otherwise specified, are conventional methods, and are carried out according to techniques or conditions described in the literature in the field or according to the product specifications. Materials, reagents and the like used in the examples described below are commercially available unless otherwise specified. The recombinant plasmids constructed in the examples were all subjected to sequencing verification. Complete culture solution (% by volume): 15% fetal bovine serum (Gibco) +83% DMEM medium (Gibco) +1% Penicillin-Streptomycin (Gibco) +1% HEPES (Solarbio). Cell culture conditions: constant temperature incubator of 37 ℃,5% co 2、5%O2.
A method of preparing porcine primary fibroblasts: ① Taking 0.5g of pig ear tissue, removing hair and bone tissue, soaking the pig ear tissue in 75% alcohol for 30-40s, washing the pig ear tissue with PBS buffer solution containing 5% (volume ratio) Penicillin-Streptomycin (Gibco) for 5 times, and washing the pig ear tissue with the PBS buffer solution for one time; ② Shearing the tissue with scissors, digesting with 5mL of 0.1% collagenase solution (Sigma) at 37 ℃ for 1h, centrifuging 500g for 5min, and discarding the supernatant; ③ The pellet was resuspended in 1mL of complete medium, then plated into 10-diameter cell culture dishes containing 10mL of complete medium and capped with 0.2% gelatin (VWR) and cultured to about 60% of the cell growth bottom; ④ After completion of step ③, the cells were digested with trypsin and collected, and then resuspended in complete medium for subsequent electrotransformation experiments.
Example 1 construction of vector
1. Construction of Cas9 efficient expression vector (pKG-GE 3 for short)
The commercial plasmids were: pX330-U6-Chimeric _BB-CBh-hSpCas, abbreviated as plasmid pX330, SEQ ID NO: 1.
Based on the pX330 plasmid, a plasmid pU6gRNAeEF a-mNLS-hSpCas9-EGFP-PURO, called plasmid pKG-GE3 for short, is constructed, and SEQ ID NO: 2.
Plasmid pX330 and plasmid pKG-GE3 are both circular plasmids.
The schematic structure of plasmid pX330 is shown in fig. 1.SEQ ID NO:1, nucleotides 440-725 constitute the CMV enhancer, nucleotides 727-1208 constitute the chicken β -actin promoter, nucleotides 1304-1324 encode the SV40 Nuclear Localization Signal (NLS), nucleotides 1325-5449 encode the Cas9 protein, and nucleotides 5450-5497 encode the nucleoplasmin Nuclear Localization Signal (NLS).
The schematic structure of plasmid pKG-GE3 is shown in FIG. 2.SEQ ID NO:2, nucleotides 395-680 constitute the CMV enhancer, nucleotides 682-890 constitute the EF1a promoter, nucleotides 986-1006 encode the Nuclear Localization Signal (NLS), nucleotides 1016-1036 encode the Nuclear Localization Signal (NLS), nucleotides 1037-5161 encode the Cas9 protein, nucleotides 5162-5209 encode the Nuclear Localization Signal (NLS), nucleotides 5219-5266 encode the Nuclear Localization Signal (NLS), nucleotides 5276-5332 encode the cleavage polypeptide P2A (the amino acid sequence of the cleavage polypeptide P2A is "ATNFSLLKQAGDVEENPGP", the cleavage site from the C-terminus is between the first amino acid residue and the second amino acid residue), nucleotides 5333-6046 encode the EGFP protein, nucleotides 6056-6109 encode the cleavage polypeptide T2A (the amino acid sequence of the cleavage polypeptide T2A is "EGRGSLLTCGDVEENPGP", the cleavage site from the cleavage site is between the first amino acid residue and the second amino acid residue from the C-terminal is "3782", the cleavage site from the nucleotide sequence of the cleavage polypeptide T2A is between the first amino acid residue and the second amino acid residue of the cleavage site is "R", the cleavage site is between the nucleotide sequence of the cleavage site is No. 3 b 3, the nucleotide sequence of the cleavage site is No. 3, and the nucleotide is expressed as a position of the amino acid sequence of the cleavage element of the amino acid sequence of the first amino acid sequence is No. 3 b 3-3 b 3, and No. 3 b 3 is expressed. SEQ ID NO:2, 911-6706 form a fusion gene, expressing a fusion protein. Due to the presence of self-cleaving polypeptides P2A and T2A, the fusion protein spontaneously cleaves into three separate proteins, cas9 protein, EGFP protein and Puro resistant protein.
Compared with the plasmid pX330, the constructed plasmid pKG-GE3 is mainly modified as follows: ① Removing residual gRNA backbone sequences (GTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGCTAGTCCGTTTT) to reduce interference; ② The original chicken beta-actin promoter is modified into an EF1a promoter with higher expression activity, so that the protein expression capacity of the Cas9 gene is increased; ③ Adding nuclear localization signal coding genes (NLS) at the upstream and downstream of the Cas9 gene, and increasing the nuclear localization capability of the Cas9 protein; ④ The original plasmid has no eukaryotic cell screening mark, is not beneficial to screening and enriching positive transformed cells, and sequentially inserts P2A-EGFP-T2A-PURO coding genes at the downstream of Cas9 genes, so that eukaryotic cell fluorescence and puromycin resistance double screening marks are endowed; ⑤ The insertion of the WPRE element and the 3' ltr sequence element enhances the protein translation capacity of the Cas9 gene.
2. Construction of pKG-U6gRNA expression vector
Constructing a pKG-U6gRNA vector by taking pUC57 as a starting plasmid, wherein the structure schematic diagram is shown in FIG. 3, and the sequence is shown in SEQ ID NO: 3. SEQ ID NO:3, nucleotides 2280 to 2539 constitute the hU6 promoter and nucleotides 2558 to 2637 are used for transcription to form the gRNA backbone. When in use, a DNA molecule (target sequence binding region for transcription to form gRNA) of about 20bp is inserted into plasmid pKG-U6gRNA to form a recombinant plasmid, the schematic diagram is shown in FIG. 4, and the recombinant plasmid is transcribed in cells to obtain gRNA.
3. Construction of different safe harbor site Donor vectors containing GFP Gene
Plasmids PB-1G 2R 3-puro-ROSA26, PB-1G 2R 3-puro-AAVS1, PB-1G 2R 3-puro-H11 and PB-1G 2R3-puro-COL1A1 were constructed.
The structural schematic diagram of the plasmid PB-1G 2R 3-puro-ROSA26 is shown in FIG. 5.SEQ ID NO: in 4, nucleotides 1 to 345 constitute the swine genome region 5 'of the ROSA26 safety harbor insertion site (SH 1 left arm is shown as SEQ ID NO: 5), nucleotides 9184 to 10195 constitute the swine genome region 3' of the ROSA26 safety harbor insertion site (SH 1 right arm is shown as SEQ ID NO: 6), nucleotides 346 to 546, 3132 to 3531, 6506 to 6706, 8975 to 9175 constitute 4 different insulator regions, nucleotides 1954 to 3131 constitute the EF-1 alpha promoter, nucleotides 1216 to 1935 encode the EGFP protein, nucleotides 637 to 1209 constitute the EF-1 alpha poly (A) signal, nucleotides 3543 to 4042 constitute the PGK promoter, nucleotides 4059 to 4769 encode the mCherry protein, nucleotides 4791 to 5015 constitute the bGH (A) signal, nucleotides 5054 to 6504 constitute the loxP-ro-loxP-region, and nucleotides 7259 to 7269 constitute the poly (pC) signal.
The structural schematic diagram of the plasmid PB-1G 2R 3-puro-AAVS1 is shown in FIG. 6. Only SEQ ID NO:4 with the 5' end porcine genomic region (SH 2 left arm) of AAVS1 safe harbor insertion site, see SEQ ID NO:7, preparing a base material; setting SEQ ID NO:4 by replacing nucleotides 9184-10195 in AAVS1 safe harbor insertion site 3' end porcine genomic region (SH 2 right arm), see SEQ ID NO:8. other sequences and SEQ ID NO:4 are consistent.
The structural schematic diagram of the plasmid PB-1G 2R 3-puro-H11 is shown in FIG. 7. Only SEQ ID NO:4 by replacing nucleotide 1-345 in the genome region of the pig 5' to the H11 safe harbor insertion site (SH 3 left arm), see SEQ ID NO:9, a step of performing the process; setting SEQ ID NO:4 by replacing nucleotide 9184-10195 in the sequence of the 3' -end pig genome region (SH 3 right arm) of the H11 safe harbor insertion site, see SEQ ID NO:10. other sequences and SEQ ID NO:4 are consistent.
The structural schematic diagram of the plasmid PB-1G 2R 3-puro-COL1A1 is shown in FIG. 8. Only SEQ ID NO:4 by substituting nucleotide 1-345 in COL1A1 safe harbor insertion site 5' end porcine genomic region (SH 4 left arm), see SEQ ID NO:11; setting SEQ ID NO:4 by replacing nucleotides 9184-10195 in the sequence of SEQ ID NO:12. other sequences and SEQ ID NO:4 are consistent.
4. Construction of pKG-MMTV-PyMT Donor vector
The construction of plasmid pKG-MMTV-PyMT is shown in FIG. 9.SEQ ID NO:13, nucleotide numbers 1-852 are homologous sequences at the 5' end of a safety harbor insertion site of a pig genome COL1A1, nucleotide numbers 879-1079 are sequences of an Insulator1 (Insulator 1), nucleotide numbers 1080-2394 are sequences of an MMTV-LTR promoter (from pGL4.36[ luc2P MMTV Hygro ] plasmid, the sequence is shown as SEQ ID NO:15, purchased from Shanghai N.Y. Biotechnology Co., ltd.), nucleotide numbers 2407-3672 are sequences of coding PyMT (total gene synthesis is carried out in a biological organism, the coded amino acid sequence is shown as SEQ ID NO: 14), nucleotide numbers 3721-3945 are sequences of bGHO (A), nucleotide numbers 4107-4436 are sequences of an SV40 promoter, nucleotide numbers 4485-5081 are sequences of coding sequences of a resistance protein (abbreviated as Puro R protein), nucleotide numbers 5261-5382 are sequences of an SV40 Poly (A) and nucleotide numbers 5431-5460 are sequences of a 5669-563, respectively, and the nucleotide numbers 5469 are sequences of the safety harbor 1-561.
Example 2 comparison of the effects of plasmid pX330 and plasmid pKG-GE3
Selecting a high-efficiency gRNA target located in the RAG1 gene:
target for RAG1-gRNA 4: 5'-AGTTATGGCAGAACTCAGTG-3' (SEQ ID NO: 16).
Primers used to amplify the fragments containing the target were as follows:
RAG1-nF126:5’-CCCCATCCAAAGTTTTTAAAGGA-3’(SEQ ID NO:17);
RAG1-nR525:5’-TGTGGCAGATGTCACAGTTTAGG-3’(SEQ ID NO:18)。
Porcine primary fibroblasts were prepared from ear tissue of a junior river-flavored pig (female, blood group AO).
1. Construction of RAG1 Gene gRNA recombinant plasmid
Plasmid pKG-U6gRNA was digested with restriction enzyme BbsI, and the vector backbone (about 3kb linear fragment) was recovered. RAG1-4S and RAG1-4A were synthesized separately, and then mixed and annealed to give a double-stranded DNA molecule having cohesive ends. The double-stranded DNA molecule having a cohesive end and the vector backbone were ligated to obtain plasmid pKG-U6gRNA (RAG 1-gRNA 4).
RAG1-4S:5’-caccgAGTTATGGCAGAACTCAGTG-3’(SEQ ID NO:19);
RAG1-4A:5’-aaacCACTGAGTTCTGCCATAACTc-3’(SEQ ID NO:20)。
RAG1-4S and RAG1-4A are single stranded DNA molecules.
2. Plasmid proportioning optimization
1. Plasmid cotransfection of porcine primary fibroblasts
A first group: the plasmid pKG-U6gRNA (RAG 1-gRNA 4) and plasmid pKG-GE3 were co-transfected into porcine primary fibroblasts. Proportioning: about 20 ten thousand porcine primary fibroblasts: 0.44. Mu.g plasmid pKG-U6gRNA (RAG 1-gRNA 4): 1.56. Mu.g of plasmid pKG-GE3. Namely, the molar ratio of the plasmid pKG-U6gRNA (RAG 1-gRNA 4) to the plasmid pKG-GE3 is as follows: 1:1.
Second group: the plasmid pKG-U6gRNA (RAG 1-gRNA 4) and plasmid pKG-GE3 were co-transfected into porcine primary fibroblasts. Proportioning: about 20 ten thousand porcine primary fibroblasts: 0.72. Mu.g plasmid pKG-U6gRNA (RAG 1-gRNA 4): 1.28. Mu.g of plasmid pKG-GE3. Namely, the molar ratio of the plasmid pKG-U6gRNA (RAG 1-gRNA 4) to the plasmid pKG-GE3 is as follows: 2:1.
Third group: the plasmid pKG-U6gRNA (RAG 1-gRNA 4) and plasmid pKG-GE3 were co-transfected into porcine primary fibroblasts. Proportioning: about 20 ten thousand porcine primary fibroblasts: 0.92. Mu.g of plasmid pKG-U6gRNA (RAG 1-gRNA 4): 1.08 μg of plasmid pKG-GE3. Namely, the molar ratio of the plasmid pKG-U6gRNA (RAG 1-gRNA 4) to the plasmid pKG-GE3 is as follows: 3:1.
Fourth group: plasmid pKG-U6gRNA (RAG 1-gRNA 4) was transfected into porcine primary fibroblasts. Proportioning: about 20 ten thousand porcine primary fibroblasts: mu.g of plasmid pKG-U6gRNA (RAG 1-gRNA 4).
Co-transfection was performed by electric shock transfection using a mammalian nuclear transfection kit (Neon kit, thermofisher) and a Neon TM transfection system electrotransfection apparatus (parameters set to 1450V, 10ms, 3 pulses).
2. After the step 1 is completed, the culture is carried out for 16 to 18 hours by adopting the complete culture solution, and then the culture is carried out by replacing the new complete culture solution. The total incubation time was 48 hours.
3. After step2 is completed, cells are digested and collected by trypsin, genomic DNA is extracted, PCR amplification is performed by using a primer pair consisting of RAG1-nF126 and RAG1-nR525, and then electrophoresis is performed.
The band of interest was recovered after electrophoresis and sequenced, and the sequencing results are shown in FIG. 10.
The editing efficiency of different targets is obtained by analyzing the sequencing peak diagram by using Synthego ICE tools. The gene editing efficiency of the first group to the third group was 9%, 53%, 66% in this order. The fourth group did not undergo gene editing. The results show that the third group has the highest editing efficiency, and the optimal ratio of the single gRNA plasmid to the Cas9 plasmid is determined to be 3:1, and the actual use amount of the plasmid is 0.92 mug to 1.08 mug.
3. Comparison of the effects of plasmid pX330 and plasmid pKG-GE3
1. Co-transfection
RAG1-B group: plasmid pKG-U6gRNA (RAG 1-gRNA 4) was transfected into porcine primary fibroblasts. Proportioning: about 20 ten thousand porcine primary fibroblasts: 0.92. Mu.g of plasmid pKG-U6gRNA (RAG 1-gRNA 4).
RAG1-330 group: plasmid pKG-U6gRNA (RAG 1-gRNA 4) and plasmid pX330 were co-transfected into porcine primary fibroblasts. Proportioning: about 20 ten thousand porcine primary fibroblasts: 0.92. Mu.g of plasmid pKG-U6gRNA (RAG 1-gRNA 4): 1.08. Mu.g of plasmid pX330, i.e.the molar ratio of the two DNA is 3:1.
RAG1-KG group: the plasmid pKG-U6gRNA (RAG 1-gRNA 4) and plasmid pKG-GE3 were co-transfected into porcine primary fibroblasts. Proportioning: about 20 ten thousand porcine primary fibroblasts: 0.92. Mu.g of plasmid pKG-U6gRNA (RAG 1-gRNA 4): 1.08. Mu.g of plasmid pKG-GE3, i.e.the molar ratio of the two DNA was 3:1.
Co-transfection was performed by electric shock transfection using a mammalian nuclear transfection kit (Neon kit, thermofisher) and a Neon TM transfection system electrotransfection apparatus (parameters set to 1450V, 10ms, 3 pulses).
2. After the step 1 is completed, the culture is carried out for 16 to 18 hours by adopting the complete culture solution, and then the culture is carried out by replacing the new complete culture solution. The total incubation time was 48 hours.
3. After step 2 is completed, cells are digested and collected by trypsin, genomic DNA is extracted, PCR amplification is carried out by using a primer pair consisting of RAG1-nF126 and RAG1-nR525, and the products are sequenced.
The editing efficiency of different targets is obtained by analyzing the sequencing peak diagram by using Synthego ICE tools. No gene editing occurred in RAG1-B groups. The editing efficiency of RAG1-330 groups and RAG1-KG groups is 28% and 68% in sequence. Exemplary peak diagrams of sequencing results are shown in FIG. 11. The results show that the use of plasmid pKG-GE3 results in a significant increase in gene editing efficiency compared to the use of plasmid pX 330.
Example 3 screening of pig genome optimal safe harbor site for site-directed insertion of exogenous Gene
1. Construction of pig genome ROSA26, AAVS1, H11 and COL1A1 safe harbor site gRNA recombinant vector and efficient cutting target spot screening
Through the early screening, the efficient cleavage targets of the ROSA26, H11, AAVS1 and COL1A1 safe harbor sites are respectively the sgRNA ROSA26-g3 (cleavage efficiency 38%), the sgRNA AAVS1-g4 (cleavage efficiency 30%), the sgRNA H11-g1 (cleavage efficiency 60%), the sgRNA COL1A1-g3 (cleavage efficiency 56%), and the target sequences are as follows:
sgRNA ROSA26-g3 target: 5'-GAAGGAGCAAACTGACATGG-3' (SEQ ID NO: 21);
sgRNA AAVS1-g4 target: 5'-TGCAGTGGGTCTTTGGGGAC-3' (SEQ ID NO: 22);
sgRNA H11-g1 target: 5'-TTCCAGGAACATAAGAAAGT-3' (SEQ ID NO: 23);
sgRNA COL1A1-g3 target: 5'-GCAGTCTCAGCAACCACTGA-3' (SEQ ID NO: 24).
The gRNA plasmids corresponding to the 4 gRNA targets are pKG-U6gRNA (ROSA 26-g 3), pKG-U6gRNA (AAVS 1-g 4), pKG-U6gRNA (H11-g 1) and pKG-U6gRNA (COL 1A1-g 3), wherein the backbone vectors are pKG-U6gRNA (SEQ ID NO: 3), and the plasmid construction method is the same as in example 2.
2. Fluorescent Donor vector containing homology arms on both sides of insertion site of different safety harbors (i.e., vector of different safety harbors containing foreign gene GFP), sgRNA vector and Cas9 vector (pKG-GE 3 prepared in example 1) were mixed with electric pig primary fibroblast
And respectively co-transfecting the PB-1G 2R 3-puro-different safe harbor insertion site fluorescent vectors with the corresponding high-efficiency sgRNA vectors and the high-efficiency Cas9 vectors into porcine primary fibroblasts. Electrotransfection experiments (parameters set to 1450V, 10ms, 3 pulses) were performed using a mammalian nuclear transfection kit (Neon kit, thermofisher) with a Neon TM transfection system electrometer.
Co-transfection plasmid combination and ratio:
A first group: the plasmid PB-1G 2R 3-puro-ROSA26, plasmid pKG-U6gRNA (ROSA 26-g 3) and plasmid pKG-GE3 were co-transfected into porcine primary fibroblasts. Proportioning: about 20 ten thousand porcine primary fibroblasts: 1.26. Mu.g of plasmid PB-1G 2R 3-puro-ROSA26: 0.82. Mu.g plasmid pKG-U6gRNA (ROSA 26-g 3): 0.92. Mu.g of plasmid pKG-GE3, i.e.3 DNA molar ratios: 1:3:1.
Second group: the plasmid PB-1G 2R 3-puro-AAVS1, plasmid pKG-U6gRNA (AAVS 1-g 4) and plasmid pKG-GE3 were co-transfected into porcine primary fibroblasts. Proportioning: about 20 ten thousand porcine primary fibroblasts: 1.26. Mu.g of plasmid PB-1G 2R 3-puro-AAVS1: 0.82. Mu.g plasmid pKG-U6gRNA (AAVS 1-g 4): 0.92. Mu.g of plasmid pKG-GE3, i.e.3 DNA molar ratios: 1:3:1.
Third group: the plasmid PB-1G 2R 3-puro-H11, plasmid pKG-U6gRNA (H11-g 1) and plasmid pKG-GE3 were co-transfected into porcine primary fibroblasts. Proportioning: about 20 ten thousand porcine primary fibroblasts: 1.26. Mu.g of plasmid PB-1G 2R 3-puro-H11: 0.82. Mu.g plasmid pKG-U6gRNA (H11-g 1): 0.92. Mu.g of plasmid pKG-GE3, i.e.3 DNA molar ratios: 1:3:1.
Fourth group: the plasmid PB-1G 2R 3-puro-COL1A1, plasmid pKG-U6gRNA (COL 1A1-g 3) and plasmid pKG-GE3 were co-transfected into porcine primary fibroblasts. Proportioning: about 20 ten thousand porcine primary fibroblasts: 1.26. Mu.g of plasmid PB-1G 2R 3-puro-COL1A1: 0.82. Mu.g of plasmid pKG-U6gRNA (COL 1A1-g 3): 0.92. Mu.g of plasmid pKG-GE3, i.e.3 DNA molar ratios: 1:3:1.
Fifth group: pig primary fibroblast, the same electrotransformation parameters do not add any plasmid for electrotransformation operation.
The specific implementation method comprises the following steps:
And (3) cells: the fusion degree of primary fibroblasts of pigs before electrotransformation reaches 60%, trypsin digestion is performed at 0.25%, trypan blue staining is counted, and five groups of electrotransformation are performed on equal cells.
Pig primary cell electrotransformation:
(1) Cells were digested with pancreatin, the resulting cell suspension was washed once with PBS phosphate buffer (Solarbio), centrifuged for 6min at 600g, the supernatant was discarded, and cells (11 μl/min) were resuspended using 58 μl of electrotransfer base solution rbuffer, while avoiding air bubbles during resuspension;
(2) Sucking 10 mu L of cell suspension and plasmid electrotransformation reaction liquid, and uniformly mixing, wherein no bubbles are generated in the mixing process;
(3) Placing an electric rotating cup with a reagent cassette in a Neon TM transfection system electric rotating instrument cup groove, and adding 3mL Buffer E;
(4) Sucking 10 mu L of the mixed solution obtained in the step 2) by using an electrotransfer gun, inserting the mixed solution into a electric shock cup, selecting an electrotransfer program (1450V 10ms 3pulse), transferring the mixed solution in the electrotransfer gun into a 6-well plate immediately after electric shock transfection, wherein each well contains 3mL of complete culture solution (15% fetal bovine serum (Gibco) +83% DMEM medium (Gibco) +1% P/S (Gibco Penicillin-Streptomycin) +1% HEPES (Solarbio));
(5) Mixing, and culturing in a constant temperature incubator at 37deg.C and 5% CO 2、5%O2;
(6) And (3) transferring the liquid for 12-24 hours, pressurizing the liquid for 48 hours by using puromycin, and screening positive cells.
3. Puromycin pressurized screening and cell GFP fluorescence intensity detection
Cells were electrotransformed with plasmid for 48h, 1.5 μg/mL puromycin was added for selection, medium containing puromycin at the same concentration was changed every two days while GFP green fluorescence photographing was performed, selection was continued for two weeks, and pressure selection was continued for one week after intracellular plasmid was completely degraded. Judging the efficiency of expressing exogenous genes at the safe harbor site through the fluorescence expression intensity of GFP.
After puromycin is screened for one week, the fluorescent intensity of the ROSA26 and COL1A1 safe harbor site experimental group is obviously stronger than that of an AAVS1 and H11 experimental group; after two weeks of puromycin screening, the fluorescence intensities are sequentially from strong to weak: COL1A1> ROSA26> H11> AAVS1, wherein the fluorescence intensity of the H11 group is not uniform, the overall fluorescence intensity of the ROSA26 group is uniform, the fluorescence intensity is high, the fluorescence expression of the AAVS1 group cells is weakest, the number of the COL1A1 group fluorescent cells is the largest, and the fluorescence is the strongest; after puromycin is continuously screened for three weeks, the fluorescence intensity is sequentially from strong to weak:
COL1A1> ROSA26> H11> AAVS1, the results are shown in FIG. 12.
4. GFP gene transcription level assay
To compare the differences in mRNA transcription levels after GFP gene integration into four different safe harbor sites, it was possible to participate in the regulation of GFP expression and the effect on expression levels. Designing a pair of primers at the exon of GFP gene, screening cells three weeks later by puromycin, extracting total RNA, reversely transcribing into cDNA, and detecting the transcription level of the primary cells after the GFP gene is integrated at four different safe harbor sites, and simultaneously using the wild primary cells as a control. GAPDH was used as a reference gene and calculated according to the method of 2 -ΔCt.
(1) Primer information (Table 1)
Table 1: fluorescent quantitative PCR primer information
(2) Total RNA extraction from cells
Total cellular RNA extraction according to Simply P Total RNA extraction kit of Bio Flux
(3) First strand cDNA acquisition
Reverse transcription kit according to VazymeII 1st Strand cDNASynthesis Kit (R211-01/02) Synthesis of cDNA according to the instruction
The first chain, specific steps and procedure are as follows:
1) Preparing first strand cDNA synthesis reaction liquid
The following mixture in Table 2 was prepared in an RNase-free centrifuge tube
TABLE 2
Gently beating and mixing by a pipetting gun.
2) The first strand cDNA synthesis reaction was performed under the following conditions, and the reaction conditions are shown in Table 3.
TABLE 3 Table 3
The product is immediately used for qPCR reaction or stored at-80 ℃ to avoid repeated freezing and thawing.
(4) Fluorescent quantitative PCR
Detection of insertion of four different sets of safe harbor sites (ROSA 26, AAVS1, H11, COL1A 1) by real-time fluorescent quantitative PCR
GFP expression level in porcine primary fibroblasts, GAPDH was used as an internal reference gene. The operation steps and the program are as follows:
1) The preparation of the reaction system is shown in Table 4
TABLE 4 Table 4
2) QPCR reaction procedure is shown in Table 5 below
TABLE 5
3) Statistics and analysis
Data analysis was performed using SPSS statistical software, expressed as (mean ± standard deviation), and statistical analysis was performed using a two-factor analysis of variance. The results of 2 -ΔCt values show that the GFP expression level of the AAVS1 and H11 groups is lower after three weeks of puromycin screening, the GFP expression level of the ROSA26 and COL1A1 groups is higher, the difference of the GFP transcription levels of the COL1A1 group and the ROSA26 group relative to the AAVS1 and H11 groups is extremely obvious (P < 0.01), the values of 2 -ΔCt are shown in table 6, and the analysis result of the difference significance is shown in figure 13.
Table 6:2 -ΔCt value information
In summary, from the results of fluorescent signal intensity and GFP gene real-time fluorescent quantitative PCR after culturing cells for three weeks, it can be concluded that among four genomic safe harbor sites of ROSA26, AAVS1, H11, and COL1A1, the COL1A1 site has the best expression effect after insertion of foreign gene.
5. FACS detection of protein expression level of GFP Gene
To compare the expression of GFP after the GFP gene was integrated into four different safety harbor sites. Cells were digested with trypsin, centrifuged at 400g for 4min, and the supernatant was discarded. Cells were resuspended in 1mL of medium and the cell suspensions were transferred into flow tubes, respectively. GFP signals were detected in the FITC channel of BD FACSMelody flow cytometer and 5X 10 4 cells were collected for analysis with wild type cells as negative control, as shown in FIG. 14. The results show that GFP fluorescence signal COL1A1> ROSA26> H11> AAVS1.
Thus, in summary of the above results, the COL1A1 site was the pig primary cell safe harbor site that most efficiently expressed the exogenous gene among the four safe harbor sites of ROSA26, AAVS1, H11, COL1 A1.
EXAMPLE 4 preparation of a monoclonal clone with MMTV-PyMT expression cassette site-directed insertion into the safe harbor site of porcine COL1A1
1. Co-transfection
The plasmid pKG-U6gRNA (COL 1A1-g 3), plasmid pKG-GE3 and plasmid pKG-MMTV-PyMT (as shown in SEQ ID NO: 13) were co-transfected into porcine primary fibroblasts. Proportioning: about 20 ten thousand porcine primary fibroblasts: 0.89. Mu.g of plasmid pKG-U6gRNA (COL 1A1-g 3): 0.99. Mu.g plasmid pKG-GE3: 1.12. Mu.g of plasmid pKG-MMTV-PyMT, 3 DNA molar ratios: 3:1:1.
Co-transfection was performed by electric shock transfection using a mammalian nuclear transfection kit (Neon kit, thermofisher) and a Neon TM transfection system electrotransfection apparatus (parameters set to 1450V, 10ms, 3 pulses).
2. Puromycin pressure screening
1. Puromycin screening MMTV-PyMT expression cassette positive insert cells
After cells are subjected to plasmid electrotransformation for 48 hours, 1.5 mug/mL puromycin is added for screening, the culture medium containing puromycin with the same concentration is replaced every day, and after continuous screening for one week, all wild control cells die, and a large number of cells die after one week of plasmid electrotransformation screening of pKG-MMTV-PyMT due to lower electrotransformation efficiency; the puromycin is added continuously for screening for one week, cells only die sporadically, part of positive clones start to divide and proliferate, and the number of cells is increased continuously; the pressure screening was continued for one week to completely degrade the intracellular plasmid to exclude false positive cell clones. After three weeks of pressure screening, the pressure is stopped, and after two generations of culture, the cell state is recovered, the cell is used for monoclonal sorting.
2. Monoclonal sorting and amplifying culture
(1) Screening the puromycin for three weeks, carrying out monoclonal separation, carrying out digestion by using trypsin, neutralizing by using a complete culture medium, centrifuging for 5min by 500g, removing supernatant, re-suspending sediment by using 1mL of the complete culture medium, properly diluting, picking single cells by using an oral suction tube, transferring the single cells into a 96-well plate containing 100 mu L of the complete culture medium, picking one 96-well single cell per group of cells, placing each cell into a constant temperature incubator with 37 ℃ and 5% CO 2、5%O2 for culture, changing the cell culture medium (containing 1.5% puromycin) every 2-3 days, observing the growth condition of each cell by using a microscope during the period, and excluding the cell-free and non-single cell clone wells;
(2) After the wells of the 96-well plate were full of cells (about 2 weeks), cells were digested with trypsin and collected, 2/3 of the cells were inoculated into 6-well plates containing complete medium, and the remaining 1/3 of the cells were collected in 1.5mL centrifuge tubes for the next genotyping;
(3) When 6-well plate cells were grown to 50% confluence, they were digested with 0.25% (Gibco) trypsin and harvested, and frozen using cell cryopreservation solution (90% complete medium+10% DMSO, volume ratio).
3. Single cell clone genome level identification of pig COL1A1 safety harbor site fixed point inserted MMTV-PyMT expression frame
To examine whether the pig COL1A1 safe harbor site was successfully site-directed inserted into the MMTV-PyMT expression cassette. Taking single cell clone after puromycin pressurized screening, extracting genome DNA, performing PCR amplification (respectively adopting a primer pair formed by sh4-Lr-JDF1414 and sh4-Lr-JDR5965, a primer pair formed by sh4-Rr-JDF282 and sh4-Rr-JDR4723, and a primer pair formed by sh4-wt-JDF1085 and sh4-wt-JDR 1560), and then performing electrophoresis. Porcine primary adipose stem cells were used as wild-type controls. The primer pair consisting of sh4-Lr-JDF1414 and sh4-Lr-JDR5965 is used for identifying whether the MMTV-PyMT expression frame at the 5' end of the porcine COL1A1 safety harbor insertion site is successfully recombined; the primer pair consisting of sh4-Rr-JDF282 and sh4-Rr-JDR4723 is used for identifying whether the MMTV-PyMT expression frame at the 3' -end of the porcine COL1A1 safety harbor insertion site is successfully recombined; the primer pair consisting of sh4-wt-JDF1085 and sh4-wt-JDR1560 is used for identifying whether the MMTV-PyMT expression cassette for site-specific insertion of the safety harbor site of the pig COL1A1 is homozygous or heterozygous.
sh4-Lr-JDF1414:CCTGCTGTAAGTGCCGTAGT(SEQ ID NO:29)
sh4-Lr-JDR5965:CTAGGGGCACAGCACGTC(SEQ ID NO:30)
sh4-Rr-JDF282:AAGTTATTAGGTCTGAAGAGGAGTTT(SEQ ID NO:31)
sh4-Rr-JDR4723:CCCATCATTCCGTCCCAGAG(SEQ ID NO:32)
sh4-wt-JDF1085:TGCTGAGTTCTGGCTTCCTG(SEQ ID NO:33)
sh4-wt-JDR1560:TCTACCAAGAGAGTGACCAGCAG(SEQ ID NO:34)
The electrophoresis patterns are shown in fig. 15, 16 and 17, respectively. From the results of electrophoresis, we preliminarily determined that single cell clones numbered 1,2,3,4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 were clones that successfully inserted MMTV-PyMT at the safe harbor site of pig COL1A1, with single cell clones numbered 6, 10 being homozygous site-specific insertion and single cell clones numbered 1,2,3,4, 5, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 being heterozygous site-specific insertion (table 7).
TABLE 7 genotype of single cell clone of pig COL1A1 safety harbor site inserted with MMTV-PyMT expression cassette at fixed point
4. Cloning and producing breast cancer model pig by somatic cell nuclear transfer technology
1. Oocyte in vitro maturation
Pig ovaries were first taken from slaughter houses and oocyte collection and In Vitro Maturation (IVM) culture were performed. Cumulus oocyte complexes (Cumulus-oocyte complexes, COCs) were withdrawn from follicles 3 to 6mm in diameter. COCs with at least three layers of dense cumulus cells were selected and approximately 300-400 COCs were cultured in four-well plates containing IVM medium, wherein each well contained 200 μl of IVM medium for approximately 50 COCs. The COCs-containing plates were incubated at 38.5 ℃ in an incubator with 5% co 2 and saturated humidity for 42-44 hours.
2. Somatic Cell Nuclear Transfer (SCNT) and embryo transfer
The SCNT technical scheme is as follows: the cultured COCs were treated with 0.1% (w/v) hyaluronidase to remove cumulus cells. The first polar body and adjacent cytoplasm (containing the oocyte nucleus) in the perioval gap was removed by gentle aspiration in TLH-PVA solution using a beveled glass needle. Positive fibroblasts homozygous for MMTV-PyMT were injected into the peri-oval space of the enucleated oocyte. Embryos are reconstituted in a fusion medium (0.25 MD-sorbitol, 0.05mM Mg (C 2H3O2)2, 20Mg/mLBSA and, 0.5mM HEPES [ acid-free ]) using an electrofusion apparatus (LF 201, japanese NEPA Gene Co., ltd.) with a single direct current pulse fusion of 200V/mM for 20. Mu.s, then the embryos are cultured in PZM-3 for 0.5-1h, and then the embryos are activated with a single pulse of 150V/mM in an activation medium containing 0.25 MD-sorbitol ,0.01mM Ca(C2H3O2)2,0.05mM Mg(C2H3O2)2 and 0.1Mg/mLBSA for 100ms. The embryos are placed in a PZM-3 solution containing 5. Mu.g/mL cytochalasin B, equilibrated for 2h in an incubator with a saturated humidity of 38.5℃and 5% CO 2、5%O2 and 90% N 2, and then cultured in PZM-3 medium under the same culture conditions as described above until embryo transplantation.
SCNT embryos were transferred into oviducts of recipient sows. About 23 days after embryo transfer, pregnancy was confirmed using an ultrasonic scanner (HS-101V, toyota Kogyo Co., japan), and cloned piglets were born at 116-117 days. A total of 7 breast cancer model pigs were produced by 4 successfully pregnant sows.
5. Transcriptional level detection of pig PyMT gene of breast cancer model
In order to detect whether a model pig with COL1A1 safety harbor site inserted into an MMTV-PyMT expression frame can express mRNA of a PyMT gene or not, a pair of primers is designed in the MMTV-PyMT expression frame, mammary tissues of a PyMT model cloned pig and a non-modified control cloned pig (with the same cell source) are separated, total RNA is extracted, and the total RNA is reversely transcribed into cDNA for detecting the mRNA expression level of the PyMT gene of pig mammary cells, and meanwhile, the non-modified cloned pig mammary cells (called WT cells) are used as a control. The calculation was performed according to the method of 2 -ΔCt using beta-actin as a reference gene. For detailed procedures, reference is made to example 3 (IV, GFP gene transcription level assay).
Primer information is shown in Table 8:
TABLE 8 fluorescent quantitative PCR primer information
Data analysis was performed using SPSS statistical software, expressed as (mean ± standard deviation), and statistical analysis was performed using one-way analysis of variance. The results of the values of 2 -ΔCt show that the expression level of the PyMT gene of the modified pig mammary cells is significantly higher than that of the unmodified cloned pig mammary cells (figure 18).
In summary, according to the result of real-time fluorescence quantitative PCR, the PyMT gene is remarkably expressed in mammary cells of the modified breast cancer model pig.
6. FACS detection of protein expression level of pig PyMT gene in breast cancer model
To compare the expression of the PyMT gene in modified and unmodified porcine mammary cells. Mammary gland tissues of model pigs and control pigs are separated respectively, put into digestion liquid containing 250U/mL collagenase I, 150U/mL hyaluronidase, 1% penicillin and 1% streptomycin, shake digested for 1h at 37 ℃, and then the digestion liquid is neutralized by an equal volume of complete culture medium (DMEM/F12 cell culture liquid containing 10% FBS) and respectively pass through 100 mu m and 40 mu m cell sieves to obtain single mammary gland epithelial cells. Cells were washed with PBS, centrifuged, and the supernatant was discarded. 90% pre-chilled methanol at-20deg.C was added to fully re-suspend the cells, and the cells were fixed for 20min. After the fixation, centrifuging and discarding the fixation solution. Blocking was performed by adding 3% BSA for 1h. After the end of the sealing, centrifuging and discarding the sealing liquid. And the cells were resuspended in a specific dilution of the PyMT antibody (Abcam, ab 15085) after washing with complete medium (final antibody concentration 1:200 dilution) and incubated for 2h at room temperature. After the antibody incubation was completed, after washing with complete medium, goat anti-rat secondary antibody (Abcam, ab 150157) was added at a final concentration of 1: after 1000 dilution, incubation for 1h at room temperature, followed by extensive washing with complete medium, 500 μl of complete medium was added to resuspend the cells and the cell suspension was transferred into a flow tube. The PyMT antibody fluorescent signal was detected in the FITC channel of BD FACSMelody flow cytometer and 5×10 4 cells were collected for analysis, the results are shown in fig. 19. The results showed that the antibody fluorescence signal of PyMT was clearly detected in mammary epithelial cells (PyMT) of pig COL1A1 safety harbor site-directed insertion MMTV-PyMT, whereas no antibody fluorescence signal of PyMT was detected in wild-type mammary epithelial cells (WT) of control pigs, indicating a higher expression of inserted PyMT in mammary epithelial cells of breast cancer model pigs, and further indicating that breast cancer model pigs were successfully constructed.
Furthermore, the breast cancer model pig prepared by the application can be used in the biomedical fields such as drug screening, drug effect evaluation, gene and cell therapy, study of pathogenesis of breast cancer and the like in the next step.
The preferred embodiments of the present invention have been described in detail above, but the present invention is not limited to the specific details of the above embodiments, and various simple modifications can be made to the technical solution of the present invention within the scope of the technical concept of the present invention, and all the simple modifications belong to the protection scope of the present invention.
In addition, the specific features described in the above embodiments may be combined in any suitable manner, and in order to avoid unnecessary repetition, various possible combinations are not described further.
Sequence listing
<110> Nanjing Kidney Gene engineering Co., ltd
<120> Construction method and application of breast cancer model pig
<130> 1
<160> 43
<170> SIPOSequenceListing 1.0
<210> 1
<211> 8484
<212> DNA
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 1
gagggcctat ttcccatgat tccttcatat ttgcatatac gatacaaggc tgttagagag 60
ataattggaa ttaatttgac tgtaaacaca aagatattag tacaaaatac gtgacgtaga 120
aagtaataat ttcttgggta gtttgcagtt ttaaaattat gttttaaaat ggactatcat 180
atgcttaccg taacttgaaa gtatttcgat ttcttggctt tatatatctt gtggaaagga 240
cgaaacaccg ggtcttcgag aagacctgtt ttagagctag aaatagcaag ttaaaataag 300
gctagtccgt tatcaacttg aaaaagtggc accgagtcgg tgcttttttg ttttagagct 360
agaaatagca agttaaaata aggctagtcc gtttttagcg cgtgcgccaa ttctgcagac 420
aaatggctct agaggtaccc gttacataac ttacggtaaa tggcccgcct ggctgaccgc 480
ccaacgaccc ccgcccattg acgtcaatag taacgccaat agggactttc cattgacgtc 540
aatgggtgga gtatttacgg taaactgccc acttggcagt acatcaagtg tatcatatgc 600
caagtacgcc ccctattgac gtcaatgacg gtaaatggcc cgcctggcat tgtgcccagt 660
acatgacctt atgggacttt cctacttggc agtacatcta cgtattagtc atcgctatta 720
ccatggtcga ggtgagcccc acgttctgct tcactctccc catctccccc ccctccccac 780
ccccaatttt gtatttattt attttttaat tattttgtgc agcgatgggg gcgggggggg 840
ggggggggcg gggcgagggg cggggcgggg cgaggcggag aggtgcggcg gcagccaatc 900
agagcggcgc gctccgaaag tttcctttta tggcgaggcg gcggcggcgg cggccctata 960
aaaagcgaag cgcgcggcgg gcgggagtcg ctgcgcgctg ccttcgcccc gtgccccgct 1020
ccgccgccgc ctcgcgccgc ccgccccggc tctgactgac cgcgttactc ccacaggtga 1080
gcgggcggga cggcccttct cctccgggct gtaattagct gagcaagagg taagggttta 1140
agggatggtt ggttggtggg gtattaatgt ttaattacct ggagcacctg cctgaaatca 1200
ctttttttca ggttggaccg gtgccaccat ggactataag gaccacgacg gagactacaa 1260
ggatcatgat attgattaca aagacgatga cgataagatg gccccaaaga agaagcggaa 1320
ggtcggtatc cacggagtcc cagcagccga caagaagtac agcatcggcc tggacatcgg 1380
caccaactct gtgggctggg ccgtgatcac cgacgagtac aaggtgccca gcaagaaatt 1440
caaggtgctg ggcaacaccg accggcacag catcaagaag aacctgatcg gagccctgct 1500
gttcgacagc ggcgaaacag ccgaggccac ccggctgaag agaaccgcca gaagaagata 1560
caccagacgg aagaaccgga tctgctatct gcaagagatc ttcagcaacg agatggccaa 1620
ggtggacgac agcttcttcc acagactgga agagtccttc ctggtggaag aggataagaa 1680
gcacgagcgg caccccatct tcggcaacat cgtggacgag gtggcctacc acgagaagta 1740
ccccaccatc taccacctga gaaagaaact ggtggacagc accgacaagg ccgacctgcg 1800
gctgatctat ctggccctgg cccacatgat caagttccgg ggccacttcc tgatcgaggg 1860
cgacctgaac cccgacaaca gcgacgtgga caagctgttc atccagctgg tgcagaccta 1920
caaccagctg ttcgaggaaa accccatcaa cgccagcggc gtggacgcca aggccatcct 1980
gtctgccaga ctgagcaaga gcagacggct ggaaaatctg atcgcccagc tgcccggcga 2040
gaagaagaat ggcctgttcg gaaacctgat tgccctgagc ctgggcctga cccccaactt 2100
caagagcaac ttcgacctgg ccgaggatgc caaactgcag ctgagcaagg acacctacga 2160
cgacgacctg gacaacctgc tggcccagat cggcgaccag tacgccgacc tgtttctggc 2220
cgccaagaac ctgtccgacg ccatcctgct gagcgacatc ctgagagtga acaccgagat 2280
caccaaggcc cccctgagcg cctctatgat caagagatac gacgagcacc accaggacct 2340
gaccctgctg aaagctctcg tgcggcagca gctgcctgag aagtacaaag agattttctt 2400
cgaccagagc aagaacggct acgccggcta cattgacggc ggagccagcc aggaagagtt 2460
ctacaagttc atcaagccca tcctggaaaa gatggacggc accgaggaac tgctcgtgaa 2520
gctgaacaga gaggacctgc tgcggaagca gcggaccttc gacaacggca gcatccccca 2580
ccagatccac ctgggagagc tgcacgccat tctgcggcgg caggaagatt tttacccatt 2640
cctgaaggac aaccgggaaa agatcgagaa gatcctgacc ttccgcatcc cctactacgt 2700
gggccctctg gccaggggaa acagcagatt cgcctggatg accagaaaga gcgaggaaac 2760
catcaccccc tggaacttcg aggaagtggt ggacaagggc gcttccgccc agagcttcat 2820
cgagcggatg accaacttcg ataagaacct gcccaacgag aaggtgctgc ccaagcacag 2880
cctgctgtac gagtacttca ccgtgtataa cgagctgacc aaagtgaaat acgtgaccga 2940
gggaatgaga aagcccgcct tcctgagcgg cgagcagaaa aaggccatcg tggacctgct 3000
gttcaagacc aaccggaaag tgaccgtgaa gcagctgaaa gaggactact tcaagaaaat 3060
cgagtgcttc gactccgtgg aaatctccgg cgtggaagat cggttcaacg cctccctggg 3120
cacataccac gatctgctga aaattatcaa ggacaaggac ttcctggaca atgaggaaaa 3180
cgaggacatt ctggaagata tcgtgctgac cctgacactg tttgaggaca gagagatgat 3240
cgaggaacgg ctgaaaacct atgcccacct gttcgacgac aaagtgatga agcagctgaa 3300
gcggcggaga tacaccggct ggggcaggct gagccggaag ctgatcaacg gcatccggga 3360
caagcagtcc ggcaagacaa tcctggattt cctgaagtcc gacggcttcg ccaacagaaa 3420
cttcatgcag ctgatccacg acgacagcct gacctttaaa gaggacatcc agaaagccca 3480
ggtgtccggc cagggcgata gcctgcacga gcacattgcc aatctggccg gcagccccgc 3540
cattaagaag ggcatcctgc agacagtgaa ggtggtggac gagctcgtga aagtgatggg 3600
ccggcacaag cccgagaaca tcgtgatcga aatggccaga gagaaccaga ccacccagaa 3660
gggacagaag aacagccgcg agagaatgaa gcggatcgaa gagggcatca aagagctggg 3720
cagccagatc ctgaaagaac accccgtgga aaacacccag ctgcagaacg agaagctgta 3780
cctgtactac ctgcagaatg ggcgggatat gtacgtggac caggaactgg acatcaaccg 3840
gctgtccgac tacgatgtgg accatatcgt gcctcagagc tttctgaagg acgactccat 3900
cgacaacaag gtgctgacca gaagcgacaa gaaccggggc aagagcgaca acgtgccctc 3960
cgaagaggtc gtgaagaaga tgaagaacta ctggcggcag ctgctgaacg ccaagctgat 4020
tacccagaga aagttcgaca atctgaccaa ggccgagaga ggcggcctga gcgaactgga 4080
taaggccggc ttcatcaaga gacagctggt ggaaacccgg cagatcacaa agcacgtggc 4140
acagatcctg gactcccgga tgaacactaa gtacgacgag aatgacaagc tgatccggga 4200
agtgaaagtg atcaccctga agtccaagct ggtgtccgat ttccggaagg atttccagtt 4260
ttacaaagtg cgcgagatca acaactacca ccacgcccac gacgcctacc tgaacgccgt 4320
cgtgggaacc gccctgatca aaaagtaccc taagctggaa agcgagttcg tgtacggcga 4380
ctacaaggtg tacgacgtgc ggaagatgat cgccaagagc gagcaggaaa tcggcaaggc 4440
taccgccaag tacttcttct acagcaacat catgaacttt ttcaagaccg agattaccct 4500
ggccaacggc gagatccgga agcggcctct gatcgagaca aacggcgaaa ccggggagat 4560
cgtgtgggat aagggccggg attttgccac cgtgcggaaa gtgctgagca tgccccaagt 4620
gaatatcgtg aaaaagaccg aggtgcagac aggcggcttc agcaaagagt ctatcctgcc 4680
caagaggaac agcgataagc tgatcgccag aaagaaggac tgggacccta agaagtacgg 4740
cggcttcgac agccccaccg tggcctattc tgtgctggtg gtggccaaag tggaaaaggg 4800
caagtccaag aaactgaaga gtgtgaaaga gctgctgggg atcaccatca tggaaagaag 4860
cagcttcgag aagaatccca tcgactttct ggaagccaag ggctacaaag aagtgaaaaa 4920
ggacctgatc atcaagctgc ctaagtactc cctgttcgag ctggaaaacg gccggaagag 4980
aatgctggcc tctgccggcg aactgcagaa gggaaacgaa ctggccctgc cctccaaata 5040
tgtgaacttc ctgtacctgg ccagccacta tgagaagctg aagggctccc ccgaggataa 5100
tgagcagaaa cagctgtttg tggaacagca caagcactac ctggacgaga tcatcgagca 5160
gatcagcgag ttctccaaga gagtgatcct ggccgacgct aatctggaca aagtgctgtc 5220
cgcctacaac aagcaccggg ataagcccat cagagagcag gccgagaata tcatccacct 5280
gtttaccctg accaatctgg gagcccctgc cgccttcaag tactttgaca ccaccatcga 5340
ccggaagagg tacaccagca ccaaagaggt gctggacgcc accctgatcc accagagcat 5400
caccggcctg tacgagacac ggatcgacct gtctcagctg ggaggcgaca aaaggccggc 5460
ggccacgaaa aaggccggcc aggcaaaaaa gaaaaagtaa gaattcctag agctcgctga 5520
tcagcctcga ctgtgccttc tagttgccag ccatctgttg tttgcccctc ccccgtgcct 5580
tccttgaccc tggaaggtgc cactcccact gtcctttcct aataaaatga ggaaattgca 5640
tcgcattgtc tgagtaggtg tcattctatt ctggggggtg gggtggggca ggacagcaag 5700
ggggaggatt gggaagagaa tagcaggcat gctggggagc ggccgcagga acccctagtg 5760
atggagttgg ccactccctc tctgcgcgct cgctcgctca ctgaggccgg gcgaccaaag 5820
gtcgcccgac gcccgggctt tgcccgggcg gcctcagtga gcgagcgagc gcgcagctgc 5880
ctgcaggggc gcctgatgcg gtattttctc cttacgcatc tgtgcggtat ttcacaccgc 5940
atacgtcaaa gcaaccatag tacgcgccct gtagcggcgc attaagcgcg gcgggtgtgg 6000
tggttacgcg cagcgtgacc gctacacttg ccagcgcctt agcgcccgct cctttcgctt 6060
tcttcccttc ctttctcgcc acgttcgccg gctttccccg tcaagctcta aatcgggggc 6120
tccctttagg gttccgattt agtgctttac ggcacctcga ccccaaaaaa cttgatttgg 6180
gtgatggttc acgtagtggg ccatcgccct gatagacggt ttttcgccct ttgacgttgg 6240
agtccacgtt ctttaatagt ggactcttgt tccaaactgg aacaacactc aactctatct 6300
cgggctattc ttttgattta taagggattt tgccgatttc ggtctattgg ttaaaaaatg 6360
agctgattta acaaaaattt aacgcgaatt ttaacaaaat attaacgttt acaattttat 6420
ggtgcactct cagtacaatc tgctctgatg ccgcatagtt aagccagccc cgacacccgc 6480
caacacccgc tgacgcgccc tgacgggctt gtctgctccc ggcatccgct tacagacaag 6540
ctgtgaccgt ctccgggagc tgcatgtgtc agaggttttc accgtcatca ccgaaacgcg 6600
cgagacgaaa gggcctcgtg atacgcctat ttttataggt taatgtcatg ataataatgg 6660
tttcttagac gtcaggtggc acttttcggg gaaatgtgcg cggaacccct atttgtttat 6720
ttttctaaat acattcaaat atgtatccgc tcatgagaca ataaccctga taaatgcttc 6780
aataatattg aaaaaggaag agtatgagta ttcaacattt ccgtgtcgcc cttattccct 6840
tttttgcggc attttgcctt cctgtttttg ctcacccaga aacgctggtg aaagtaaaag 6900
atgctgaaga tcagttgggt gcacgagtgg gttacatcga actggatctc aacagcggta 6960
agatccttga gagttttcgc cccgaagaac gttttccaat gatgagcact tttaaagttc 7020
tgctatgtgg cgcggtatta tcccgtattg acgccgggca agagcaactc ggtcgccgca 7080
tacactattc tcagaatgac ttggttgagt actcaccagt cacagaaaag catcttacgg 7140
atggcatgac agtaagagaa ttatgcagtg ctgccataac catgagtgat aacactgcgg 7200
ccaacttact tctgacaacg atcggaggac cgaaggagct aaccgctttt ttgcacaaca 7260
tgggggatca tgtaactcgc cttgatcgtt gggaaccgga gctgaatgaa gccataccaa 7320
acgacgagcg tgacaccacg atgcctgtag caatggcaac aacgttgcgc aaactattaa 7380
ctggcgaact acttactcta gcttcccggc aacaattaat agactggatg gaggcggata 7440
aagttgcagg accacttctg cgctcggccc ttccggctgg ctggtttatt gctgataaat 7500
ctggagccgg tgagcgtgga agccgcggta tcattgcagc actggggcca gatggtaagc 7560
cctcccgtat cgtagttatc tacacgacgg ggagtcaggc aactatggat gaacgaaata 7620
gacagatcgc tgagataggt gcctcactga ttaagcattg gtaactgtca gaccaagttt 7680
actcatatat actttagatt gatttaaaac ttcattttta atttaaaagg atctaggtga 7740
agatcctttt tgataatctc atgaccaaaa tcccttaacg tgagttttcg ttccactgag 7800
cgtcagaccc cgtagaaaag atcaaaggat cttcttgaga tccttttttt ctgcgcgtaa 7860
tctgctgctt gcaaacaaaa aaaccaccgc taccagcggt ggtttgtttg ccggatcaag 7920
agctaccaac tctttttccg aaggtaactg gcttcagcag agcgcagata ccaaatactg 7980
ttcttctagt gtagccgtag ttaggccacc acttcaagaa ctctgtagca ccgcctacat 8040
acctcgctct gctaatcctg ttaccagtgg ctgctgccag tggcgataag tcgtgtctta 8100
ccgggttgga ctcaagacga tagttaccgg ataaggcgca gcggtcgggc tgaacggggg 8160
gttcgtgcac acagcccagc ttggagcgaa cgacctacac cgaactgaga tacctacagc 8220
gtgagctatg agaaagcgcc acgcttcccg aagggagaaa ggcggacagg tatccggtaa 8280
gcggcagggt cggaacagga gagcgcacga gggagcttcc agggggaaac gcctggtatc 8340
tttatagtcc tgtcgggttt cgccacctct gacttgagcg tcgatttttg tgatgctcgt 8400
caggggggcg gagcctatgg aaaaacgcca gcaacgcggc ctttttacgg ttcctggcct 8460
tttgctggcc ttttgctcac atgt 8484
<210> 2
<211> 10476
<212> DNA
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 2
gagggcctat ttcccatgat tccttcatat ttgcatatac gatacaaggc tgttagagag 60
ataattggaa ttaatttgac tgtaaacaca aagatattag tacaaaatac gtgacgtaga 120
aagtaataat ttcttgggta gtttgcagtt ttaaaattat gttttaaaat ggactatcat 180
atgcttaccg taacttgaaa gtatttcgat ttcttggctt tatatatctt gtggaaagga 240
cgaaacaccg ggtcttcgag aagacctgtt ttagagctag aaatagcaag ttaaaataag 300
gctagtccgt tatcaacttg aaaaagtggc accgagtcgg tgcttttttc tagcgcgtgc 360
gccaattctg cagacaaatg gctctagagg tacccgttac ataacttacg gtaaatggcc 420
cgcctggctg accgcccaac gacccccgcc cattgacgtc aatagtaacg ccaataggga 480
ctttccattg acgtcaatgg gtggagtatt tacggtaaac tgcccacttg gcagtacatc 540
aagtgtatca tatgccaagt acgcccccta ttgacgtcaa tgacggtaaa tggcccgcct 600
ggcattgtgc ccagtacatg accttatggg actttcctac ttggcagtac atctacgtat 660
tagtcatcgc tattaccatg ggggcagagc gcacatcgcc cacagtcccc gagaagttgg 720
ggggaggggt cggcaattga tccggtgcct agagaaggtg gcgcggggta aactgggaaa 780
gtgatgtcgt gtactggctc cgcctttttc ccgagggtgg gggagaaccg tatataagtg 840
cagtagtcgc cgtgaacgtt ctttttcgca acgggtttgc cgccagaaca caggttggac 900
cggtgccacc atggactata aggaccacga cggagactac aaggatcatg atattgatta 960
caaagacgat gacgataaga tggcccccaa aaagaaacga aaggtgggtg ggtccccaaa 1020
gaagaagcgg aaggtcggta tccacggagt cccagcagcc gacaagaagt acagcatcgg 1080
cctggacatc ggcaccaact ctgtgggctg ggccgtgatc accgacgagt acaaggtgcc 1140
cagcaagaaa ttcaaggtgc tgggcaacac cgaccggcac agcatcaaga agaacctgat 1200
cggagccctg ctgttcgaca gcggcgaaac agccgaggcc acccggctga agagaaccgc 1260
cagaagaaga tacaccagac ggaagaaccg gatctgctat ctgcaagaga tcttcagcaa 1320
cgagatggcc aaggtggacg acagcttctt ccacagactg gaagagtcct tcctggtgga 1380
agaggataag aagcacgagc ggcaccccat cttcggcaac atcgtggacg aggtggccta 1440
ccacgagaag taccccacca tctaccacct gagaaagaaa ctggtggaca gcaccgacaa 1500
ggccgacctg cggctgatct atctggccct ggcccacatg atcaagttcc ggggccactt 1560
cctgatcgag ggcgacctga accccgacaa cagcgacgtg gacaagctgt tcatccagct 1620
ggtgcagacc tacaaccagc tgttcgagga aaaccccatc aacgccagcg gcgtggacgc 1680
caaggccatc ctgtctgcca gactgagcaa gagcagacgg ctggaaaatc tgatcgccca 1740
gctgcccggc gagaagaaga atggcctgtt cggaaacctg attgccctga gcctgggcct 1800
gacccccaac ttcaagagca acttcgacct ggccgaggat gccaaactgc agctgagcaa 1860
ggacacctac gacgacgacc tggacaacct gctggcccag atcggcgacc agtacgccga 1920
cctgtttctg gccgccaaga acctgtccga cgccatcctg ctgagcgaca tcctgagagt 1980
gaacaccgag atcaccaagg cccccctgag cgcctctatg atcaagagat acgacgagca 2040
ccaccaggac ctgaccctgc tgaaagctct cgtgcggcag cagctgcctg agaagtacaa 2100
agagattttc ttcgaccaga gcaagaacgg ctacgccggc tacattgacg gcggagccag 2160
ccaggaagag ttctacaagt tcatcaagcc catcctggaa aagatggacg gcaccgagga 2220
actgctcgtg aagctgaaca gagaggacct gctgcggaag cagcggacct tcgacaacgg 2280
cagcatcccc caccagatcc acctgggaga gctgcacgcc attctgcggc ggcaggaaga 2340
tttttaccca ttcctgaagg acaaccggga aaagatcgag aagatcctga ccttccgcat 2400
cccctactac gtgggccctc tggccagggg aaacagcaga ttcgcctgga tgaccagaaa 2460
gagcgaggaa accatcaccc cctggaactt cgaggaagtg gtggacaagg gcgcttccgc 2520
ccagagcttc atcgagcgga tgaccaactt cgataagaac ctgcccaacg agaaggtgct 2580
gcccaagcac agcctgctgt acgagtactt caccgtgtat aacgagctga ccaaagtgaa 2640
atacgtgacc gagggaatga gaaagcccgc cttcctgagc ggcgagcaga aaaaggccat 2700
cgtggacctg ctgttcaaga ccaaccggaa agtgaccgtg aagcagctga aagaggacta 2760
cttcaagaaa atcgagtgct tcgactccgt ggaaatctcc ggcgtggaag atcggttcaa 2820
cgcctccctg ggcacatacc acgatctgct gaaaattatc aaggacaagg acttcctgga 2880
caatgaggaa aacgaggaca ttctggaaga tatcgtgctg accctgacac tgtttgagga 2940
cagagagatg atcgaggaac ggctgaaaac ctatgcccac ctgttcgacg acaaagtgat 3000
gaagcagctg aagcggcgga gatacaccgg ctggggcagg ctgagccgga agctgatcaa 3060
cggcatccgg gacaagcagt ccggcaagac aatcctggat ttcctgaagt ccgacggctt 3120
cgccaacaga aacttcatgc agctgatcca cgacgacagc ctgaccttta aagaggacat 3180
ccagaaagcc caggtgtccg gccagggcga tagcctgcac gagcacattg ccaatctggc 3240
cggcagcccc gccattaaga agggcatcct gcagacagtg aaggtggtgg acgagctcgt 3300
gaaagtgatg ggccggcaca agcccgagaa catcgtgatc gaaatggcca gagagaacca 3360
gaccacccag aagggacaga agaacagccg cgagagaatg aagcggatcg aagagggcat 3420
caaagagctg ggcagccaga tcctgaaaga acaccccgtg gaaaacaccc agctgcagaa 3480
cgagaagctg tacctgtact acctgcagaa tgggcgggat atgtacgtgg accaggaact 3540
ggacatcaac cggctgtccg actacgatgt ggaccatatc gtgcctcaga gctttctgaa 3600
ggacgactcc atcgacaaca aggtgctgac cagaagcgac aagaaccggg gcaagagcga 3660
caacgtgccc tccgaagagg tcgtgaagaa gatgaagaac tactggcggc agctgctgaa 3720
cgccaagctg attacccaga gaaagttcga caatctgacc aaggccgaga gaggcggcct 3780
gagcgaactg gataaggccg gcttcatcaa gagacagctg gtggaaaccc ggcagatcac 3840
aaagcacgtg gcacagatcc tggactcccg gatgaacact aagtacgacg agaatgacaa 3900
gctgatccgg gaagtgaaag tgatcaccct gaagtccaag ctggtgtccg atttccggaa 3960
ggatttccag ttttacaaag tgcgcgagat caacaactac caccacgccc acgacgccta 4020
cctgaacgcc gtcgtgggaa ccgccctgat caaaaagtac cctaagctgg aaagcgagtt 4080
cgtgtacggc gactacaagg tgtacgacgt gcggaagatg atcgccaaga gcgagcagga 4140
aatcggcaag gctaccgcca agtacttctt ctacagcaac atcatgaact ttttcaagac 4200
cgagattacc ctggccaacg gcgagatccg gaagcggcct ctgatcgaga caaacggcga 4260
aaccggggag atcgtgtggg ataagggccg ggattttgcc accgtgcgga aagtgctgag 4320
catgccccaa gtgaatatcg tgaaaaagac cgaggtgcag acaggcggct tcagcaaaga 4380
gtctatcctg cccaagagga acagcgataa gctgatcgcc agaaagaagg actgggaccc 4440
taagaagtac ggcggcttcg acagccccac cgtggcctat tctgtgctgg tggtggccaa 4500
agtggaaaag ggcaagtcca agaaactgaa gagtgtgaaa gagctgctgg ggatcaccat 4560
catggaaaga agcagcttcg agaagaatcc catcgacttt ctggaagcca agggctacaa 4620
agaagtgaaa aaggacctga tcatcaagct gcctaagtac tccctgttcg agctggaaaa 4680
cggccggaag agaatgctgg cctctgccgg cgaactgcag aagggaaacg aactggccct 4740
gccctccaaa tatgtgaact tcctgtacct ggccagccac tatgagaagc tgaagggctc 4800
ccccgaggat aatgagcaga aacagctgtt tgtggaacag cacaagcact acctggacga 4860
gatcatcgag cagatcagcg agttctccaa gagagtgatc ctggccgacg ctaatctgga 4920
caaagtgctg tccgcctaca acaagcaccg ggataagccc atcagagagc aggccgagaa 4980
tatcatccac ctgtttaccc tgaccaatct gggagcccct gccgccttca agtactttga 5040
caccaccatc gaccggaaga ggtacaccag caccaaagag gtgctggacg ccaccctgat 5100
ccaccagagc atcaccggcc tgtacgagac acggatcgac ctgtctcagc tgggaggcga 5160
caaaaggccg gcggccacga aaaaggccgg ccaggcaaaa aagaaaaagg gcggctccaa 5220
gcggcctgcc gcgacgaaga aagcgggaca ggccaagaaa aagaaaggat ccggcgcaac 5280
aaacttctct ctgctgaaac aagccggaga tgtcgaagag aatcctggac cggtgagcaa 5340
gggcgaggag ctgttcaccg gggtggtgcc catcctggtc gagctggacg gcgacgtaaa 5400
cggccacaag ttcagcgtgt ccggcgaggg cgagggcgat gccacctacg gcaagctgac 5460
cctgaagttc atctgcacca ccggcaagct gcccgtgccc tggcccaccc tcgtgaccac 5520
cctgacctac ggcgtgcagt gcttcagccg ctaccccgac cacatgaagc agcacgactt 5580
cttcaagtcc gccatgcccg aaggctacgt ccaggagcgc accatcttct tcaaggacga 5640
cggcaactac aagacccgcg ccgaggtgaa gttcgagggc gacaccctgg tgaaccgcat 5700
cgagctgaag ggcatcgact tcaaggagga cggcaacatc ctggggcaca agctggagta 5760
caactacaac agccacaacg tctatatcat ggccgacaag cagaagaacg gcatcaaggt 5820
gaacttcaag atccgccaca acatcgagga cggcagcgtg cagctcgccg accactacca 5880
gcagaacacc cccatcggcg acggccccgt gctgctgccc gacaaccact acctgagcac 5940
ccagtccgcc ctgagcaaag accccaacga gaagcgcgat cacatggtcc tgctggagtt 6000
cgtgaccgcc gccgggatca ctctcggcat ggacgagctg tacaagggct ccggcgaggg 6060
caggggaagt cttctaacat gcggggacgt ggaggaaaat cccggcccaa ccgagtacaa 6120
gcccacggtg cgcctcgcca cccgcgacga cgtccccagg gccgtacgca ccctcgccgc 6180
cgcgttcgcc gactaccccg ccacgcgcca caccgtcgat ccggaccgcc acatcgagcg 6240
ggtcaccgag ctgcaagaac tcttcctcac gcgcgtcggg ctcgacatcg gcaaggtgtg 6300
ggtcgcggac gacggcgccg cggtggcggt ctggaccacg ccggagagcg tcgaagcggg 6360
ggcggtgttc gccgagatcg gcccgcgcat ggccgagttg agcggttccc ggctggccgc 6420
gcagcaacag atggaaggcc tcctggcgcc gcaccggccc aaggagcccg cgtggttcct 6480
ggccaccgtc ggagtctcgc ccgaccacca gggcaagggt ctgggcagcg ccgtcgtgct 6540
ccccggagtg gaggcggccg agcgcgccgg ggtgcccgcc ttcctggaga cctccgcgcc 6600
ccgcaacctc cccttctacg agcggctcgg cttcaccgtc accgccgacg tcgaggtgcc 6660
cgaaggaccg cgcacctggt gcatgacccg caagcccggt gcctgaacgc gttaagtcga 6720
caatcaacct ctggattaca aaatttgtga aagattgact ggtattctta actatgttgc 6780
tccttttacg ctatgtggat acgctgcttt aatgcctttg tatcatgcta ttgcttcccg 6840
tatggctttc attttctcct ccttgtataa atcctggttg ctgtctcttt atgaggagtt 6900
gtggcccgtt gtcaggcaac gtggcgtggt gtgcactgtg tttgctgacg caacccccac 6960
tggttggggc attgccacca cctgtcagct cctttccggg actttcgctt tccccctccc 7020
tattgccacg gcggaactca tcgccgcctg ccttgcccgc tgctggacag gggctcggct 7080
gttgggcact gacaattccg tggtgttgtc ggggaaatca tcgtcctttc cttggctgct 7140
cgcctgtgtt gccacctgga ttctgcgcgg gacgtccttc tgctacgtcc cttcggccct 7200
caatccagcg gaccttcctt cccgcggcct gctgccggct ctgcggcctc ttccgcgtct 7260
tcgccttcgc cctcagacga gtcggatctc cctttgggcc gcctccccgc gtcgacttta 7320
agaccaatga cttacaaggc agctgtagat cttagccact ttttaaaaga aaagggggga 7380
ctggaagggc taattcactc ccaacgaaga caagatctgc tttttgcttg tactgggtct 7440
ctctggttag accagatctg agcctgggag ctctctggct aactagggaa cccactgctt 7500
aagcctcaat aaagcttgcc ttgagtgctt caagtagtgt gtgcccgtct gttgtgtgac 7560
tctggtaact agagatccct cagacccttt tagtcagtgt ggaaaatctc tagcagggcc 7620
cgtttaaacc cgctgatcag cctcgactgt gccttctagt tgccagccat ctgttgtttg 7680
cccctccccc gtgccttcct tgaccctgga aggtgccact cccactgtcc tttcctaata 7740
aaatgaggaa attgcatcgc attgtctgag taggtgtcat tctattctgg ggggtggggt 7800
ggggcaggac agcaaggggg aggattggga agacaatagc aggcatgctg gggatgcggt 7860
gggctctatg gcctgcaggg gcgcctgatg cggtattttc tccttacgca tctgtgcggt 7920
atttcacacc gcatacgtca aagcaaccat agtacgcgcc ctgtagcggc gcattaagcg 7980
cggcgggtgt ggtggttacg cgcagcgtga ccgctacact tgccagcgcc ttagcgcccg 8040
ctcctttcgc tttcttccct tcctttctcg ccacgttcgc cggctttccc cgtcaagctc 8100
taaatcgggg gctcccttta gggttccgat ttagtgcttt acggcacctc gaccccaaaa 8160
aacttgattt gggtgatggt tcacgtagtg ggccatcgcc ctgatagacg gtttttcgcc 8220
ctttgacgtt ggagtccacg ttctttaata gtggactctt gttccaaact ggaacaacac 8280
tcaactctat ctcgggctat tcttttgatt tataagggat tttgccgatt tcggtctatt 8340
ggttaaaaaa tgagctgatt taacaaaaat ttaacgcgaa ttttaacaaa atattaacgt 8400
ttacaatttt atggtgcact ctcagtacaa tctgctctga tgccgcatag ttaagccagc 8460
cccgacaccc gccaacaccc gctgacgcgc cctgacgggc ttgtctgctc ccggcatccg 8520
cttacagaca agctgtgacc gtctccggga gctgcatgtg tcagaggttt tcaccgtcat 8580
caccgaaacg cgcgagacga aagggcctcg tgatacgcct atttttatag gttaatgtca 8640
tgataataat ggtttcttag acgtcaggtg gcacttttcg gggaaatgtg cgcggaaccc 8700
ctatttgttt atttttctaa atacattcaa atatgtatcc gctcatgaga caataaccct 8760
gataaatgct tcaataatat tgaaaaagga agagtatgag tattcaacat ttccgtgtcg 8820
cccttattcc cttttttgcg gcattttgcc ttcctgtttt tgctcaccca gaaacgctgg 8880
tgaaagtaaa agatgctgaa gatcagttgg gtgcacgagt gggttacatc gaactggatc 8940
tcaacagcgg taagatcctt gagagttttc gccccgaaga acgttttcca atgatgagca 9000
cttttaaagt tctgctatgt ggcgcggtat tatcccgtat tgacgccggg caagagcaac 9060
tcggtcgccg catacactat tctcagaatg acttggttga gtactcacca gtcacagaaa 9120
agcatcttac ggatggcatg acagtaagag aattatgcag tgctgccata accatgagtg 9180
ataacactgc ggccaactta cttctgacaa cgatcggagg accgaaggag ctaaccgctt 9240
ttttgcacaa catgggggat catgtaactc gccttgatcg ttgggaaccg gagctgaatg 9300
aagccatacc aaacgacgag cgtgacacca cgatgcctgt agcaatggca acaacgttgc 9360
gcaaactatt aactggcgaa ctacttactc tagcttcccg gcaacaatta atagactgga 9420
tggaggcgga taaagttgca ggaccacttc tgcgctcggc ccttccggct ggctggttta 9480
ttgctgataa atctggagcc ggtgagcgtg gaagccgcgg tatcattgca gcactggggc 9540
cagatggtaa gccctcccgt atcgtagtta tctacacgac ggggagtcag gcaactatgg 9600
atgaacgaaa tagacagatc gctgagatag gtgcctcact gattaagcat tggtaactgt 9660
cagaccaagt ttactcatat atactttaga ttgatttaaa acttcatttt taatttaaaa 9720
ggatctaggt gaagatcctt tttgataatc tcatgaccaa aatcccttaa cgtgagtttt 9780
cgttccactg agcgtcagac cccgtagaaa agatcaaagg atcttcttga gatccttttt 9840
ttctgcgcgt aatctgctgc ttgcaaacaa aaaaaccacc gctaccagcg gtggtttgtt 9900
tgccggatca agagctacca actctttttc cgaaggtaac tggcttcagc agagcgcaga 9960
taccaaatac tgttcttcta gtgtagccgt agttaggcca ccacttcaag aactctgtag 10020
caccgcctac atacctcgct ctgctaatcc tgttaccagt ggctgctgcc agtggcgata 10080
agtcgtgtct taccgggttg gactcaagac gatagttacc ggataaggcg cagcggtcgg 10140
gctgaacggg gggttcgtgc acacagccca gcttggagcg aacgacctac accgaactga 10200
gatacctaca gcgtgagcta tgagaaagcg ccacgcttcc cgaagggaga aaggcggaca 10260
ggtatccggt aagcggcagg gtcggaacag gagagcgcac gagggagctt ccagggggaa 10320
acgcctggta tctttatagt cctgtcgggt ttcgccacct ctgacttgag cgtcgatttt 10380
tgtgatgctc gtcagggggg cggagcctat ggaaaaacgc cagcaacgcg gcctttttac 10440
ggttcctggc cttttgctgg ccttttgctc acatgt 10476
<210> 3
<211> 3120
<212> DNA
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 3
gacgaaaggg cctcgtgata cgcctatttt tataggttaa tgtcatgata ataatggttt 60
cttagacgtc aggtggcact tttcggggaa atgtgcgcgg aacccctatt tgtttatttt 120
tctaaataca ttcaaatatg tatccgctca tgagacaata accctgataa atgcttcaat 180
aatattgaaa aaggaagagt atgagtattc aacatttccg tgtcgccctt attccctttt 240
ttgcggcatt ttgccttcct gtttttgctc acccagaaac gctggtgaaa gtaaaagatg 300
ctgaagatca gttgggtgca cgagtgggtt acatcgaact ggatctcaac agcggtaaga 360
tccttgagag ttttcgcccc gaagaacgtt ttccaatgat gagcactttt aaagttctgc 420
tatgtggcgc ggtattatcc cgtattgacg ccgggcaaga gcaactcggt cgccgcatac 480
actattctca gaatgacttg gttgagtact caccagtcac agaaaagcat cttacggatg 540
gcatgacagt aagagaatta tgcagtgctg ccataaccat gagtgataac actgcggcca 600
acttacttct gacaacgatc ggaggaccga aggagctaac cgcttttttg cacaacatgg 660
gggatcatgt aactcgcctt gatcgttggg aaccggagct gaatgaagcc ataccaaacg 720
acgagcgtga caccacgatg cctgtagcaa tggcaacaac gttgcgcaaa ctattaactg 780
gcgaactact tactctagct tcccggcaac aattaataga ctggatggag gcggataaag 840
ttgcaggacc acttctgcgc tcggcccttc cggctggctg gtttattgct gataaatctg 900
gagccggtga gcgtgggtct cgcggtatca ttgcagcact ggggccagat ggtaagccct 960
cccgtatcgt agttatctac acgacgggga gtcaggcaac tatggatgaa cgaaatagac 1020
agatcgctga gataggtgcc tcactgatta agcattggta actgtcagac caagtttact 1080
catatatact ttagattgat ttaaaacttc atttttaatt taaaaggatc taggtgaaga 1140
tcctttttga taatctcatg accaaaatcc cttaacgtga gttttcgttc cactgagcgt 1200
cagaccccgt agaaaagatc aaaggatctt cttgagatcc tttttttctg cgcgtaatct 1260
gctgcttgca aacaaaaaaa ccaccgctac cagcggtggt ttgtttgccg gatcaagagc 1320
taccaactct ttttccgaag gtaactggct tcagcagagc gcagatacca aatactgttc 1380
ttctagtgta gccgtagtta ggccaccact tcaagaactc tgtagcaccg cctacatacc 1440
tcgctctgct aatcctgtta ccagtggctg ctgccagtgg cgataagtcg tgtcttaccg 1500
ggttggactc aagacgatag ttaccggata aggcgcagcg gtcgggctga acggggggtt 1560
cgtgcacaca gcccagcttg gagcgaacga cctacaccga actgagatac ctacagcgtg 1620
agctatgaga aagcgccacg cttcccgaag ggagaaaggc ggacaggtat ccggtaagcg 1680
gcagggtcgg aacaggagag cgcacgaggg agcttccagg gggaaacgcc tggtatcttt 1740
atagtcctgt cgggtttcgc cacctctgac ttgagcgtcg atttttgtga tgctcgtcag 1800
gggggcggag cctatggaaa aacgccagca acgcggcctt tttacggttc ctggcctttt 1860
gctggccttt tgctcacatg ttctttcctg cgttatcccc tgattctgtg gataaccgta 1920
ttaccgcctt tgagtgagct gataccgctc gccgcagccg aacgaccgag cgcagcgagt 1980
cagtgagcga ggaagcggaa gagcgcccaa tacgcaaacc gcctctcccc gcgcgttggc 2040
cgattcatta atgcagctgg cacgacaggt ttcccgactg gaaagcgggc agtgagcgca 2100
acgcaattaa tgtgagttag ctcactcatt aggcacccca ggctttacac tttatgcttc 2160
cggctcgtat gttgtgtgga attgtgagcg gataacaatt tcacacagga aacagctatg 2220
accatgatta cgccaagctt gcatgcaggc ctctgcagtc gacgggcccg ggatccgatg 2280
ataaacatgt gagggcctat ttcccatgat tccttcatat ttgcatatac gatacaaggc 2340
tgttagagag ataattggaa ttaatttgac tgtaaacaca aagatattag tacaaaatac 2400
gtgacgtaga aagtaataat ttcttgggta gtttgcagtt ttaaaattat gttttaaaat 2460
ggactatcat atgcttaccg taacttgaaa gtatttcgat ttcttggctt tatatatctt 2520
gtggaaagga cgaaacaccg ggtcttcgag aagacctgtt ttagagctag aaatagcaag 2580
ttaaaataag gctagtccgt tatcaacttg aaaaagtggc accgagtcgg tgcttttttc 2640
tagcgcgtgc gccaattctg cagacaaatg gctctagagg tacccataga tctagatgca 2700
ttcgcgaggt accgagctcg aattcactgg ccgtcgtttt acaacgtcgt gactgggaaa 2760
accctggcgt tacccaactt aatcgccttg cagcacatcc ccctttcgcc agctggcgta 2820
atagcgaaga ggcccgcacc gatcgccctt cccaacagtt gcgcagcctg aatggcgaat 2880
ggcgcctgat gcggtatttt ctccttacgc atctgtgcgg tatttcacac cgcatatggt 2940
gcactctcag tacaatctgc tctgatgccg catagttaag ccagccccga cacccgccaa 3000
cacccgctga cgcgccctga cgggcttgtc tgctcccggc atccgcttac agacaagctg 3060
tgaccgtctc cgggagctgc atgtgtcaga ggttttcacc gtcatcaccg aaacgcgcga 3120
<210> 4
<211> 14138
<212> DNA
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 4
ggcgcgccct ctacctgctc tcggacccgt gggggtgggg ggtggaggaa ggagtggggg 60
gtcggtcctg ctggcttgtg ggtgggaggc gcatgttctc caaaaacccg cgcgagctgc 120
aatcctgagg gagctgcagt ggaggaggcg gagagaaggc cgcacccttc tccgcagggg 180
gaggggagtg ccgcaatacc tttatgggag ttctctgctg cctccttttc ctaaggaccg 240
ccctgggcct agaaaaatcc ctccctcccc cgcgatctcg tcatcgcctc catgtcagtt 300
tgctccttct cgattatggg cgggattctt ttgccctggc gcgccccaga cccgggcctg 360
gggggcaagt cggggggcgg ggggaggtcg ggcagggtcc cctgggagga tggggacgtg 420
ctgtgcccct agcggccacc agagggcacc aggacaccac tgcggtcggc tcagcggctc 480
ctgccctggt cagggggcgc caggtcctgc ccctcctggg gagggcgggg ggcgagaagg 540
gcgattttaa ttaacccacg tttcaacatg cacatcccag taatttggaa acattttgtt 600
tccaaagatt cacttaacat tggtttagca acatgaagct ttctatgcaa cccaaggact 660
cagtttttgg cctgttttag tgacaggcaa tcagcaacat gctgcatttc tctccagtgt 720
tgtaatcaaa gaaaccctcc catagcttta aatgatattc cttccccttc caattatgtg 780
gggggaaaac aaccctattc tccacccaga agtgttaact caagaattac attttcaaga 840
agtttccaga ttcgtaaaac cagaattaga tgtctttcac ctaaatgtct cggtgttgac 900
caaaggaaca cacaggtttc tcatttaact tttttaatgg gtctcaaaat tctgtgacaa 960
atttttggtc aagttgtttc cattaaaaag tactgatttt aaaaactaat aacttaaaac 1020
tgccacacgc aaaaaagaaa accaaagtgg tccacaaaac attctccttt ccttctgaag 1080
gttttacgat gcattgttat cattaaccag tcttttacta ctaaacttaa atggccaatt 1140
gaaacaaaca gttctgagac cgttcttcca ccactgatta agagtggggt ggcaggtatt 1200
agggataatg ctagcttact tgtacagctc gtccatgccg agagtgatcc cggcggcggt 1260
cacgaactcc agcaggacca tgtgatcgcg cttctcgttg gggtctttgc tcagggcgga 1320
ctgggtgctc aggtagtggt tgtcgggcag cagcacgggg ccgtcgccga tgggggtgtt 1380
ctgctggtag tggtcggcga gctgcacgct gccgtcctcg atgttgtggc ggatcttgaa 1440
gttcaccttg atgccgttct tctgcttgtc ggccatgata tagacgttgt ggctgttgta 1500
gttgtactcc agcttgtgcc ccaggatgtt gccgtcctcc ttgaagtcga tgcccttcag 1560
ctcgatgcgg ttcaccaggg tgtcgccctc gaacttcacc tcggcgcggg tcttgtagtt 1620
gccgtcgtcc ttgaagaaga tggtgcgctc ctggacgtag ccttcgggca tggcggactt 1680
gaagaagtcg tgctgcttca tgtggtcggg gtagcggctg aagcactgca cgccgtaggt 1740
cagggtggtc acgagggtgg gccagggcac gggcagcttg ccggtggtgc agatgaactt 1800
cagggtcagc ttgccgtagg tggcatcgcc ctcgccctcg ccggacacgc tgaacttgtg 1860
gccgtttacg tcgccgtcca gctcgaccag gatgggcacc accccggtga acagctcctc 1920
gcccttgctc accatggtgg cgtcgaccgt acgtcacgac acctgaaatg gaagaaaaaa 1980
actttgaacc actgtctgag gcttgagaat gaaccaagat ccaaactcaa aaagggcaaa 2040
ttccaaggag aattacatca agtgccaagc tggcctaact tcagtctcca cccactcagt 2100
gtggggaaac tccatcgcat aaaacccctc cccccaacct aaagacgacg tactccaaaa 2160
gctcgagaac taatcgaggt gcctggacgg cgcccggtac tccgtggagt cacatgaagc 2220
gacggctgag gacggaaagg cccttttcct ttgtgtgggt gactcacccg cccgctctcc 2280
cgagcgccgc gtcctccatt ttgagctccc tgcagcaggg ccgggaagcg gccatctttc 2340
cgctcacgca actggtgccg accgggccag ccttgccgcc cagggcgggg cgatacacgg 2400
cggcgcgagg ccaggcacca gagcaggccg gccagcttga gactaccccc gtccgattct 2460
cggtggccgc gctcgcaggc cccgcctcgc cgaacatgtg cgctgggacg cacgggcccc 2520
gtcgccgccc gcggccccaa aaaccgaaat accagtgtgc agatcttggc ccgcatttac 2580
aagactatct tgccagaaaa aaagcgtcgc agcaggtcat caaaaatttt aaatggctag 2640
agacttatcg aaagcagcga gacaggcgcg aaggtgccac cagattcgca cgcggcggcc 2700
ccagcgccca ggccaggcct caactcaagc acgaggcgaa ggggctcctt aagcgcaagg 2760
cctcgaactc tcccacccac ttccaacccg aagctcggga tcaagaatca cgtactgcag 2820
ccagtggaag taattcaagg cacgcaaggg ccataacccg taaagaggcc aggcccgcgg 2880
gaaccacaca cggcacttac ctgtgttctg gcggcaaacc cgttgcgaaa aagaacgttc 2940
acggcgacta ctgcacttat atacggttct cccccaccct cgggaaaaag gcggagccag 3000
tacacgacat cactttccca gtttaccccg cgccaccttc tctaggcacc ggttcaattg 3060
ccgacccctc cccccaactt ctcggggact gtgggcgatg tgcgctctgc ccactgacgg 3120
gcaccggagc cctagattcg attccctttg gggcaaaact caccgcctaa tcccctataa 3180
ctctaccggg gagcccggtg gagagcagac gggctgacgc tgccacctgc cggccatccc 3240
aggataggac cgccgtattc aagtcgccct caggaaggac cctcggggca ccagaggcct 3300
tcgaagcccc aatgagtgag gcaactgagg gtcgcgggtg ccattacaag gcccagccaa 3360
ggcctagagc caaggcttga accgtggggg acccccaagc cccacctgcc caggaacagc 3420
agacactggg acactttgtt tcaggtcctg cccaggcccc tcccactgtg aggctgggat 3480
ttgtcgccca gggtgcagat gagaagagtg gggaaagcag tcctgagcca ggaaattcta 3540
ccgggtaggg gaggcgcttt tcccaaggca gtctggagca tgcgctttag cagccccgct 3600
gggcacttgg cgctacacaa gtggcctctg gcctcgcaca cattccacat ccaccggtag 3660
gcgccaaccg gctccgttct ttggtggccc cttcgcgcca ccttctactc ctcccctagt 3720
caggaagttc ccccccgccc cgcagctcgc gtcgtgcagg acgtgacaaa tggaagtagc 3780
acgtctcact agtctcgtgc agatggacag caccgctgag caatggaagc gggtaggcct 3840
ttggggcagc ggccaatagc agctttgctc cttcgctttc tgggctcaga ggctgggaag 3900
gggtgggtcc gggggcgggc tcaggggcgg gctcaggggc ggggcgggcg cccgaaggtc 3960
ctccggaggc ccggcattct gcacgcttca aaagcgcacg tctgccgcgc tgttctcctc 4020
ttcctcatct ccgggccttt cgacctccta gggccaccat ggtgagcaag ggcgaggacg 4080
acaacatggc catcatcaag gagttcatgc gcttcaaggt gcacatggag ggctccgtga 4140
acggccacga gttcgagatc gagggcgagg gcgagggccg cccctacgag ggcacccaga 4200
ccgccaagct gaaggtgacc aagggcggcc ccctgccctt cgcctgggac atcctgtccc 4260
ctcagttcat gtacggctcc aaggcctacg tgaagcaccc cgccgacatc cccgactact 4320
tgaagctgtc cttccccgag ggcttcaagt gggagcgcgt gatgaacttc gaggacggcg 4380
gcgtggtgac cgtgacccag gactcctccc tgcaggacgg cgagttcatc tacaaggtga 4440
agctgcgcgg caccaacttc ccctccgacg gccccgtaat gcagaagaag accatgggct 4500
gggaggcctc ctccgagcgg atgtaccccg aggacggcgc cctgaagggc gagatcaagc 4560
agaggctgaa gctgaaggac ggcggccact acgacgccga ggtcaagacc acctacaagg 4620
ccaagaagcc cgtgcagctg cccggcgcct acaacgtcaa catcaagctg gacatcacct 4680
cccacaacga ggactacacc atcgtggaac agtacgagcg cgccgagggc cgccactcca 4740
ccggcggcat ggacgagctg tacaagtgag gatccgctga tcagcctcga ctgtgccttc 4800
tagttgccag ccatctgttg tttgcccctc ccccgtgcct tccttgaccc tggaaggtgc 4860
cactcccact gtcctttcct aataaaatga ggaaattgca tcgcattgtc tgagtaggtg 4920
tcattctatt ctggggggtg gggtggggca ggacagcaag ggggaggatt gggaagacaa 4980
tagcaggcat gctggggatg cggtgggctc tatggcttct gaggcggaaa gaacccttct 5040
gaggcggaaa gaaccagctg ccttaatata acttcgtata atgtatgcta tacgaagtta 5100
ttaggtctga agaggagttt acgtccagcc aattctgtgg aatgtgtgtc agttagggtg 5160
tggaaagtcc ccaggctccc cagcaggcag aagtatgcaa agcatgcatc tcaattagtc 5220
agcaaccagg tgtggaaagt ccccaggctc cccagcaggc agaagtatgc aaagcatgca 5280
tctcaattag tcagcaacca tagtcccgcc cctaactccg cccatcccgc ccctaactcc 5340
gcccagttcc gcccattctc cgccccatgg ctgactaatt ttttttattt atgcagaggc 5400
cgaggccgcc tctgcctctg agctattcca gaagtagtga ggaggctttt ttggaggcct 5460
aggcttttgc aaaaagctcc cgggagcttg tatatccatt ttcggcggcc gcgccaccat 5520
gaccgagtac aagcccacgg tgcgcctcgc cacccgcgac gacgtcccca gggccgtacg 5580
caccctcgcc gccgcgttcg ccgactaccc cgccacgcgc cacaccgtcg atccggaccg 5640
ccacatcgag cgggtcaccg agctgcaaga actcttcctc acgcgcgtcg ggctcgacat 5700
cggcaaggtg tgggtcgcgg acgacggcgc cgcggtggcg gtctggacca cgccggagag 5760
cgtcgaagcg ggggcggtgt tcgccgagat cggcccgcgc atggccgagt tgagcggttc 5820
ccggctggcc gcgcagcaac agatggaagg cctcctggcg ccgcaccggc ccaaggagcc 5880
cgcgtggttc ctggccaccg tcggagtctc gcccgaccac cagggcaagg gtctgggcag 5940
cgccgtcgtg ctccccggag tggaggcggc cgagcgcgcc ggggtgcccg ccttcctgga 6000
gacctccgcg ccccgcaacc tccccttcta cgagcggctc ggcttcaccg tcaccgccga 6060
cgtcgaggtg cccgaaggac cgcgcacctg gtgcatgacc cgcaagcccg gtgcctgaga 6120
attcgcggga ctctggggtt cgaaatgacc gaccaagcga cgcccaacct gccatcacga 6180
gatttcgatt ccaccgccgc cttctatgaa aggttgggct tcggaatcgt tttccgggac 6240
gccggctgga tgatcctcca gcgcggggat ctcatgctgg agttcttcgc ccaccccaac 6300
ttgtttattg cagcttataa tggttacaaa taaagcaata gcatcacaaa tttcacaaat 6360
aaagcatttt tttcactgca ttctagttgt ggtttgtcca aactcatcaa tgtatcttat 6420
catgtctgta taccgctcga ctagagcttg cggaaccctt aatataactt cgtataatgt 6480
atgctatacg aagttattag gtccgctggc catctacgag ccaaagactt tcaaatcttt 6540
ggctgccttg gccagtagga ggcgacacga aggatttgct gctgccttgg gggatgggaa 6600
ggaacctgaa ggcatttttt ccagagtggt gcagtaccac tgaggactgt tgctgtattg 6660
attaggaaaa gagacagagt aatttgcagt ttgtttgatt tatactgggc tgcaggtcga 6720
gggatcttca taagagaaga gggacagcta tgactgggag tagtcaggag aggaggaaaa 6780
atctggctag taaaacatgt aaggaaaatt ttagggatgt taaagaaaaa aataacacaa 6840
aacaaaatat aaaaaaaatc taacctcaag tcaaggcttt tctatggaat aaggaatgga 6900
cagcaggggg ctgtttcata tactgatgac ctctttatag ccacctttgt tcatggcagc 6960
cagcatatgg catatgttgc caaactctaa accaaatact cattctgatg ttttaaatga 7020
tttgccctcc catatgtcct tccgagtgag agacacaaaa aattccaaca cactattgca 7080
atgaaaataa atttccttta ttagccagaa gtcagatgct caaggggctt catgatgtcc 7140
ccataatttt tggcagaggg aaaaagatct cagtggtatt tgtgagccag ggcattggcc 7200
acaccagcca ccaccttctg ataggcagcc tgcggtacct tacatggtgg cgaattcgtt 7260
tgccaaaatg atgagacagc acaataacca gcacgttgcc caggagctgt aggaaaaaga 7320
agaaggcatg aacatggtta gcagaggctc tagagccgcc ggtcacacgc cagaagccga 7380
accccgccct gccccgtccc ccccgaaggc agccgtcccc ctgcggcagc cccgaggctg 7440
gagatggaga aggggacggc ggcgcggcga cgcacgaagg ccctccccgc ccatttcctt 7500
cctgccggcg ccgcaccgct tcgcccgcgc ccgctagagg gggtgcggcg gcgcctccca 7560
gatttcggct ccgccagatt tgggacaaag gaagtccctg cgccctctcg cacgattacc 7620
ataaaaggca atggctgcgg ctcgccgcgc ctcgacagcc gccggcgctc cggggccgcc 7680
gcgcccctcc cccgagccct ccccggcccg aggcggcccc gccccgcccg gcacccccac 7740
ctgccgccac cccccgcccg gcacggcgag ccccgcgcca cgccccgcac ggagccccgc 7800
acccgaagcc gggccgtgct cagcaactcg gggagggggg tgcagggggg ggttacagcc 7860
cgaccgccgc gcccacaccc cctgctcacc cccccacgca cacaccccgc acgcagcctt 7920
tgttcccctc gcagcccccc cgcaccgcgg ggcaccgccc ccggccgcgc tcccctcgcg 7980
cacacgcgga gcgcacaaag ccccgcgccg cgcccgcagc gctcacagcc gccgggcagc 8040
gcgggccgca cgcggcgctc cccacgcaca cacacacgca cgcacccccc gagccgctcc 8100
cccccgcaca aagggccctc ccggagccct ttaaggcttt cacgcagcca cagaaaagaa 8160
acgagccgtc attaaaccaa gcgctaatta cagcccggag gagaagggcc gtcccgcccg 8220
ctcacctgtg ggagtaacgc ggtcagtcag agccggggcg ggcggcgcga ggcggcgcgg 8280
agcggggcac ggggcgaagg caacgcagcg actcccgccc gccgcgcgct tcgcttttta 8340
tagggccgcc gccgccgccg cctcgccata aaaggaaact ttcggagcgc gccgctctga 8400
ttggctgccg ccgcacctct ccgcctcgcc ccgccccgcc cctcgccccg ccccgccccg 8460
cctggcgcgc gccccccccc cccccgcccc catcgctgca caaaataatt aaaaaataaa 8520
taaatacaaa attgggggtg gggagggggg ggagatgggg agagtgaagc agaacgtggg 8580
gctcacctcg acccatggta atagcgatga ctaatacgta gatgtactgc caagtaggaa 8640
agtcccataa ggtcatgtac tgggcataat gccaggcggg ccatttaccg tcattgacgt 8700
caataggggg cgtacttggc atatgataca cttgatgtac tgccaagtgg gcagtttacc 8760
gtaaatagtc cacccattga cgtcaatgga aagtccctat tggcgttact atgggaacat 8820
acgtcattat tgacgtcaat gggcgggggt cgttgggcgg tcagccaggc gggccattta 8880
ccgtaagtta tgtaacgcgg aactccatat atgggctatg aactaatgac cccgtaattg 8940
attactatta ataactagtc aataatcaat gtcgtaaatg tcgtaaatgt ctcagctagt 9000
caggtagtaa aaggtgtcaa ctaggcagtg gcagagcagg attcaaattc agggctgttg 9060
tgatgcctcc gcagactctg agcgccacct ggtggtaatt tgtctgtgcc tcttctgacg 9120
tggaagaaca gcaactaaca cactaacacg gcatttacta tgggccagcc attgtacgcg 9180
ttgcttaacc tgattcttgg gcgttgtcct gcaggggatt gagcaggtgt acgaggacga 9240
gcccaatttc tctatattcc cacagtcttg agtttgtgtc acaaaataat tatagtgggg 9300
tggagatggg aaatgagtcc aggcaacacc taagcctgat tttatgcatt gagactgcgt 9360
gttattacta aagatctttg tgtcgcaatt tcctgatgaa gggagatagg ttaaaaagca 9420
cggatctact gagttttaca gtcatcccat ttgtagactt ttgctacacc accaaagtat 9480
agcatctgag attaaatatt aatctccaaa ccttaggccc cctcacttgc atccttacgg 9540
tcagataact ctcactcata ctttaagccc attttgtttg ttgtacttgc tcatccagtc 9600
ccagacatag cattggcttt ctcctcacct gttttaggta gccagcaagt catgaaatca 9660
gataagttcc accaccaatt aacactaccc atcttgagca taggcccaac agtgcattta 9720
ttcctcattt actgatgttc gtgaatattt accttgattt tcattttttt ctttttctta 9780
agctgggatt ttactcctga ccctattcac agtcagatga tcttgactac cactgcgatt 9840
ggacctgagg ttcagcaata ctccccttta tgtcttttga atacttttca ataaatctgt 9900
ttgtattttc attagttagt aactgagctc agttgccgta atgctaatag cttccaaact 9960
agtgtctctg tctccagtat ctgataaatc ttaggtgttg ctgggacagt tgtcctaaaa 10020
ttaagataaa gcatgaaaat aactgacaca actccattac tggctcctaa ctacttaaac 10080
aatgcattct atcatcacaa atgtgaaaaa ggagttccct cagtggacta accttatctt 10140
ttctcaacac ctttttcttt gcacaatttt ccacacatgc ctacaaaaag tacttatgcg 10200
gccgccataa aagttttgtt actttataga agaaattttg agtttttgtt ttttttaata 10260
aataaataaa cataaataaa ttgtttgttg aatttattat tagtatgtaa gtgtaaatat 10320
aataaaactt aatatctatt caaattaata aataaacctc gatatacaga ccgataaaac 10380
acatgcgtca attttacaca tgattatctt taacgtacgt cacaatatga ttatctttct 10440
agggttaatc tagctgcgtg ttctgcagcg tgtcgagcat cttcatctgc tccatcacgc 10500
tgtaaaacac atttgcaccg cgagtctgcc cgtcctccac gggttcaaaa acgtgaatga 10560
acgaggcgcg ctcactggcc gtcgttttac aacgtcgtga ctgggaaaac cctggcgtta 10620
cccaacttaa tcgccttgca gcacatcccc ctttcgccag ctggcgtaat agcgaagagg 10680
cccgcaccga tcgcccttcc caacagttgc gcagcctgaa tggcgaatgg gacgcgccct 10740
gtagcggcgc attaagcgcg gcgggtgtgg tggttacgcg cagcgtgacc gctacacttg 10800
ccagcgccct agcgcccgct cctttcgctt tcttcccttc ctttctcgcc acgttcgccg 10860
gctttccccg tcaagctcta aatcgggggc tccctttagg gttccgattt agtgctttac 10920
ggcacctcga ccccaaaaaa cttgattagg gtgatggttc acgtagtggg ccatcgccct 10980
gatagacggt ttttcgccct ttgacgttgg agtccacgtt ctttaatagt ggactcttgt 11040
tccaaactgg aacaacactc aaccctatct cggtctattc ttttgattta taagggattt 11100
tgccgatttc ggcctattgg ttaaaaaatg agctgattta acaaaaattt aacgcgaatt 11160
ttaacaaaat attaacgctt acaatttagg tggcactttt cggggaaatg tgcgcggaac 11220
ccctatttgt ttatttttct aaatacattc aaatatgtat ccgctcatga gacaataacc 11280
ctgataaatg cttcaataat attgaaaaag gaagagtatg agtattcaac atttccgtgt 11340
cgcccttatt cccttttttg cggcattttg ccttcctgtt tttgctcacc cagaaacgct 11400
ggtgaaagta aaagatgctg aagatcagtt gggtgcacga gtgggttaca tcgaactgga 11460
tctcaacagc ggtaagatcc ttgagagttt tcgccccgaa gaacgttttc caatgatgag 11520
cacttttaaa gttctgctat gtggcgcggt attatcccgt attgacgccg ggcaagagca 11580
actcggtcgc cgcatacact attctcagaa tgacttggtt gagtactcac cagtcacaga 11640
aaagcatctt acggatggca tgacagtaag agaattatgc agtgctgcca taaccatgag 11700
tgataacact gcggccaact tacttctgac aacgatcgga ggaccgaagg agctaaccgc 11760
ttttttgcac aacatggggg atcatgtaac tcgccttgat cgttgggaac cggagctgaa 11820
tgaagccata ccaaacgacg agcgtgacac cacgatgcct gtagcaatgg caacaacgtt 11880
gcgcaaacta ttaactggcg aactacttac tctagcttcc cggcaacaat taatagactg 11940
gatggaggcg gataaagttg caggaccact tctgcgctcg gcccttccgg ctggctggtt 12000
tattgctgat aaatctggag ccggtgagcg tggttcacgc ggtatcattg cagcactggg 12060
gccagatggt aagccctccc gtatcgtagt tatctacacg acggggagtc aggcaactat 12120
ggatgaacga aatagacaga tcgctgagat aggtgcctca ctgattaagc attggtaact 12180
gtcagaccaa gtttactcat atatacttta gattgattta aaacttcatt tttaatttaa 12240
aaggatctag gtgaagatcc tttttgataa tctcatgacc aaaatccctt aacgtgagtt 12300
ttcgttccac tgagcgtcag accccgtaga aaagatcaaa ggatcttctt gagatccttt 12360
ttttctgcgc gtaatctgct gcttgcaaac aaaaaaacca ccgctaccag cggtggtttg 12420
tttgccggat caagagctac caactctttt tccgaaggta actggcttca gcagagcgca 12480
gataccaaat actgtccttc tagtgtagcc gtagttaggc caccacttca agaactctgt 12540
agcaccgcct acatacctcg ctctgctaat cctgttacca gtggctgctg ccagtggcga 12600
taagtcgtgt cttaccgggt tggactcaag acgatagtta ccggataagg cgcagcggtc 12660
gggctgaacg gggggttcgt gcacacagcc cagcttggag cgaacgacct acaccgaact 12720
gagataccta cagcgtgagc tatgagaaag cgccacgctt cccgaaggga gaaaggcgga 12780
caggtatccg gtaagcggca gggtcggaac aggagagcgc acgagggagc ttccaggggg 12840
aaacgcctgg tatctttata gtcctgtcgg gtttcgccac ctctgacttg agcgtcgatt 12900
tttgtgatgc tcgtcagggg ggcggagcct atggaaaaac gccagcaacg cggccttttt 12960
acggttcctg gccttttgct ggccttttgc tcacatgttc tttcctgcgt tatcccctga 13020
ttctgtggat aaccgtatta ccgcctttga gtgagctgat accgctcgcc gcagccgaac 13080
gaccgagcgc agcgagtcag tgagcgagga agcggaagag cgcccaatac gcaaaccgcc 13140
tctccccgcg cgttggccga ttcattaatg cagctggcac gacaggtttc ccgactggaa 13200
agcgggcagt gagcgcaacg caattaatgt gagttagctc actcattagg caccccaggc 13260
tttacacttt atgcttccgg ctcgtatgtt gtgtggaatt gtgagcggat aacaatttca 13320
cacaggaaac agctatgacc atgattacgc caagcgcgcc cgccgggtaa ctcacggggt 13380
atccatgtcc atttctgcgg catccagcca ggatacccgt cctcgctgac gtaatatccc 13440
agcgccgcac cgctgtcatt aatctgcaca ccggcacggc agttccggct gtcgccggta 13500
ttgttcgggt tgctgatgcg cttcgggctg accatccgga actgtgtccg gaaaagccgc 13560
gacgaactgg tatcccaggt ggcctgaacg aacagttcac cgttaaaggc gtgcatggcc 13620
acaccttccc gaatcatcat ggtaaacgtg cgttttcgct caacgtcaat gcagcagcag 13680
tcatcctcgg caaactcttt ccatgccgct tcaacctcgc gggaaaaggc acgggcttct 13740
tcctccccga tgcccagata gcgccagctt gggcgatgac tgagccggaa aaaagacccg 13800
acgatatgat cctgatgcag ctagattaac cctagaaaga tagtctgcgt aaaattgacg 13860
catgcattct tgaaatattg ctctctcttt ctaaatagcg cgaatccgtc gctgtgcatt 13920
taggacatct cagtcgccgc ttggagctcc cgtgaggcgt gcttgtcaat gcggtaagtg 13980
tcactgattt tgaactataa cgaccgcgtg agtcaaaatg acgcatgatt atcttttacg 14040
tgacttttaa gatttaactc atacgataat tatattgtta tttcatgttc tacttacgtg 14100
ataacttatt atatatatat tttcttgtta tagatatc 14138
<210> 5
<211> 345
<212> DNA
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 5
ggcgcgccct ctacctgctc tcggacccgt gggggtgggg ggtggaggaa ggagtggggg 60
gtcggtcctg ctggcttgtg ggtgggaggc gcatgttctc caaaaacccg cgcgagctgc 120
aatcctgagg gagctgcagt ggaggaggcg gagagaaggc cgcacccttc tccgcagggg 180
gaggggagtg ccgcaatacc tttatgggag ttctctgctg cctccttttc ctaaggaccg 240
ccctgggcct agaaaaatcc ctccctcccc cgcgatctcg tcatcgcctc catgtcagtt 300
tgctccttct cgattatggg cgggattctt ttgccctggc gcgcc 345
<210> 6
<211> 1012
<212> DNA
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 6
cttaacctga ttcttgggcg ttgtcctgca ggggattgag caggtgtacg aggacgagcc 60
caatttctct atattcccac agtcttgagt ttgtgtcaca aaataattat agtggggtgg 120
agatgggaaa tgagtccagg caacacctaa gcctgatttt atgcattgag actgcgtgtt 180
attactaaag atctttgtgt cgcaatttcc tgatgaaggg agataggtta aaaagcacgg 240
atctactgag ttttacagtc atcccatttg tagacttttg ctacaccacc aaagtatagc 300
atctgagatt aaatattaat ctccaaacct taggccccct cacttgcatc cttacggtca 360
gataactctc actcatactt taagcccatt ttgtttgttg tacttgctca tccagtccca 420
gacatagcat tggctttctc ctcacctgtt ttaggtagcc agcaagtcat gaaatcagat 480
aagttccacc accaattaac actacccatc ttgagcatag gcccaacagt gcatttattc 540
ctcatttact gatgttcgtg aatatttacc ttgattttca tttttttctt tttcttaagc 600
tgggatttta ctcctgaccc tattcacagt cagatgatct tgactaccac tgcgattgga 660
cctgaggttc agcaatactc ccctttatgt cttttgaata cttttcaata aatctgtttg 720
tattttcatt agttagtaac tgagctcagt tgccgtaatg ctaatagctt ccaaactagt 780
gtctctgtct ccagtatctg ataaatctta ggtgttgctg ggacagttgt cctaaaatta 840
agataaagca tgaaaataac tgacacaact ccattactgg ctcctaacta cttaaacaat 900
gcattctatc atcacaaatg tgaaaaagga gttccctcag tggactaacc ttatcttttc 960
tcaacacctt tttctttgca caattttcca cacatgccta caaaaagtac tt 1012
<210> 7
<211> 1073
<212> DNA
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 7
gtgctgagtc cttttcccat cccacccacc tggagctccc ctcttccagt cctgagccac 60
ttgaactggc ctggtttttg ccatcctgcg ctgccctctc tccggactcg agccactgct 120
gagggcctca ggccagtcca tcctcgtctt gtctctttcg ccctgctctt tccccacctt 180
gagcgctctt aaccagcctg gcccgtgcca cctctactct gccatcgaat gctgccccac 240
tttctcgagt ccgccacttc tcccagcttc accggtaccc actgtttccc ctagtccagg 300
caggtaccac tttccctgag cgtcctcctc ctctctcctg ggcctgtgct gcttcttttc 360
ccgctctctg gcctgggccg tttcttcggc cagcccccga gccttccatg ccctttcctt 420
caggtttctg ctcttcatcc ttggtctctg ccatctgttg ccatgtaagg gtgctctttc 480
ctgagccatc gccctcaagg cgctctgctc ctcaagtgga tgcttccctc gcctggctca 540
cctcctgctc tctctcctgc ccccttcacc tgcgtgccct cctcattctc cctctgtgcc 600
acctctggcc ttgcactgta ggctctctct tggggatgtt tctccttctc cacacacttc 660
tctttcactc tgtcctcttg ctttgtgtgg gcctgcagcg ttaccctttt ttctgggcac 720
actcagagca ccctcctctt tctggttctg ggccacctgt ctgtcctcgg gtcatcttgc 780
tctctctgcc tggatgccct cctgtggctt tgggcagctt ctccctcctt cagagtgcac 840
cgccagttct cctaggcccg gtcacttccc cttcccaggg gacctagagc cctgctaggt 900
cctctctctc cacaacctgg gcccccaaac ctttccaaaa caccttgctt tctgcctcca 960
ttggtcttgt gttccagagc cagagtcact atatgtccca gaaccaggat tccctctggt 1020
tctgagggct tttatcgcat cccctgcctg gctgcagtgg gtctttgggc gcc 1073
<210> 8
<211> 260
<212> DNA
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 8
gacaggccac agaagagcct ctactcctcc ctctgtcccc gaggctgtct ccctcccagt 60
cttcccagct caggccagtc cccaggcctc tcttccctgc cagagcccgt caggttcggt 120
tactttgggg cccagagagg accctgtgaa ggaagcgtgg gtaggggcac gggaatgggg 180
aggatgcctg aagaggcccc cttagccaga agaggagcag aagaggagca ggtacccaga 240
agaggagcag ttcagggaaa 260
<210> 9
<211> 546
<212> DNA
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 9
aaatacccac gtttattggg acaaaagttg ttagggaaaa tggggcctca gagttatgat 60
tcaagtcata attctttcca tttataattt cactcgagac tctgttaact gattccttgt 120
gtgttgtatc ttactcctca gctcacaatt acttttagtt attcacctta actgtatgaa 180
taacagtgga gaaaaggatt ctaccagaat actctaatta tggttttgag tcccctttcc 240
agactgaaga tttttcagtc tttttgatct gaggtgattt ttcagtcttt tcgatctgag 300
gtgacagtct caagctcctc aattcaccca gtctcttgat acttgtccat ttagggccac 360
caaagctact ttgacttcat actagagagt caattaatga ggccattctc tgatggacag 420
gtgaagcagg caaggtgact atattttgac taaacggtag aaaacagcct gagtgttaac 480
agtgtagcct ataaaaccca gagctgccca ccctgatcta aacttccagg aacataagaa 540
cgcgcc 546
<210> 10
<211> 1009
<212> DNA
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 10
agtaggtcac atttcagtaa aacctggctt tgtggattga gcatggtctg tctcttcctg 60
gtacttcatt agtcccctaa gtgggatttg ctgagcaaga ctcctcaatt acagaaatac 120
tccagtttag aattctcgca aaggcttttt gtttccacaa gtagaatcta gaaagcaatc 180
tcaagtaaca acagcagaga cctgaatccc aatccatctt tcctgtgtgt cctcttttac 240
ctccttccct ttcatgttga accaacagtc ctttttcagt ctagaagcta gtacgaaaga 300
aatgtacaga tgtaggtacc aagcaaagcc attagccaat aactggtgag atggagctaa 360
gaggaaataa aagtgttcct aagaatagca cagcagaagc tagatccaca gatcttaaaa 420
caattttggt tgagtaagag tagaggcaaa agaggaagct aataatgcag tttttaggag 480
ctaagagcca gataaagggt aagggcagga ggaagtgcta tctcagctaa cgagatacat 540
gaaacaacgg tggaagtcca gcaggcacaa gatgagttga gaagcaatca gggccagaag 600
gatgtgcaag gcctcaaaat aaaaaagcac agggccacag ggaaccttat ggaaattaaa 660
aggaagagga tgcagtcagg agaggaaaaa atagtgctcc ctcccccatg cccaaggaag 720
cagctgagca gccagtactt gggaagttag tagtaataag ttggtaagag ggagttctgt 780
tcgtggctca atggttaaca aatcagacta gaaaccgtga ggttgcgggt ttgatccctg 840
gccttgctca gtgggttaag gatccggcat tgccgtgacc tgtggtgtag gtcacagacg 900
tggctcagtt cccgcattcc tgtggctctg gtgtaggctg gtggctacag ctctgattag 960
acccctaggc tgggaacctc catatgccct ggaagtggcc gtagaaaag 1009
<210> 11
<211> 878
<212> DNA
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 11
ggatggggac tcatgtgaat tttctaaagg tgctatttaa acggggggca cgagtgccgg 60
ctttggacag ggccgctcgc tctccaccct ttcttcttcc ccctcggccg cctctcaccc 120
cctgaggcct ctctcccccc acgacctcct ctctctcctc tgaaaccctc tcctcctcag 180
ctgcatccca ccctcgtggc ctctctctct ctctgtctgt cctgtgtcct ctctcactgg 240
gtttcagagc acagatgccc aaagcacaaa agcagttttc ccctggggtg ggaggaagca 300
agagactttg tacctatttt gtatgtgtat aataatttga gatgttttta attattttga 360
ttgctggaat aaagcatgtg gaaatgaccc aaaccaatct tgcactggcc tcctgatttc 420
cttccttgga gacggaggga gggggagacc tgggggaggg cgcttggggg ggggtgggct 480
ctcttctttc tgcgctcccc ccccccacct ccaacacctt gacgacccct cctgcttccg 540
cttgcctttc tcaggcttta acactttctc ctcgccctct cagcatgcgc atgcgcgtgc 600
ctctacctcc cccgcacatc ctggcctgcc caccctgaat ggcctggccc agcgatgcca 660
ccaactctct cgctccgtcc acggctgggg aggggggcac tctgcagggt tggggggcac 720
tgggaggctg ggttgggtga gggaggggtg cctgggcccc caccccccag caagttctct 780
ccctaggcga actggagggt cgtctggcct cttgagcctt gttgctggct ctgagctcta 840
ccaagagagt gaccagcagg accgcaccat cacgcgcc 878
<210> 12
<211> 727
<212> DNA
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 12
gtggttgctg agactgcgtg ggggcccaag gagacctgga gaaaggaatg cttcctgctc 60
cttcttctgg ggccccagga gagccttccc agggccttgg agaggtgctg tccagggact 120
aaccctgtgc tctaggaagg ctgcaggccc tgaccagctg ggcaggtcct gggtccctcc 180
tggccttcta agttccccaa acatgagacc tctgggtgtg gggtggcctg gggaggtcat 240
tttgcccagg ccctacctcc tgcccattcc taaccctttt taaaaatctg tgcgtcctct 300
tcttccttct tctccctccc ttcccttttc gctcaccctc tgctgctggc ctgagagccg 360
gaggccccca gggggaaggc gactggtctc ctccccagtc tcagggaagg gagacagaga 420
atccaggaag ccagaactca gcagacgaag cacccaggga cctagagatg ggttgaaaag 480
ttgacagctg tcccacctgc ctcccaaggt ctcagggcct aaacctccaa ggcaggaaag 540
gcccctgtcc ctccctgggg tccatagaaa gagggacaag tctgcacgga ccatttgctg 600
taatattaac accttggctg tcattaggta gtcttggctg ttaattatgt cctgtgataa 660
tgtattatta gcacgccgac cacatagggt agggaactgc agctagtaaa caaaagtttg 720
ttcctat 727
<210> 13
<211> 10339
<212> DNA
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 13
ggatggggac tcatgtgaat tttctaaagg tgctatttaa acggggggca cgagtgccgg 60
ctttggacag ggccgctcgc tctccaccct ttcttcttcc ccctcggccg cctctcaccc 120
cctgaggcct ctctcccccc acgacctcct ctctctcctc tgaaaccctc tcctcctcag 180
ctgcatccca ccctcgtggc ctctctctct ctctgtctgt cctgtgtcct ctctcactgg 240
gtttcagagc acagatgccc aaagcacaaa agcagttttc ccctggggtg ggaggaagca 300
agagactttg tacctatttt gtatgtgtat aataatttga gatgttttta attattttga 360
ttgctggaat aaagcatgtg gaaatgaccc aaaccaatct tgcactggcc tcctgatttc 420
cttccttgga gacggaggga gggggagacc tgggggaggg cgcttggggg ggggtgggct 480
ctcttctttc tgcgctcccc ccccccacct ccaacacctt gacgacccct cctgcttccg 540
cttgcctttc tcaggcttta acactttctc ctcgccctct cagcatgcgc atgcgcgtgc 600
ctctacctcc cccgcacatc ctggcctgcc caccctgaat ggcctggccc agcgatgcca 660
ccaactctct cgctccgtcc acggctgggg aggggggcac tctgcagggt tggggggcac 720
tgggaggctg ggttgggtga gggaggggtg cctgggcccc caccccccag caagttctct 780
ccctaggcga actggagggt cgtctggcct cttgagcctt gttgctggct ctgagctcta 840
ccaagagagt gaccagcagg accgcaccat cacgcgcccc agacccgggc ctggggggca 900
agtcgggggg cggggggagg tcgggcaggg tcccctggga ggatggggac gtgctgtgcc 960
cctagcggcc accagagggc accaggacac cactgcggtc ggctcagcgg ctcctgccct 1020
ggtcaggggg cgccaggtcc tgcccctcct ggggagggcg gggggcgaga agggcgattg 1080
cagaaatggt tgaactcccg agagtgtcct acacctaggg gagaagcagc caaggggttg 1140
tttcccacca aggacgaccc gtctgcgcac aaacggatga gcccatcaga caaagacata 1200
ttcattctct gctgcaaact tggcatagct ctgctttgcc tggggctatt gggggaagtt 1260
gcggttcgtg ctcgcagggc tctcaccctt gactctttca ataataactc ttctgtgcaa 1320
gattacaatc taaacaattc ggagaactcg accttcctcc tgaggcaagg accacagcca 1380
acttcctctt acaagccgca tcgattttgt ccttcagaaa tagaaataag aatgcttgct 1440
aaaaattata tttttaccaa taagaccaat ccaataggta gattattagt tactatgtta 1500
agaaatgaat cattatcttt tagtactatt tttactcaaa ttcagaagtt agaaatggga 1560
atagaaaata gaaagagacg ctcaacctca attgaagaac aggtgcaagg actattgacc 1620
acaggcctag aagtaaaaaa gggaaaaaag agtgtttttg tcaaaatagg agacaggtgg 1680
tggcaaccag ggacttatag gggaccttac atctacagac caacagatgc ccccttacca 1740
tatacaggaa gatatgactt aaattgggat aggtgggtta cagtcaatgg ctataaagtg 1800
ttatatagat ccctcccctt tcgtgaaaga ctcgccagag ctagacctcc ttggtgtatg 1860
ttgtctcaag aaaagaaaga cgacatgaaa caacaggtac atgattatat ttatctagga 1920
acaggaatgc acttttgggg aaagattttc cataccaagg aggggacagt ggctggacta 1980
atagaacatt attctgcaaa aacttatggc atgagttatt atgattagcc ttgatttgcc 2040
caaccttgcg gttcccaagg cttaagtaag tttttggtta caaactgttc ttaaaacaag 2100
gatgtgagac aagtggtttc ctgacttggt ttggtatcaa aggttctgat ctgagctctg 2160
agtgttctat tttcctatgt tcttttggaa tttatccaaa tcttatgtaa atgcttatgt 2220
aaaccaagat ataaaagagt gctgattttt tgagtaaact tgcaacagtc ctaacattca 2280
cctcttgtgt gtttgtgtct gttcgccatc ccgtctccgc tcgtcactta tccttcactt 2340
tccagagggt ccccccgcag accccggcga ccctcaggtc ggccgactgc ggcatctaga 2400
gccaccatgg atagagttct gagcagagct gacaaagaaa ggctgctaga acttctaaaa 2460
cttcccagac aactatgggg ggattttgga agaatgcagc aggcatataa gcagcagtca 2520
ctgctactgc acccagacaa aggtggaagc catgccttaa tgcaggaatt gaacagtctc 2580
tggggaacat ttaaaactga agtatacaat ctgagaatga atctaggagg aaccggcttc 2640
caggtaagaa ggctacatgc ggatgggtgg aatctaagta ccaaagacac ctttggtgat 2700
agatactacc agcggttctg cagaatgcct cttacctgcc tagtaaatgt taaatacagc 2760
tcatgtagtt gtatattatg cctgcttaga aagcaacata gagagctcaa agacaaatgt 2820
gatgccaggt gcctagtact tggagaatgt ttttgtcttg aatgttacat gcaatggttt 2880
ggaacaccaa cccgagatgt gctgaacctg tatgcagact tcattgcaag catgcctata 2940
gactggctgg acctggatgt gcacagcgtg tataatccaa aacggcggag cgaggaactg 3000
aggagagcgg ccacagtcca ctacacgatg actactggtc attcagctat ggaagcaagt 3060
acttcacaag ggaatggaat gatttcttca gaaagtggga ccccagctac cagtcgccgc 3120
ctaagactgc cgagtcttct gagcaacccg acctattctg ttatgaggag ccactcctat 3180
cccccaaccc gagttctcca acagatacac ccgcacatac tgctggaaga agacgaaatc 3240
cttgtgttgc tgagcccgat gacagcatat ccccggaccc ccccagaact cctgtatcca 3300
gaaagcgacc aagaccagct ggagccactg gaggaggagg aggaggagta catgccaatg 3360
gaggatctgt atttggacat cctaccgggg gaacaagtac cccagctcat ccccccccct 3420
atcattccca gggcgggtct gagtccatgg gagggtctga ttcttcggga tttgcagagg 3480
gctcatttcg atccgatcct agatgcgagt cagagaatga gagctactca cagagctgct 3540
ctcagagctc attcaatgca acgccaccta agaaggctag ggaggaccct gctcctagtg 3600
actttcctag cagccttact gggtatttgt ctcatgctat ttattctaat aaaacgttcc 3660
cggcatttct agggccgcga ctctagagtc ggggcggccg gccgcttcga gcagacatga 3720
ctgtgccttc tagttgccag ccatctgttg tttgcccctc ccccgtgcct tccttgaccc 3780
tggaaggtgc cactcccact gtcctttcct aataaaatga ggaaattgca tcgcattgtc 3840
tgagtaggtg tcattctatt ctggggggtg gggtggggca ggacagcaag ggggaggatt 3900
gggaagacaa tagcaggcat gctggggatg cggtgggctc tatggaacaa caacaattgc 3960
attcatttta tgtttcaggt tcagggggag gtgtgggagg tctgaggcgg aaagaaccag 4020
ctgccttaat ataacttcgt ataatgtatg ctatacgaag ttattaggtc tgaagaggag 4080
tttacgtcca gccaattctg tggaatgtgt gtcagttagg gtgtggaaag tccccaggct 4140
ccccagcagg cagaagtatg caaagcatgc atctcaatta gtcagcaacc aggtgtggaa 4200
agtccccagg ctccccagca ggcagaagta tgcaaagcat gcatctcaat tagtcagcaa 4260
ccatagtccc gcccctaact ccgcccatcc cgcccctaac tccgcccagt tccgcccatt 4320
ctccgcccca tggctgacta atttttttta tttatgcaga ggccgaggcc gcctctgcct 4380
ctgagctatt ccagaagtag tgaggaggct tttttggagg cctaggcttt tgcaaaaagc 4440
tcccgggagc ttgtatatcc attttcggcg gccgcgccac catgaccgag tacaagccca 4500
cggtgcgcct cgccacccgc gacgacgtcc ccagggccgt acgcaccctc gccgccgcgt 4560
tcgccgacta ccccgccacg cgccacaccg tcgatccgga ccgccacatc gagcgggtca 4620
ccgagctgca agaactcttc ctcacgcgcg tcgggctcga catcggcaag gtgtgggtcg 4680
cggacgacgg cgccgcggtg gcggtctgga ccacgccgga gagcgtcgaa gcgggggcgg 4740
tgttcgccga gatcggcccg cgcatggccg agttgagcgg ttcccggctg gccgcgcagc 4800
aacagatgga aggcctcctg gcgccgcacc ggcccaagga gcccgcgtgg ttcctggcca 4860
ccgtcggagt ctcgcccgac caccagggca agggtctggg cagcgccgtc gtgctccccg 4920
gagtggaggc ggccgagcgc gccggggtgc ccgccttcct ggagacctcc gcgccccgca 4980
acctcccctt ctacgagcgg ctcggcttca ccgtcaccgc cgacgtcgag gtgcccgaag 5040
gaccgcgcac ctggtgcatg acccgcaagc ccggtgcctg agaattcgcg ggactctggg 5100
gttcgaaatg accgaccaag cgacgcccaa cctgccatca cgagatttcg attccaccgc 5160
cgccttctat gaaaggttgg gcttcggaat cgttttccgg gacgccggct ggatgatcct 5220
ccagcgcggg gatctcatgc tggagttctt cgcccacccc aacttgttta ttgcagctta 5280
taatggttac aaataaagca atagcatcac aaatttcaca aataaagcat ttttttcact 5340
gcattctagt tgtggtttgt ccaaactcat caatgtatct tatcatgtct gtataccgct 5400
cgactagagc ttgcggaacc cttaatataa cttcgtataa tgtatgctat acgaagttat 5460
taggtccgct ggccatctac gagccaaaga ctttcaaatc tttggctgcc ttggccagta 5520
ggaggcgaca cgaaggattt gctgctgcct tgggggatgg gaaggaacct gaaggcattt 5580
tttccagagt ggtgcagtac cactgaggac tgttgctgta ttgattagga aaagagacag 5640
agtaatttgc agtttgtttg atttatactg tggttgctga gactgcgtgg gggcccaagg 5700
agacctggag aaaggaatgc ttcctgctcc ttcttctggg gccccaggag agccttccca 5760
gggccttgga gaggtgctgt ccagggacta accctgtgct ctaggaaggc tgcaggccct 5820
gaccagctgg gcaggtcctg ggtccctcct ggccttctaa gttccccaaa catgagacct 5880
ctgggtgtgg ggtggcctgg ggaggtcatt ttgcccaggc cctacctcct gcccattcct 5940
aacccttttt aaaaatctgt gcgtcctctt cttccttctt ctccctccct tcccttttcg 6000
ctcaccctct gctgctggcc tgagagccgg aggcccccag ggggaaggcg actggtctcc 6060
tccccagtct cagggaaggg agacagagaa tccaggaagc cagaactcag cagacgaagc 6120
acccagggac ctagagatgg gttgaaaagt tgacagctgt cccacctgcc tcccaaggtc 6180
tcagggccta aacctccaag gcaggaaagg cccctgtccc tccctggggt ccatagaaag 6240
agggacaagt ctgcacggac catttgctgt aatattaaca ccttggctgt cattaggtag 6300
tcttggctgt taattatgtc ctgtgataat gtattattag cacgccgacc acatagggta 6360
gggaactgca gctagtaaac aaaagtttgt tcctatatgc ggccgccata aaagttttgt 6420
tactttatag aagaaatttt gagtttttgt tttttttaat aaataaataa acataaataa 6480
attgtttgtt gaatttatta ttagtatgta agtgtaaata taataaaact taatatctat 6540
tcaaattaat aaataaacct cgatatacag accgataaaa cacatgcgtc aattttacac 6600
atgattatct ttaacgtacg tcacaatatg attatctttc tagggttaat ctagctgcgt 6660
gttctgcagc gtgtcgagca tcttcatctg ctccatcacg ctgtaaaaca catttgcacc 6720
gcgagtctgc ccgtcctcca cgggttcaaa aacgtgaatg aacgaggcgc gctcactggc 6780
cgtcgtttta caacgtcgtg actgggaaaa ccctggcgtt acccaactta atcgccttgc 6840
agcacatccc cctttcgcca gctggcgtaa tagcgaagag gcccgcaccg atcgcccttc 6900
ccaacagttg cgcagcctga atggcgaatg ggacgcgccc tgtagcggcg cattaagcgc 6960
ggcgggtgtg gtggttacgc gcagcgtgac cgctacactt gccagcgccc tagcgcccgc 7020
tcctttcgct ttcttccctt cctttctcgc cacgttcgcc ggctttcccc gtcaagctct 7080
aaatcggggg ctccctttag ggttccgatt tagtgcttta cggcacctcg accccaaaaa 7140
acttgattag ggtgatggtt cacgtagtgg gccatcgccc tgatagacgg tttttcgccc 7200
tttgacgttg gagtccacgt tctttaatag tggactcttg ttccaaactg gaacaacact 7260
caaccctatc tcggtctatt cttttgattt ataagggatt ttgccgattt cggcctattg 7320
gttaaaaaat gagctgattt aacaaaaatt taacgcgaat tttaacaaaa tattaacgct 7380
tacaatttag gtggcacttt tcggggaaat gtgcgcggaa cccctatttg tttatttttc 7440
taaatacatt caaatatgta tccgctcatg agacaataac cctgataaat gcttcaataa 7500
tattgaaaaa ggaagagtat gagtattcaa catttccgtg tcgcccttat tccctttttt 7560
gcggcatttt gccttcctgt ttttgctcac ccagaaacgc tggtgaaagt aaaagatgct 7620
gaagatcagt tgggtgcacg agtgggttac atcgaactgg atctcaacag cggtaagatc 7680
cttgagagtt ttcgccccga agaacgtttt ccaatgatga gcacttttaa agttctgcta 7740
tgtggcgcgg tattatcccg tattgacgcc gggcaagagc aactcggtcg ccgcatacac 7800
tattctcaga atgacttggt tgagtactca ccagtcacag aaaagcatct tacggatggc 7860
atgacagtaa gagaattatg cagtgctgcc ataaccatga gtgataacac tgcggccaac 7920
ttacttctga caacgatcgg aggaccgaag gagctaaccg cttttttgca caacatgggg 7980
gatcatgtaa ctcgccttga tcgttgggaa ccggagctga atgaagccat accaaacgac 8040
gagcgtgaca ccacgatgcc tgtagcaatg gcaacaacgt tgcgcaaact attaactggc 8100
gaactactta ctctagcttc ccggcaacaa ttaatagact ggatggaggc ggataaagtt 8160
gcaggaccac ttctgcgctc ggcccttccg gctggctggt ttattgctga taaatctgga 8220
gccggtgagc gtggttcacg cggtatcatt gcagcactgg ggccagatgg taagccctcc 8280
cgtatcgtag ttatctacac gacggggagt caggcaacta tggatgaacg aaatagacag 8340
atcgctgaga taggtgcctc actgattaag cattggtaac tgtcagacca agtttactca 8400
tatatacttt agattgattt aaaacttcat ttttaattta aaaggatcta ggtgaagatc 8460
ctttttgata atctcatgac caaaatccct taacgtgagt tttcgttcca ctgagcgtca 8520
gaccccgtag aaaagatcaa aggatcttct tgagatcctt tttttctgcg cgtaatctgc 8580
tgcttgcaaa caaaaaaacc accgctacca gcggtggttt gtttgccgga tcaagagcta 8640
ccaactcttt ttccgaaggt aactggcttc agcagagcgc agataccaaa tactgtcctt 8700
ctagtgtagc cgtagttagg ccaccacttc aagaactctg tagcaccgcc tacatacctc 8760
gctctgctaa tcctgttacc agtggctgct gccagtggcg ataagtcgtg tcttaccggg 8820
ttggactcaa gacgatagtt accggataag gcgcagcggt cgggctgaac ggggggttcg 8880
tgcacacagc ccagcttgga gcgaacgacc tacaccgaac tgagatacct acagcgtgag 8940
ctatgagaaa gcgccacgct tcccgaaggg agaaaggcgg acaggtatcc ggtaagcggc 9000
agggtcggaa caggagagcg cacgagggag cttccagggg gaaacgcctg gtatctttat 9060
agtcctgtcg ggtttcgcca cctctgactt gagcgtcgat ttttgtgatg ctcgtcaggg 9120
gggcggagcc tatggaaaaa cgccagcaac gcggcctttt tacggttcct ggccttttgc 9180
tggccttttg ctcacatgtt ctttcctgcg ttatcccctg attctgtgga taaccgtatt 9240
accgcctttg agtgagctga taccgctcgc cgcagccgaa cgaccgagcg cagcgagtca 9300
gtgagcgagg aagcggaaga gcgcccaata cgcaaaccgc ctctccccgc gcgttggccg 9360
attcattaat gcagctggca cgacaggttt cccgactgga aagcgggcag tgagcgcaac 9420
gcaattaatg tgagttagct cactcattag gcaccccagg ctttacactt tatgcttccg 9480
gctcgtatgt tgtgtggaat tgtgagcgga taacaatttc acacaggaaa cagctatgac 9540
catgattacg ccaagcgcgc ccgccgggta actcacgggg tatccatgtc catttctgcg 9600
gcatccagcc aggatacccg tcctcgctga cgtaatatcc cagcgccgca ccgctgtcat 9660
taatctgcac accggcacgg cagttccggc tgtcgccggt attgttcggg ttgctgatgc 9720
gcttcgggct gaccatccgg aactgtgtcc ggaaaagccg cgacgaactg gtatcccagg 9780
tggcctgaac gaacagttca ccgttaaagg cgtgcatggc cacaccttcc cgaatcatca 9840
tggtaaacgt gcgttttcgc tcaacgtcaa tgcagcagca gtcatcctcg gcaaactctt 9900
tccatgccgc ttcaacctcg cgggaaaagg cacgggcttc ttcctccccg atgcccagat 9960
agcgccagct tgggcgatga ctgagccgga aaaaagaccc gacgatatga tcctgatgca 10020
gctagattaa ccctagaaag atagtctgcg taaaattgac gcatgcattc ttgaaatatt 10080
gctctctctt tctaaatagc gcgaatccgt cgctgtgcat ttaggacatc tcagtcgccg 10140
cttggagctc ccgtgaggcg tgcttgtcaa tgcggtaagt gtcactgatt ttgaactata 10200
acgaccgcgt gagtcaaaat gacgcatgat tatcttttac gtgactttta agatttaact 10260
catacgataa ttatattgtt atttcatgtt ctacttacgt gataacttat tatatatata 10320
ttttcttgtt atagatatc 10339
<210> 14
<211> 421
<212> PRT
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 14
Met Asp Arg Val Leu Ser Arg Ala Asp Lys Glu Arg Leu Leu Glu Leu
1 5 10 15
Leu Lys Leu Pro Arg Gln Leu Trp Gly Asp Phe Gly Arg Met Gln Gln
20 25 30
Ala Tyr Lys Gln Gln Ser Leu Leu Leu His Pro Asp Lys Gly Gly Ser
35 40 45
His Ala Leu Met Gln Glu Leu Asn Ser Leu Trp Gly Thr Phe Lys Thr
50 55 60
Glu Val Tyr Asn Leu Arg Met Asn Leu Gly Gly Thr Gly Phe Gln Val
65 70 75 80
Arg Arg Leu His Ala Asp Gly Trp Asn Leu Ser Thr Lys Asp Thr Phe
85 90 95
Gly Asp Arg Tyr Tyr Gln Arg Phe Cys Arg Met Pro Leu Thr Cys Leu
100 105 110
Val Asn Val Lys Tyr Ser Ser Cys Ser Cys Ile Leu Cys Leu Leu Arg
115 120 125
Lys Gln His Arg Glu Leu Lys Asp Lys Cys Asp Ala Arg Cys Leu Val
130 135 140
Leu Gly Glu Cys Phe Cys Leu Glu Cys Tyr Met Gln Trp Phe Gly Thr
145 150 155 160
Pro Thr Arg Asp Val Leu Asn Leu Tyr Ala Asp Phe Ile Ala Ser Met
165 170 175
Pro Ile Asp Trp Leu Asp Leu Asp Val His Ser Val Tyr Asn Pro Lys
180 185 190
Arg Arg Ser Glu Glu Leu Arg Arg Ala Ala Thr Val His Tyr Thr Met
195 200 205
Thr Thr Gly His Ser Ala Met Glu Ala Ser Thr Ser Gln Gly Asn Gly
210 215 220
Met Ile Ser Ser Glu Ser Gly Thr Pro Ala Thr Ser Arg Arg Leu Arg
225 230 235 240
Leu Pro Ser Leu Leu Ser Asn Pro Thr Tyr Ser Val Met Arg Ser His
245 250 255
Ser Tyr Pro Pro Thr Arg Val Leu Gln Gln Ile His Pro His Ile Leu
260 265 270
Leu Glu Glu Asp Glu Ile Leu Val Leu Leu Ser Pro Met Thr Ala Tyr
275 280 285
Pro Arg Thr Pro Pro Glu Leu Leu Tyr Pro Glu Ser Asp Gln Asp Gln
290 295 300
Leu Glu Pro Leu Glu Glu Glu Glu Glu Glu Tyr Met Pro Met Glu Asp
305 310 315 320
Leu Tyr Leu Asp Ile Leu Pro Gly Glu Gln Val Pro Gln Leu Ile Pro
325 330 335
Pro Pro Ile Ile Pro Arg Ala Gly Leu Ser Pro Trp Glu Gly Leu Ile
340 345 350
Leu Arg Asp Leu Gln Arg Ala His Phe Asp Pro Ile Leu Asp Ala Ser
355 360 365
Gln Arg Met Arg Ala Thr His Arg Ala Ala Leu Arg Ala His Ser Met
370 375 380
Gln Arg His Leu Arg Arg Leu Gly Arg Thr Leu Leu Leu Val Thr Phe
385 390 395 400
Leu Ala Ala Leu Leu Gly Ile Cys Leu Met Leu Phe Ile Leu Ile Lys
405 410 415
Arg Ser Arg His Phe
420
<210> 15
<211> 7246
<212> DNA
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 15
ggcctaactg gccgcagaaa tggttgaact cccgagagtg tcctacacct aggggagaag 60
cagccaaggg gttgtttccc accaaggacg acccgtctgc gcacaaacgg atgagcccat 120
cagacaaaga catattcatt ctctgctgca aacttggcat agctctgctt tgcctggggc 180
tattggggga agttgcggtt cgtgctcgca gggctctcac ccttgactct ttcaataata 240
actcttctgt gcaagattac aatctaaaca attcggagaa ctcgaccttc ctcctgaggc 300
aaggaccaca gccaacttcc tcttacaagc cgcatcgatt ttgtccttca gaaatagaaa 360
taagaatgct tgctaaaaat tatattttta ccaataagac caatccaata ggtagattat 420
tagttactat gttaagaaat gaatcattat cttttagtac tatttttact caaattcaga 480
agttagaaat gggaatagaa aatagaaaga gacgctcaac ctcaattgaa gaacaggtgc 540
aaggactatt gaccacaggc ctagaagtaa aaaagggaaa aaagagtgtt tttgtcaaaa 600
taggagacag gtggtggcaa ccagggactt ataggggacc ttacatctac agaccaacag 660
atgccccctt accatataca ggaagatatg acttaaattg ggataggtgg gttacagtca 720
atggctataa agtgttatat agatccctcc cctttcgtga aagactcgcc agagctagac 780
ctccttggtg tatgttgtct caagaaaaga aagacgacat gaaacaacag gtacatgatt 840
atatttatct aggaacagga atgcactttt ggggaaagat tttccatacc aaggagggga 900
cagtggctgg actaatagaa cattattctg caaaaactta tggcatgagt tattatgatt 960
agccttgatt tgcccaacct tgcggttccc aaggcttaag taagtttttg gttacaaact 1020
gttcttaaaa caaggatgtg agacaagtgg tttcctgact tggtttggta tcaaaggttc 1080
tgatctgagc tctgagtgtt ctattttcct atgttctttt ggaatttatc caaatcttat 1140
gtaaatgctt atgtaaacca agatataaaa gagtgctgat tttttgagta aacttgcaac 1200
agtcctaaca ttcacctctt gtgtgtttgt gtctgttcgc catcccgtct ccgctcgtca 1260
cttatccttc actttccaga gggtcccccc gcagaccccg gcgaccctca ggtcggccga 1320
ctgcggcagg cctcggcggc caagcttggc aatccggtac tgttggtaaa gccaccatgg 1380
aagatgccaa aaacattaag aagggcccag cgccattcta cccactcgaa gacgggaccg 1440
ccggcgagca gctgcacaaa gccatgaagc gctacgccct ggtgcccggc accatcgcct 1500
ttaccgacgc acatatcgag gtggacatta cctacgccga gtacttcgag atgagcgttc 1560
ggctggcaga agctatgaag cgctatgggc tgaatacaaa ccatcggatc gtggtgtgca 1620
gcgagaatag cttgcagttc ttcatgcccg tgttgggtgc cctgttcatc ggtgtggctg 1680
tggccccagc taacgacatc tacaacgagc gcgagctgct gaacagcatg ggcatcagcc 1740
agcccaccgt cgtattcgtg agcaagaaag ggctgcaaaa gatcctcaac gtgcaaaaga 1800
agctaccgat catacaaaag atcatcatca tggatagcaa gaccgactac cagggcttcc 1860
aaagcatgta caccttcgtg acttcccatt tgccacccgg cttcaacgag tacgacttcg 1920
tgcccgagag cttcgaccgg gacaaaacca tcgccctgat catgaacagt agtggcagta 1980
ccggattgcc caagggcgta gccctaccgc accgcaccgc ttgtgtccga ttcagtcatg 2040
cccgcgaccc catcttcggc aaccagatca tccccgacac cgctatcctc agcgtggtgc 2100
catttcacca cggcttcggc atgttcacca cgctgggcta cttgatctgc ggctttcggg 2160
tcgtgctcat gtaccgcttc gaggaggagc tattcttgcg cagcttgcaa gactataaga 2220
ttcaatctgc cctgctggtg cccacactat ttagcttctt cgctaagagc actctcatcg 2280
acaagtacga cctaagcaac ttgcacgaga tcgccagcgg cggggcgccg ctcagcaagg 2340
aggtaggtga ggccgtggcc aaacgcttcc acctaccagg catccgccag ggctacggcc 2400
tgacagaaac aaccagcgcc attctgatca cccccgaagg ggacgacaag cctggcgcag 2460
taggcaaggt ggtgcccttc ttcgaggcta aggtggtgga cttggacacc ggtaagacac 2520
tgggtgtgaa ccagcgcggc gagctgtgcg tccgtggccc catgatcatg agcggctacg 2580
ttaacaaccc cgaggctaca aacgctctca tcgacaagga cggctggctg cacagcggcg 2640
acatcgccta ctgggacgag gacgagcact tcttcatcgt ggaccggctg aagagcctga 2700
tcaaatacaa gggctaccag gtagccccag ccgaactgga gagcatcctg ctgcaacacc 2760
ccaacatctt cgacgccggg gtcgccggcc tgcccgacga cgatgccggc gagctgcccg 2820
ccgcagtcgt cgtgctggaa cacggtaaaa ccatgaccga gaaggagatc gtggactatg 2880
tggccagcca ggttacaacc gccaagaagc tgcgcggtgg tgttgtgttc gtggacgagg 2940
tgcctaaagg actgaccggc aagttggacg cccgcaagat ccgcgagatt ctcattaagg 3000
ccaagaaggg cggcaagatc gccgtgaatt ctcacggctt ccctcccgag gtggaggagc 3060
aggccgccgg caccctgccc atgagctgcg cccaggagag cggcatggat agacaccctg 3120
ctgcttgcgc cagcgccagg atcaacgtct aaggccgcga ctctagagtc ggggcggccg 3180
gccgcttcga gcagacatga taagatacat tgatgagttt ggacaaacca caactagaat 3240
gcagtgaaaa aaatgcttta tttgtgaaat ttgtgatgct attgctttat ttgtaaccat 3300
tataagctgc aataaacaag ttaacaacaa caattgcatt cattttatgt ttcaggttca 3360
gggggaggtg tgggaggttt tttaaagcaa gtaaaacctc tacaaatgtg gtaaaatcga 3420
taaggatccg tttgcgtatt gggcgctctt ccgctgatct gcgcagcacc atggcctgaa 3480
ataacctctg aaagaggaac ttggttagct accttctgag gcggaaagaa ccagctgtgg 3540
aatgtgtgtc agttagggtg tggaaagtcc ccaggctccc cagcaggcag aagtatgcaa 3600
agcatgcatc tcaattagtc agcaaccagg tgtggaaagt ccccaggctc cccagcaggc 3660
agaagtatgc aaagcatgca tctcaattag tcagcaacca tagtcccgcc cctaactccg 3720
cccatcccgc ccctaactcc gcccagttcc gcccattctc cgccccatgg ctgactaatt 3780
ttttttattt atgcagaggc cgaggccgcc tctgcctctg agctattcca gaagtagtga 3840
ggaggctttt ttggaggcct aggcttttgc aaaaagctcg attcttctga cactagcgcc 3900
accatgaaga agcccgaact caccgctacc agcgttgaaa aatttctcat cgagaagttc 3960
gacagtgtga gcgacctgat gcagttgtcg gagggcgaag agagccgagc cttcagcttc 4020
gatgtcggcg gacgcggcta tgtactgcgg gtgaatagct gcgctgatgg cttctacaaa 4080
gaccgctacg tgtaccgcca cttcgccagc gctgcactac ccatccccga agtgttggac 4140
atcggcgagt tcagcgagag cctgacatac tgcatcagta gacgcgccca aggcgttact 4200
ctccaagacc tccccgaaac agagctgcct gctgtgttac agcctgtcgc cgaagctatg 4260
gatgctattg ccgccgccga cctcagtcaa accagcggct tcggcccatt cgggccccaa 4320
ggcatcggcc agtacacaac ctggcgggat ttcatttgcg ccattgctga tccccatgtc 4380
taccactggc agaccgtgat ggacgacacc gtgtccgcca gcgtagctca agccctggac 4440
gaactgatgc tgtgggccga agactgtccc gaggtgcgcc acctcgtcca tgccgacttc 4500
ggcagcaaca acgtcctgac cgacaacggc cgcatcaccg ccgtaatcga ctggtccgaa 4560
gctatgttcg gggacagtca gtacgaggtg gccaacatct tcttctggcg gccctggctg 4620
gcttgcatgg agcagcagac tcgctacttc gagcgccggc atcccgagct ggccggcagc 4680
cctcgtctgc gagcctacat gctgcgcatc ggcctggatc agctctacca gagcctcgtg 4740
gacggcaact tcgacgatgc tgcctgggct caaggccgct gcgatgccat cgtccgcagc 4800
ggggccggca ccgtcggtcg cacacaaatc gctcgccgga gcgcagccgt atggaccgac 4860
ggctgcgtcg aggtgctggc cgacagcggc aaccgccggc ccagtacacg accgcgcgct 4920
aaggaggtag gtcgagttta aactctagaa ccggtcatgg ccgcaataaa atatctttat 4980
tttcattaca tctgtgtgtt ggttttttgt gtgttcgaac tagatgctgt cgaccgatgc 5040
ccttgagagc cttcaaccca gtcagctcct tccggtgggc gcggggcatg actatcgtcg 5100
ccgcacttat gactgtcttc tttatcatgc aactcgtagg acaggtgccg gcagcgctct 5160
tccgcttcct cgctcactga ctcgctgcgc tcggtcgttc ggctgcggcg agcggtatca 5220
gctcactcaa aggcggtaat acggttatcc acagaatcag gggataacgc aggaaagaac 5280
atgtgagcaa aaggccagca aaaggccagg aaccgtaaaa aggccgcgtt gctggcgttt 5340
ttccataggc tccgcccccc tgacgagcat cacaaaaatc gacgctcaag tcagaggtgg 5400
cgaaacccga caggactata aagataccag gcgtttcccc ctggaagctc cctcgtgcgc 5460
tctcctgttc cgaccctgcc gcttaccgga tacctgtccg cctttctccc ttcgggaagc 5520
gtggcgcttt ctcatagctc acgctgtagg tatctcagtt cggtgtaggt cgttcgctcc 5580
aagctgggct gtgtgcacga accccccgtt cagcccgacc gctgcgcctt atccggtaac 5640
tatcgtcttg agtccaaccc ggtaagacac gacttatcgc cactggcagc agccactggt 5700
aacaggatta gcagagcgag gtatgtaggc ggtgctacag agttcttgaa gtggtggcct 5760
aactacggct acactagaag aacagtattt ggtatctgcg ctctgctgaa gccagttacc 5820
ttcggaaaaa gagttggtag ctcttgatcc ggcaaacaaa ccaccgctgg tagcggtggt 5880
ttttttgttt gcaagcagca gattacgcgc agaaaaaaag gatctcaaga agatcctttg 5940
atcttttcta cggggtctga cgctcagtgg aacgaaaact cacgttaagg gattttggtc 6000
atgagattat caaaaaggat cttcacctag atccttttaa attaaaaatg aagttttaaa 6060
tcaatctaaa gtatatatga gtaaacttgg tctgacagcg gccgcaaatg ctaaaccact 6120
gcagtggtta ccagtgcttg atcagtgagg caccgatctc agcgatctgc ctatttcgtt 6180
cgtccatagt ggcctgactc cccgtcgtgt agatcactac gattcgtgag ggcttaccat 6240
caggccccag cgcagcaatg atgccgcgag agccgcgttc accggccccc gatttgtcag 6300
caatgaacca gccagcaggg agggccgagc gaagaagtgg tcctgctact ttgtccgcct 6360
ccatccagtc tatgagctgc tgtcgtgatg ctagagtaag aagttcgcca gtgagtagtt 6420
tccgaagagt tgtggccatt gctactggca tcgtggtatc acgctcgtcg ttcggtatgg 6480
cttcgttcaa ctctggttcc cagcggtcaa gccgggtcac atgatcaccc atattatgaa 6540
gaaatgcagt cagctcctta gggcctccga tcgttgtcag aagtaagttg gccgcggtgt 6600
tgtcgctcat ggtaatggca gcactacaca attctcttac cgtcatgcca tccgtaagat 6660
gcttttccgt gaccggcgag tactcaacca agtcgttttg tgagtagtgt atacggcgac 6720
caagctgctc ttgcccggcg tctatacggg acaacaccgc gccacatagc agtactttga 6780
aagtgctcat catcgggaat cgttcttcgg ggcggaaaga ctcaaggatc ttgccgctat 6840
tgagatccag ttcgatatag cccactcttg cacccagttg atcttcagca tcttttactt 6900
tcaccagcgt ttcggggtgt gcaaaaacag gcaagcaaaa tgccgcaaag aagggaatga 6960
gtgcgacacg aaaatgttgg atgctcatac tcgtcctttt tcaatattat tgaagcattt 7020
atcagggtta ctagtacgtc tctcaaggat aagtaagtaa tattaaggta cgggaggtat 7080
tggacaggcc gcaataaaat atctttattt tcattacatc tgtgtgttgg ttttttgtgt 7140
gaatcgatag tactaacata cgctctccat caaaacaaaa cgaaacaaaa caaactagca 7200
aaataggctg tccccagtgc aagtgcaggt gccagaacat ttctct 7246
<210> 16
<211> 20
<212> DNA
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 16
agttatggca gaactcagtg 20
<210> 17
<211> 23
<212> DNA
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 17
ccccatccaa agtttttaaa gga 23
<210> 18
<211> 23
<212> DNA
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 18
tgtggcagat gtcacagttt agg 23
<210> 19
<211> 25
<212> DNA
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 19
caccgagtta tggcagaact cagtg 25
<210> 20
<211> 25
<212> DNA
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 20
aaaccactga gttctgccat aactc 25
<210> 21
<211> 20
<212> DNA
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 21
gaaggagcaa actgacatgg 20
<210> 22
<211> 20
<212> DNA
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 22
tgcagtgggt ctttggggac 20
<210> 23
<211> 20
<212> DNA
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 23
ttccaggaac ataagaaagt 20
<210> 24
<211> 20
<212> DNA
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 24
gcagtctcag caaccactga 20
<210> 25
<211> 20
<212> DNA
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 25
ggtcggagtg aacggatttg 20
<210> 26
<211> 20
<212> DNA
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 26
ccatttgatg ttggcgggat 20
<210> 27
<211> 20
<212> DNA
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 27
agatccgcca caacatcgag 20
<210> 28
<211> 20
<212> DNA
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 28
gtccatgccg agagtgatcc 20
<210> 29
<211> 20
<212> DNA
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 29
cctgctgtaa gtgccgtagt 20
<210> 30
<211> 18
<212> DNA
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 30
ctaggggcac agcacgtc 18
<210> 31
<211> 26
<212> DNA
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 31
aagttattag gtctgaagag gagttt 26
<210> 32
<211> 20
<212> DNA
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 32
cccatcattc cgtcccagag 20
<210> 33
<211> 20
<212> DNA
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 33
tgctgagttc tggcttcctg 20
<210> 34
<211> 23
<212> DNA
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 34
tctaccaaga gagtgaccag cag 23
<210> 35
<211> 20
<212> DNA
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 35
cacgccatcc tgcgtctgga 20
<210> 36
<211> 20
<212> DNA
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 36
agcaccgtgt tggcgtagag 20
<210> 37
<211> 22
<212> DNA
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 37
gtctccgctc gtcacttatc ct 22
<210> 38
<211> 22
<212> DNA
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 38
ctagcagcct ttctttgtca gc 22
<210> 39
<211> 1266
<212> DNA
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 39
atggatagag ttctgagcag agctgacaaa gaaaggctgc tagaacttct aaaacttccc 60
agacaactat ggggggattt tggaagaatg cagcaggcat ataagcagca gtcactgcta 120
ctgcacccag acaaaggtgg aagccatgcc ttaatgcagg aattgaacag tctctgggga 180
acatttaaaa ctgaagtata caatctgaga atgaatctag gaggaaccgg cttccaggta 240
agaaggctac atgcggatgg gtggaatcta agtaccaaag acacctttgg tgatagatac 300
taccagcggt tctgcagaat gcctcttacc tgcctagtaa atgttaaata cagctcatgt 360
agttgtatat tatgcctgct tagaaagcaa catagagagc tcaaagacaa atgtgatgcc 420
aggtgcctag tacttggaga atgtttttgt cttgaatgtt acatgcaatg gtttggaaca 480
ccaacccgag atgtgctgaa cctgtatgca gacttcattg caagcatgcc tatagactgg 540
ctggacctgg atgtgcacag cgtgtataat ccaaaacggc ggagcgagga actgaggaga 600
gcggccacag tccactacac gatgactact ggtcattcag ctatggaagc aagtacttca 660
caagggaatg gaatgatttc ttcagaaagt gggaccccag ctaccagtcg ccgcctaaga 720
ctgccgagtc ttctgagcaa cccgacctat tctgttatga ggagccactc ctatccccca 780
acccgagttc tccaacagat acacccgcac atactgctgg aagaagacga aatccttgtg 840
ttgctgagcc cgatgacagc atatccccgg acccccccag aactcctgta tccagaaagc 900
gaccaagacc agctggagcc actggaggag gaggaggagg agtacatgcc aatggaggat 960
ctgtatttgg acatcctacc gggggaacaa gtaccccagc tcatcccccc ccctatcatt 1020
cccagggcgg gtctgagtcc atgggagggt ctgattcttc gggatttgca gagggctcat 1080
ttcgatccga tcctagatgc gagtcagaga atgagagcta ctcacagagc tgctctcaga 1140
gctcattcaa tgcaacgcca cctaagaagg ctagggagga ccctgctcct agtgactttc 1200
ctagcagcct tactgggtat ttgtctcatg ctatttattc taataaaacg ttcccggcat 1260
ttctag 1266
<210> 40
<211> 1104
<212> DNA
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 40
aataaatgca ctgttgggcc tatgctcaag atgggtagtg ttaattggtg gtggaactta 60
tctgatttca tgacttgctg gctacctaaa acaggtgagg agaaagccaa tgctatgtct 120
gggactggat gagcaagtac aacaaacaaa atgggcttaa agtatgagtg agagttatct 180
gaccgtaagg atgcaagtga gggggcctaa ggtttggaga ttaatattta atctcagatg 240
ctatactttg gtggtgtagc aaaagtctac aaatgggatg actgtaaaac tcagtagatc 300
cgtgcttttt aacctatctc ccttcatcag gaaattgcga cacaaagatc tttagtaata 360
acacgcagtc tcaatgcata aaatcaggct taggtgttgc ctggactcat ttcccatctc 420
caccccacta taattatttt gtgacacaaa ctcaagactg tgggaatata gagaaattgg 480
gctcgtcctc gtacacctgc tcaatcccct gcaggacaac gcccaagaat caggttaagc 540
cagggcaaaa gaatcccgcc cataatcgag aaggagcaaa ctgacatgga ggcgatgacg 600
agatcgcggg ggagggaggg atttttctag gcccagggcg gtccttagga aaaggaggca 660
gcagagaact cccataaagg tattgcggca ctcccctccc cctgcggaga agggtgcggc 720
cttctctccg cctcctccac tgcagctccc tcaggattgc agctcgcgcg ggtttttgga 780
gaacatgcgc ctcccaccca caagccagca ggaccgaccc cccactcctt cctccacccc 840
ccacccccac gggtccgaga gcaggtagag ggctagtctc gtccttcagg cggcggacgc 900
ccagggcgga gccgcagtca ccaccaccca gaagcctcgg cccggcagcc cgcccccgcc 960
tcctgcgcgc gcttcctgcc acgttgcgca ggggcgaggg gccagacact gcggcgctgg 1020
cctcggggag ggccgtacca aagaccgcct ccctgccgac tcgcgtagtg gtttcgctca 1080
tttgggaccc aagccaataa caag 1104
<210> 41
<211> 1056
<212> DNA
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 41
tgctctctct cctgccccct tcacctgcgt gccctcctca ttctccctct gtgccacctc 60
tggccttgca ctgtaggctc tctcttgggg atgtttctcc ttctccacac acttctcttt 120
cactctgtcc tcttgctttg tgtgggcctg cagcgttacc cttttttctg ggcacactca 180
gagcaccctc ctctttctgg ttctgggcca cctgtctgtc ctcgggtcat cttgctctct 240
ctgcctggat gccctcctgt ggctttgggc agcttctccc tccttcagag tgcaccgcca 300
gttctcctag gcccggtcac ttccccttcc caggggacct agagccctgc taggtcctct 360
ctctccacaa cctgggcccc caaacctttc caaaacacct tgctttctgc ctccattggt 420
cttgtgttcc agagccagag tcactatatg tcccagaacc aggattccct ctggttctga 480
gggcttttat cgcatcccct gcctggctgc agtgggtctt tggggacagg ccacagaaga 540
gcctctactc ctccctctgt ccccgaggct gtctccctcc cagtcttccc agctcaggcc 600
agtccccagg cctctcttcc ctgccagagc ccgtcaggtt cggttacttt ggggcccaga 660
gaggaccctg tgaaggaagc gtgggtaggg gcacgggaat ggggaggatg cctgaagagg 720
cccccttagc cagaagagga gcagaagagg agcaggtacc cagaagagga gcagttcagg 780
gaaatagaag agtcccgagc tctttttttt tttttttttt atttcttttc ttttcttttc 840
tttttatggc agcatccgtg gtatatggag gttcccagcc taggggtcag atcatacctg 900
caactgccag cctacaccac agccacagca ctcaggatcc gagctgcatc tgcggcttac 960
gccacaggtc acagcaacgc tggatcctta acccactgaa tgaggccagg gattgaacct 1020
gcaacctcat gcacactatg ctggggtctt aatcgg 1056
<210> 42
<211> 1108
<212> DNA
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 42
acttcctcct gcccttaccc tttatctggc tcttagctcc taaaaactgc attattagct 60
tcctcttttg cctctactct tactcaacca aaattgtttt aagatctgtg gatctagctt 120
ctgctgtgct attcttagga acacttttat ttcctcttag ctccatctca ccagttattg 180
gctaatggct ttgcttggta cctacatctg tacatttctt tcgtactagc ttctagactg 240
aaaaaggact gttggttcaa catgaaaggg aaggaggtaa aagaggacac acaggaaaga 300
tggattggga ttcaggtctc tgctgttgtt acttgagatt gctttctaga ttctacttgt 360
ggaaacaaaa agcctttgcg agaattctaa actggagtat ttctgtaatt gaggagtctt 420
gctcagcaaa tcccacttag gggactaatg aagtaccagg aagagacaga ccatgctcaa 480
tccacaaagc caggttttac tgaaatgtga cctactttct tatgttcctg gaagtttaga 540
tcagggtggg cagctctggg ttttataggc tacactgtta acactcaggc tgttttctac 600
cgtttagtca aaatatagtc accttgcctg cttcacctgt ccatcagaga atggcctcat 660
taattgactc tctagtatga agtcaaagta gctttggtgg ccctaaatgg acaagtatca 720
agagactggg tgaattgagg agcttgagac tgtcacctca gatcgaaaag actgaaaaat 780
cacctcagat caaaaagact gaaaaatctt cagtctggaa aggggactca aaaccataat 840
tagagtattc tggtagaatc cttttctcca ctgttattca tacagttaag gtgaataact 900
aaaagtaatt gtgagctgag gagtaagata caacacacaa ggaatcagtt aacagagtct 960
cgagtgaaat tataaatgga aagaattatg acttgaatca taactctgag gccccatttt 1020
ccctaacaac ttttgtccca ataaacgtgg gtatttgttt gggagaaact atcatataca 1080
tgattaccca gtaaacagac tgtttact 1108
<210> 43
<211> 1089
<212> DNA
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 43
actttgtacc tattttgtat gtgtataata atttgagatg tttttaatta ttttgattgc 60
tggaataaag catgtggaaa tgacccaaac caatcttgca ctggcctcct gatttccttc 120
cttggagacg gagggagggg gagacctggg ggagggcgct tggggggggg tgggctctct 180
tctttctgcg ctcccccccc ccacctccaa caccttgacg acccctcctg cttccgcttg 240
cctttctcag gctttaacac tttctcctcg ccctctcagc atgcgcatgc gcgtgcctct 300
acctcccccg cacatcctgg cctgcccacc ctgaatgtcc tggcccagcg atgccaccaa 360
ctctctcgct ccgtccacgg ctggggaggg gggcactctg cagggttggg gggcactggg 420
aggctgggtt gggtgaggga ggggtgcctg ggcccccacc ccccagcaag ttctctccct 480
aggcgaactg gagggtcgtc tggcctcttg agccttgttg ctggctctga gctctaccaa 540
gagagtgacc agcaggaccg caccatcagt ggttgctgag actgcgtggg ggcccaagga 600
gacctggaga aaggaatgct tcctgctcct tcttctgggg ccccaggaga gccttcccag 660
ggccttggag aggtgctgtc cagggactaa ccctgtgctc taggaaggct gcaggccctg 720
accagctggg caggtcctgg gtccctcctg gccttctaag ttccccaaac atgagacctc 780
tgggtgtggg gtggcctggg gaggtcattt tgcccaggcc ctacctcctg cccattccta 840
acccttttta aaaatctgtg cgtcctcttc ttccttcttc tccctccctt cccttttcgc 900
tcaccctctg ctgctggcct gagagccgga ggcccccagg gggaaggcga ctggtctcct 960
ccccagtctc agggaaggga gacagagaat ccaggaagcc agaactcagc agacgaagca 1020
cccagggacc tagagatggg ttgaaaagtt gacagctgtc ccacctgcct cccaaggtct 1080
cagggccta 1089
Claims (10)
1. A construction method of a pig cell for expressing PyMT is characterized in that a nucleotide sequence for encoding the PyMT is inserted into a pig safe harbor site to obtain a polypeptide expressing SEQ ID NO:14, wherein the nucleotide sequence of the coded PyMT is shown in SEQ ID NO:39, wherein the nucleotide sequence encoding the PyMT is regulated and controlled in the pig cells by an exogenous promoter, the exogenous promoter is MMTV-LTR, and the pig safe harbor site is selected from the pig ROSA26, AAVS1, H11 or COL1A1 safe harbor site;
The construction method comprises the steps of co-transfecting a safe harbor site vector, an sgRNA vector and a Cas vector into pig cells, wherein the Cas vector comprises a nucleotide sequence for encoding Cas9 protein, EGFP and Puro resistance protein;
Inserting a nucleotide sequence encoding a PyMT into a pig safe harbor site using a safe harbor site vector, the safe harbor site vector comprising a nucleotide sequence encoding a PyMT and a safe harbor site vector backbone comprising a 5 'homology arm and a 3' homology arm of the safe harbor insertion site, the nucleotide sequence encoding a PyMT being located between the 5 'homology arm and the 3' homology arm, the safe harbor site vector backbone being selected from any one of the following:
a) The ROSA26 safe harbor site vector skeleton, the 5' homology arm of which is shown in SEQ ID NO:5, the 3' homology arm is shown in SEQ ID NO:6 is shown in the figure;
b) AAVS1 safety harbor site carrier skeleton, its 5' homology arm is as SEQ ID NO:7, the 3' homology arm is shown as SEQ ID NO: shown as 8;
c) H11 safe harbor site carrier skeleton, the 5' homology arm of which is shown in SEQ ID NO:9, the 3' homology arm is shown as SEQ ID NO:10 is shown in the figure;
Or D) a COL1A1 safe harbor site carrier skeleton, wherein the 5' -homology arm is shown in SEQ ID NO:11, the 3' homology arm is shown as SEQ ID NO: shown at 12;
The sgRNA vector comprises a sgRNA targeting ROSA26, AAVS1, H11 or COL1A1 safe harbor site, wherein:
The nucleotide sequence of the sgRNA targeting the ROSA26 is shown in SEQ ID NO: as indicated by the reference numeral 21,
The nucleotide sequence of the sgRNA targeting AAVS1 is shown in SEQ ID NO: as indicated by the reference numeral 22,
The nucleotide sequence of the sgRNA targeting H11 is shown in SEQ ID NO: as indicated by the numeral 23,
The nucleotide sequence of the sgRNA targeting COL1A1 is shown in SEQ ID NO: shown at 24.
2. The construction method according to claim 1, wherein the nucleotide sequence of 500bp each in the region of the ROSA26 safety harbor site and upstream and downstream thereof is as set forth in SEQ ID NO:40, and the nucleotide sequence of 500bp respectively at the AAVS1 safe harbor site region and the upstream and downstream thereof is shown as SEQ ID NO:41, and the nucleotide sequence of 500bp respectively at the upper and lower reaches of the H11 safe harbor site region is shown as SEQ ID NO:42, the nucleotide sequence of 500bp of each of the COL1A1 safe harbor site region and the upstream and downstream thereof is shown as SEQ ID NO: 43.
3. The method according to claim 1, wherein the nucleotide sequence of MMTV-LTR is as set forth in SEQ ID NO: 15.
4. The method of claim 1, wherein the porcine cells are porcine fibroblasts or breast cells.
5. The method of construction of claim 1, wherein the Cas vector further comprises an EF1a promoter, a WPRE element, and a 3' ltr sequence element.
6. The construction method according to claim 1, wherein the Cas vector has the nucleotide sequence of, in order from 5 '-3': CMV enhancer, EF1a promoter, nuclear localization signal, nucleotide sequence encoding Cas protein, nuclear localization signal, nucleotide sequence encoding self-cleaving polypeptide P2A, nucleotide sequence encoding EGFP, nucleotide sequence encoding self-cleaving polypeptide T2A, nucleotide sequence encoding Puro resistance protein, WPRE sequence element, 3' ltr sequence element and polyA signal sequence element.
7. The method of claim 1, wherein the Cas vector has a nucleotide sequence set forth in SEQ ID NO: 2.
8. A method for constructing a breast cancer model pig, comprising transferring pig cells obtained by the construction method according to any one of claims 1 to 7 into enucleated pig oocytes to obtain a breast cancer model pig.
9. Use of a pig cell obtained by the construction method of any one of claims 1 to 7 in the preparation of an animal model of breast cancer, or in the screening of a medicament for the treatment of breast cancer, or in the study of the pathogenesis of breast cancer.
10. Use of a breast cancer model pig obtained by the construction method of claim 8 in screening for a drug for treating breast cancer or in studying the pathogenesis of breast cancer.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110187956.7A CN114958758B (en) | 2021-02-18 | 2021-02-18 | Construction method and application of breast cancer model pig |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110187956.7A CN114958758B (en) | 2021-02-18 | 2021-02-18 | Construction method and application of breast cancer model pig |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114958758A CN114958758A (en) | 2022-08-30 |
CN114958758B true CN114958758B (en) | 2024-04-23 |
Family
ID=82954269
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110187956.7A Active CN114958758B (en) | 2021-02-18 | 2021-02-18 | Construction method and application of breast cancer model pig |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114958758B (en) |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1981196A (en) * | 2004-05-19 | 2007-06-13 | 哥本哈根大学 | ADAM12, a novel marker for abnormal cell function |
CN102137939A (en) * | 2008-08-29 | 2011-07-27 | 霍夫曼-拉罗奇有限公司 | Diagnosis and treatment of tumors independent of VEGF |
CN104087615A (en) * | 2014-07-03 | 2014-10-08 | 上海交通大学医学院附属第九人民医院 | Strain construction method of hemangioma animal model |
CN105112449A (en) * | 2015-09-02 | 2015-12-02 | 中国农业大学 | CD28 gene overexpression vector and application thereof |
CN110283847A (en) * | 2019-06-04 | 2019-09-27 | 西北农林科技大学 | A kind of while site-directed integration FAD3 and FABP4 gene carrier and recombinant cell |
CN110358792A (en) * | 2019-07-19 | 2019-10-22 | 华中农业大学 | Fixed point integration of foreign gene is to the targeting vector construction method of ACTB downstream of gene and its application |
CN110651046A (en) * | 2017-02-22 | 2020-01-03 | 艾欧生物科学公司 | Nucleic acid constructs comprising gene editing multiple sites and uses thereof |
CN110951784A (en) * | 2019-12-29 | 2020-04-03 | 华中农业大学 | Unmarked pig β -defensin 2 gene site-directed knock-in plasmid vector and application thereof |
KR20200110557A (en) * | 2019-03-15 | 2020-09-24 | 국립암센터 | Method of manufacturing breast cancer animal model and uses thereof |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9901081B2 (en) * | 2012-08-23 | 2018-02-27 | Buck Institute For Research On Aging | Transgenic mouse for determining the role of senescent cells in cancer |
-
2021
- 2021-02-18 CN CN202110187956.7A patent/CN114958758B/en active Active
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1981196A (en) * | 2004-05-19 | 2007-06-13 | 哥本哈根大学 | ADAM12, a novel marker for abnormal cell function |
CN102137939A (en) * | 2008-08-29 | 2011-07-27 | 霍夫曼-拉罗奇有限公司 | Diagnosis and treatment of tumors independent of VEGF |
CN104087615A (en) * | 2014-07-03 | 2014-10-08 | 上海交通大学医学院附属第九人民医院 | Strain construction method of hemangioma animal model |
CN105112449A (en) * | 2015-09-02 | 2015-12-02 | 中国农业大学 | CD28 gene overexpression vector and application thereof |
CN110651046A (en) * | 2017-02-22 | 2020-01-03 | 艾欧生物科学公司 | Nucleic acid constructs comprising gene editing multiple sites and uses thereof |
KR20200110557A (en) * | 2019-03-15 | 2020-09-24 | 국립암센터 | Method of manufacturing breast cancer animal model and uses thereof |
CN110283847A (en) * | 2019-06-04 | 2019-09-27 | 西北农林科技大学 | A kind of while site-directed integration FAD3 and FABP4 gene carrier and recombinant cell |
CN110358792A (en) * | 2019-07-19 | 2019-10-22 | 华中农业大学 | Fixed point integration of foreign gene is to the targeting vector construction method of ACTB downstream of gene and its application |
CN110951784A (en) * | 2019-12-29 | 2020-04-03 | 华中农业大学 | Unmarked pig β -defensin 2 gene site-directed knock-in plasmid vector and application thereof |
Non-Patent Citations (3)
Title |
---|
Induction of Mammary Tumors by Expression of Polyomavirus Middle T Oncogene: A Transgenic Mouse Model for Metastatic Disease;CHANTALE T. GUY et al.;《MOLECULAR AND CELLULAR BIOLOGY》;第12卷(第3期);第954-961页 * |
Insights from transgenic mouse models of PyMT-induced breast cancer: recapitulating human breast cancer progression in vivo;Sherif Attalla et al.;《Oncogene》;第475-491页 * |
猪转基因友好整合位点的筛选与应用;马林媛;《中国博士学位论文全文数据库 农业科技辑》(第5期);第6-24页1.2-1.4节,表1-4 * |
Also Published As
Publication number | Publication date |
---|---|
CN114958758A (en) | 2022-08-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
KR102604096B1 (en) | Gene therapy to treat Wilson's disease | |
CN112779292B (en) | Method for constructing high-quality pig nuclear transplantation donor cells with high lean meat percentage and rapid growth and capable of resisting blue ear diseases and serial diarrhea diseases and application of donor cells | |
CN112779291B (en) | Method for constructing high-quality pig nuclear transplantation donor cells with high lean meat percentage, fast growth, high reproductive capacity and resistance to series epidemic diseases and application thereof | |
CN114990157B (en) | Gene editing system for constructing LMNA gene mutation dilated cardiomyopathy model pig nuclear transplantation donor cells and application thereof | |
CN112522264B (en) | CRISPR/Cas9 system causing congenital deafness and application thereof in preparation of model pig nuclear donor cells | |
CN114958762B (en) | Method for constructing nerve tissue specific overexpression humanized SNCA parkinsonism model pig and application | |
CN112680444B (en) | CRISPR system for OCA2 gene mutation and application thereof in construction of albino clone pig nuclear donor cells | |
CN112877362A (en) | Gene editing system for constructing high-quality porcine nuclear transplantation donor cells with high fertility and capability of resisting porcine reproductive and respiratory syndrome and serial diarrhea diseases and application of gene editing system | |
CN112877359A (en) | CRISPR/cas system and application thereof in constructing INHA (INHA-mutated high-fertility porcine nuclear transfer donor cells) | |
CN114958758B (en) | Construction method and application of breast cancer model pig | |
CN112522313B (en) | CRISPR/Cas9 system for constructing depression cloned pig nuclear donor cells with TPH2 gene mutation | |
CN112813101B (en) | Gene editing system for constructing high-quality pig nuclear transplantation donor cells with high lean meat percentage and rapid growth and application thereof | |
CN114958760B (en) | Gene editing technology for constructing Alzheimer disease model pig and application thereof | |
CN112522311B (en) | CRISPR system for ADCY3 gene editing and application thereof in construction of obese pig nuclear transfer donor cells | |
CN113046388B (en) | CRISPR system for constructing atherosclerosis pig nuclear transfer donor cells with double genes in combined knockout mode and application of CRISPR system | |
CN112522255B (en) | CRISPR/Cas9 system and application thereof in construction of porcine recombinant cell with insulin receptor substrate gene defect | |
CN114958759A (en) | Construction method and application of amyotrophic lateral sclerosis model pig | |
CN114958761B (en) | Construction method and application of stomach cancer model pig | |
CN112575033B (en) | CRISPR system and application thereof in construction of SCN1A gene mutated epileptic encephalopathy clone pig nuclear donor cell | |
CN112608941B (en) | CRISPR system for constructing obese pig nuclear transplantation donor cells with MC4R gene mutation and application of CRISPR system | |
CN112680453B (en) | CRISPR system and application thereof in construction of STXBP1 mutant epileptic encephalopathy clone pig nuclear donor cell | |
CN112899306B (en) | CRISPR system and application thereof in construction of GABRG2 gene mutation cloned pig nuclear donor cells | |
CN113584078B (en) | CRISPR system for double-target gene editing and application thereof in construction of depressive pig nuclear transfer donor cells | |
CN112795566B (en) | OPG gene editing system for constructing osteoporosis clone pig nuclear donor cell line and application thereof | |
CN112522256B (en) | CRISPR/Cas9 system and application thereof in construction of dystrophin gene-deficient porcine recombinant cells |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |