HK1139737A1 - Genetic analysis systems and methods - Google Patents
Genetic analysis systems and methods Download PDFInfo
- Publication number
- HK1139737A1 HK1139737A1 HK10106416.1A HK10106416A HK1139737A1 HK 1139737 A1 HK1139737 A1 HK 1139737A1 HK 10106416 A HK10106416 A HK 10106416A HK 1139737 A1 HK1139737 A1 HK 1139737A1
- Authority
- HK
- Hong Kong
- Prior art keywords
- individual
- phenotype
- profile
- genotype
- genomic
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B20/00—ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B50/00—ICT programming tools or database systems specially adapted for bioinformatics
- G16B50/30—Data warehousing; Computing architectures
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6813—Hybridisation assays
- C12Q1/6827—Hybridisation assays for detection of mutation or polymorphism
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6813—Hybridisation assays
- C12Q1/6834—Enzymatic or biochemical coupling of nucleic acids to a solid phase
- C12Q1/6837—Enzymatic or biochemical coupling of nucleic acids to a solid phase using probe arrays or probe chips
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6876—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
- C12Q1/6883—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6876—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
- C12Q1/6883—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
- C12Q1/6886—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N33/00—Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
- G01N33/48—Biological material, e.g. blood, urine; Haemocytometers
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B20/00—ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
- G16B20/10—Ploidy or copy number detection
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B20/00—ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
- G16B20/20—Allele or variant detection, e.g. single nucleotide polymorphism [SNP] detection
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B20/00—ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
- G16B20/40—Population genetics; Linkage disequilibrium
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B50/00—ICT programming tools or database systems specially adapted for bioinformatics
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/156—Polymorphic or mutational markers
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/172—Haplotypes
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A90/00—Technologies having an indirect contribution to adaptation to climate change
- Y02A90/10—Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation
Landscapes
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Engineering & Computer Science (AREA)
- Chemical & Material Sciences (AREA)
- Physics & Mathematics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Genetics & Genomics (AREA)
- Analytical Chemistry (AREA)
- General Health & Medical Sciences (AREA)
- Biotechnology (AREA)
- Biophysics (AREA)
- Organic Chemistry (AREA)
- Molecular Biology (AREA)
- Theoretical Computer Science (AREA)
- Bioinformatics & Computational Biology (AREA)
- Medical Informatics (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Evolutionary Biology (AREA)
- Wood Science & Technology (AREA)
- Zoology (AREA)
- Immunology (AREA)
- Pathology (AREA)
- Biochemistry (AREA)
- Microbiology (AREA)
- General Engineering & Computer Science (AREA)
- Bioethics (AREA)
- Databases & Information Systems (AREA)
- Physiology (AREA)
- Ecology (AREA)
- Hospice & Palliative Care (AREA)
- Oncology (AREA)
- Hematology (AREA)
- General Physics & Mathematics (AREA)
- Medicinal Chemistry (AREA)
- Food Science & Technology (AREA)
- Urology & Nephrology (AREA)
- Biomedical Technology (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The present invention provides methods of determining a Genetic Composite Index score by assessing the association between an individual's genotype and at least one disease or condition. The assessment comprises comparing an individual's genomic profile with a database of medically relevant genetic variations that have been established to associate with at least one disease or condition.
Description
Background
Other recent advances in human genome sequencing and human genomics have revealed that the genome composition between any two people has more than 99.9% similarity. Relatively small amounts of variation in DNA between individuals are responsible for differences in phenotypic traits and are associated with many human diseases, susceptibility to various diseases, and response to disease treatment. Inter-individual variation of DNA occurs in coding and non-coding regions and includes base changes at specific sites in the genomic DNA sequence, as well as insertion and deletion of DNA. Changes that occur at a single base position in the genome are referred to as single nucleotide polymorphisms, or "SNPs".
Although SNPs are relatively rare in the human genome, they account for a large portion of inter-individual DNA sequence variation, with one SNP occurring approximately every 1,200 base pairs in the human genome (see International HapMap Project, www.hapmap.org). The complexity of SNPs is becoming known as more human genetic information is available. In turn, the occurrence of SNPs in the genome is associated with the presence and/or susceptibility to a variety of diseases and conditions.
As these correlations and other advances in human genetics are obtained, medical and personal care are generally moving toward personalized approaches where a patient will make appropriate medical and other choices taking into account his or her genomic information, among other factors. Thus, there is a need to provide individuals and their health care providers with information specific to the individual's personal genome, thereby providing personalized medical and other decisions.
Disclosure of Invention
The present invention provides a method of assessing genotype correlations in an individual, the method comprising: a) obtaining a genetic sample of the individual, b) generating a genomic profile of the individual, c) determining the genotype-phenotype association of the individual by comparing the genomic profile of the individual to a current database of human genotype-phenotype associations, d) reporting the results from step c) to the individual or to a health care manager of the individual, e) when additional human genotype associations are known, updating a database of human genotype correlations with the additional human genotype correlations, f) updating the genotype correlations of the individual by comparing the genomic profile of the individual obtained from step c), or a portion thereof, with the additional human genotype correlations, and determining additional genotype correlations for the individual, and g) reporting the results obtained from step f) to the individual or to a health care manager of the individual.
The present invention further provides a commercial method of assessing genotype correlation in an individual, the method comprising: a) obtaining a genetic sample of the individual; b) generating a genomic profile of the individual, c) determining the genotype correlation of the individual by comparing the genomic profile of the individual to a database of human genotype correlations; d) providing the individual in an encrypted manner with the results of determining the genotype correlations of the individual; e) updating the human genotype correlations database with additional human genotype correlations, when the additional human genotype correlations are known; f) updating the genotype correlation of the individual by comparing the genomic profile of the individual or a portion thereof to additional human genotype correlations and determining additional genotype correlations for the individual; and g) providing the individual or a health care manager of the individual with results that update the genotype correlation of the individual.
Another aspect of the invention is a method of generating a phenotype profile of an individual, the method comprising: a) providing a rule set (rule set) comprising rules, each rule indicating a correlation between at least one genotype and at least one phenotype, b) providing a data set comprising a genomic profile of each individual of the plurality of individuals, wherein each genomic profile comprises a plurality of genotypes; c) periodically updating the rule set with at least one new rule, wherein the at least one new rule indicates a correlation between genotypes and phenotypes not previously associated with each other in the rule set; d) applying each new rule to a genomic profile of at least one individual, thereby correlating at least one genotype with at least one phenotype of the individual, and optionally, e) generating a report comprising the phenotypic profile of the individual.
The present invention also provides a system comprising: a) a rule set comprising rules, each rule indicating a correlation between at least one genotype and at least one phenotype; b) code for periodically updating the rule set with at least one new rule, wherein the at least one new rule indicates a correlation between genotypes and phenotypes not previously associated with each other in the rule set; c) a database comprising genomic profiles of a plurality of individuals; d) code for applying the rule set to a genomic profile of the individual to determine a phenotypic profile of the individual; and e) code to generate a report for each individual.
Another aspect of the present invention is the transmission over the network in an encrypted or unencrypted manner in the above described method and system.
Reference to the incorporated references
All publications and patent applications mentioned in this specification are herein incorporated by reference to the same extent as if each individual publication or patent application was specifically and individually indicated to be incorporated by reference.
Drawings
FIG. 1 is a flow diagram illustrating a method aspect of the invention.
FIG. 2 is an example of a means for controlling the quality of genomic DNA.
FIG. 3 shows an example of hybridization quality control means.
FIG. 4 is a table of typical genotype correlations from publications with SNPs tested and effect evaluations. A-I) indicates genotype correlation of individual loci; J) indicates genotype correlation of the two loci; K) indicates genotype correlations at three loci; l) is an index of the ethnicity and national abbreviations used in A-K; m) is a reference for the index, inheritance rate and inheritance rate of the phenotypic Name abbreviation (ShortPhotylype Name) in A-K.
FIGS. 5A-J are tables with typical genotype correlations with effect evaluation.
FIGS. 6A-F are tables of typical genotype correlations and estimated relative risk.
FIG. 7 is an example report.
FIG. 8 is an illustration of a system for analyzing and transmitting genomic and phenotypic profiles over a network.
FIG. 9 is a flow chart illustrating a business method aspect of the invention.
FIG. 10: the prevalence (prevalence) evaluates the effect on relative risk assessment. Assuming Hardy-Weinberg Equilibrium (Hardy-Weinberg Equilibrium), each curve corresponds to a different value of allele frequency in the population. The two black lines correspond to a dominance ratio of 9 and 6, the two red lines correspond to a dominance ratio of 6 and 4, and the two blue lines correspond to a dominance ratio of 3 and 2.
FIG. 11: allele frequencies evaluate the effect on relative risk assessment. Each curve corresponds to a different value of prevalence in the population. The two black lines correspond to a dominance ratio of 9 and 6, the two red lines correspond to a dominance ratio of 6 and 4, and the two blue lines correspond to a dominance ratio of 3 and 2.
FIG. 12: pairwise comparison of absolute values of different models.
FIG. 13: pairwise comparisons of rank values (GCI scores) based on different models. Spearman correlations between the different pairs are given in table 2.
FIG. 14: popularity reports the effect on GCI scores. The Spearman correlation between any two prevalence values is at least 0.99.
FIG. 15: is a diagram of an example web page from a personal portal.
FIG. 16: a diagram of an example web page from a personal portal illustrating a risk of a person to have prostate cancer.
FIG. 17: a diagram of an example web page from a person's portal to illustrate the person's risk of crohn's disease.
FIG. 18: histogram of GCI scores for HapMAP-based multiple sclerosis using 2 SNPs.
FIG. 19: is a lifetime risk for individuals with multiple sclerosis using GCI Plus.
FIG. 20: histogram of GCI scores for crohn's disease.
FIG. 21: is a table of multi-locus correlations.
FIG. 22: table of SNPs and phenotypic associations.
FIG. 23: table of phenotypes and prevalence.
FIG. 24: are a glossary of abbreviations in figures 21, 22 and 25.
FIG. 25: table of SNPs and phenotypic associations.
Detailed Description
The present invention provides methods and systems for generating phenotypic profiles based on stored genomic profiles of individuals or groups of individuals, and for conveniently generating original and updated phenotypic profiles based on stored genomic profiles. The genomic profile is generated by genotyping a biological sample obtained from the individual. The biological sample obtained from an individual may be any sample from which a genetic sample may be derived. The sample may be from a buccal swab, saliva, blood, hair, or any other type of tissue sample. The genotype may then be determined from the biological sample. The genotype may be any genetic variant or biomarker, for example, Single Nucleotide Polymorphisms (SNPs), haplotypes (haplotypes)) or sequences of the genome. The genotype may be the entire genomic sequence of the individual. Genotypes can be derived from high throughput analysis that generates thousands or millions of data points, e.g., microarray analysis for most or all known SNPs. In other embodiments, the genotype may also be determined by high throughput sequencing.
The genotype forms the genomic map of the individual. Genomic profiles are stored digitally and are easily accessed at any point in time to generate phenotypic profiles. A phenotype profile is generated by applying rules that associate or bind a genotype with a phenotype. Rules may be formulated based on scientific studies that indicate a correlation between genotype and phenotype. The correlation may be validated or confirmed by a committee of one or more experts. By applying rules to the genomic profile of an individual, associations between the genotype and phenotype of the individual can be determined. An individual's phenotype profile will have this certainty. The determination may be a positive correlation between the genotype of the individual and a given phenotype, such that the individual has the given phenotype or will develop the phenotype. Alternatively, it may be determined that the individual does not have or will not produce a given phenotype. In other embodiments, the determination may be a risk factor, an estimate, or a probability that the individual has or will develop a phenotype.
The determination may be based on a variety of rules, for example, a variety of rules may be applied to a genomic profile to determine the association of an individual's genotype with a particular phenotype. The determination process may also include individual-specific factors such as race, gender, lifestyle (e.g., diet and exercise habits), age, environment (e.g., location of residence), family medical history, personal medical history, and other known phenotypes. The incorporation of specific factors may include these factors by modifying existing rules. Alternatively, separate rules may be generated from these factors and applied to the individual's phenotypic determination after existing rules have been applied.
A phenotype may include any measurable trait or characteristic, such as susceptibility to a disease or response to a drug treatment. Other phenotypes that may be included are physical and mental traits such as height, weight, hair color, eye color, sunburn sensitivity, size, memory, intelligence, optimism, overall temperament. Phenotypes may also include genetic comparisons with other individuals or organisms. For example, individuals may be interested in the similarity between their genomic profile and that of celebrities. They may also compare their genetic profile to other organisms (e.g., bacteria, plants, or other animals).
In summary, the collection of related phenotypes determined for an individual constitutes a phenotype profile for that individual. The phenotype profile may be accessed through an online portal. Alternatively, the phenotype profile may be provided in paper form as it existed at a particular time, with subsequent updates also being provided in paper form. Phenotypic profiles may also be provided through an online portal. The online portal may optionally be an encrypted online portal. Access to the phenotype profiles may be provided to registered users who subscribe to rules for generating correlations between phenotypes and genotypes, determining a genomic profile of an individual, applying the rules to the genomic profile, and a service for generating a phenotype profile of an individual. Access may also be provided to non-registered users, where they may have limited rights to access their phenotype profiles and/or reports, or may allow for the generation of an initial report or phenotype profile, but only generate updated reports through a paid subscription. Healthcare managers and providers, such as caregivers, physicians, and genetic consultants, may also have access to the phenotype spectrum.
In another aspect of the invention, genomic profiles may be generated for registered and non-registered users and stored digitally, but access to phenotypic profiles and reports may be limited to registered users. In another variation, both registered and non-registered users may have access to their genotype and phenotype profiles, but non-registered users have restricted access or allow for the generation of limited reports, whereas registered users have full access and may allow for the generation of full reports. In another embodiment, registered and non-registered users may initially have full access or full initial reports, but only registered users may access reports updated based on their stored genomic profile.
In another aspect of the invention, information regarding the association of a plurality of genetic markers with one or more diseases or conditions is combined and analyzed to obtain a Genetic Composite Index (GCI) score. This score includes known risk factors as well as other information and assumptions such as allele frequency and prevalence of the disease. GCI can be used to quantitatively assess the association of a disease or condition with the combined effects of a range of genetic markers. The GCI score can be used to provide reliable (e.g., robust), understandable, and/or intuitive knowledge to persons who are not genetically trained regarding their individual risk of contracting a disease as compared to a relevant population based on existing scientific studies. The GCI score may be used to generate a GCI Plus score. The GCI Plus score may include all GCI hypotheses including risk of status (e.g., lifetime risk), age-defined prevalence, and/or age-defined incidence. The lifetime risk of an individual can then be calculated as the GCI Plus score which is proportional to the individual GCI score divided by the average GCI score. The average GCI score may be determined from a group of individuals with similar familial context, such as a group of caucasians, asians, eastern indians, or other groups with common familial context. The group may consist of at least 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55 or 60 individuals. In certain embodiments, the average GCI score may be determined by at least 75, 80, 95, or 100 individuals. The GCI Plus score can be determined by determining the GCI score of the individual, dividing the GCI score by the average relative risk, and multiplying by the lifetime risk of the condition or phenotype. For example, the GCI Plus score is calculated using the data from fig. 22 and/or fig. 25 and the information in fig. 24, such as in fig. 19.
The present invention encompasses the use of the GCI scores described herein, and one of skill in the art will readily recognize the use of GCI Plus scores or variants thereof in place of the GCI scores described herein.
In one embodiment, a GCI score is generated for each disease or condition of interest. These GCI scores can be pooled to form a risk profile (risk profile) for the individual. The GCI scores may be stored digitally so that they can be conveniently accessed at any point in time to generate a risk profile. The risk profile may be broken down according to a large disease category, such as cancer, heart disease, metabolic disorder, mental disorder, bone disease, or age on-set disorder. Large disease classes can be further broken down into subclasses. For example, for a large classification as cancer, the subclass of cancers may be listed, for example, by type (sarcoma, carcinoma or leukemia, etc.) or by tissue specificity (nerve, breast, ovary, testis, prostate, bone, lymph node, pancreas, esophagus, stomach, liver, brain, lung, kidney, etc.).
In another embodiment, a GCI score is generated for the individual that provides readily understandable information about the individual's risk of acquiring or susceptibility to at least one disease or condition. In one embodiment, multiple GCI scores are generated for different diseases or conditions. In another embodiment, at least one GCI score may be accessed through an online portal. Alternatively, the at least one GCI score may be provided in a paper format, with subsequent updates also being provided in a paper format. In one embodiment, access to at least one GCI score is provided to registered users who are individuals subscribed to the service. In an alternative embodiment, non-registered users are provided access rights, where they may have limited access rights to access at least one of their GCI scores, or they may allow for the generation of an initial report of at least one of their GCI scores, but only generate updated reports through a paid subscription. In another embodiment, healthcare managers and providers, such as caregivers, doctors, and genetic consultants, may also have access to at least one of the individual's GCI scores.
There may also be a basic registration mode. The base registry may provide a phenotype profile in which registered users may choose to apply all existing rules to their genomic profile, or to apply a subset of existing rules to their genomic profile. For example, they may choose to apply only rules for treatable (actionable) disease phenotypes. The base registrations may have different levels within the registration hierarchy. For example, the different levels may depend on the number of phenotypes registered users want to associate with their genomic profile, or on the number of people who can access their phenotypic profile. Another level of basic enrollment may incorporate individual-specific factors such as a phenotype that is already known (e.g., age, gender, or medical history) into their phenotype profile. Yet another level of basic enrollment may allow an individual to generate at least one GCI score for a disease or condition. A variant of this level may further allow an individual to specify that an automatic update of the at least one GCI score for a disease or condition be generated if any change in the at least one GCI score is due to a change in the analysis used to generate the at least one GCI score. In some implementations, the individual can be notified of the automatic update by email, voice message, text message, postal delivery, or facsimile.
Registered users may also generate reports with their phenotype profiles and information about the phenotypes (e.g., genetic and medical information about the phenotypes). For example, the prevalence of a phenotype in a population, genetic variants for association, molecular mechanisms that cause a phenotype, methods of treatment for a phenotype, treatment options for a phenotype, and prophylactic actions may be included in the report. In other embodiments, the report may also include information such as the similarity between the genotype of the individual and the genotypes of other individuals (e.g., celebrities or other known individuals). Information about similarity can be, but is not limited to, percent homology, number of identical variations, and possibly similar phenotypes. The reports may further include at least one GCI score.
If the report is accessed online, the report may also provide a link to other locations with further information about the phenotype, a link to an online support team and message board of people with the same phenotype or one or more similar phenotypes, a link to contact an online genetic advisor or physician, or a link to schedule a telephone call or live appointment with a genetic advisor or physician. If the report is in paper form, the information may be the location of the linked site or the telephone number and address of the genetic counselor or doctor. Registered users can also select which phenotypes to include in their phenotype profile and which information to include in their reports. The profile and report may also be made available to the individual's health care manager or provider, such as a caregiver, doctor, psychiatrist, psychologist, therapist, or genetic counselor. The registered user can also choose whether the form and report, or portions thereof, are available to the individual's healthcare manager or provider.
The present invention may also include a registered high level (premium level). The high-level of registration digitally maintains its genomic profile after the initial phenotypic profile and report are generated, and registered users can generate phenotypic profiles and reports with updated correlations from recent studies. In another embodiment, the registered users can generate risk profiles and reports using updated correlations from recent studies. As studies reveal new correlations between genotypes and phenotypes, diseases or conditions, new rules will be generated based on these new correlations and can be applied to genomic profiles that have been stored and maintained. The new rules may associate genotypes that have not been previously associated with any phenotype, associate genotypes with new phenotypes, correct existing correlations, or provide a basis for adjusting GCI scores based on associations between newly discovered genotypes and diseases or conditions. Registered users may be notified of the new correlations via email or other electronic means, and if a phenotype of interest, they may choose to update their phenotype profile with the new correlations. Registered users may select a registration mode that pays for each update, for multiple updates within a specified time period (e.g., 3 months, 6 months, or 1 year), or for unlimited updates. Another level of enrollment may be that, rather than an individual selecting when to update their phenotype profile or risk profile, the enrolled users automatically update their phenotype profile or risk profile whenever new rules are generated based on new correlations.
In another aspect of registration, a registered user may introduce the following services to a non-registered user: generating rules of correlation between phenotype and genotype, determining a genomic profile of the individual, applying the rules to the genomic profile, and generating a phenotypic profile of the individual. The registered user may be prompted by an introduction to a preferred service subscription price or to upgrade his existing registration. Individuals introduced may have free access or may enjoy discounted registration fees for a limited period of time.
Phenotype profiles and reports and risk profiles and reports may be generated for both human and non-human individuals. For example, the subject may include other mammals, such as cattle, horses, sheep, dogs, or cats. As used herein, a registered user is a human individual who subscribes to a service by purchasing or paying for one or more services. Services may include, but are not limited to, one or more of the following: determining a genomic profile of themselves or another individual (e.g., a registered user's child or pet); obtaining a phenotype spectrum; updating phenotypic profiles and obtaining reports based on their genomic and phenotypic profiles.
In another aspect of the invention, a "field-deployed" mechanism can be derived from an aggregation of individuals to generate a phenotypic profile of the individuals. In a preferred embodiment, the individual may have an initial phenotype profile generated based on genetic information. For example, an initial phenotype profile is generated that includes risk factors for different phenotypes and suggested therapeutic or prophylactic measures. For example, the phenotype profile may include information about available medications for a condition and/or recommendations for dietary changes or exercise regimens. Individuals may choose to see or contact a doctor or genetic counselor through a web portal or telephone to discuss their phenotype profile. The individual may decide to take a certain course of action, e.g. take a specific medication, change their diet, etc.
The individual may then subsequently submit a biological sample to assess changes in their physical state and possible changes in risk factors. The individual may determine the change by submitting the biological sample directly to an entity that generates the genomic profile and the phenotypic profile (or to a related entity, such as an entity contracted by the entity that generates the genetic profile and the phenotypic profile). Alternatively, the individual may utilize a "regional deployment" mechanism, wherein the individual may submit their saliva, blood, or other biological sample to a detection device at their home, analyzed by a third party, and the data transmitted for inclusion in another phenotype profile. For example, an individual may receive an initial phenotypic report based on their genetic data to report to an individual with an increased lifetime risk of Myocardial Infarction (MI). The report may also have recommendations for preventive measures to reduce the risk of MI, such as cholesterol lowering drugs and dietary changes. Individuals may choose to contact a genetic counselor or physician to discuss the reports and preventive measures and decide to change their diet. After taking a new diet for a period of time, the individual may visit their individual physician to measure their cholesterol level. New information (cholesterol levels) may be transmitted (e.g., via the Internet) to entities with genomic information and used to generate new phenotypic profiles of individuals, as well as new risk factors for myocardial infarction and/or other states.
Individuals may also use "area deployment" mechanisms or direct mechanisms to determine their individual response to a particular drug treatment. For example, an individual may measure their response to a drug, and this information may be used to determine a more effective treatment. Information that can be determined includes, but is not limited to, metabolite levels, glucose levels, ion levels (e.g., calcium, sodium, potassium, iron), vitamins, blood cell counts, Body Mass Index (BMI), protein levels, transcript levels, heart rate, etc., which can be determined by readily available methods and can be included in algorithms to determine a revised overall risk assessment score in conjunction with the initial genomic profile.
The term "biological sample" refers to any biological sample that can be isolated from an individual, including samples from which genetic material can be isolated. As used herein, "genetic sample" refers to DNA and/or RNA obtained from or derived from an individual.
As used herein, the term "genome" is intended to mean the entire set of chromosomal DNA found in the nucleus of a human cell. The term "genomic DNA" refers to one or more chromosomal DNA molecules, or a portion of a chromosomal DNA molecule, that naturally occurs in the nucleus of a human cell.
The term "genomic profile" refers to a set of information about an individual's genes, such as the presence or absence of a particular SNP or mutation. The genomic profile includes the genotype of the individual. The genomic profile may also be a substantially complete genomic sequence of an individual. In some embodiments, the genomic profile may be at least 60%, 80%, or 95% of the entire genomic sequence of the individual. The genomic profile may be about 100% of the entire genomic sequence of an individual. When referring to a genomic map, "a portion thereof" refers to a genomic map of a subset of the genomic map of the whole genome.
The term "genotype" refers to the specific genetic composition of an individual's DNA. The genotype may include genetic variants and genetic markers of the individual. Genetic markers and genetic variants may include nucleotide repeats, nucleotide insertions, nucleotide deletions, chromosomal translocations, chromosomal duplications, or copy number variations. Copy number variations may include microsatellite repeats, nucleotide repeats, centromeric repeats or telomeric repeats. The genotype may also be SNP, haplotype or diplotype (diplotype). Haplotypes may refer to loci or alleles. Haplotypes can also be referred to as a set of Single Nucleotide Polymorphisms (SNPs) on a single chromatid that are statistically correlated. Diplotypes are a set of haplotypes.
The term single nucleotide polymorphism, or "SNP," refers to a particular locus that exhibits a variation (e.g., at least 1 percentage point (1%)) on the chromosome relative to the identity of nitrogenous choline present at that locus in a human population. For example, in the case where one individual may have adenosine (a) at a particular nucleotide position of a given gene, another individual may have cytosine (C), guanine (G) or thymine (T) at that position, such that a SNP is present at that particular position.
As used herein, the term "SNP genomic profile" refers to the base content of a given individual's DNA at a SNP location of the entire individual's whole genomic DNA sequence. "SNP profile" refers to a complete genomic profile, or to a portion thereof, such as a more localized SNP profile that may be associated with a particular gene or a particular set of genes.
The term "phenotype" is used to describe a quantitative trait or characteristic of an individual. Phenotypes include, but are not limited to, medical and non-medical conditions. Medical conditions include diseases and disorders. Phenotypes may also include physical traits such as hair color, physiological traits such as lung capacity, mental traits such as memory retention, emotional traits such as anger control ability, ethnic characteristics such as ethnic background, familial characteristics such as the position of an individual's birth, and age characteristics such as age expectations or age of onset of different phenotypes. Phenotypes may also be monogenic, where it is believed that one gene may be associated with a phenotype; or polygenic, wherein more than one gene is associated with a phenotype.
"rules" are used to define the correlation between genotype and phenotype. The rules may define the relevance by a numerical value, such as by a percentage, risk factor, or confidence score. The rules may include correlations of multiple genotypes with phenotypes. A "rule set" includes more than one rule. A "new rule" may be a rule that indicates a correlation between a genotype and a phenotype for which the rule does not currently exist. The new rules may associate unassociated genotypes with phenotypes. The new rules may also associate genotypes that have been associated with a phenotype with a previously unassociated phenotype. The "new rule" may also be an existing rule that is modified by other factors, including another rule. Existing rules may be modified due to known characteristics of the individual, such as race, family, geography, gender, age, family history, or other previously determined phenotype.
As used herein, "genotype association" refers to the statistical association between individual genotypes (e.g., the presence of a mutation or mutations), and the likelihood that a phenotype (e.g., a particular disease, state, physical state, and/or mental state) is predisposed. The frequency with which a particular phenotype is observed in the presence of a particular genotype determines the degree of genotype correlation or the likelihood that a particular phenotype will occur. For example, as detailed herein, SNPs that result in the apolipoprotein E4 isoform are associated with the induction of early onset Alzheimer's disease. Genotype correlations may also refer to correlations or negative correlations in which a phenotype is not likely to result. Genotype correlations may also indicate an assessment that an individual has a phenotype or is predisposed to developing a phenotype. Genotype correlations can be represented by numerical values, such as percentages, relative risk factors, effect assessments, or confidence scores.
The term "phenotype profile" refers to a collection of multiple phenotypes associated with a genotype or genotypes of an individual. The phenotype profile may include information generated by applying one or more rules to the genomic profile or information about genotype correlations applied to the genomic profile. A phenotype profile may be generated by applying rules that relate multiple genotypes to a phenotype. The probability or the evaluation can be expressed as a numerical value, for example as a percentage, as a numerical risk factor or as a numerical confidence interval. The probability may also be expressed as high, medium, or low. The phenotype profile may also indicate the presence or risk of a phenotype. For example, the phenotype profile may indicate the presence of blue eyes or a high risk of developing diabetes. The phenotypic profile may also indicate a predicted prognosis, therapeutic effect, or response to treatment of the medical condition.
The term risk profile refers to a collection of GCI scores for more than one disease or condition. GCI scores are based on analysis of associations between an individual's genotype and one or more diseases or conditions. The risk profile may display GCI scores grouped by disease category. Further, the risk profile may show information on how to predict changes in GCI scores as the individual ages or as various risk factors are adjusted. For example, the GCI score for a particular disease may take into account changes in diet or the effects of precautions taken (smoking cessation, medication, bilateral radical mastectomy, hysterectomy). The GCI score may be displayed as a numerical metric, a graphical display, an auditory feedback, or a combination of any of the foregoing.
As used herein, the term "online portal" refers to a source of information that an individual conveniently accesses through a computer and Internet website, telephone, or other means that allows similar access to the information. The online portal may be an encrypted website. The website may provide links to other encrypted and unencrypted websites, such as links to encrypted websites having a phenotype profile of the individual or links to unencrypted websites (e.g., message boards of individuals sharing a particular phenotype).
The practice of the present invention will employ, unless otherwise indicated, conventional techniques of molecular biology, cell biology, biochemistry and immunology, which are within the skill of the art. These conventional techniques include nucleic acid isolation, polymer array synthesis, hybridization, ligation, and hybridization detection using labels. This invention illustrates a specific exemplification of suitable techniques and is given by reference. However, other equivalent conventional methods may also be used. Other conventional techniques and instructions for use can be found in the following standard laboratory manuals and literature: for example, genomic analysis: a Series of Laboratory manuals (volumes I-IV) (Genome Analysis: A Laboratory Manual Series (Vols. I-IV)), PCR primers: a Laboratory Manual (PCR Primer: A Laboratory Manual), molecular cloning method: a Laboratory Manual (Molecular Cloning: A Laboratory Manual) (all from Cold Spring Harbor Laboratory Press), Stryer, L. (1995) biochemistry (fourth edition) Freeman, New York, Gait, "oligonucleotide Synthesis: practical methods (Oligonucleotide Synthesis: a practical approach) "1984, IRL press, london, Nelson and Cox (2000), Lehninger, biochemical principles, third edition, w.h. freeman pub., new york, n.y.; and Berg et al (2002) biochemistry, fifth edition, w.h.freeman pub., new york, n.y., all of which are incorporated herein by reference in their entirety.
The methods of the invention include analyzing the genomic profile of an individual to provide the individual with molecular information about the phenotype. As detailed herein, an individual provides a genetic sample from which a personal genome map is generated. The data relating to genotype correlations of an individual's genomic profile is queried by comparing the genomic profile to a database of established and validated human genotype correlations. The database of established and validated genotype correlations can be from the literature of peer-reviewed and further reviewed and validated by a committee of one or more experts in the field, such as geneticists, epidemiologists or statisticians. In a preferred embodiment, rules are formulated based on validated genotype correlations and applied to the genomic profile of the individual to generate a phenotypic profile. The results of the analysis of the individual's genomic profile (the phenotypic profile) are provided to the individual or individual's healthcare manager along with explanatory and supportive information, thereby giving the ability to personalize the selection of the individual's healthcare.
The method of the invention is described in detail in FIG. 1, wherein a genomic map of an individual is first generated. The individual genomic profile will include information about the individual genes based on genetic variation and genetic markers. Genetic variation is a genotype, which constitutes a genomic map. Such genetic variations or genetic markers include, but are not limited to, single nucleotide polymorphisms, single and/or polynucleotide repeats, single and/or polynucleotide deletions, microsatellite repeats (typically a small number of nucleotide repeats having 5 to 1,000 repeat units), dinucleotide repeats, trinucleotide repeats, sequence rearrangements (including translocations and repeats), copy number variations (deletions and additions at specific loci), and the like. Other genetic variations include chromosomal repeats and translocations as well as centromeric and telomeric repeats.
Genotypes may also include haplotypes and diplotypes. In some embodiments, the genomic profile may have at least 100,000, 300,000, 500,000, or 1,000,000 genotypes. In some embodiments, the genomic profile may be substantially the entire genomic sequence of an individual. In other embodiments, the genomic profile is at least 60%, 80%, or 95% of the entire genomic sequence of the individual. The genomic profile may be about 100% of the entire genomic sequence of an individual. Genetic samples containing the target substance include, but are not limited to, unamplified genomic DNA or RNA samples or amplified DNA (or cDNA). The target substance may be a specific region of genomic DNA comprising a genetic marker of particular interest.
In step 102 of FIG. 1, a genetic sample of an individual is isolated from a biological sample of the individual. These biological samples include, but are not limited to, blood, hair, skin, saliva, semen, urine, fecal material, sweat, oral cavity (buccal), and various body tissues. In some embodiments, the tissue sample may be collected directly from the individual, e.g., the oral sample may be obtained by swabbing the inside of the cheek of the individual with a swab. Other samples such as saliva, semen, urine, fecal material, or sweat may also be provided by the individual himself. Other biological samples may be taken by a health care professional (e.g., a phlebotomist, nurse, or doctor). For example, a blood sample may be drawn from an individual by a nurse. Tissue biopsies can be performed by health care professionals, and health care professionals can also utilize the kit to efficiently obtain a sample. Small cylindrical skin samples may be removed or small tissue or fluid samples may be removed using a needle.
In some embodiments, a kit is provided to an individual having a sample collection container for a biological sample of the individual. The kit may also provide instructions for the individual to directly collect their own sample, such as how much hair, urine, sweat or saliva to provide. The kit may also include instructions for the individual to request that a tissue sample be extracted by a health care professional. The kit may include a location where the sample may be collected by a third party, for example, the kit may be provided to a healthcare facility where the sample is subsequently collected from the individual. The kit may also provide a return package for delivering the sample to a sample processing facility where the genetic material is isolated from the biological sample (step 104).
Genetic samples of DNA or RNA can be isolated from biological samples according to any of several known biochemical and molecular biological methods, see, e.g., Sambrook et al, molecular cloning: a laboratory Manual (Molecular Cloning: A laboratory Manual) (Cold spring harbor laboratory, N.Y.) (1989). There are also several commercially available kits and reagents for isolating DNA or RNA from biological samples, such as those available from DNAGenotek, Gentra Systems, Qiagen, Ambion, and other suppliers. Oral sample kits are readily commercially available, e.g., MasterAmp from Epicentre BiotechnologiesTMBuccal Swab DNA extraction kit, as well as kits for extracting DNA from blood samples, e.g., Extract-N-Amp from SigmaAldrichTM. DNA derived from other tissues can be obtained by digesting the tissues with protease and performing heat treatment, centrifuging the sample and extracting unnecessary substances using phenol-chloroform, leaving the DNA in the aqueous phase. The DNA may then be further isolated by ethanol precipitation.
In a preferred embodiment, genomic DNA is isolated from saliva. For example, using DNA self-collection kit technology available from DNA Genotek, individuals collect saliva samples for clinical processing. The samples can be conveniently stored and transported at room temperature. After the sample is delivered to the appropriate laboratory for processing, the DNA is isolated by heat denaturation and protease digestion of the sample (typically at least 1 hour at 50 ℃ using reagents supplied by the collection kit supplier). The sample was then centrifuged and the supernatant was subjected to ethanol precipitation. The DNA pellet is suspended in a buffer suitable for subsequent analysis.
In another embodimentRNA can be used as a genetic sample. In particular, genetic variations in expression can be identified from mRNA. The term "messenger RNA" or "mRNA" includes, but is not limited to, pre-mRNA transcripts, transcript processing intermediates, mature mRNA prepared for translation and transcription of a gene or genes, or nucleic acid derived from mRNA transcripts. Transcript processing may include splicing, editing, and degradation. As used herein, a nucleic acid derived from an mRNA transcript refers to a nucleic acid whose mRNA transcript or subsequence thereof ultimately serves as a template for its synthesis. Thus, cDNA reverse transcribed from mRNA, DNA amplified from cDNA, RNA transcribed from amplified DNA, and the like are all derived from mRNA transcripts. RNA can be isolated from any of several body tissues using methods known in the art, e.g., using PAXgene obtained from PreAnalytiXTMBlood RNA System RNA was isolated from unfractionated whole blood. Typically, mRNA will be used to reverse transcribe cDNA which is then used or amplified for gene variation analysis.
Prior to genomic profiling, genetic samples are typically amplified from cDNA reverse transcribed from DNA or RNA. DNA can be amplified by a variety of methods, many of which use PCR. See, for example, PCR techniques: DNA Amplification mechanism and Applications (PCRTechnology: Principles and Applications for DNA Amplification) (Ed.H.A.Erlich, Freeman Press, NY, N.Y., 1992); PCR protocol: methods and application guidelines (PCR Protocols: A Guide to Methods and Applications) (eds. Innis et al, Academic Press, San Diego, Calif., 1990); mattila et al, nucleic acids Res.19, 4967 (1991); eckert et al, PCR methods and Applications (PCRmethods and Applications)1, 17 (1991); PCR (eds. mcpherson et al, IRL Press, Oxford); and U.S. Pat. nos. 4,683,202, 4,683,195, 4,800,159, 4,965,188, and 5,333,675, each of which is incorporated herein by reference in its entirety.
Other suitable amplification methods include Ligase Chain Reaction (LCR) (e.g., Wu and Wallace, genomics, 4, 560(1989), Landegren et al, science, 241, 1077(1988) and Barringer et al, Gene, 89: 117(1990)), transcriptional amplification (Kwoh et al, Proc. Natl.Acad. Sci. USA 86: 1173-1177(1989) and WO88/10315), autonomous sequence replication (Guateli et al, Proc. Nat.Acad. Sci.USA, 87: 1874-1878(1990) and WO90/06995), selective amplification of target polynucleotide sequences (U.S. Pat. No. 6,410,276), consensus primer polymerase chain reaction (CP-PCR) (U.S. Pat. No. 4,437,975), random primer polymerase chain reaction (AP-PCR) (U.S. Pat. 5,413,909, U.S. No. 2), nucleic acid sequence based amplification (RCA-loop), amplification of nucleic acid sequences (RCA-PCR) (U.S. Pat. No. 64), and amplification of multiple amplification loops (RCA-amplification of NAc. Pat. 3, NAcarrier amplification loop) (amplification of nucleic acid sequences (RCA-PCR) (U.S. No. 3, amplification of nucleic acid sequences of amplification of PCR) (U.S. 3, amplification of DNA, amplification of multiple (C2CA) (Dahl et al, Proc. Natl. Acad. Sci 101: 4548-4553 (2004)). (see U.S. patent nos. 5,409,818, 5,554,517, and 6,063,603, each of which is incorporated herein by reference). Other amplification methods that may be used are described in U.S. Pat. Nos. 5,242,794, 5,494,810, 5,409,818, 4,988,617, 6,063,603, and 5,554,517, and U.S. patent application No. 09/854,317, each of which is incorporated herein by reference.
The generation of the genomic map of step 106 is accomplished using any of several methods. Several methods are known in the art to identify genetic variations, and these include, but are not limited to, DNA sequencing by any of several methods, PCR-based methods, fragment length polymorphism analysis (restriction fragment length polymorphism (RFLP), Cleavage Fragment Length Polymorphism (CFLP)), hybridization methods using allele-specific oligonucleotides as templates (e.g., TaqMan PCR method, invader method (invader method), DNA chip method), methods using primer extension reactions, mass spectrometry (MALDI-TOF/MS method), and the like.
In one embodiment, high density DNA arrays are used for SNP identification and profiling. These arrays are commercially available from Affymetrix and Illumina (see Affymetrix GeneChip)500K Assay Manual, Affymetrix, Santa Clara, CA (incorporated by reference); sentrixhumanHap650Y genotyping bead chip (genotyping bead), Illumina, San Diego, CA).
For example, SNP profiles can be generated by genotyping SNPs over 900,000 using Affymetrix Genome Wide Human SNP Array 6.0. Alternatively, more than 500,000 SNPs analyzed by complete genome sampling can be determined by using Affymetrix GeneChip Human Mapping 500K Array Set. In these assays, a subset of the human genome is amplified by a single primer amplification reaction using restriction enzyme digested, adaptor ligated human genomic DNA. As shown in fig. 2, the concentration of the ligated DNA can then be determined. The amplified DNA is then fragmented and the mass of the sample is determined before proceeding to step 106. If the sample meets the PCR and fragmentation criteria, the sample is denatured, labeled and then hybridized to a microarray consisting of small DNA probes at specific locations on the coated quartz face. The amount of label hybridized to each probe as a function of the amplified DNA sequence is monitored to generate sequence information and ultimately SNP genotyping.
The use of Affymetrix GeneChip 500K Assay was performed according to the manufacturer's instructions. Briefly, isolated genomic DNA was first digested with NspI or StyI restriction endonucleases. The digested DNA is then ligated to NspI or StyI adaptor oligonucleotides that anneal to NspI or StyI restriction enzyme DNA, respectively. The ligated adaptor-containing DNA is then amplified by PCR to produce amplified DNA fragments of between about 200 to 1100 base pairs, as confirmed by gel electrophoresis. PCR products that meet amplification criteria are purified and quantified for fragmentation. The PCR product was fragmented with DNase I to achieve optimal DNA chip hybridization. After fragmentation, the DNA fragments should be less than 250 base pairs and on average 180 base pairs, as confirmed by gel electrophoresis. Samples meeting the fragmentation criteria were then labeled with a biotin compound using terminal deoxynucleotidyl transferase. The labeled fragments were then denatured and then hybridized to a GeneChip 250K array. After hybridization, the array was stained in a three-step process prior to scanning, consisting of the following steps: streptavidin phycoerythrin (SAPE) staining was followed by an antibody amplification step with biotinylated anti-streptavidin antibody (goat) and a final staining with streptavidin phycoerythrin (SAPE). After labeling, the array is covered with array holding buffer and then scanned with a Scanner such as the Affymetrix GeneChip Scanner 3000.
After Affymetrix GeneChip Human Mapping 500K Array Set scan, data analysis was performed according to the manufacturer's instructions, as shown in FIG. 3. Briefly, raw data was obtained using GeneChip operating software (GCOS). It can also be achieved by using Affymetrix GeneChip Command ConsoleTMData are obtained. Initial data were obtained and analyzed using GeneChip genotyping analysis software (GTYPE). For the purposes of the present invention, samples with a GTYPE modulation rate (call rate) of less than 80% were excluded. The samples were then examined using BRLMM and/or SNiPer algorithm analysis. And excluding samples with BRLMM calling rate less than 95% or SNiPer calling rate less than 98%. Finally, correlation analysis was performed and samples with SNiPer mass index less than 0.45 and/or Hardy-Weinberg p-value less than 0.00001 were excluded.
Alternatively or in addition to DNA microarray analysis, genetic variations, such as SNPs and mutations, can be detected by DNA sequencing. DNA sequencing may also be used to sequence a substantial portion or all of an individual's genomic sequence. In general, DNA sequencing is commonly used based on polyacrylamide gel fractionation to resolve populations of chain end fragments (Sanger et al, Proc. Natl. Acad. Sci. USA 74: 5463-5467 (1977)). Alternative methods that have been developed and continue to be developed improve the speed and simplicity of DNA sequencing. For example, high throughput and single molecule sequencing platforms are commercially available from, or are being developed by, 454Life Sciences (Branford, CT) (Margulies et al, Nature, (2005) 437: 376-.
After the genome map of the individual is generated in step 106, the map is stored digitally in step 108, which may be stored digitally in an encrypted manner. The genomic profile is encoded in a computer-readable format for storage as part of a data set, and may be stored as a database, where the genomic profile may be "deposited" and can be accessed again at a later time. The data set includes a plurality of data points, where each data point relates to an individual. Each data point may have a plurality of data elements. One data element is a unique identifier that is used to identify the genomic map of an individual. It may also be a bar code. Another data element is genotype information, such as a SNP or a nucleotide sequence of the genome of the individual. Data elements corresponding to genotype information can also be included in the data points. For example, if the genotype information includes SNPs identified by microarray analysis, other data elements may include microarray SNP identification numbers, SNPrs numbers, and polymorphic nucleotides (polymorphic nucleotides). Other data elements may be the chromosomal location of the genotype information, quality measures of the data, raw data files, data images, and extraction intensity scores.
Individual-specific factors, such as physical data, medical data, race, family, geography, gender, age, family history, known phenotypes, demographic data, exposure data (exposuredata), lifestyle data, behavioral data, and other known phenotypes, may also be included as data elements. For example, these factors may include, but are not limited to, the individual's: a place of birth, a parent and/or grandparent, a family of relatives, a place of residence of a ancestor, environmental conditions, known health conditions, known drug interactions, home hygiene conditions, lifestyle conditions, diet, exercise habits, marital status, and physical measurement data (e.g., weight, height, cholesterol level, heart rate, blood pressure, glucose level, and other measurement data known in the art). The above factors of the individual's relatives or ancestors (e.g., parents and grandparents) may also be introduced as data elements and used to determine the risk of the individual's phenotype or status.
The specific factors may be obtained from a questionnaire or from an individual's healthcare manager. Information from the map of "savings" can then be accessed and used as needed. For example, in an initial assessment of genotype correlations of an individual, the entire information of the individual (typically SNPs or other genomic sequences across or taken from the entire genome) will be analyzed for determining genotype correlations. In subsequent analyses, all or a portion of the information from the stored or deposited genome map may be accessed as needed or appropriate.
Comparison of genomic profiles with genotype-associated databases
In step 110, genotype correlations are obtained from the scientific literature. The genotypic relevance of a genetic variation is determined from analysis of a population of individuals who have been tested for the presence or absence of one or more phenotypic traits of interest and for genotypic profiling. Alleles of each genetic variation or polymorphism in the genotype spectrum are then tested to determine whether the presence of a particular allele is associated with the trait of interest. Correlation analysis can be performed by standard statistical methods and statistically significant correlations between genetic variation and phenotypic characteristics are recorded. For example, it may be determined that the presence of allele A1 of polymorphism A is associated with heart disease. As a further example, it may be found that the presence of a combination of allele A1 at polymorphism A and allele B1 at polymorphism B is associated with an increased risk of cancer. The results of the analysis can be published in peer review literature, confirmed by other research groups, and/or analyzed by expert committees (e.g., geneticists, statisticians, epidemiologists, and doctors), and can also be validated.
Examples of correlations between genotypes and phenotypes are shown in fig. 4,5 and 6, where rules between genotypes and phenotypes applied to the genomic profile are based on these correlations. For example, in fig. 4A and B, each row corresponds to a phenotype/locus/race, with fig. 4C through I including further information on the relevance of each of these rows. By way of example, the "phenotype name abbreviation" of BC in fig. 4A is an abbreviation for breast cancer as noted in the index for the phenotype name abbreviation of fig. 4M. In the row BC _4 (which is the class name of the locus), gene LSP1 is associated with breast cancer. As shown in fig. 4C, the published or functional SNP confirmed for this association is rs3817198, while the disclosed risk allele is C and the non-risk allele is T. The disclosed SNPs and alleles are identified by publications (e.g., the basic publications in fig. 4E-G). In the example of LSP1 of fig. 4E, the basic publication is Easton et al, nature 447: 713-720(2007). Fig. 22 and 25 further list the correlations. The correlations in figures 22 and 25 can be used to calculate an individual's risk for a state or phenotype, e.g., calculating a GCI or GCI Plus score. The GCI or GCI Plus score may also introduce information such as popularity of the status, as in fig. 23.
Alternatively, correlations may be formed from stored genomic profiles. For example, individuals with stored genomic profiles may also have stored known phenotypic information. Analysis of the stored genomic profile and known phenotypes can form genotype correlations. As an example, 250 individuals with stored genomic profiles also have stored information previously diagnosed as having diabetes. Their genomic profile was analyzed and compared to a control group of non-diabetic individuals. Individuals previously diagnosed with diabetes are then determined to have a higher rate of a particular genetic variant than the control group, and thus a genotype correlation can be made between the particular genetic variant and diabetes.
In step 112, rules are formed based on the association between the confirmed genetic variants and the particular phenotype. Rules may be generated, for example, based on the correlated genotypes and phenotypes listed in table 1. The rules based on relevance may introduce other factors, such as gender (e.g., fig. 4) or ethnicity (fig. 4 and 5) to produce an effect evaluation as in fig. 4 and 5. Other metrics produced by the rules may evaluate relative risk increase as in fig. 6. The relative risk increase for effect assessment and estimation can be from or calculated from published literature. Alternatively, the rules may be based on correlations generated from stored genomic profiles and previously known phenotypes. In some embodiments, the rules may be based on the correlations in fig. 22 and 25.
In a preferred embodiment, the genetic variant is a SNP. Although SNPs occur at single sites, individuals carrying a particular SNP allele at one site are generally predictable to carry a particular SNP allele at other sites. The association of a SNP with an allele that predisposes an individual to a disease or condition is produced by linkage disequilibrium (linkagedisequilibrium), in which the frequency of nonrandom associations occurring between alleles at two or more loci in a population is greater than or less than that expected by random formation of recombinations.
Other genetic markers or variants (e.g., nucleotide repeats or insertions) may also be in linkage disequilibrium with genetic markers that have been shown to be associated with a particular phenotype. For example, nucleotide insertions are associated with a phenotype, and SNPs are in linkage disequilibrium with nucleotide insertions. Rules are formed based on the association between SNPs and phenotypes. Rules based on the correlation between nucleotide insertions and phenotypes can also be developed. Either rule or both rules may be applied to the genomic map, as the presence of one SNP may give a certain risk factor and the other rule may give another risk factor, and when they are combined, the risk may be increased.
Through linkage disequilibrium, disease-prone alleles co-segregate with specific alleles of SNPs or combinations of specific alleles of SNPs (cosegregates). The particular combination of SNP alleles along a chromosome is called a haplotype, and the region of DNA in which they are combined may be called a haplotype block. Although a haplotype block may consist of one SNP, a typical haplotype block represents a series of 2 or more contiguous SNPs that exhibit low haplotype diversity between individuals and generally have a low recombination frequency. Identification of the haplotype can be performed by identifying one or more SNPs located in the haplotype block. Thus, in general, SNP profiling can be used to identify a haplotype block rather than having to identify all SNPs in a given haplotype block.
Genotypic correlations between SNP haplotype patterns and disease, status or physical state are becoming increasingly known. For a given disease, the haplotype patterns of a group of people known to have the disease are compared to a group of people without the disease. By analyzing many individuals, the frequency of polymorphisms in a population can be determined, and these frequencies or genotypes can then be correlated with a particular phenotype (e.g., disease or condition). Examples of known SNP-disease associations include complement factor H polymorphisms in age-related macular degeneration (Klein et al, science, 308: 385-INSIG2A variant of the gene (Herbert et al, science, 312: 279-283 (2006)). Other known SNP associations include, for example, polymorphisms in the 9p21 region including CDKN2A and B (e.g., rs10757274, rs2383206, rs13333040, rs2383207, and rs10116277 associated with myocardial infarction (Helgadottir et al, science, 316: 1491-.
SNPs may be functional or non-functional. For example, functional SNPs have an effect on cellular function, resulting in a phenotype, whereas non-functional SNPs are functionally silent, but may be in linkage disequilibrium with functional SNPs. SNPs may also be synonymous or non-synonymous. Synonymous SNPs are SNPs in which the different forms result in the same polypeptide sequence, and are non-functional SNPs. If a SNP results in different polypeptides, the SNP is non-synonymous and may be functional or non-functional. SNPs or other genetic markers used to identify haplotypes in a diplotype (which is 2 or more haplotypes) may also be used to associate phenotypes associated with the diplotype. Information about the haplotype, diplotype, and SNP profile of an individual may be in the genomic map of the individual.
In a preferred embodiment, for a rule generated based on a genetic marker that is linked in linkage disequilibrium with another genetic marker associated with a phenotype, the genetic marker may have an r2 or D' score greater than 0.5, which is commonly used in the art to determine linkage disequilibrium. In preferred embodiments, the score is greater than 0.6, 0.7, 0.8, 0.90, 0.95, or 0.99. As a result, in the present invention, the genetic markers used to associate a phenotype with an individual's genomic profile may be the same or different from functional or published SNPs associated with the phenotype. For example, using BC _4, the test SNP and the disclosed SNP are the same, just as the risk and non-risk alleles tested are the same as the disclosed risk and non-risk alleles (fig. 4A and C). However, for BC _5, CASP8 and their association with breast cancer, the test SNPs differ from their functional or published SNPs just as the risk and non-risk alleles tested were for the published risk and non-risk alleles. The tested and disclosed alleles are oriented with respect to the positive strand of the genome and from these columns, homozygous risk or non-risk genotypes can be inferred, which can generate rules for the genome map of individuals, e.g., registered users. In some embodiments, instead of identifying a test SNP, an allelic difference or SNP may be identified based on another analytical method (e.g., TaqMan) using published SNP information. For example, AMD _5 in fig. 25A, discloses the SNP rs1061170, but no test SNP was identified. Test SNPs can be identified by LD analysis of the disclosed SNPs. Alternatively, rather than using a test SNP, the genome of an individual having the test SNP can be evaluated using TaqMan or other equivalent assay methods.
The test SNPs may be "DIRECT" or "TAG (TAG)" SNPs (fig. 4E-G, fig. 5). A direct SNP is the same test SNP as a published or functional SNP, e.g., for BC _ 4. Using the European and Asian SNP rs1073640, a direct SNP can also be used for FGFR2 association of breast cancer, with the minor allele being A and the other allele being G (Easton et al, Nature 447: 1087-. Another published or functional SNP that is also an FGFR2 association with breast cancer in Europe and Asian is rs1219648(Hunter et al, nat. Genet.39: 870-874 (2007)). A tag SNP is a test SNP that is different from a functional or public SNP, such as BC _ 5. Tagging SNPs may also be used for other genetic variants, e.g. for CAMTA1(rs4908449), 9p21(rs10757274, rs2383206, rs13333040, rs2383207, rs10116277), COL1a1(rs1800012), FVL (rs6025), HLA-DQA1(rs 498888889, rs2588331), eNOS (rs1799983), MTHFR (rs1801133) and APC (rs 28933380).
A database of SNPs is publicly available from: for example, International HapMap Project (see www.hapmap.org, The International HapMap Consortium, Nature, 426: 789-. These databases provide, or enable the determination of, SNP haplotype patterns. Thus, these SNP databases enable the detection of genetic risk factors underlying a wide range of diseases and conditions (e.g., cancer, inflammatory diseases, cardiovascular diseases, neurodegenerative diseases, and infectious diseases). These diseases or conditions may be disposable, where methods of treatment and therapy currently exist. Treatment may include prophylactic treatment and treatment to improve symptoms and conditions, including lifestyle changes.
Many other phenotypes can also be detected, such as physical traits, physiological traits, mental traits, emotional traits, race, family, and age. Physical traits may include height, hair color, eye color, body, or traits such as energy, endurance, and agility. The mental traits may include intelligence, memory, or learning. Ethnicity and pedigree may include the identification of pedigree or ethnicity, or where the ancestry of the individual originated. The age may be the actual age of the individual determined or the age at which the genetic characteristics of the individual are such that they are relative to the total population. For example, an individual is 38 years old in nature, but its genetic characteristics may determine that its memory or physical health status may be 28 years old on average. The additional age trait may be the predicted lifespan of the individual.
Other phenotypes may also include non-medical conditions, such as "entertainment" phenotypes. These phenotypes may include comparisons with known individuals, e.g., foreign nobody, politician, celebrity, inventor, athlete, musician, artist, business, and notorious individuals (e.g., criminals). Other "recreational" phenotypes may include comparison with other organisms, such as bacteria, insects, plants, or non-human animals. For example, an individual may be interested in seeing how their genomic profile compares to the genomic profile of their pet dog or president.
In step 114, the rules are applied to the stored genomic profile to generate the phenotypic profile of step 116. For example, the information in fig. 4,5 or 6 may form the basis of a rule or test to be applied to the genomic profile of an individual. The rules may include the information in fig. 4 for the test SNPs and alleles and the assessment of effects, where UNITS for the assessment of effects is the unit of the assessment of effects, e.g., OR odds ratio (95% confidence interval) OR mean. In a preferred embodiment the evaluation of the effect may be a genotypic risk (FIGS. 4C-G), such as risk for homozygotes (homoz or RR), risk heterozygotes (heteroz or RN) and non-risk homozygotes (homoz or NN). In other embodiments, the effect evaluation may be carrier risk (carrierisk), which is RR or RN to NN. In still further embodiments, the assessment of effect may be based on allele, allele risk, e.g., R versus N. Here too, there are two loci (FIG. 4J) or three loci (FIG. 4K) of genotype effect evaluation (e.g., 9 possible genotype combinations for two locus effect evaluation: RRRR, RRNN, etc.). The frequency of the test SNPs in the public HapMap is also recorded in fig. 4H and I.
In other embodiments, the information from fig. 21, 22, 23, and/or 25 can be used to generate information to apply to a genomic profile of an individual. For example, the information can be used to generate a GCI or GCI Plus score for the individual (e.g., fig. 19). The score can be used to generate information of genetic risk (e.g., estimated lifetime risk) for one or more states in a phenotypic profile of an individual (e.g., fig. 15). The method allows calculation of an estimated lifetime risk or relative risk for one or more phenotypes or states as listed in fig. 22 or 25. The risk of a single state may be based on one or more SNPs. For example, the estimated risk for a phenotype or state may be based on at least 2, 3, 4,5, 6, 7, 8, 9, 10, 11, or 12 SNPs, wherein the SNP used to estimate risk may be a public SNP, a test SNP, or both (e.g., fig. 25).
The estimated risk for the state may be based on the SNPs listed in fig. 22 or 25. In some embodiments, the risk of a condition may be based on at least one SNP. For example, an individual's assessment of risk for Alzheimer's Disease (AD), colorectal cancer (CRC), Osteoarthritis (OA), or exfoliative glaucoma (XFG) may be based on 1 SNP (e.g., rs4420638 for AD, rs 69883267 for CRC, rs4911178 for OA, and rs2165241 for XFG). For other states, such as obesity (BMIOB), graves' disease (GD), or Hemochromatosis (HEM), the estimated risk of an individual may be based on at least 1 or 2 SNPs (e.g., rs9939609 and/or rs9291171 for BMIOB; DRB1 0301DQA1 and/or rs3087243 for GD; rs 0501800562 and/or rs129128 for HEM). For states such as, but not limited to, Myocardial Infarction (MI), Multiple Sclerosis (MS) or Psoriasis (PS), 1,2 or 3 SNPs may be used to assess an individual's risk for these states (e.g., rs1866389, rs1333049 and/or rs6922269 for MI; rs6897932, rs12722489 and/or DRB1 x 1501 for MS; rs6859018, rs11209026 and/or HLAC 0602 for PS). To assess the individual risk of Restless Legs Syndrome (RLS) or celiac disease (CelD), 1,2, 3 or 4 SNPs may be used (e.g., rs6904723, rs2300478, rs1026732 and/or rs9296249 for RLS; rs6840978, rs11571315, rs2187668 and/or DQA1 x 0301 DQB1 x 0302 for CelD). For Prostate Cancer (PC) or lupus (SLE), 1,2, 3, 4 or 5 SNPs may be used to assess an individual's risk for PC or SLE (e.g., rs4242384, rs 69883267, rs 169901979, rs17765344 and/or rs4430796 for PC, rs12531711, rs10954213, rs2004640, DRB1 0301 and/or DRB1 1501 for SLE). To assess the lifetime risk of an individual for macular degeneration (AMD) or Rheumatoid Arthritis (RA), 1,2, 3, 4,5 or 6 SNPs may be used (e.g. rs10737680, rs10490924, rs541862, rs2230199, rs1061170, and/or rs9332739 for AMD, rs6679677, rs11203367, rs6457617, DRB 0101, DRB1 × 1, and/or DRB 04084 × 0404 for RA). To assess the lifetime risk of an individual with Breast Cancer (BC), 1,2, 3, 4,5, 6 or 7 SNPs may be used (e.g., rs3803662, rs2981582, rs4700485, rs3817198, rs17468277, rs6721996 and/or rs 3803662). To assess the lifetime risk of an individual with Crohn's Disease (CD) or type 2 diabetes (T2D), 1,2, 3, 4,5, 6, 7, 8, 9, 10 or 11 SNPs may be used (e.g., rs2066845, rs5743293, rs10883365, rs17234657, rs10210302, rs9858542, rs11805303, rs1000113, rs17221417, rs2542151 and/or rs10761659 for CD; rs13266634, rs4506565, rs10012946, rs 10006992, rs10811661, rs 77512288738, rs8050136, rs1111875, rs4402960, rs5215 and/or rs1801282 for T2D). In some embodiments, the SNP used as a basis for risk determination may form a linkage disequilibrium with a SNP described above or listed in fig. 22 or 25.
The phenotype profile of an individual may include a number of phenotypes. In particular, assessing a patient's risk of having a disease or other condition (e.g., likely drug response, including metabolism, efficacy, and/or safety) by the methods of the invention enables prognostic or diagnostic analysis of susceptibility to a variety of unrelated diseases and conditions, whether in symptomatic, presymptomatic, or asymptomatic individuals, including carriers of one or more disease/condition-susceptible alleles. Thus, these methods provide an overall assessment of individual susceptibility to a disease or condition without the need to pre-envisage testing for any particular disease or condition. For example, the methods of the invention enable the assessment of individual susceptibility for any of a variety of conditions listed in table 1, fig. 4,5 or 6 based on individual genomic profiles. Moreover, these methods allow individuals evaluating one or more phenotypes or states to estimate a lifetime risk or relative risk, such as those phenotypes in fig. 22 or 25.
The evaluation preferably provides information about 2 or more of these states, and more preferably 3, 4,5, 10, 20, 50, 100 or even more of these states. In a preferred embodiment, at least 20 rules are applied to the genomic profile of an individual to obtain a phenotypic profile. In other embodiments, at least 50 rules are applied to the genomic profile of the individual. A single rule of phenotype may be applied to a single gene phenotype. More than one rule may also be used for a single phenotype, such as a multi-gene phenotype or a single-gene phenotype where multiple genetic variants in a single gene affect the probability of the phenotype appearing.
After an initial scan of the genomic profile of an individual patient, updates to the individual genotype correlations are made (or employed) by comparison to additional nucleotide variants (e.g., SNPs) when these additional nucleotide variants are known. For example, step 110 can be performed periodically, e.g., daily, weekly, or monthly, by one or more of ordinary skill in the art of genetics who search the scientific literature for new genotype correlations. The new genotype correlations may then be further confirmed by a committee of one or more experts in the field. Step 112 may then be periodically updated with new rules based on the new validated dependencies.
The new rule may include genotypes or phenotypes outside of the existing rules. For example, genotypes not associated with any phenotype are found to be associated with a new or existing phenotype. The new rules may also be used for the correlation between previous non-genotypes and their associated phenotypes. The new rules may also be determined for genotypes and phenotypes that already have existing rules. For example, there are rules based on the correlation between genotype a and phenotype a. New studies revealed that genotype B is associated with phenotype a, thus creating new rules based on this association. Another example is the discovery that phenotype B is associated with genotype A and new rules are developed accordingly.
Rules can be formulated when finding correlations based on what is known but not initially confirmed in the published scientific literature. For example, it may be reported that genotype C is associated with phenotype C. Additional publications report that genotype D is associated with phenotype D. Phenotypes C and D are associated symptoms, e.g., phenotype C may be tachypnea, while phenotype D is a smaller lung volume. The association between genotype C and phenotype D or between genotype D and phenotype C can be found and confirmed by statistical methods using the existing stored genomic profiles of individuals with genotypes C and D and phenotypes C and D, or by further study. New rules may then be generated based on the newly discovered and confirmed correlations. In another embodiment, stored genotype profiles for multiple individuals with specific or related phenotypes can be studied to determine genotypes common to these individuals and determine correlations. New rules may be generated based on this correlation.
Rules may also be formulated to modify existing rules. For example, the correlation between genotype and phenotype may be determined in part by known individual characteristics, such as race, family, geography, gender, age, family history, or any other known phenotype of the individual. Rules based on these known individual characteristics may be formulated and incorporated into existing rules to provide revised rules. The choice of rules to apply the correction will depend on the particular individual factors of the individual. For example, the rule may be based on a 35% probability that an individual has phenotype E when the individual has genotype E. However, if the individual is of a particular ethnicity, the probability is 5%. New rules may be formulated based on this result and applied to individuals with the particular ethnic characteristics. Alternatively, an existing rule may be applied that determines a value of 35% and then another rule based on the ethnic characteristics of the phenotype. Rules based on known individual characteristics can be determined from the scientific literature or based on studies on stored genomic profiles. As new rules are generated, they may be added and applied to the genomic map in step 114, or they may be applied periodically, for example at least once a year.
Information on individual risk of disease can also be expanded with the technological advances in higher resolution SNP genomic maps. As described above, the initial SNP genomic profile can be easily generated using microarray technology for scanning 500,000 SNPs. Given the case of the haplotype block, this number can be used for a typical profile of all SNPs in the genome of an individual. Nonetheless, it is estimated that about 1000 ten thousand SNPs (the International HapMap Project; www.hapmap.org) typically occur in the human genome. With technological advances in practical and economic interpretation of SNPs (e.g., microarrays of 1,000,000, 1,500,000, 2,000,000, 3,000,000 or more SNPs) or whole genome sequencing at higher levels of detail, more detailed SNP genomic profiles can be generated. Likewise, advances in technology through computational analysis methods will enable more elaborate economic analysis of SNP genomic profiles and updating of SNP-disease association master databases.
After generating the phenotype profile at step 116, the registered users or their healthcare managers may access their genomic or phenotype profiles through an online portal or website as in step 118. Reports including phenotype profiles and other information about phenotype profiles and genomic profiles may also be provided to registered users or their healthcare managers, as described in steps 120 and 122. The report may be printed out, stored in a registered user's computer, or viewed online.
FIG. 7 illustrates an example online report. The registered user may choose to display a single phenotype or more than one phenotype. The registered users may also have different View options, for example, a "Quick View" option as shown in FIG. 7. The phenotype may be a medical condition and the different treatments and symptoms in the quick report may be linked to other web pages containing further information about the treatment. For example, by clicking on a medication, a website may be directed that includes information about dosage, cost, side effects, and efficacy. The drug may also be compared to other treatments. The website may also include a link to the website of the pharmaceutical manufacturer. Another link may provide the registered user with the option of generating a pharmacogenomic (pharmacogenomic) map, which will include information on their likely response to the drug based on their genomic map. Links to alternatives to medication may also be provided, such as preventive behavior (e.g., fitness and weight loss); and may also provide links to dietary supplements, dietary plans, and links to nearby health clubs, health clinics, health and rehabilitation providers, metropolitan spa (day spa), and the like. Educational and informative videos, summaries of available treatments, possible therapies, and general advice may also be provided.
The online report may also provide a link to schedule individual doctors or genetic counseling appointments or to access an online genetic counselor or doctor, thereby providing the registered user with the opportunity to query more information about their phenotype profile. Links to online genetic consultation and physician queries may also be provided on the online report.
Reports may also be viewed in other forms, such as a composite view of a single phenotype, where more detail is provided for each category. For example, there may be more detailed statistics regarding the likelihood of a phenotype occurring for registered users; more information about typical symptoms or phenotypes, such as a range of symptoms representative of a medical condition or a physical non-medical condition (e.g., height); or more information about genes and genetic variants, such as population prevalence, e.g., in the world or in different countries, or in different age ranges or genders. For example, fig. 15 shows a summary of estimated lifetime risks for a number of states. The individual may view more information about a particular condition, such as prostate cancer (figure 16) or crohn's disease (figure 17).
In another embodiment, the report may be of an "entertaining" phenotype, e.g., the similarity of an individual's genomic profile to that of a known individual (e.g., albert einstein). The report can show the percent similarity between the individual genomic profile and the individual genomic profile of einstein, and can further show the predicted IQ of einstein and the predicted IQ of the individual. Further information may include the genomic profile of the total population and how its IQ is compared to the genomic profile and IQ of the individual and einstein.
In another embodiment, the report may display all phenotypes that have been associated with the genomic profile of the registered user. In other embodiments, the report may only show a phenotype determined to be positively correlated with the genomic profile of the individual. Individuals may select particular sub-classes that display phenotypes in other forms, such as medical-only phenotypes or disposable medical phenotypes only. For example, the disposable phenotypes and their associated genotypes may include crohn's disease (associated with IL23R and CARD 15), type 1 diabetes (associated with HLA-DR/DQ), lupus (associated with HLA-DRB1), psoriasis (HLA-C), multiple sclerosis (HLA-DQA1), graves' disease (HLA-DRB1), rheumatoid arthritis (HLA-DRB1), type 2 diabetes (TCF7L2), breast cancer (BRCA2), colon cancer (APC), situational memory (KIBRA), and osteoporosis (COL1a 1). Individuals may also select sub-classes that show phenotypes in the report, e.g., inflammatory diseases of medical conditions only or physical traits of non-medical conditions only. In some embodiments, an individual may choose to display all of the states for which an estimated risk is calculated for the individual by highlighting those states for which an estimated risk is calculated (e.g., fig. 15A, D), states with only a higher risk (fig. 15B), or states with only a lower risk (fig. 15C).
The information delivered and communicated to the individual may be encrypted and confidential and access to the information by the individual may be controlled. Information derived from complex genomic profiles can be provided to individuals as regulatory-approved, understandable, medically relevant, and/or highly influential data. Information may also be of general importance, regardless of medical treatment. Information may be delivered to an individual cryptographically in several ways, including, but not limited to, an entrance interface and/or mail. More preferably, the information is provided to the individual encrypted via a portal interface to which the individual has secure and confidential access (if the individual so chooses). This interface is preferably provided through an online, internet web portal, or alternatively, through the phone or other means that allows private, secure, and easy-to-use access. The data transmission of genomic profiles, phenotypic profiles and reports over the network is provided to the individual or its health care manager.
Accordingly, FIG. 8 is a block diagram illustrating a representative example logic device through which phenotype profiles and reports may be generated. FIG. 8 shows a computer system (or digital device) 800 for receiving and storing a genomic profile, analyzing genotype correlations, generating rules based on genotype correlations, applying rules to the genomic profile, and generating phenotypic profiles and reports. The computer system 800 may be understood as a logical device capable of reading instructions from the media 811 and/or the network port 805, the network port 805 optionally being connected to a server 809 having a fixed media 812. The system shown in fig. 8 includes a CPU 801, a disk drive 803, an optional input device (e.g., keyboard 815 and/or mouse 816), and an optional monitor 807. Data communication with the server 809 at the local or remote location may be accomplished via the communication medium shown. A communication medium may include any means for transmitting and/or receiving data. The communication medium may be, for example, a network connection, a wireless connection, or an internet connection. This connection may provide communication over the World Wide Web. It is envisioned that data pertaining to the present invention may be transmitted over such means over a network or connection for receipt and/or verification by a party 822. Recipient 822 may be, but is not limited to, an individual, a registered user, a healthcare provider, or a healthcare manager. In one embodiment, the computer-readable medium comprises a medium adapted to convey the results of an analysis of a biological sample or genotype correlation. The medium may comprise results on a phenotype profile of an individual subject, wherein such results are obtained using the methods described herein.
The personal portal will preferably serve as the basic interface for the individual receiving and evaluating the genomic data. The portal will enable an individual to track the progress of their samples from collection to testing and to track the results. Through portal visits, individuals are presented with the relative risk of common genetic diseases based on their genomic profiles. The registered user can select through the portal which rules to apply to their genomic profile.
In one embodiment, one or more web pages will have a list of phenotypes and a box near each phenotype that registered users can select to include in their phenotype profiles. Phenotypes can be linked to information related to the phenotype to assist registered users in judiciously selecting a phenotype about which they wish to include in their phenotype profile. The web page may also have phenotypes organized in disease groups (e.g., treatable diseases or non-treatable diseases). For example, the registered user may select only disposable phenotypes, such as HLA-DQA1 and celiac disease. Registered users may also choose to display pre-symptomatic or post-symptomatic treatment of the phenotype. For example, an individual may be selected to have a treatable phenotype (beyond further screening) of presymptomatic treatment, which for celiac disease is a presymptomatic treatment of gluten-free diet. Another example may be alzheimer's disease, with pre-symptomatic treatment being statins, exercise, vitamins and psychotropic effects. Thrombosis is another example, and pre-symptomatic treatment is to avoid oral contraceptives and avoid prolonged sedentary. An example of a phenotype with approved post-symptomatic treatment is wet AMD associated with CFH, where an individual may undergo laser treatment of their condition.
Phenotypes may also be organized by type or kind of disease or condition, such as neurological, cardiovascular, endocrine, immunological, and the like. Phenotypes can also be grouped into medical and non-medical phenotypes. Other classifications of phenotypes on web pages can be made in terms of physical traits, physiological traits, mental traits, or emotional traits. The web page may further provide for selecting a set of phenotypic partitions by selecting a box. For example, all phenotypes, medically-only related phenotypes, non-medically related phenotypes only, disposable phenotypes only, non-disposable phenotypes only, different disease groups, or "entertainment" phenotypes are selected. The "entertaining" phenotype may include comparison to a celebrity or other well-known individual, or comparison to other animals or even other organisms. A list of genomic maps available for comparison may also be provided on a web page for selection by the registered user for comparison with the registered user's genomic map.
The online portal may also provide a search engine to assist registered users in browsing the portal, retrieving a particular phenotype, or retrieving particular terms or information revealed by their phenotype profile or report. Links to access the collocated services and offered products may also be provided by the portal. Additional links to chat rooms supporting teams, message boards, and individuals with common or similar phenotypes may also be provided. The online portal may also provide links to other addresses with more information about the phenotype in the registered user phenotype spectrum. The online portal may also provide services that allow registered users to share their formulaic spectrum and reports with friends, family, or healthcare managers. Registered users may choose to display the phenotype they wish to share with their friends, family or healthcare managers in a phenotype spectrum.
The phenotype profiles and reports provide personalized genotype correlations for individuals. The genotypic relevance provided to an individual can be used to determine personal health care and lifestyle choices. If a strong correlation between the genetic variant and the disease that can be treated is found, the detection of the genetic variant can help decide to initiate disease treatment and/or individual monitoring. In the case where there is a statistically significant correlation, but not considered a strong correlation, the individual may discuss this information with the individual physician and decide on an appropriate, beneficial course of action. Potential regimens that may benefit an individual in terms of a particular genotype correlation include performing therapeutic treatments, monitoring potential therapeutic needs or effects, or changing lifestyle in terms of diet, exercise, and other personal habits/activities. For example, a treatable phenotype (e.g., celiac disease) may be treated for symptoms of a gluten-free diet. Also, through pharmacogenomics, genotype related information can be applied to predict the likely response of an individual who must be treated with a particular drug or course of drug therapy, such as the likely efficacy or safety of a particular drug therapy.
The registered user may choose to provide the genomic profile and the phenotypic profile to their healthcare manager, such as a physician or genetic counselor. The genomic and phenotypic profiles may be accessed directly by the healthcare administrator, printed out as a copy by a registered user for delivery to the healthcare administrator, or sent directly to the healthcare administrator through an online portal (e.g., via a link on an online report).
The transfer of this relevant information will cause the patient to perform an action that is coordinated with his physician. In particular, discussions between a patient and his doctor may be made possible by personal portals and links to medical information and integrating the patient's genomic information into his medical records. The medical information may include prevention and health information. The information provided to an individual patient by the present invention will enable the patient to make an informed choice as to his or her health care. In this way, patients can select for diseases that may help them avoid and/or delay the more likely cause of their individual genomic profile (inherited DNA). In addition, the patient will be able to adopt a treatment regime tailored to the specific medical needs of the individual himself. Individuals will also have the ability to access their genotype data if they develop a disease and require this information to help their physician develop a treatment strategy.
Genotype-related information can also be used in conjunction with genetic counseling to suggest to couples considering fertility, as well as to suggest potential genetic concerns for the mother, father, and/or child. The genetic advisor can provide information and support to registered users with a phenotype profile that shows an increased risk for a particular state or disease. They can interpret information about the condition, analyze genetic patterns and risk of recurrence, and discuss available choices with registered users. The genetic counselor can also provide supportive consultations to recommend community or national support services to registered users. Genetic counseling may include a specific registration plan. In some embodiments, the genetic counseling may be scheduled to be available within 24 hours of the request and for times such as evening, saturday, sunday, and/or holiday.
The entry of the individual will also facilitate the transfer of additional information beyond the initial screening. Individuals will be informed of new scientific discoveries about their personal genetic profile, such as information about new therapeutic or prophylactic strategies for their current or potential state. New findings may also be communicated to their healthcare managers. In a preferred embodiment, the registered user or their healthcare provider is electronically notified of new genotype correlations and new studies about the phenotypes in the phenotype profile of the registered user. In other embodiments, an email of the "entertainment" phenotype is sent to registered users, e.g., an email can inform them that 77% of their genomic profile is the same as that of arabian-lincoln and that further information is provided through an online portal.
The present invention also provides a computer code system for generating new rules, revising rules, combining rules, periodically updating rule sets with new rules, securely maintaining a genomic profile database, applying rules to genomic profiles to determine phenotypic profiles, and for generating reports. The computer code informs the registered user of new or revised correlations and new or revised reports, such as reports with new prevention and health information, information about new treatments under development, or available new treatments.
Business method
The present invention provides a commercial method for assessing genotype correlations of individuals based on a comparison of a patient's genomic profile to a clinical database of established medically relevant nucleotide variants. The present invention further provides a business method that uses a stored genomic profile of an individual to assess initially unknown novel correlations to generate an updated phenotypic profile of the individual without requiring the individual to submit additional biological samples. Fig. 9 is a flow chart illustrating the business method.
The revenue stream for the commercial process of the present invention is generated in part in step 101 when an individual initially requests and purchases a personal genome map for genotypic correlations of a variety of common human diseases, conditions and physical states. The request and purchase may be made from a number of sources including, but not limited to, an online web portal, an online health service, and an individual's individual doctor or similar source of personal medical attention. In alternative embodiments, the genomic profile may be provided free of charge and the revenue stream may be generated in a subsequent step (e.g., step 103).
A registered user or consumer makes a request to purchase a form spectrum. The collection kit is provided to the consumer in response to the demand and purchase for collecting the biological sample for genetic sample isolation in step 103. When requested by a source that is online, by telephone, or other such that a consumer cannot readily obtain the collection kit in person, the collection kit is provided by courier, such as a courier service that is delivered on the day or at night. Included in the collection kit are containers for the sample and packaging materials for rapid delivery of the sample to the laboratory where the genomic map is generated. The kit may also include instructions for sending the sample to a sample processing facility or laboratory and instructions for accessing its genomic and phenotypic profiles, which may be performed through an online portal.
As explained in detail above, genomic DNA can be obtained from any of a variety of types of biological samples. Preferably, genomic DNA is isolated from saliva using a commercially available collection kit (e.g., a kit available from DNA Genotek). The use of saliva and such a kit enables non-invasive sample collection, since it is convenient for the consumer to provide a saliva sample in a container from the collection kit, and then seal the container. Additionally, saliva samples can be stored and transported at room temperature.
After the biological sample is deposited in the collection or specimen container, the consumer delivers the sample to the laboratory for processing in step 105. Typically, the consumer may use the packaging material provided in the collection kit to deliver/send the sample to the laboratory through rapid delivery, such as a courier service on the same day or overnight.
Laboratories that process samples and generate genomic maps may follow appropriate government agency guidelines and regulations. For example, in the united states, a treatment laboratory may be managed by one or more federal agencies and/or one or more state agencies, such as the Food and Drug Administration (FDA) or the Centers for medical and medical id Services (CMS). Clinical laboratories in the United states may be licensed or approved according to Clinical Laboratory Improvement Algorithms (CLIA) of 1988.
In step 107, the sample is processed as previously described by the laboratory to isolate a genetic sample of DNA or RNA. The isolated genetic sample is then analyzed and a genomic map is generated in step 109. Preferably, a genomic SNP profile is generated. As described above, several methods can be used to generate SNP profiles. Preferably, high density arrays (e.g., commercially available platforms from Affymetrix or Illumina) are used for SNP identification and profiling. For example, as described in more detail above, SNP profiles were generated using Affymetrix GeneChipassay. As technology evolves, there may be other technology vendors that can generate high density SNP profiles. In another embodiment, the genomic profile of the registered user will be the genomic sequence of the registered user.
After the genomic profile of the individual is generated, the genotype data is preferably encrypted, entered in step 111, and deposited in an encrypted database or vault where the information is stored for future use in step 113. The genomic profile and related information may be confidential, with access to this private information and genomic profile being restricted according to the instructions of the individual and/or his or her individual physician. Others (e.g., the family of the individual and a genetic counselor) may also be granted access by the registered user.
The database or vault may be located locally at the processing laboratory. Alternatively, the database may be located at a separate location. In this case, the genomic map data generated by the processing laboratory may be transported to a separate facility comprising a database in step 111.
After generating the genomic profile of the individual, the genetic variation of the individual is then compared to a clinical database of determined medically relevant genetic variants in step 115. Alternatively, the genotype correlations may not be medically relevant but still be included in the genotype correlation database, for example, physical traits such as eye color, or "entertainment" phenotypes such as similarity to a celebrity genome map.
Medically relevant SNPs can be established through scientific literature and related sources. non-SNP genetic variants can also be established to associate with a phenotype. Typically, the association of SNPs for a given disease is established by comparing the haplotype patterns of a group of people known to already have the disease to a group of people without the disease. By analyzing many individuals, the frequency of polymorphisms in a population can be determined, and in turn these genotype frequencies can be correlated with a particular phenotype (e.g., disease or condition). Alternatively, the phenotype may be a non-medical condition.
Related SNPs and non-SNP genetic variants can also be determined by analyzing stored genomic maps of individuals, rather than by available published literature. Individuals with stored genomic profiles may reveal phenotypes that have been previously determined. Analysis of the genotype and revealed phenotype of an individual can be compared to individuals without the phenotype to determine correlations that can then be used for other genomic profiles. Individuals whose genomic profile is determined may fill out a questionnaire regarding the phenotypes that have been previously determined. The questionnaire may include questions about medical and non-medical conditions, such as previously diagnosed diseases, family history of medical conditions, lifestyle, physical traits, mental traits, age, social life, environment, and the like.
In one embodiment, if an individual fills out a questionnaire, they can determine their genomic profile for free. In some embodiments, individuals fill out questionnaires periodically to gain free access to their profile and reports. In other embodiments, individuals who have filled out a questionnaire may be given an upgrade to the registration so that they have a higher level of access than their previous registrations, or they may purchase or update the registration at a lower price.
To ensure scientific accuracy and importance, all information deposited in the medically relevant database of genetic variants in step 121 is first approved by a research/clinical advisor group and, if authorized in step 119, reviewed and supervised by appropriate governmental agencies. For example, in the united states, the FDA may supervise by approving algorithms for validating data relating to genetic variants (typically SNPs, transcript levels, or mutations). In step 123, the scientific literature and other relevant sources are monitored for additional genetic variant-disease or condition correlations, and after confirming their accuracy and importance, and upon review and approval by governmental agencies, these additional genotype correlations are added to the master database in step 125.
The combination of a database of approved and validated medically relevant genetic variants with a genome-wide individual profile will advantageously allow genetic risk assessment of a large number of diseases or conditions. After compiling a genomic profile of an individual, the genotype correlations of the individual may be determined by comparing nucleotide (genetic) variants or genetic markers of the individual to a database of human nucleotide variants that have been associated with a particular phenotype (e.g., a disease, state, or physical state). By comparing the individual's genomic profile to a master database of genotype correlations, individuals can be informed whether and to what extent they find positive or negative for genetic risk factors. Individuals will receive relative risk and/or disease constitution data for a wide range of scientifically proven disease states (e.g., alzheimer's disease, cardiovascular disease, coagulation). For example, genotype correlations in table 1 may be included. In addition, SNP disease correlations in the database may include, but are not limited to, those shown in fig. 4. Other correlations in fig. 5 and 6 may also be included. The business method of the present invention thus provides risk analysis for a large number of diseases and conditions without the need to know in advance what risks those diseases and conditions may cause.
In other embodiments, the genotypic correlation associated with the genome-wide individual profile is a non-medically relevant phenotype, such as a "recreational" phenotype or a physical trait such as hair color. In a preferred embodiment, the rule or rule set is applied to a genomic map or SNP map of the individual, as described above. Applying the rules to the genomic profile generates a phenotypic profile for the individual.
Thus, when new correlations are discovered and validated, the master database of human genotype correlations is expanded with additional genotype correlations. Updates may be made by accessing relevant information from individual genomic profiles stored in a database, as needed or appropriate. For example, the known correlations of new genotypes may be based on specific gene variants. It can then be determined whether an individual is likely to be affected by the new genotype correlation by obtaining and comparing only a portion of the gene in the individual's complete genomic profile.
The results of the genomic query are preferably analyzed and interpreted for presentation to the individual in an understandable format. The results of the initial screening are then provided to the patient in a secure, confidential manner, either by mail or through an online portal interface as described in detail above, step 117.
The report may include a phenotype profile as well as genomic information about the phenotypes in the phenotype profile, e.g., basic genetic information about the genes involved or statistical information about the genetic variants in different populations. Other information based on phenotype profiles that may be included in the report are preventive strategies, health information, treatment methods, symptom recognition, early detection protocols, intervention protocols, and further identification and classification of phenotypes. Controlled, modest updates may be or may be performed after initial screening of the genomic profile of the individual.
When new genotype correlations arise and are verified and approved, the individual genomic profile is updated or available for updating in conjunction with updates to the master database. New rules based on new genotype correlations may be applied to the initial genomic profile to provide an updated phenotypic profile. An updated genotype correlation profile may be generated by comparing the relevant portion of the genomic profile of the individual to the new genotype correlations in step 127. For example, if a new genotype correlation is found based on variations in a particular gene, the portion of the gene that maps to the genome of the individual can be analyzed for the new genotype correlation. In this case, one or more rules may be applied to generate an updated tabular form, rather than updating the tabular form with the entire rule set having the rules already applied. In step 129, the results of the updated genotype correlations for the individual are provided in an encrypted manner.
The initial and updated phenotype profiles may be services provided to registered users or consumers. Different levels of registration for genomic profiling and combinations thereof may be provided. Likewise, the registration level may be varied to provide individuals with a choice of the amount of service they wish to receive with their genotype correlations. In this way, the level of service provided will vary with the registration level of the service purchased by the individual.
Entry level registration of registered users may include genomic profiles and initial phenotype profiles. This may be the base registration level. There may be different levels of service within the base registration level. For example, a particular registration level may provide an introduction to genetic counseling, doctors with special expertise in treating or preventing a particular disease, and other service options. Genetic counseling can be obtained online or by telephone. In another embodiment, the price of the enrollment may depend on the number of phenotypes the individual selects for their phenotype profile. Another option might be whether the registered user chooses to access online genetic counseling.
In another case, registration may provide an initial genotypic correlation of the whole genome while maintaining the genomic profile of the individual in the database; this database may be encrypted if the individual so chooses. After this initial analysis, subsequent analyses and additional results may be completed upon request and additional payment by the individual. This may be a high level registration.
In one embodiment of the business method of the present invention, an update of the risk of the individual is made and the individual may be provided with corresponding information on a registered basis. Registered users who purchase advanced registrations may obtain updates. Registration for genotype correlation analysis can provide an update of a particular type or subclass of new genotype correlations according to individual preferences. For example, an individual may only wish to learn about the existence of genotype correlations for known therapeutic or prophylactic processes. To assist the individual in deciding whether to perform additional analyses, the individual may be provided with information regarding additional genotype correlations that have been made available. This information can be conveniently mailed or emailed to registered users.
In advanced registrations, there may be more service levels, such as those mentioned in the basic registration. Other registration modes may be provided in a high level. For example, the highest ranking may provide unlimited updates and reports to registered users. The profile of registered users may be updated when new correlations and rules are determined. In this level, registered users may also allow access to an unlimited number of individuals, such as family members and healthcare managers. Registered users may also have unlimited access to online genetic consultants and physicians.
The next registration level within the high hierarchy may provide more limited aspects, such as a limited number of updates. Registered users may make a limited number of updates to their genomic profile during the registration period, e.g., 4 times a year. In another registration level, registered users may update their stored genomic profile once a week, once a month, or once a year. In another embodiment, registered users may only have a limited number of phenotypes that may choose to update their genomic profile.
The personal portal will also conveniently enable individuals to maintain a registry of risk or relevance updates and/or information updates, or request updated risk assessments and information. As described above, different levels of registration may be provided to enable an individual to select various levels of genotype correlation results and updates, and registered users may select different levels of registration through their personal portals.
Any of these registration options will contribute to the revenue stream for the business method of the present invention. The revenue stream for the commercial process of the present invention is also increased by adding new consumers and registered users, wherein new genomic profiles are added to the database.
Table 1: a representative gene having a phenotype-associated genetic variant.
Gene | Phenotype |
A2M | Alzheimer's disease |
ABCA1 | Cholesterol, HDL |
ABCB1 | HIV |
ABCB1 | Epilepsy |
ABCB1 | Complications of renal transplantation |
ABCB1 | Digoxin, serum concentration |
ABCB1 | Crohn's disease; ulcerative colitis |
ABCB1 | Parkinson's disease |
ABCC8 | Type 2 diabetes mellitus |
ABCC8 | Diabetes mellitus, type 2 |
ABO | Myocardial infarction |
ACADM | Medium chain acyl-CoA dehydrogenase deficiency |
ACDC | Type 2, diabetes mellitus |
ACE | Type 2 diabetes mellitus |
ACE | Hypertension (hypertension) |
ACE | Alzheimer's disease |
ACE | Myocardial infarction |
ACE | Cardiovascular |
ACE | Left ventricular hypertrophy |
ACE | Coronary artery disease |
ACE | Atherosclerosis, coronary sclerosis |
ACE | For retinopathy, diabetes |
ACE | Systemic Lupus Erythematosus (SLE) |
ACE | Blood pressure, of the arteries |
ACE | Erectile dysfunction |
ACE | Lupus (Lupus) |
Gene | Phenotype |
ACE | Polycystic kidney disease |
ACE | Apoplexy (apoplexy) |
ACP1 | Diabetes mellitus, type 1 |
ACSM1(LIP)c | Cholesterol levels |
ADAM33 | Asthma (asthma) |
ADD1 | Hypertension (hypertension) |
ADD1 | Blood pressure, of the arteries |
ADH1B | Abuse of alcohol |
ADH1C | Abuse of alcohol |
ADIPOQ | Diabetes mellitus, type 2 |
ADIPOQ | Obesity |
ADORA2A | Panic disorder |
ADRB1 | Hypertension (hypertension) |
ADRB1 | Heart failure |
ADRB2 | Asthma (asthma) |
ADRB2 | Hypertension (hypertension) |
ADRB2 | Obesity |
ADRB2 | Blood pressure, of the arteries |
ADRB2 | Type 2 diabetes mellitus |
ADRB3 | Obesity |
ADRB3 | Type 2 diabetes mellitus |
ADRB3 | Hypertension (hypertension) |
AGT | Hypertension (hypertension) |
AGT | Type 2 diabetes mellitus |
AGT | Essential hypertension |
AGT | Myocardial infarction |
AGTR1 | Hypertension (hypertension) |
Gene | Phenotype |
AGTR2 | Hypertension (hypertension) |
AHR | Breast cancer |
ALAD | Toxicity of lead |
ALDH2 | Alcoholism |
ALDH2 | Abuse of alcohol |
ALDH2 | Colorectal cancer |
ALDRL2 | Type 2 diabetes mellitus |
ALOX5 | Asthma (asthma) |
ALOX5AP | Asthma (asthma) |
APBB1 | Alzheimer's disease |
APC | Colorectal cancer |
APEX1 | Lung cancer |
APOA1 | Atherosclerosis, coronary |
APOA1 | Cholesterol, HDL |
APOA1 | Coronary artery disease |
APOA1 | Type 2 diabetes mellitus |
APOA4 | Type 2 diabetes mellitus |
APOA5 | Triglycerides |
APOA5 | Atherosclerosis, coronary |
APOB | Hypercholesterolemia with high blood pressure |
APOB | Obesity |
APOB | Cardiovascular |
APOB | Coronary artery disease |
APOB | Coronary heart disease |
APOB | Type 2 diabetes mellitus |
APOC1 | Alzheimer's disease |
APOC3 | Triglycerides |
Gene | Phenotype |
APOC3 | Type 2 diabetes mellitus |
APOE | Alzheimer's disease |
APOE | Type 2 diabetes mellitus |
APOE | Multiple sclerosis |
APOE | Atherosclerosis, coronary |
APOE | Parkinson's disease |
APOE | Coronary heart disease |
APOE | Myocardial infarction |
APOE | Apoplexy (apoplexy) |
APOE | Alzheimer's disease |
APOE | Coronary artery disease |
APP | Alzheimer's disease |
AR | Prostate cancer |
AR | Breast cancer |
ATM | Breast cancer |
ATP7B | Wilson's disease |
ATXN8OS | Spinocerebellar ataxia |
BACE1 | Alzheimer's disease |
BCHE | Alzheimer's disease |
BDKRB2 | Hypertension (hypertension) |
BDNF | Alzheimer's disease |
BDNF | Bipolar disorder |
BDNF | Parkinson's disease |
BDNF | Schizophrenia |
BDNF | Memory power |
BGLAP | Bone mineral density |
BRAF | Thyroid cancer |
Gene | Phenotype |
BRCA1 | Breast cancer |
BRCA1 | Breast cancer; ovarian cancer |
BRCA1 | Ovarian cancer |
BRCA2 | Breast cancer |
BRCA2 | Breast cancer; ovarian cancer |
BRCA2 | Ovarian cancer |
BRIP1 | Breast cancer |
C4A | Systemic Lupus Erythematosus (SLE) |
CALCR | Bone mineral density |
CAMTA1 | Scenario memory |
CAPN10 | Diabetes mellitus, type 2 |
CAPN10 | Type 2 diabetes mellitus |
CAPN3 | Muscular dystrophy |
CARD15 | Crohn's disease |
CARD15 | Crohn's disease; ulcerative colitis |
CARD15 | Inflammatory bowel disease |
CART | Obesity |
CASR | Bone mineral density |
CCKAR | Schizophrenia |
CCL2 | Systemic Lupus Erythematosus (SLE) |
CCL5 | HIV |
CCL5 | Asthma (asthma) |
CCND1 | Colorectal cancer |
CCR2 | HIV |
CCR2 | HIV infection |
CCR2 | Hepatitis C |
CCR2 | Myocardial infarction |
Gene | Phenotype |
CCR3 | Asthma (asthma) |
CCR5 | HIV |
CCR5 | HIV infection |
CCR5 | Hepatitis C |
CCR5 | Asthma (asthma) |
CCR5 | Multiple sclerosis |
CD14 | Specific reactivity (atopy) |
CD14 | Asthma (asthma) |
CD14 | Crohn's disease |
CD14 | Crohn's disease; ulcerative colitis |
CD14 | Periodontitis |
CD14 | Total IgE |
CDH1 | Prostate cancer |
CDH1 | Colorectal cancer |
CDKN2A | Melanoma (MEA) |
CDSN | Psoriasis vulgaris |
CEBPA | Of leukemia, bone marrow |
CETP | Atherosclerosis, coronary |
CETP | Coronary heart disease |
CETP | Hypercholesterolemia with high blood pressure |
CFH | Macular degeneration |
CFTR | Cystic fibrosis |
CFTR | Pancreatitis |
CFTR | Cystic fibrosis |
CHAT | Alzheimer's disease |
CHEK2 | Breast cancer |
CHRNA7 | Schizophrenia |
Gene | Phenotype |
CMA1 | Atopic dermatitis |
CNR1 | Schizophrenia |
COL1A1 | Bone mineral density |
COL1A1 | Osteoporosis and its preparation method |
COL1A2 | Bone mineral density |
COL2A1 | Osteoarthritis |
COMT | Schizophrenia |
COMT | Breast cancer |
COMT | Parkinson's disease |
COMT | Bipolar disorder |
COMT | Obsessive compulsive neurosis |
COMT | Alcoholism |
CR1 | Systemic Lupus Erythematosus (SLE) |
CRP | C-reactive protein |
CST3 | Alzheimer's disease |
CTLA4 | Type 1 diabetes mellitus |
CTLA4 | Graves' disease |
CTLA4 | Multiple sclerosis |
CTLA4 | Rheumatoid arthritis |
CTLA4 | Systemic Lupus Erythematosus (SLE) |
CTLA4 | Lupus erythematosus (lupus erythematosus) |
CTLA4 | Celiac disease |
CTSD | Alzheimer's disease |
CX3CR1 | HIV |
CXCL12 | HIV |
CXCL12 | HIV infection |
CYBA | Atherosclerosis, coronary |
Gene | Phenotype |
CYBA | Hypertension (hypertension) |
CYP11B2 | Hypertension (hypertension) |
CYP11B2 | Left ventricular hypertrophy |
CYP17A1 | Breast cancer |
CYP17A1 | Prostate cancer |
CYP17A1 | Endometriosis of the endometrium |
CYP17A1 | Endometrial cancer |
CYP19A1 | Breast cancer |
CYP19A1 | Prostate cancer |
CYP19A1 | Endometriosis of the endometrium |
CYP1A1 | Lung cancer |
CYP1A1 | Breast cancer |
CYP1A1 | Colorectal cancer |
CYP1A1 | Prostate cancer |
CYP1A1 | Esophageal cancer |
CYP1A1 | Endometriosis of the endometrium |
CYP1A1 | Cytogenesis study |
CYP1A2 | Schizophrenia |
CYP1A2 | Colorectal cancer |
CYP1B1 | Breast cancer |
CYP1B1 | Glaucoma treatment |
CYP1B1 | Prostate cancer |
CYP21A2 | Deletion of 21-hydroxylase |
CYP21A2 | Congenital adrenal hyperplasia |
CYP21A2 | Adrenal hyperplasia, congenital |
CYP2A6 | Smoking behaviour |
CYP2A6 | Nicotine |
Gene | Phenotype |
CYP2A6 | Lung cancer |
CYP2C19 | Infection of helicobacter pylori |
CYP2C19 | Phenytoin |
CYP2C19 | Stomach disease |
CYP2C8 | Malaria, plasmodium falciparum |
CYP2C9 | Anticoagulant complications |
CYP2C9 | Sensitivity to Fahualing |
CYP2C9 | Favallin treatment, response thereof |
CYP2C9 | Colorectal cancer |
CYP2C9 | Phenytoin |
CYP2C9 | Reaction of acetonitre and coumaryl alcohol |
CYP2C9 | Blood coagulation disorders |
CYP2C9 | Hypertension (hypertension) |
CYP2D6 | Colorectal cancer |
CYP2D6 | Parkinson's disease |
CYP2D6 | CYP2D6 undesirable metaboliser phenotype |
CYP2E1 | Lung cancer |
CYP2E1 | Colorectal cancer |
CYP3A4 | Prostate cancer |
CYP3A5 | Prostate cancer |
CYP3A5 | Esophageal cancer |
CYP46A1 | Alzheimer's disease |
DBH | Schizophrenia |
DHCR7 | Stern-Lon-Ouder syndrome |
DISC1 | Schizophrenia |
DLST | Alzheimer's disease |
DMD | Muscular dystrophy |
Gene | Phenotype |
DRD2 | Alcoholism |
DRD2 | Schizophrenia |
DRD2 | Smoking behaviour |
DRD2 | Parkinson's disease |
DRD2 | Tardive dyskinesia |
DRD3 | Schizophrenia |
DRD3 | Tardive dyskinesia |
DRD3 | Bipolar disorder |
DRD4 | Attention deficit disorder with hyperactivity] |
DRD4 | Schizophrenia |
DRD4 | New pursuit (novelty seek) |
DRD4 | ADHD |
DRD4 | Personality quality |
DRD4 | Abuse of heroin |
DRD4 | Abuse of alcohol |
DRD4 | Alcoholism |
DRD4 | Personality disorder |
DTNBP1 | Schizophrenia |
EDN1 | Hypertension (hypertension) |
EGFR | Lung cancer |
ELAC2 | Prostate cancer |
ENPP1 | Type 2 diabetes mellitus |
EPHB2 | Prostate cancer |
EPHX1 | Lung cancer |
EPHX1 | Colorectal cancer |
EPHX1 | Cell generation study |
EPHX1 | Chronic obstructive pulmonary disease/COPD |
Gene | Phenotype |
ERBB2 | Breast cancer |
ERCC1 | Lung cancer |
ERCC1 | Colorectal cancer |
ERCC2 | Lung cancer |
ERCC2 | Cell generation study |
ERCC2 | Cancer of the bladder |
ERCC2 | Colorectal cancer |
ESR1 | Bone mineral density |
ESR1 | Bone mineral density |
ESR1 | Breast cancer |
ESR1 | Endometriosis of the endometrium |
ESR1 | Osteoporosis and its preparation method |
ESR2 | Bone mineral density |
ESR2 | Breast cancer |
Estrogen receptors | Bone mineral density |
F2 | Coronary heart disease |
F2 | Apoplexy (apoplexy) |
F2 | Of thromboembolism, of veins |
F2 | Pre-eclampsia |
F2 | Thrombosis |
F5 | Of thromboembolism, of veins |
F5 | Pre-eclampsia |
F5 | Myocardial infarction |
F5 | Apoplexy (apoplexy) |
F5 | Of stroke, ischemia |
F7 | Atherosclerosis, coronary |
F7 | Myocardial infarction |
Gene | Phenotype |
F8 | Hemophilia |
F9 | Hemophilia |
FABP2 | Type 2 diabetes mellitus |
FAS | Alzheimer's disease |
FASLG | Multiple sclerosis |
FCGR2A | Systemic Lupus Erythematosus (SLE) |
FCGR2A | Lupus erythematosus (lupus erythematosus) |
FCGR2A | Periodontitis |
FCGR2A | Rheumatoid arthritis |
FCGR2B | Lupus erythematosus (lupus erythematosus) |
FCGR2B | Systemic Lupus Erythematosus (SLE) |
FCGR3A | Systemic Lupus Erythematosus (SLE) |
FCGR3A | Lupus erythematosus (lupus erythematosus) |
FCGR3A | Periodontitis |
FCGR3A | Arthritis (arthritis) |
FCGR3A | Rheumatoid arthritis |
FCGR3B | Periodontitis |
FCGR3B | Periodontal disease |
FCGR3B | Lupus erythematosus (lupus erythematosus) |
FGB | Fibrinogen |
FGB | Myocardial infarction |
FGB | Coronary heart disease |
FLT3 | Of leukemia, bone marrow |
FLT3 | Leukemia (leukemia) |
FMR1 | Fragile X syndrome |
FRAXA | Fragile X syndrome |
FUT2 | Infection of helicobacter pylori |
Gene | Phenotype |
FVL | Factor V Leiden |
G6PD | Deletion of G6PD |
G6PD | Hyperbilirubinemia |
GABRA5 | Bipolar disorder |
GBA | Gaucher disease |
GBA | Parkinson's disease |
GCGR(FAAH,ML4R,UCP2) | Body weight/obesity |
GCK | Type 2 diabetes mellitus |
GCLM(F12,TLR4) | Atherosclerosis, myocardial infarction |
GDNF | Schizophrenia |
GHRL | Obesity |
GJB1 | Charcot Marie-picture thinking disease |
GJB2 | Deafness |
GJB2 | Of hearing loss, sensory nerve non-syndromic |
GJB2 | Of hearing loss, sensory nerves |
GJB2 | Hearing loss/deafness |
GJB6 | Of hearing loss, sensory nerve non-syndromic |
GJB6 | Hearing loss/deafness |
GNAS | Hypertension (hypertension) |
GNB3 | Hypertension (hypertension) |
GPX1 | Lung cancer |
GRIN1 | Schizophrenia |
GRIN2B | Schizophrenia |
GSK3B | Bipolar disorder |
GSTM1 | Lung cancer |
GSTM1 | Colorectal cancer |
GSTM1 | Breast cancer |
Gene | Phenotype |
GSTM1 | Prostate cancer |
GSTM1 | Cell generation study |
GSTM1 | Cancer of the bladder |
GSTM1 | Esophageal cancer |
GSTM1 | Head and neck cancer |
GSTM1 | Leukemia (leukemia) |
GSTM1 | Parkinson's disease |
GSTM1 | Stomach cancer |
GSTP1 | Lung cancer |
GSTP1 | Colorectal cancer |
GSTP1 | Breast cancer |
GSTP1 | Cell generation study |
GSTP1 | Prostate cancer |
GSTT1 | Lung cancer |
GSTT1 | Colorectal cancer |
GSTT1 | Breast cancer |
GSTT1 | Prostate cancer |
GSTT1 | Cancer of the bladder |
GSTT1 | Cell generation study |
GSTT1 | Asthma (asthma) |
GSTT1 | Toxicity of benzene |
GSTT1 | Esophageal cancer |
GSTT1 | Head and neck cancer |
GYS1 | Type 2 diabetes mellitus |
HBB | Thalassemia |
HBB | Thalassemia, beta- |
HD | Huntington's chorea |
Gene | Phenotype |
HFE | Hemochromatosis |
HFE | Iron level |
HFE | Colorectal cancer |
HK2 | Type 2 diabetes mellitus |
HLA | Rheumatoid arthritis |
HLA | Type 1 diabetes mellitus |
HLA | Behcet's disease |
HLA | Celiac disease |
HLA | Psoriasis vulgaris |
HLA | Graves disease |
HLA | Multiple sclerosis |
HLA | Schizophrenia |
HLA | Asthma (asthma) |
HLA | Diabetes mellitus |
HLA | Lupus (Lupus) |
HLA-A | Leukemia (leukemia) |
HLA-A | HIV |
HLA-A | Diabetes mellitus, type 1 |
HLA-A | Graft versus host disease |
HLA-A | Multiple sclerosis |
HLA-B | Leukemia (leukemia) |
HLA-B | Behcet's disease |
HLA-B | Celiac disease |
HLA-B | Diabetes mellitus, type 1 |
HLA-B | Graft versus host disease |
HLA-B | Sarcoidosis of meat type |
HLA-C | Psoriasis vulgaris |
Gene | Phenotype |
HLA-DPA1 | Measles, measles and other diseases |
HLA-DPB1 | Diabetes mellitus, type 1 |
HLA-DPB1 | Asthma (asthma) |
HLA-DQA1 | Diabetes mellitus, type 1 |
HLA-DQA1 | Celiac disease |
HLA-DQA1 | Cervical cancer |
HLA-DQA1 | Asthma (asthma) |
HLA-DQA1 | Multiple sclerosis |
HLA-DQA1 | Diabetes, type 2; diabetes mellitus, type 1 |
HLA-DQA1 | Lupus erythematosus (lupus erythematosus) |
HLA-DQA1 | Loss of pregnancy, relapse |
HLA-DQA1 | Psoriasis vulgaris |
HLA-DQB1 | Diabetes mellitus, type 1 |
HLA-DQB1 | Celiac disease |
HLA-DQB1 | Multiple sclerosis |
HLA-DQB1 | Cervical cancer |
HLA-DQB1 | Lupus erythematosus (lupus erythematosus) |
HLA-DQB1 | Loss of pregnancy, relapse |
HLA-DQB1 | Arthritis (arthritis) |
HLA-DQB1 | Asthma (asthma) |
HLA-DQB1 | HIV |
HLA-DQB1 | Lymphoma (lymphoma) |
HLA-DQB1 | Tuberculosis (tuberculosis) |
HLA-DQB1 | Rheumatoid arthritis |
HLA-DQB1 | Diabetes mellitus, type 2 |
HLA-DQB1 | Graft versus host disease |
HLA-DQB1 | Narcolepsy |
Gene | Phenotype |
HLA-DQB1 | Arthritis, rheumatic |
HLA-DQB1 | Cholangitis, sclerosing |
HLA-DQB1 | Diabetes, type 2; diabetes mellitus, type 1 |
HLA-DQB1 | Graves' disease |
HLA-DQB1 | Hepatitis C |
HLA-DQB1 | Hepatitis C, chronic |
HLA-DQB1 | Malaria |
HLA-DQB1 | Malaria, plasmodium falciparum |
HLA-DQB1 | Melanoma (MEA) |
HLA-DQB1 | Psoriasis vulgaris |
HLA-DQB1 | Sjogren's syndrome |
HLA-DQB1 | Systemic Lupus Erythematosus (SLE) |
HLA-DRB1 | Diabetes mellitus, type 1 |
HLA-DRB1 | Multiple sclerosis |
HLA-DRB1 | Systemic Lupus Erythematosus (SLE) |
HLA-DRB1 | Rheumatoid arthritis |
HLA-DRB1 | Cervical cancer |
HLA-DRB1 | Arthritis (arthritis) |
HLA-DRB1 | Celiac disease |
HLA-DRB1 | Lupus erythematosus (lupus erythematosus) |
HLA-DRB1 | Sarcoidosis of meat type |
HLA-DRB1 | HIV |
HLA-DRB1 | Tuberculosis (tuberculosis) |
HLA-DRB1 | Graves' disease |
HLA-DRB1 | Lymphoma (lymphoma) |
HLA-DRB1 | Psoriasis vulgaris |
HLA-DRB1 | Asthma (asthma) |
Gene | Phenotype |
HLA-DRB1 | Crohn's disease |
HLA-DRB1 | Graft versus host disease |
HLA-DRB1 | Hepatitis C, chronic |
HLA-DRB1 | Narcolepsy |
HLA-DRB1 | Sclerosis, systemic |
HLA-DRB1 | Sjogren's syndrome |
HLA-DRB1 | Type 1 diabetes mellitus |
HLA-DRB1 | Arthritis, rheumatic |
HLA-DRB1 | Cholangitis, sclerosing |
HLA-DRB1 | Diabetes, type 2; diabetes mellitus, type 1 |
HLA-DRB1 | Infection of helicobacter pylori |
HLA-DRB1 | Hepatitis C |
HLA-DRB1 | Arthritis of teenagers |
HLA-DRB1 | Leukemia (leukemia) |
HLA-DRB1 | Malaria |
HLA-DRB1 | Melanoma (MEA) |
HLA-DRB1 | Loss of pregnancy, relapse |
HLA-DRB3 | Psoriasis vulgaris |
HLA-G | Loss of pregnancy, relapse |
HMOX1 | Atherosclerosis, coronary |
HNF4A | Diabetes mellitus, type 2 |
HNF4A | Type 2 diabetes mellitus |
HSD11B2 | Hypertension (hypertension) |
HSD17B1 | Breast cancer |
HTR1A | Depression, major type |
HTR1B | Dependence on alcohol |
HTR1B | Alcoholism |
Gene | Phenotype |
HTR2A | Memory power |
HTR2A | Schizophrenia |
HTR2A | Bipolar disorder |
HTR2A | Depression (depression) |
HTR2A | Depression, major type |
HTR2A | Suicide |
HTR2A | Alzheimer's disease |
HTR2A | Anorexia nervosa |
HTR2A | Hypertension (hypertension) |
HTR2A | Obsessive compulsive neurosis |
HTR2C | Schizophrenia |
HTR6 | Alzheimer's disease |
HTR6 | Schizophrenia |
HTRA1 | Wet age-related macular degeneration |
IAPP | Type 2 diabetes mellitus |
IDE | Alzheimer's disease |
IFNG | Tuberculosis (tuberculosis) |
IFNG | Type 1 diabetes mellitus |
IFNG | Graft versus host disease |
IFNG | Hepatitis B |
IFNG | Multiple sclerosis |
IFNG | Asthma (asthma) |
IFNG | Breast cancer |
IFNG | Kidney transplantation |
IFNG | Complications of renal transplantation |
IFNG | Long service life |
IFNG | Loss of pregnancy, relapse |
Gene | Phenotype |
IGFBP3 | Breast cancer |
IGFBP3 | Prostate cancer |
IL10 | Systemic Lupus Erythematosus (SLE) |
IL10 | Asthma (asthma) |
IL10 | Graft versus host disease |
IL10 | HIV |
IL10 | Kidney transplantation |
IL10 | Complications of renal transplantation |
IL10 | Hepatitis B |
IL10 | Arthritis of teenagers |
IL10 | Long service life |
IL10 | Multiple sclerosis |
IL10 | Loss of pregnancy, relapse |
IL10 | Rheumatoid arthritis |
IL10 | Tuberculosis (tuberculosis) |
IL12B | Type 1 diabetes mellitus |
IL12B | Asthma (asthma) |
IL13 | Asthma (asthma) |
IL13 | Specific reactivity |
IL13 | Chronic obstructive pulmonary disease/COPD |
IL13 | Graves' disease |
IL1A | Periodontitis |
IL1A | Alzheimer's disease |
IL1B | Periodontitis |
IL1B | Alzheimer's disease |
IL1B | Stomach cancer |
IL1R1 | Type 1 diabetes mellitus |
Gene | Phenotype |
IL1RN | Stomach cancer |
IL2 | Asthma; eczema; allergic diseases |
IL4 | Asthma (asthma) |
IL4 | Specific reactivity |
IL4 | HIV |
IL4R | Asthma (asthma) |
IL4R | Specific reactivity |
IL4R | Total serum IgE |
IL6 | Bone mineralization |
IL6 | Kidney transplantation |
IL6 | Complications of renal transplantation |
IL6 | Long service life |
IL6 | Multiple sclerosis |
IL6 | Bone mineral density |
IL6 | Bone mineral density |
IL6 | Colorectal cancer |
IL6 | Arthritis of teenagers |
IL6 | Rheumatoid arthritis |
IL9 | Asthma (asthma) |
INHA | Premature ovarian failure |
INS | Type 1 diabetes mellitus |
INS | Type 2 diabetes mellitus |
INS | Diabetes mellitus, type 1 |
INS | Obesity |
INS | Prostate cancer |
INSIG2 | Obesity |
INSR | Type 2 diabetes mellitus |
Gene | Phenotype |
INSR | Hypertension (hypertension) |
INSR | Polycystic ovarian syndrome |
IPF1 | Diabetes mellitus, type 2 |
IRS1 | Type 2 diabetes mellitus |
IRS1 | Diabetes mellitus, type 2 |
IRS2 | Diabetes mellitus, type 2 |
ITGB3 | Myocardial infarction |
ITGB3 | Atherosclerosis, coronary |
ITGB3 | Coronary heart disease |
ITGB3 | Myocardial infarction |
KCNE1 | EKG, Exception |
KCNE2 | EKG, Exception |
KCNH2 | EKG, Exception |
KCNH2 | QT interval prolongation syndrome |
KCNJ11 | Diabetes mellitus, type 2 |
KCNJ11 | Type 2 diabetes mellitus |
KCNN3 | Schizophrenia |
KCNQ1 | EKG, Exception |
KCNQ1 | QT interval prolongation syndrome |
KIBRA | Scenario memory |
KLK1 | Hypertension (hypertension) |
KLK3 | Prostate cancer |
KRAS | Colorectal cancer |
LDLR | Hypercholesterolemia with high blood pressure |
LDLR | Hypertension (hypertension) |
LEP | Obesity |
LEPR | Obesity |
Gene | Phenotype |
LIG4 | Breast cancer |
LIPC | Atherosclerosis, coronary |
LPL | Coronary artery disease |
LPL | Hyperlipidemia |
LPL | Triglycerides |
LRP1 | Alzheimer's disease |
LRP5 | Bone mineral density |
LRRK2 | Parkinson's disease |
LRRK2 | Parkinson's disease |
LTA | Type 1 diabetes mellitus |
LTA | Asthma (asthma) |
LTA | Systemic Lupus Erythematosus (SLE) |
LTA | Septicemia |
LTC4S | Asthma (asthma) |
MAOA | Alcoholism |
MAOA | Schizophrenia |
MAOA | Bipolar disorder |
MAOA | Smoking behaviour |
MAOA | Personality disorder |
MAOB | Parkinson's disease |
MAOB | Smoking behaviour |
MAPT | Parkinson's disease |
MAPT | Alzheimer's disease |
MAPT | Dementia and method of treatment |
MAPT | Dementia of frontotemporal type |
MAPT | Progressive supranuclear palsy |
MC1R | Melanoma (MEA) |
Gene | Phenotype |
MC3R | Obesity |
MC4R | Obesity |
MECP2 | Rett syndrome |
MEFV | Familial mediterranean fever |
MEFV | Amyloidosis of the disease |
MICA | Type 1 diabetes mellitus |
MICA | Behcet's disease |
MICA | Celiac disease |
MICA | Rheumatoid arthritis |
MICA | Systemic Lupus Erythematosus (SLE) |
MLH1 | Colorectal cancer |
MME | Alzheimer's disease |
MMP1 | Lung cancer |
MMP1 | Ovarian cancer |
MMP1 | Periodontitis |
MMP3 | Myocardial infarction |
MMP3 | Ovarian cancer |
MMP3 | Rheumatoid arthritis |
MPO | Lung cancer |
MPO | Alzheimer's disease |
MPO | Breast cancer |
MPZ | Charcot Marie-picture thinking disease |
MS4A2 | Asthma (asthma) |
MS4A2 | Specific reactivity |
MSH2 | Colorectal cancer |
MSH6 | Colorectal cancer |
MSR1 | Prostate cancer |
Gene | Phenotype |
MTHFR | Colorectal cancer |
MTHFR | Type 2 diabetes mellitus |
MTHFR | Neural tube defect |
MTHFR | Homocysteine |
MTHFR | Of thromboembolism, of veins |
MTHFR | Atherosclerosis, coronary |
MTHFR | Alzheimer's disease |
MTHFR | Esophageal cancer |
MTHFR | Pre-eclampsia |
MTHFR | Loss of pregnancy, relapse |
MTHFR | Apoplexy (apoplexy) |
MTHFR | Thrombosis, deep veins |
MT-ND1 | Diabetes mellitus, type 2 |
MTR | Colorectal cancer |
MT-RNR1 | Of hearing loss, sensory nerve non-syndromic |
MTRR | Neural tube defect |
MTRR | Homocysteine |
MT-TL1 | Diabetes mellitus, type 2 |
MUTYH | Colorectal cancer |
MYBPC3 | Cardiomyopathy |
MYH7 | Cardiomyopathy |
MYOC | Glaucoma, primary open angle |
MYOC | Glaucoma treatment |
NAT1 | Colorectal cancer |
NAT1 | Breast cancer |
NAT1 | Cancer of the bladder |
NAT2 | Colorectal cancer |
Gene | Phenotype |
NAT2 | Cancer of the bladder |
NAT2 | Breast cancer |
NAT2 | Lung cancer |
NBN | Breast cancer |
NCOA3 | Breast cancer |
NCSTN | Alzheimer's disease |
NEUROD1 | Type 1 diabetes mellitus |
NF1 | Neurofibromatosis 1 |
NOS1 | Asthma (asthma) |
NOS2A | Multiple sclerosis |
NOS3 | Hypertension (hypertension) |
NOS3 | Coronary heart disease |
NOS3 | Atherosclerosis, coronary |
NOS3 | Coronary artery disease |
NOS3 | Myocardial infarction |
NOS3 | Acute coronary syndrome |
NOS3 | Blood pressure, of the arteries |
NOS3 | Pre-eclampsia |
NOS3 | Nitric oxide |
NOS3 | Alzheimer's disease |
NOS3 | Asthma (asthma) |
NOS3 | Type 2 diabetes mellitus |
NOS3 | Cardiovascular diseases |
NOS3 | Behcet's disease |
NOS3 | Erectile dysfunction |
NOS3 | Renal failure, chronic |
NOS3 | Toxicity of lead |
Gene | Phenotype |
NOS3 | Left ventricular hypertrophy |
NOS3 | Loss of pregnancy, relapse |
NOS3 | For retinopathy, diabetes |
NOS3 | Apoplexy (apoplexy) |
NOTCH4 | Schizophrenia |
NPY | Abuse of alcohol |
NQO1 | Lung cancer |
NQO1 | Colorectal cancer |
NQO1 | Toxicity of benzene |
NQO1 | Cancer of the bladder |
NQO1 | Parkinson's disease |
NR3C2 | Hypertension (hypertension) |
NR4A2 | Parkinson's disease |
NRG1 | Schizophrenia |
NTF3 | Schizophrenia |
OGG1 | Lung cancer |
OGG1 | Colorectal cancer |
OLR1 | Alzheimer's disease |
OPA1 | Glaucoma treatment |
OPRM1 | Abuse of alcohol |
OPRM1 | Dependence on drugs |
OPTN | Glaucoma, primary open angle |
P450 | Metabolism of drugs |
PADI4 | Rheumatoid arthritis |
PAH | phenylketonuria/PKU |
PAI1 | Coronary heart disease |
PAI1 | Asthma (asthma) |
Gene | Phenotype |
PALB2 | Breast cancer |
PARK2 | Parkinson's disease |
PARK7 | Parkinson's disease |
PDCD1 | Lupus erythematosus (lupus erythematosus) |
PINK1 | Parkinson's disease |
PKA | Memory power |
PKC | Memory power |
PLA2G4A | Schizophrenia |
PNOC | Schizophrenia |
POMC | Obesity |
PON1 | Atherosclerosis, coronary |
PON1 | Parkinson's disease |
PON1 | Type 2 diabetes mellitus |
PON1 | Atherosclerosis of arteries |
PON1 | Coronary artery disease |
PON1 | Coronary heart disease |
PON1 | Alzheimer's disease |
PON1 | Long service life |
PON2 | Atherosclerosis, coronary |
PON2 | Premature delivery |
PPARG | Type 2 diabetes mellitus |
PPARG | Obesity |
PPARG | Diabetes mellitus, type 2 |
PPARG | Colorectal cancer |
PPARG | Hypertension (hypertension) |
PPARGC1A | Diabetes mellitus, type 2 |
PRKCZ | Type 2 diabetes mellitus |
Gene | Phenotype |
PRL | Systemic Lupus Erythematosus (SLE) |
PRNP | AAlzheimer's disease |
PRNP | Creutzfeldt-Jakob disease |
PRNP | Yak-Ke-Shi disease |
PRODH | Schizophrenia |
PRSS1 | Pancreatitis |
PSEN1 | Alzheimer's disease |
PSEN2 | Alzheimer's disease |
PSMB8 | Type 1 diabetes mellitus |
PSMB9 | Type 1 diabetes mellitus |
PTCH | Skin cancer, non-melanoma |
PTGIS | Hypertension (hypertension) |
PTGS2 | Colorectal cancer |
PTH | Bone mineral density |
PTPN11 | Noonan syndrome |
PTPN22 | Rheumatoid arthritis |
PTPRC | Multiple sclerosis |
PVT1 | End stage renal disease |
RAD51 | Breast cancer |
RAGE | For retinopathy, diabetes |
RB1 | Retinoblastoma |
RELN | Schizophrenia |
REN | Hypertension (hypertension) |
RET | Thyroid cancer |
RET | Hischutton's disease |
RFC1 | Neural tube defect |
RGS4 | Schizophrenia |
Gene | Phenotype |
RHO | Retinitis pigmentosa |
RNASEL | Prostate cancer |
RYR1 | Malignant hyperthermia |
SAA1 | Amyloidosis of the disease |
SCG2 | Hypertension (hypertension) |
SCG3 | Obesity |
SCGB1A1 | Asthma (asthma) |
SCN5A | Brugada syndrome |
SCN5A | EKG, Exception |
SCN5A | QT interval prolongation syndrome |
SCNN1B | Hypertension (hypertension) |
SCNN1G | Hypertension (hypertension) |
SERPINA1 | COPD |
SERPINA3 | Alzheimer's disease |
SERPINA3 | COPD |
SERPINA3 | Parkinson's disease |
SERPINE1 | Myocardial infarction |
SERPINE1 | Type 2 diabetes mellitus |
SERPINE1 | Atherosclerosis, coronary |
SERPINE1 | Obesity |
SERPINE1 | Pre-eclampsia |
SERPINE1 | Apoplexy (apoplexy) |
SERPINE1 | Hypertension (hypertension) |
SERPINE1 | Loss of pregnancy, relapse |
SERPINE1 | Of thromboembolism, of veins |
SLC11A1 | Tuberculosis (tuberculosis) |
SLC22A4 | Crohn's disease; ulcerative colitis |
Gene | Phenotype |
SLC22A5 | Crohn's disease; ulcerative colitis |
SLC2A1 | Type 2 diabetes mellitus |
SLC2A2 | Type 2 diabetes mellitus |
SLC2A4 | Type 2 diabetes mellitus |
SLC3A1 | Cystinuria |
SLC6A3 | Attention deficit disorder with hyperactivity] |
SLC6A3 | Parkinson's disease |
SLC6A3 | Smoking behaviour |
SLC6A3 | Alcoholism |
SLC6A3 | Schizophrenia |
SLC6A4 | Depression (depression) |
SLC6A4 | Depression, major type |
SLC6A4 | Schizophrenia |
SLC6A4 | Suicide |
SLC6A4 | Alcoholism |
SLC6A4 | Bipolar disorder |
SLC6A4 | Personality quality |
SLC6A4 | Attention deficit disorder with hyperactivity] |
SLC6A4 | Alzheimer's disease |
SLC6A4 | Personality disorder |
SLC6A4 | Panic disorder |
SLC6A4 | Abuse of alcohol |
SLC6A4 | Affective disorders |
SLC6A4 | Anxiety disorders |
SLC6A4 | Smoking behaviour |
SLC6A4 | Depression, major; bipolar disorder |
SLC6A4 | Abuse of heroin |
Gene | Phenotype |
SLC6A4 | Irritable bowel syndrome |
SLC6A4 | Migraine headache |
SLC6A4 | Obsessive compulsive neurosis |
SLC6A4 | Suicide behavior |
SLC7A9 | Cystinuria |
SNAP25 | ADHD |
SNCA | Parkinson's disease |
SOD1 | ALS/amyotrophic lateral sclerosis |
SOD2 | Breast cancer |
SOD2 | Lung cancer |
SOD2 | Prostate cancer |
SPINK1 | Pancreatitis |
SPP1 | Multiple sclerosis |
SRD5A2 | Prostate cancer |
STAT6 | Asthma (asthma) |
STAT6 | Total IgE |
SULT1A1 | Breast cancer |
SULT1A1 | Colorectal cancer |
TAP1 | Type 1 diabetes mellitus |
TAP1 | Lupus erythematosus (lupus erythematosus) |
TAP2 | Type 1 diabetes mellitus |
TAP2 | Diabetes mellitus, type 1 |
TBX21 | Asthma (asthma) |
TBXA2R | Asthma (asthma) |
TCF1 | Diabetes mellitus, type 2 |
TCF1 | Type 2 diabetes mellitus |
TF | Alzheimer's disease |
Gene | Phenotype |
TGFB1 | Breast cancer |
TGFB1 | Kidney transplantation |
TGFB1 | Complications of renal transplantation |
TH | Schizophrenia |
THBD | Myocardial infarction |
TLR4 | Asthma (asthma) |
TLR4 | Crohn's disease; ulcerative colitis |
TLR4 | Septicemia |
TNF | Asthma (asthma) |
TNFA | Cerebrovascular disease |
TNF | Type 1 diabetes mellitus |
TNF | Rheumatoid arthritis |
TNF | Systemic Lupus Erythematosus (SLE) |
TNF | Kidney transplantation |
TNF | Psoriasis vulgaris |
TNF | Septicemia |
TNF | Type 2 diabetes mellitus |
TNF | Alzheimer's disease |
TNF | Crohn's disease |
TNF | Diabetes mellitus, type 1 |
TNF | Hepatitis B |
TNF | Complications of renal transplantation |
TNF | Multiple sclerosis |
TNF | Schizophrenia |
TNF | Celiac disease |
TNF | Obesity |
TNF | Loss of pregnancy, relapse |
Gene | Phenotype |
TNFRSF11B | Bone mineral density |
TNFRSF1A | Rheumatoid diseaseArthritis of joint |
TNFRSF1B | Rheumatoid arthritis |
TNFRSF1B | Systemic Lupus Erythematosus (SLE) |
TNFRSF1B | Arthritis (arthritis) |
TNNT2 | Cardiomyopathy |
TP53 | Lung cancer |
TP53 | Breast cancer |
TP53 | Colorectal cancer |
TP53 | Prostate cancer |
TP53 | Cervical cancer |
TP53 | Ovarian cancer |
TP53 | Smoking |
TP53 | Esophageal cancer |
TP73 | Lung cancer |
TPH1 | Suicide |
TPH1 | Depression, major type |
TPH1 | Suicide behavior |
TPH1 | Schizophrenia |
TPMT | Thiopurine methyltransferase Activity |
TPMT | Leukemia (leukemia) |
TPMT | Inflammatory bowel disease |
TPMT | Thiopurine S-methyltransferase phenotype |
TSC1 | Tuberous sclerosis |
TSC2 | Tuberous sclerosis |
TSHR | Graves' disease |
TYMS | Colorectal cancer |
Gene | Phenotype |
TYMS | Stomach cancer |
TYMS | Esophageal cancer |
UCHL1 | Parkinson's disease |
UCP1 | Obesity |
UCP2 | Obesity |
UCP3 | Obesity |
UGT1A1 | Hyperbilirubinemia |
UGT1A1 | Syndrome of Rilbert syndrome |
UGT1A6 | Colorectal cancer |
UGT1A7 | Colorectal cancer |
UTS2 | Diabetes mellitus, type 2 |
VDR | Bone mineral density |
VDR | Prostate cancer |
VDR | Bone mineral density |
VDR | Type 1 diabetes mellitus |
VDR | Osteoporosis and its preparation method |
VDR | Bone mass |
VDR | Breast cancer |
VDR | Toxicity of lead |
VDR | Tuberculosis (tuberculosis) |
VDR | Type 2 diabetes mellitus |
VEGF | Breast cancer |
Vit D rec | Idiopathic short stature |
VKORC1 | Warfarin therapy, response thereto |
WNK4 | Hypertension (hypertension) |
XPA | Lung cancer |
XPC | Lung cancer |
Gene | Phenotype |
XPC | Cell generation study |
XRCC1 | Lung cancer |
XRCC1 | Cell generation study |
XRCC1 | Breast cancer |
XRCC1 | Cancer of the bladder |
XRCC2 | Breast cancer |
XRCC3 | Breast cancer |
XRCC3 | Cell generation study |
XRCC3 | Lung cancer |
XRCC3 | Cancer of the bladder |
ZDHHC8 | Schizophrenia |
Genetic Integrated index (GCI)
The etiology of many conditions or diseases is attributed to both genetic and environmental factors. Recent advances in genotyping technology have provided opportunities to identify new associations between disease and genetic markers throughout the genome. Indeed, many recent studies have found these associations, where a particular allele or genotype is associated with an increased risk of disease. Some of these studies include collecting a set of test cases and a set of controls and comparing the allelic distribution of genetic markers between the two populations. In some of these studies, the association between a particular genetic marker and a disease was determined in isolation from other genetic markers that were handled as background and did not play a role in statistical analysis.
Genetic markers and variants may include SNPs, nucleotide repeats, nucleotide insertions, nucleotide deletions, chromosomal translocations, chromosomal duplications, or copy number variations. Copy number variations may include microsatellite repeats, nucleotide repeats, centromere repeats or telomere repeats.
In one aspect of the invention, information about the association of multiple genetic markers with one or more diseases or conditions is combined and analyzed to derive a GCI score. GCI scoring can be used to provide people without genetic training with reliable (i.e., robust), understandable, and/or intuitive knowledge of their individual risk of disease compared to a relevant population based on current scientific research. In one embodiment, the method of generating a reliable GCI score for the combined effect of different loci is based on the reported individual risk for each locus studied. For example, a disease or condition of interest is identified and then sources of information (including, but not limited to, databases, patent publications, and scientific literature) are queried for information regarding the association of the disease or condition with one or more genetic loci. These sources of information are validated and evaluated using quality criteria. In some embodiments, the evaluation process includes multiple steps. In other embodiments, the information sources are evaluated against a plurality of quality criteria. Information derived from information resources is used to identify odds ratios or relative risks of one or more genetic loci for each disease or condition of interest.
In alternative embodiments, the Odds Ratio (OR) OR Relative Risk (RR) for at least one genetic locus is not available from available sources of information. The RRs are then calculated using (1) the reporter OR of multiple alleles of the same locus, (2) allele frequencies from datasets (e.g., HapMap datasets), and/OR (3) disease/status prevalence from available resources (e.g., CDC, national center for Health Statistics, etc.) to derive the RRs for all alleles of interest. In one embodiment, the ORs of multiple alleles of the same locus are assessed separately OR independently. In a preferred embodiment, the ORs of multiple alleles of the same locus are combined to account for the dependency (dependency) between the ORs of different alleles. In some embodiments, established disease models (including, but not limited to, models such as positive (additive), additive (additive), Harvard-modified, dominant effects) are used to generate intermediate scores representing individual risk according to the selected model.
In another embodiment, a method of analyzing multiple models of a disease or condition of interest is used, and correlates the results obtained from these different models; this minimizes the possible errors that may be introduced by selecting a particular disease model. This approach minimizes the impact of reasonable errors in prevalence, allele frequency, and OR estimates derived from the information sources on the calculation of relative risk. Incorrectly estimating prevalence has little or no effect on the final score due to the "linear" or monotonic nature of the effect of prevalence estimates on RRs; the same model is assumed to be consistently applied to all individuals generating the report.
In another embodiment, a method is used that considers environmental/behavioral/demographic data as an additional "locus". In related embodiments, such data may be obtained from information sources, such as medical or scientific literature or databases (e.g., association of smoking w/lung cancer or from insurance health risk assessment). In one embodiment, a GCI score is generated for one or more complex diseases. Complex diseases can be influenced by multiple genes, environmental factors, and their interactions. When studying complex diseases, a large number of possible interactions need to be analyzed. In one embodiment, a program such as the Bonferroni correction is used to correct multiple comparisons. In an alternative embodiment, when the tests are independent or show a particular type of dependency, the overall level of significance (also known as the "family error rate") is controlled using the Simes test (Sarkar S. (1998)). Some probability inequalities for ordered MTP2 random variables: proof of Simes hypothesis (Ann Stat 26: 494-504). If p (K) ≦ α K/K for any K in K1,., then the Simes test rejects all K tests for the global zero hypothesis with the specific zero hypothesis true (Simes RJ (1986) enhanced Bonferroni procedure for multiple tests of signalicities.biometrika 73: 751-754).
Other embodiments that may be used in the context of multi-gene and multi-environment factor analysis control false discovery rates (false-discovery rates), i.e., the expected proportion of false rejects that reject zero hypotheses. This approach is particularly beneficial when, as in microarray studies, a fraction of the null hypotheses can be assumed to be erroneous. Devlin et al (2003, Analysis of multiple genes of association. Gene expression 25: 36-47) proposed a variation of the Benjamini and Hochberg (1995, Controlling the false discovery rate: a practical and functional profiling. J R Stat Soc Ser B57: 289-300) incremental program that controls the rate of false discovery when testing a large number of possible gene-gene interactions in a multiple locus association study. The Benjamini and Hochberg programs are related to the Simes test; setting k*Maxk such that p (K) ≦ α K/K, which rejects all responses toK of (a)*A null hypothesis. In fact, when all null hypotheses are true, The Benjamini and Hochberg programs are reduced to The Simes test (Benjamini Y, Yekutieli D (2001) The control soft feel discovery rate in multiple testing under dependency.Ann Stat29:1165-1188)。
In some embodiments, the individual is ranked based on its median score compared to a population of individuals to generate a final score, which may be expressed as a ranking in the population, such as 99 th or 99, 98, 97, 96, 95, 94, 93, 92, 91, 90, 89, 88, 87, 86, 85, 84, 83, 82, 81, 80, 79, 78, 77, 76, 75, 74, 73, 72, 71, 70, 69, 65, 60, 55, 50, 45, 40, 35, 30, 25, 20, 15, 10, 5, or 0 th ranking. In another embodiment, the score may be displayed as a range, such as 100 th to 95 th quantile, 95 th to 85 th quantile, 85 th to 60 th quantile, or any subrange between 100 th to 0 th quantile. In yet another embodiment, the individuals are ranked in quartiles, such as the highest 75 th quartile or the lowest 25 th quartile. In further embodiments, the individual is ranked compared to the mean or median score in the population.
In one embodiment, the population compared to the individual includes a large number of people from different geographic and ethnic backgrounds, such as a global population. In other embodiments, the population compared to the individual is limited to a particular geography, family, race, gender, age (fetal, neonatal, child, juvenile, adolescent, adult, elderly individual), disease state (e.g., symptomatic, asymptomatic, carrier, early onset, late onset). In some embodiments, the population compared to the individual is derived from information reported from public and/or private information sources.
In one embodiment, the GCI score or GCIPlus score of the individual is visualized using a display device. In some embodiments, a display screen (e.g., a computer monitor or television screen) is used for visual display, such as a personal portal with associated information. In another embodiment, the display device is a static display device, such as a printed page. In one embodiment, the display may include, but is not limited to, one or more of the following: bin (bin) (e.g., 1-5, 6-10, 11-15, 16-20, 21-25, 26-30, 31-35, 36-40, 41-45, 46-50, 51-55, 56-60, 61-65, 66-70, 71-75, 76-80, 81-85, 86-90, 91-95, 96-100), color or gray scale gradient, thermometer, scale, pie chart, bar chart, or bar chart. For example, fig. 18 and 19 are different displays of MS and fig. 20 is for crohn's disease. In another embodiment, a thermometer is used to display the GCI score and disease/status prevalence. In another embodiment, the temperature table displays the level of change with the reported GCI score, e.g., fig. 15-17, the color corresponding to risk. The thermometer may display a colorimetric change with increasing GCI score (e.g., gradually changing from a blue color at a lower GCI score to a red color at a higher GCI score). In a related embodiment, the thermometer displays a level that varies with the reported GCI score and a colorimetric change that increases with the level of risk.
In an alternative embodiment, the individual's GCI score is communicated to the individual using auditory feedback. In one embodiment, the audible feedback is a verbal explanation that the risk level is high or low. In another embodiment, the auditory feedback is a recitation of a particular GCI score, such as a number, percentile, range, quartile, or comparison to a population average or median GCI score. In one embodiment, the live person delivers the audible feedback either personally or through a communication device, such as a telephone (landline, cellular, or satellite), or through a personal portal. In another embodiment, the audible feedback is delivered by an automated system (e.g., a computer). In one embodiment, the auditory feedback is delivered as part of an Interactive Voice Response (IVR) system, a technique that allows computers to detect voice and touch tones using normal telephone calls. In another embodiment, the individual may interact with a central server through an IVR system. IVR systems can react to pre-recorded or dynamically generated audio to interact with individuals and provide them with auditory feedback of their risk level. In one embodiment, the individual may call a number answered by the IVR. After optionally entering an authentication code, security code, or through a voice recognition program, the IVR system causes the object to select an option from a menu, such as a touch tone or voice menu. One of these options may provide the individual with his or her risk level.
In another embodiment, the GCI score of an individual is visualized using a display device and communicated using auditory feedback, for example through a personal portal. This combination may include a visual display of the GCI score and an audible feedback that discusses the relevance of the GCI score to the overall health of the individual and possible precautions that may be proposed.
In one embodiment, the GCI score is generated using a multi-step method. Initially, for each state to be studied, the relative risk of odds ratio derived from each genetic marker is calculated. For each prevalence value of p 0.01, 0.02,..., 0.5, GCI scores for the HapMap CEU population were calculated based on prevalence and HapMap allele frequency. If the GCI score does not change under varying prevalence, the only assumption considered is the existence of a cumulative model. Additionally, it may be determined that the model is sensitive to popularity. For any combination of the uncalled values, the distribution of relative risk and score in the HapMap population was obtained. For each new individual, the individual score was compared to the HapMap distribution and the resulting score was the ranking of the individual in this population. The resolution of the reported scores may be low due to assumptions made in the process. The population will be divided into quantiles (3-6 bins) and the reported bin will be the one in which the individual ranks fall. The number of bins may be different for different diseases based on considerations such as resolution of the scores for each disease. In the case of a link between scores of different HapMap individuals, an average ranking will be used.
In one embodiment, a higher GCI score is interpreted as indicating an increased risk of acquiring or being diagnosed with a condition or disease. In another embodiment, a mathematical model is used to derive the GCI score. In some embodiments, the GCI score is based on a mathematical model that accounts for incomplete features that underlie information about a population and/or a disease or condition. In some embodiments, the mathematical model includes at least one hypothesis that is specific as part of the basis for calculating the GCI score, wherein the hypothesis includes, but is not limited to: an assumption of a given odds ratio; an assumption that the popularity of the state is known; the hypothesis that the genotype frequencies in the population are known; and the hypothesis that consumers are from the same community background as the study used and HapMap; the combined risk is the hypothesis of the product of different risk factors for the individual genetic markers. In some embodiments, GCI may also include the hypothesis that the polygenic frequency of a genotype is the product of the allelic frequencies of each SNP or individual genetic marker (e.g., different SNPs or genetic markers are independent throughout the population).
Integral model
In one embodiment, the GCI score is calculated under the assumption that the risk attributed to the set of genetic markers is the product of the risks attributed to the individual genetic markers. This means that different genetic markers are due to the risk of disease independently of other genetic markers. Formally, there is an at risk allele r1、...、rkAnd non-risk allele n1、...、nkK genetic markers of (1). In SNPi, we mean that the three possible genotype values are riri、niriAnd nini. Genotype information of an individual can be determined by vector (g)1、...、gk) Described, wherein g is based on the number of risk alleles at the i positioniMay be 0,1 or 2. We pass through λ1 iIndicating the relative risk of a heterozygous genotype at the same position compared to the homozygous non-risk allele at position i. In other words, we defineSimilarly, we mean ririThe relative risk of the genotype isUnder the integrative model, we assume that there is genotype (g)1、...、gk) The risk of the individual of (a) isVolumetric models have previously been used in the literature to simulate case-control studies or for visualization purposes.
Assessing relative risk
In another embodiment, the relative risk for different genetic markers is known and a cumulative model can be used for risk assessment. However, in some embodiments that include association studies, the study design prevents reporting of relative risk. In some case-control studies, the relative risk cannot be calculated directly from the data without further assumptions. Instead of reporting relative risk, the usual way is to report the Odds Ratio (OR) of genotypes, which are diseases (r) carrying a given risk genotypeiriOr niri) The ratio of the probability of not carrying a given risk genotype disease. In the form of a sheet, the sheet is,
finding the relative risk from the odds ratio may require additional assumptions. For example, assume allele frequencies in the entire populationAndknown or evaluated (these may be evaluated from existing datasets, e.g. a HapMap dataset comprising 120 chromosomes), and/or it is assumed that the prevalence of the disease, p ═ p (d), is known. From the three equations above, one can derive:
p=a·P(D|nini)+b·P(D|niri)+c·P(D|riri)
by definition of relative risk, in dividing by pP (D | n)ini) After the term, the first equation can be rewritten as:
and therefore the latter two equations can be rewritten as:
(1)
it should be noted that equation system 1 is equivalent to the Zhang and Yu formulas in Zhang J and Yu k when a is 1 (non-risk allele frequency of 1) (What's the relative rise a method of correcting the proportions in the co-ordinates of common outrecords, jama, 280: 1690-1, 1998, the entire contents of which are incorporated by reference). In contrast to Zhang and Yu formulas, some embodiments of the invention take into account allele frequencies in the population, which may affect relative risk. Still other embodiments allow for interdependence of relative risk. This is in contrast to calculating each relative risk independently.
Equation system 1 can be rewritten as two quadratic equations with up to four possible solutions. Gradient descent algorithm (gradient algorithm) can be used to solveSolving these equations, where the starting point is set to the odds ratio, for example,and
for example:
finding a solution to these equations is equivalent to finding the function g (λ)1,λ2)=f1(λ1,λ2)2+f2(λ1,λ2)2Is measured.
Therefore, the temperature of the molten metal is controlled,
in this example, we pass the setting x0=OR1,y0=OR2And starting. We will find the value ε]=10-10Set to the tolerance constant (tolerance constant) of the whole algorithm. In iteration i, we defineThen, we set
These iterations are repeated until g (x)i,yi) < tolerance, wherein tolerance is set to 10 in the provided code-7。
In this embodiment, these equations give a, b, c, p, OR1And OR2Positive solutions of different values of (b). FIG. 1 shows a schematic view of a0
Robustness of relative risk assessment
In some embodiments, the effect of different parameters (prevalence, allele frequency, and odds ratio error) on estimates of relative risk is determined. To determine the impact of allele frequencies and prevalence estimates on relative risk values, relative risk (under HWE) from a set of values of different odds ratios and different allele frequencies was calculated, and the results of these calculations were plotted for prevalence values in the range of 0 to 1. Fig. 10. In addition, for a fixed prevalence value, the resulting relative risk can be plotted as a function of risk allele frequency. Fig. 11. When p is 0, λ1=OR1And λ2=OR2And when p is 1, λ1=λ20. This can be calculated directly from the equation. In addition, in some embodiments, λ is when the risk allele frequency is high1Is closer to a linear function, and λ2Closer to a concave function with a bounded second derivative. In the limiting case, λ is when c is 12=OR2+p(1-OR2) And is andif OR is present1≈OR2The latter is also close to a linear function. When the risk allele frequency is low, lambda1And λ2The behavior of the approximation function 1/p. In the limiting case, when c is 0,this indicates that for high risk allele frequencies, an incorrect prevalence estimate will not significantly affect the resulting relative risk. In addition, for low risk allele frequencies, if the correct prevalence p is replaced by a prevalence value p' ═ α p, then the resulting relative risk will be eliminated at mostThe coefficient of (a). This is illustrated in the (c) and (d) diagrams of fig. 11. It should be noted that for high risk allele frequencies, the two plots are quite similar, whereas for low allele frequencies, there is a higher deviation in the difference in relative risk values, which is less than a factor of 2.
Calculating GCI score
In one embodiment, the genetic composite index is calculated using a reference set representing the relevant population. This reference set may be one of the populations in the HapMap or another genotype dataset.
In this embodiment, the GCI is calculated as follows. For each of the k risk loci, the relative risk is calculated from the odds ratio using equation system 1. Then, a volumetric score is calculated for each individual in the reference set. The GCI of an individual with a positive score of s is the score of all individuals in the reference dataset with a score of s' ≦ s. For example, if 50% of the individuals in the reference set have a multiplicative score less than s, then the individual's final GCI score will be 0.5.
Other models
In one embodiment, a volumetric model is used. In alternative embodiments, other models may be used for the purpose of determining the GCI score. Other suitable models include, but are not limited to:
an additive model. In an additive model, has genotype (g)1,...gk) The risk of the individual is assumed to be
A generalized additive model. In the generalized additive model, it is assumed that the function f exists so as to have a genotype (g)1,...gk) The risk of the individual of (a) is
Harvard improvement score (Het). This score was derived by G.A Colditz et al, whereby the score was applied to a genetic marker (Harvard report on cancer preventionvolume 4: Harvard cancer risk index. cancer Causes and Controls, 11: 477-. Although the function f operates on odds ratios rather than relative risk, the Het score is essentially a generalized additive score. This is useful in situations where the relative risk is difficult to assess. To define the function f, the intermediate function g is defined as:
then calculateIn which p ishet iThe frequency of individuals heterozygous for SNP i in the entire reference population. The function f is then defined as f (x) g (x)/Het, and Harvard improvement score (Het) is simply defined as
Harvard modified score (Hom). Except that the value het is valuedInstead, this score is similar to the Het score, where p ishom iThe frequency of individuals with homozygous risk alleles.
The maximum advantage ratio. In this model, it is assumed that one of the genetic markers (the one with the greatest odds ratio) gives a lower bound to the combined risk for the entire group of subjects. Formally, having a genotype (g)1,...gk) Is scored as
Comparison between scores
In one embodiment, GCI scores were calculated based on multiple models across the entire HapMap CEU population for 10 SNPs associated with T2D. The related SNPs are rs7754840, rs4506565, rs7756992, rs10811661, rs12804210, rs8050136, rs1111875, rs4402960, rs5215 and rs 1801282. Odds ratios of three possible genotypes for each of these SNPs are reported in the literature. The CEU population consists of a three-person group of thirty mothers-father-children. To avoid dependency, sixty parents from this group were employed. One individual with no calls in one of the 10 SNPs was excluded, resulting in a set of 59 individuals. The GCI rating of each individual was then calculated using several different models.
It can be observed that for this data set, the different models produce highly correlated results. Fig. 12 and 13. Spearman correlation was calculated between pairs of models (table 2), which showed that the additive and the multiplicative models had a correlation coefficient of 0.97, and thus the GCI score was robust using either the additive or the multiplicative models. Similarly, the correlation between the Harvard modified score and the multiplicative model was 0.83, and the correlation coefficient between the Harvard score and the additive model was 0.7. However, using the maximum odds ratio as the genetic score results in a dichotomous score (dichotomous score) defined by one SNP. Overall, these results show that scoring ranks provide a stable framework that minimizes model dependence.
Table 2: spearman correlation of score distribution of CEU data between model pairs.
The effect of variation in T2D prevalence on the resulting distribution was determined. The popularity values varied between 0.001 and 0.512 (FIG. 14). For the case of T2D, it can be seen that different prevalence values result in the same order of individuals (Spearman correlation > 0.99), so an artificially fixed value of 0.001 for prevalence can be assumed.
Extending a model to an arbitrary number of variants
In another embodiment, the model may be extended to the case where any number of possible variations occur. The previous considerations relate to the case where there are three possible variants (nn, nr, rr). In general, when multiple SNP associations are known, any number of variants can be found in a population. For example, when the interaction between two genetic markers is correlated with status, there are nine possible variants. This results in eight different odds ratios.
To summarize the original formula, it can be assumed that there are k +1 possible variants a0,...,akHaving a frequency f0,f1,...,fkThe measured odds ratio is 1, OR1,...,ORkAnd unknown relative risk value of 1, lambda1,...,λk. It can further be assumed that with respect to a0All relative risk and odds ratios were determined, and therefore,andbased on:
can determine
And, if setThis results in the following equation:
and therefore the number of the first and second channels,
or
The latter is an equation with one variable (C). This process can produce many different solutions (basically, up to k +1 different solutions). A criteria optimization tool (e.g., gradient descent) may be used to find the closest C0=∑fitiThe solution of (1).
The present invention uses a stable scoring framework for risk factor quantification. Although different genetic models may result in different scores, the results are often correlated. Thus, the quantification of risk factors is generally independent of the model used.
Comparative Risk case assessment study
Methods for assessing relative risk from odds ratios of multiallelic genes in case-control studies are also provided in the present invention. In contrast to previous approaches, this approach takes into account allele frequency, prevalence of disease, and dependence between the relative risk of different alleles. The performance of this method on a simulated case control study was determined and found to be extremely accurate.
Method of producing a composite material
In the case of testing for association of a particular SNP with disease D, R and N represent risk and non-risk alleles of this particular SNP. P (RR | D), P (RN | D) and P (NN | D) represent the probability of being affected by disease if the individual is assumed to be homozygous for the risk allele, heterozygous or homozygous for the non-risk allele, respectively. f. ofRR、fRNAnd fNNUsed to indicate the frequency of the three genotypes in the population. Using these definitions, relative risk is defined as
In case control studies, P (RR | D) values (i.e., the frequency of RR in case and control) and P (RN | D), P (NN | D) and P (NN | D), i.e., the frequency of RN and NN in case and control, can be evaluated. To estimate relative risk, Bayes (Bayes) law can be used to derive:
thus, if the frequency of genotypes is known, one can use them to calculate relative risk. The frequency of genotypes in a population cannot be calculated from the case-control study itself, as they depend on the prevalence of the disease in the population. In particular, if the prevalence of the disease is p (d):
fRR=P(RR|D)p(D)+P(RR|~D)(1-p(D))
fRN=P(RN|D)p(D)+P(RN|~D)(1-p(D))
fNN=P(NN|D)p(D)+P(NN|~D)(1-p(D))。
when p (d) is sufficiently small, the frequency of genotypes can approach that in the control population, but when prevalence is high, this will not be an accurate estimate. However, if a reference data set (e.g., HapMap [ cite ]) is given, one can estimate genotype frequencies based on the reference data set.
Most recent studies do not estimate relative risk using a reference dataset and only report odds ratios. Odds ratio can be written
The odds ratio is often advantageous since it is usually not necessary to have an estimate of allele frequency in the population; to calculate odds ratios, what is generally required is genotype frequency in cases and controls.
In some cases, genotype data is not available by itself, but summary data (e.g., odds ratio) is available. This is the case when a posterior analysis (meta-analysis) is performed based on results from previous case control studies. In this case, it is verified how to find the relative risk from the odds ratio. The fact shown using the following equation:
p(D)=fRRP(D|RR)+fRNP(D|RN)+fNNP(D|NN)
if this equation is divided by P (D | NN), we get
This enables the odds ratio to be written in the form:
by a similar calculation, the following system of equations is obtained:
equation 1
If odds ratios, genotype frequencies in the population, and prevalence of disease are known, then relative risk can be obtained by solving this system of equations.
It should be noted that there are two quadratic equations, so they have a maximum of four solutions. However, as shown below, there is typically one possible solution to this approach.
It should be noted that when fNNWhen 1, equation system 1 is equivalent to Zhang and Yu formulas; however, the allele frequencies in the population are considered here. Moreover, our method takes into account the fact that: the two relative risks are dependent on each other, whereas previous methods propose calculating each relative risk independently.
Relative risk of multiallelic loci. The calculations are somewhat complex if multiple markers or other multiallelic variants are considered. a is0、a1、...、akRepresents the possible k +1 alleles, where a0Is a non-risk allele. The allele frequency f in the population for k +1 possible alleles is assumed0、f1、f2、...、fk. For allele i, the relative risk and odds ratio is defined as
The following equation applies to the prevalence of the disease:
thus, by dividing both sides of the equation by p (D | a)0) We get:
obtaining:
by settingTo obtainThus, by definition of C, we derive:
this is a polynomial equation with one variable C. Once C is determined, the relative risk is determined. The polynomial is k +1 degrees and therefore we expect to have at most k +1 solutions. However, since the right side of the equation strictly reduces as a function of C, there may typically be only one solution to this equation. This solution is easily found using a binary search, since the solution bounds on C ═ 1 andin the meantime.
Stability of relative risk assessment. The effect of various parameters (prevalence, allele frequency, and odds ratio error) on the estimate of relative risk was determined. To determine the impact of allele frequencies and prevalence estimates on relative risk values, relative risk was calculated from a set of values for different odds ratios, different allele frequencies (at HWE), and the results of these calculations were plotted for prevalence values in the range of 0 to 1.
In addition, for a fixed prevalence value, the resulting relative risk is plotted as a function of risk allele frequency. It is clear that in all cases of p (d) ═ 0, λRR=ORRRAnd λRN=ORRNAnd when p (D) is 1, λRR=λRN0. This can be directly calculated from equation 1. In addition, λ is when the risk allele frequency is highRRClose to linear behavior, and λRNClose to a concave function with a bounded second derivative. When the frequency of the risk allele is low,λRRand λRNClose to the behavior of the function 1/p (D). This means that for high risk allele frequencies, a false estimate of prevalence will not greatly affect the resulting relative risk.
The following examples illustrate and explain the present invention. The scope of the invention is not limited to these examples.
Example I
Generation and analysis of SNP profiles
The individual is provided with a sample tube in a kit (e.g., purchased from DNA Genotek) in which the individual deposits a saliva sample (approximately 4ml) from which genomic DNA will be extracted. Saliva samples were delivered to CLIA certified laboratories for processing and analysis. Typically, the sample is delivered to the testing facility by overnight mailing in a shipping container that is conveniently provided to the individual within the collection kit.
In a preferred embodiment, the genomic DNA is isolated from saliva. For example, using the DNA self-collection kit technology provided by DNA Genotek, an individual collects approximately 4ml of saliva samples for clinical processing. After delivery of the sample to an appropriate laboratory for processing, the DNA is isolated by heat denaturation and protease digestion of the sample (typically for at least one hour at 50 ℃ using reagents provided by the collection kit supplier). Subsequently, the sample was centrifuged, and the supernatant was subjected to ethanol precipitation. The DNA pellet is suspended in a buffer suitable for subsequent analysis.
Genomic DNA of an individual is isolated from a saliva sample according to well known procedures and/or procedures provided by the manufacturer of the collection kit. Typically, the sample is first heat denatured and protease digested. Next, the sample was centrifuged, and the supernatant was retained. The supernatant was then subjected to ethanol precipitation to obtain a precipitate containing approximately 5-16 ug of genomic DNA. The DNA pellet was suspended in 10mM Tris (pH 7.6), 1mM EDTA (TE). SNP profiles were generated by hybridizing genomic DNA to commercially available high density SNP arrays (e.g., those provided by Affymetrix or Illumina) using instrumentation and instructions provided by the array manufacturer. Individual SNP profiles are stored in an encrypted database or vault.
The data structure of the patient is queried for risk-conferring SNPs by comparison with a clinical database of established, medically relevant SNPs whose presence in the genome is associated with a given disease or condition. The database includes information on the statistical relevance of specific SNPs and SNP haplotypes to a particular disease or condition. For example, as shown in example III, polymorphisms in the apolipoprotein E gene lead to distinct isoforms of the protein, which in turn are associated with a statistical likelihood of developing Alzheimer's disease. As another example, individuals with a variant of the coagulation protein factor V, known as the factor VLeiden, have an increased tendency to clot. Many genes in which SNPs are associated with disease or status phenotypes are shown in table 1. The scientific accuracy and importance of the information in the database is approved by the research/clinical advisory committee and can be reviewed by a supervising governmental agency. The database can be continuously updated as more SNP-disease associations emerge from the scientific community.
The results of the analysis of the individual SNP profiles are securely provided to the patient through an online portal or email. The patient is provided with explanatory and supportive information, such as the information on factor V Leiden shown in example IV. Secure access to the individual's SNP profile information (e.g., through an online portal) would facilitate discussion with the patient's physician and give the ability to select for personalized medicine.
Example II
Updating of genotype correlations
In response to a request to initially determine the genotype correlations of an individual, a genomic profile is generated, the genotype correlations are obtained, and the results are provided to the individual as described in example I. After an initial determination of the genotype correlations of the individual, later when additional genotype correlations are known, updated correlations are determined or can be determined. The registered users have advanced registrations and their genotype profiles are stored in an encrypted database. The updated correlations were performed on the stored genotype profiles.
For example, as described in example I above, initial genotype correlations have determined that a particular individual does not have ApoE4, and therefore is not susceptible to early onset alzheimer's disease, and that this individual does not have factor V Leiden. After this initial determination, the new correlations become known and validated such that polymorphisms in a given gene (say gene XYZ) are correlated with a given state (say state 321). This new genotype correlation was added to the master database of human genotype correlations. Updates are then provided to the specific individual by first obtaining data for the relevant genes XYZ from the genomic profile of the specific individual stored in the encrypted database. The relevant gene XYZ data for a particular individual is compared to the gene XYZ information of the updated master database. From this comparison, the susceptibility or predisposition of a particular individual to state 321 is determined. The results of this determination are added to the genotype correlations of a particular individual. The updated results of whether a particular individual is sensitive or genetically susceptible to the state 321 are provided to the particular individual along with explanatory and supportive information.
Example III
Association of the ApoE4 locus with Alzheimer's disease
The risk of Alzheimer's Disease (AD) has been shown to be associated with polymorphisms in the apolipoprotein e (ApoE) gene, which result in three isoforms of ApoE known as ApoE2, ApoE3 and ApoE 4. These isoforms differ from each other by one or two amino acids at residues 112 and 158 of the APOE protein. ApoE2 contains cysteine/cysteine at position 112/158; ApoE3 contains cysteine/arginine at position 112/158; and ApoE4 contains arginine/arginine at position 112/158. As shown in Table 3, the risk of Alzheimer's disease onset at a younger age increases with the APOE ε 4 gene copy number. Also, as shown in table 3, the relative risk of AD increases with APOE ∈ 4 gene copy number.
Table 3: prevalence of AD risk alleles (Corder et al, Science: 261: 921-3, 1993)
APOE epsilon 4 copies | Popularity of | Risk of alzheimer's disease | Age of onset |
0 | 73% | 20% | 84 |
1 | 24% | 47% | 75 |
2 | 3% | 91% | 68 |
Table 4: has the relative risk of AD of ApoE4 (Farrer et al, JAMA: 278: 1349-56, 1997)
APOE genotype | Ratio of advantages to each other |
ε2ε2 | 0.6 |
ε2ε3 | 0.6 |
ε3ε3 | 1.0 |
ε2ε4 | 22.6 |
ε3ε4 | 3.2 |
ε4ε4 | 14.9 |
Example IV
Information on factor V Leiden-positive patients
The following information is an example of information that may be provided to individuals with genomic SNP profiles showing the presence of the factor V Leiden gene. The individual may have a basic registration that may provide information in an initial report.
What is the factor V Leiden?
Factor V Leiden is not a disease, which means the presence of a specific gene inherited by one's parents. Factor V Leiden is a variant of the protein factor V (5) required for coagulation. Persons with factor V deficiency are more likely to bleed severely, while persons with factor V Leiden have an increased tendency to clot blood.
The human carrying the factor V Leiden gene has a 5-fold higher risk of developing a blood clot (thrombosis) than the rest of the population. But many people with this gene never develop blood clots. In the uk and usa, 5% of the population carries one or more factor V Leiden genes, which is much greater than the number of people who will actually suffer thrombosis.
How do you get the factor V Leiden?
The factor V gene is inherited by one's parents. As with all genetic traits, one gene is inherited from the mother and one from the father. Thus, it is possible to inherit: two normal genes or one factor V Leiden gene and one normal gene or two factor VLeiden genes. Having one factor V Leiden gene will result in a slightly higher risk of developing thrombosis, but having two genes results in a much greater risk.
What are the symptoms of factor V Leiden?
There are no signs unless you have a blood clot (thrombosis).
What is the danger signal?
The most common problem is blood clots in the legs. Leg swelling, pain and redness indicate this problem. In more rare cases, pulmonary blood clots (pulmonary thrombosis) may occur, which lead to breathing difficulties. Depending on the size of the blood clot, the severity of this condition is barely noticeable in patients with severe dyspnea. In even more rare cases, blood clots may occur in the arm or other body parts. Since these clots form in veins that transport blood to the heart rather than in arteries that carry blood from the heart, the factor VLeiden does not increase the risk of coronary thrombosis.
What can do to avoid blood clots?
Factor V Leiden is only slightly elevated leading to the risk of blood clots and many people with this state never develop thrombosis. One can do many things to avoid causing blood clots. Avoid standing or sitting in the same posture for a long time. When traveling long distances, it is important to exercise regularly-the blood must be left "still". Overnight or smoking will greatly increase the risk of blood clots. Women carrying the factor V Leiden gene should not take a contraceptive pill because this would significantly increase the chance of thrombosis. Women carrying the factor V Leiden gene should also consult their physician before pregnancy because this also increases the risk of thrombosis.
How do doctors find out if you have the factor V Leiden?
The gene for factor V Leiden can be found in blood samples.
Blood clots in the leg or arm are typically determined by ultrasound examination.
After a substance is injected into the blood to visualize the clot, the clot may also be detected by X-ray. Blood clots in the lungs are more difficult to find, but often physicians will use radioactive materials to test the distribution of blood flow in the lungs and the distribution of air flow into the lungs. The two distribution patterns should match-a mismatch indicates the presence of a blood clot.
How does the factor V Leiden handle?
Persons with factor V Leiden do not require treatment unless their blood begins to clot, in which case the physician will prescribe a blood-thinning (anticoagulant) drug, such as warfarin (e.g., warfarin sodium) or heparin, to prevent further blood clotting. Treatment will typically last three to six months, but may take longer if there are several blood clots. In severe cases, the course of medication may continue indefinitely; in extremely rare cases, blood clots may require surgical removal.
How does factor V Leiden treat during pregnancy?
Women carrying two factor V Leiden genes will need to receive treatment with heparin procoagulant drugs during pregnancy. The same treatment is applicable to women carrying only one factor V Leiden gene who have previously had a blood clot themselves or a family history of blood clotting.
All women carrying the factor V Leiden gene may need to wear special stockings to prevent blood clots in the latter half of pregnancy. After the birth of children, they may be prescribed the anticoagulant drug heparin.
Prognosis
The risk of developing blood clots increases with age, but in an age-based survey of 100 people carrying the gene, only a few have been found to have suffered from thrombosis. The National Society of Genetics Counselors (NSGC) can provide a list of Genetic consultants in The region of you and information about establishing family history. Their online databases are searched on www.nsgc.org/consumer.
While preferred embodiments of the present invention have been shown and described herein, it will be obvious to those skilled in the art that such embodiments are provided by way of example only. Numerous variations, changes, and substitutions will now occur to those skilled in the art without departing from the invention. It should be understood that many alternatives to the embodiments of the invention described herein may be employed in practicing the invention. It is intended that the following claims define the scope of the invention and that the invention cover methods and structures within the scope of these claims and their equivalents.
Claims (78)
1. A method of assessing genotype correlations of an individual, the method comprising:
a) obtaining a genetic sample of the individual, wherein the genetic sample is DNA;
b) generating a genomic profile of the individual from the genetic sample;
c) determining a genotype correlation for the individual by comparing the genomic profile of the individual to a database of correlations of current human genotypes and phenotypes to determine, for each phenotype of interest, a plurality of relative risk or odds ratios for a plurality of alleles, including risk alleles or non-risk alleles, of the individual;
d) updating the human genotype correlations database with additional human genotype correlations, when the additional human genotype correlations are known; and
e) updating the genotype correlation of the individual by comparing the genomic profile of the individual of step c), or a portion thereof, to the additional human genotype correlation and determining additional genotype correlations for the individual.
2. The method of claim 1, wherein the genetic sample is obtained by a third party.
3. The method of claim 1, wherein the generating a genomic profile is performed by a third party.
4. The method of claim 1, further comprising calculating a GCI score, wherein the GCI is calculated from a plurality of relative risks or odds ratios.
5. The method of claim 1, wherein the genomic profile comprises single nucleotide polymorphisms, nucleotide insertions, nucleotide deletions, chromosomal translocations, chromosomal duplications, or copy number variations.
6. The method of claim 1, wherein the genomic profile is the entire genome of the individual.
7. The method of claim 1, wherein the method comprises assessing 2 or more genotype correlations.
8. The method of claim 1, wherein the method comprises assessing 10 or more genotype correlations.
9. The method of claim 1, wherein said human genotype correlation database comprises genetic variants in one or more genes listed in table 1, figure 4, figure 5, 6, 22, or 25 and phenotypes associated with the genetic variants.
10. The method of claim 1, wherein said human genotype correlation database comprises genetic variants determined from said genomic profile of said individual and predetermined phenotypes revealed by said individual.
11. The method of claim 9 or 10, wherein the genetic variant is a single nucleotide polymorphism, a nucleotide insertion, a nucleotide deletion, a chromosomal translocation, a chromosomal duplication, or a copy number variation.
12. The method of claim 1, wherein the genetic sample is blood, hair, skin, saliva, semen, urine, fecal matter, sweat, or an oral sample.
13. The method of claim 1, wherein the genomic profile is generated using a high density DNA microarray, DNA sequencing, or PCR-based methods.
14. The method of claim 4, wherein at least one of physical data, medical data, ethnicity, family, geography, gender, age, family history, known phenotype, demographic data, exposure data, lifestyle data, or behavioral data of the individual is incorporated into the calculation of the GCI.
15. The method of claim 1, wherein the genomic profile of the individual is compared to a correlation between a SNP and a phenotype, wherein the SNP:
rs 69883267 when the phenotype is colorectal cancer, rs2165241 when the phenotype is exfoliative glaucoma, rs9939609 when the phenotype is obesity, rs3087243 or DRB1 0301 when the phenotype is Graves' disease, rs1800562 when the phenotype is hemochromatosis, rs 6969 when the phenotype is myocardial infarction, rs6897932, rs12722489 or DRB1 1501 when the phenotype is multiple sclerosis, rs11209026 when the phenotype is Psoriasis (PS), rs2300478, rs1026732 or rs9296249 when the phenotype is restless leg syndrome, rs6840978 or rs2187668 when the phenotype is celiac disease, rs 69883267, rs 30909 or rs 30906 when the phenotype is prostate cancer, dr5744798 when the phenotype is lupus erythematosus, dr5431711, DRB 3809, DRB 125355639 or DRB 36933 9 when the phenotype is rheumatoid arthritis, dr6446579 or DRB 4705, DRB 125357243 when the phenotype is lupus erythematosus, DRB 125357263, DRB 12535729, or DRB 3564049 when the phenotype is lupus erythematosus, rs2981582, rs3817198 or rs3803662, rs 2066846845, rs10883365, rs17234657, rs10210302, rs9858542, rs11805303, rs1000113, rs2542151 or rs10761659 when the phenotype is crohn's disease, rs13266634, rs4506565, rs7756992, rs10811661, rs8050136, rs1111875, rs4402960, rs5215 or rs1801282 when the phenotype is type 2 diabetes.
16. The method of claim 15, further comprising:
f) calculating at least one GCI score for said phenotype in combination with said relative risk or odds ratio.
17. A method of assessing genotype correlations of an individual, the method comprising:
a) obtaining a plurality of genetic samples from a plurality of individuals;
b) providing a set of rules comprising rules, each rule indicating a correlation between at least one genotype and at least one phenotype;
c) providing a data set comprising a genomic profile of each individual of the plurality of individuals, wherein each genomic profile comprises a plurality of genotypes;
d) determining a genotype correlation for the individual by comparing the genomic profile of the individual to a database of correlations of current human genotypes and phenotypes to determine, for each phenotype of interest, a plurality of relative risk or odds ratios for a plurality of alleles, including risk alleles or non-risk alleles, of the individual;
e) periodically updating the rule set with at least one new rule, wherein the at least one new rule indicates a correlation between genotypes and phenotypes that were previously unrelated to each other in the rule set; and
f) applying each new rule to the genomic profile of at least one of the individuals, thereby correlating at least one genotype with at least one phenotype for the individual.
18. The method of claim 17, further comprising:
f) generating a report comprising the phenotype profile of the individual.
19. The method of claim 17, further comprising: after step b)
i) Applying the rules of the rule set to the genomic profile of the individual to determine a set of phenotypic profiles of the individual; and
ii) generating a report comprising the initial phenotype profile of the individual.
20. The method of claim 18 or 19, wherein providing the report comprises transmitting the report over a network.
21. The method of claim 18 or 19, wherein the report is provided in an encrypted manner.
22. The method of claim 18 or 19, wherein the report is provided in an unencrypted manner.
23. The method of claim 18 or 19, wherein the report is provided through an online portal.
24. The method of claim 18 or 19, wherein the report is provided as a paper or email.
25. The method of claim 17, wherein the new rule associates an unassociated genotype with a phenotype.
26. The method of claim 17, wherein the new rule associates an associated genotype with a phenotype not previously associated therewith in the rule set.
27. The method of claim 17, wherein the new rule changes a rule in the rule set.
28. The method of claim 17, wherein the new rule is generated by correlation of a genotype of the genomic profile from the individual and a predetermined phenotype of the individual.
29. The method of claim 17, wherein the rule associates a plurality of genotypes with a phenotype.
30. The method of claim 17, wherein applying the new rule further comprises determining the phenotype profile based at least in part on a characteristic of the individual selected from the group consisting of ethnicity, pedigree, geography, gender, age, family history, and a predetermined phenotype.
31. The method of claim 17, wherein the genotype comprises a nucleotide repeat, a nucleotide insertion, a nucleotide deletion, a chromosomal translocation, a chromosomal repeat, or a copy number variation.
32. The method of claim 31, wherein the copy number variation is a microsatellite repeat, a nucleotide repeat, a centromeric repeat or a telomeric repeat.
33. The method of claim 17, wherein the genotype comprises a single nucleotide polymorphism.
34. The method of claim 17, wherein the genotypes comprise a haplotype and a diplotype.
35. The method of claim 17, wherein the genotype comprises a genetic marker in linkage disequilibrium with a phenotype-associated single nucleotide polymorphism.
36. The method of claim 17, wherein the phenotype profile indicates the presence or risk of the quantitative trait.
37. The method of claim 17, wherein the phenotype profile indicates a probability that an individual having a genotype has or will have a phenotype.
38. The method of claim 37, wherein the probability is based on a GCI or GCI Plus score.
39. The method of claim 37, wherein the probability is an estimated lifetime risk.
40. The method of claim 17, wherein the correlation is validated.
41. The method of claim 17, wherein the rule set includes at least 20 rules.
42. The method of claim 17, wherein the set of rules includes at least 50 rules.
43. The method of claim 17, wherein the rule set comprises rules based on the genotype correlations in table 1.
44. The method of claim 17, wherein the rule set comprises rules based on the genotype correlations in figures 4,5, 6, 22, or 25.
45. The method of claim 17, wherein the phenotype comprises a quantitative trait.
46. The method of claim 45, wherein the quantitative trait comprises a medical condition.
47. The method of claim 46, wherein said phenotype profile indicates the presence or absence of said medical condition, the risk of developing said medical condition, the prognosis of said medical condition, the effect of treatment of said medical condition, or the response to treatment of said medical condition.
48. The method of claim 45, wherein the quantitative trait comprises a phenotype of a non-medical condition.
49. The method of claim 45, wherein the quantitative trait is selected from the group consisting of a physical trait, a physiological trait, a mental trait, an emotional trait, ethnicity, pedigree, or age.
50. The method of claim 17, wherein the subject is a human.
51. The method of claim 17, wherein the subject is a non-human.
52. The method of claim 17, wherein the individual is a registered user.
53. The method of claim 17, wherein the individual is a non-registered user.
54. The method of claim 17, wherein the genomic profile comprises at least 100,000 genotypes.
55. The method of claim 17, wherein the genomic profile comprises at least 400,000 genotypes.
56. The method of claim 17, wherein the genomic profile comprises at least 900,000 genotypes.
57. The method of claim 17, wherein the genomic profile comprises at least 1,000,000 genotypes.
58. The method of claim 17, wherein the genomic profile comprises substantially complete whole genome sequence.
59. The method of claim 17, wherein the data set comprises a plurality of data points, wherein each data point relates to an individual and comprises a plurality of data elements, wherein the data elements comprise at least one element selected from the group consisting of a unique identifier of the individual, genotype information, microarray SNP identification number, SNP rs number, chromosomal location, polymorphic nucleotides, quality metrics, raw data files, images, extracted intensity scores, physical data, medical data, ethnicity, pedigree, geography, gender, age, family history, known phenotype, demographic data, exposure data, lifestyle data, and behavioral data.
60. The method of claim 17, wherein the periodic updating and applying occurs at least once a year.
61. The method of claim 17, wherein providing the data set comprises obtaining a genomic profile for each individual of a plurality of individuals by:
i) performing genetic analysis on a genetic sample obtained from said individual, and
ii) encoding the analysis in a computer readable form.
62. The method of claim 17, wherein said phenotype profile comprises a monogenic phenotype.
63. The method of claim 17, wherein said phenotype profile comprises a multigenic phenotype.
64. The method of claim 17, wherein the report includes an initial phenotype profile.
65. The method of claim 17, wherein the report comprises an updated phenotype profile.
66. The method of claim 17, wherein said report further comprises information about said phenotype of said phenotypic profile selected from one or more of the following: preventive countermeasures, health information, therapy, symptom recognition, early detection protocols, intervention protocols, and precise identification and subclassification of the phenotypes in the phenotype profile.
67. The method of claim 17, further comprising:
e) adding a new genomic profile of a new individual to the individual dataset;
f) applying the rule set to the genomic profile of the new individual; and
g) generating an initial report of the phenotype profile of the new individual.
68. The method of claim 17, comprising:
e) adding a new genomic profile of the individual;
f) applying the rule set to the new genomic profile of the individual; and
g) generating a new report of the phenotype profile of the individual.
69. A system for assessing genotype correlations of an individual, the system comprising:
a) means for storing a rule set comprising rules, each rule indicating a correlation between at least one genotype and at least one phenotype, wherein the genotype correlation is determined by comparing the genomic profile of the individual to a database of correlations of current human genotypes and phenotypes to determine a plurality of relative risk or odds ratios for a plurality of alleles, including risk alleles or non-risk alleles, of the individual for each phenotype of interest;
b) means for periodically updating said rule set with at least one new rule, wherein said at least one new rule indicates a correlation between genotypes and phenotypes not previously correlated with each other in said rule set;
c) means for generating a genomic profile of an individual, thereby obtaining a database comprising genomic profiles of a plurality of individuals;
d) means for applying the rule set to the genomic profile of an individual to determine a phenotypic profile of the individual; and
e) and means for generating a report for each individual.
70. The system of claim 69, wherein the report is transmitted over a network.
71. The system of claim 69, wherein the report is provided in an encrypted manner.
72. The system of claim 69, wherein said report is provided in an unencrypted manner.
73. The system of claim 69, wherein said report is provided through an online portal.
74. The system of claim 69, wherein the report is provided via paper or email.
75. The system of claim 69, further comprising means for announcing to said individual a new or revised association.
76. The system of claim 69, further comprising code for advertising to said individual new or revised rules applicable to said genomic profile of said individual.
77. The system of claim 69, further comprising means for advertising to said individual new or revised prevention and health information regarding said phenotype of said phenotype profile of said individual.
78. A kit for performing the method of claim 1, the kit comprising:
a) at least one sample collection container;
b) instructions for obtaining a sample from an individual;
c) instructions for accessing a genomic profile of the individual obtained from the sample through an online portal;
d) instructions for accessing, via an online portal, a phenotype profile of the individual obtained from the sample; and
e) a package for delivering the sample collection container to the sample processing mechanism.
Applications Claiming Priority (13)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US86806606P | 2006-11-30 | 2006-11-30 | |
US60/868,066 | 2006-11-30 | ||
US95112307P | 2007-07-20 | 2007-07-20 | |
US60/951,123 | 2007-07-20 | ||
US11/781,679 | 2007-07-23 | ||
US11/781,679 US20080131887A1 (en) | 2006-11-30 | 2007-07-23 | Genetic Analysis Systems and Methods |
US97219807P | 2007-09-13 | 2007-09-13 | |
US60/972,198 | 2007-09-13 | ||
US98562207P | 2007-11-05 | 2007-11-05 | |
US60/985,622 | 2007-11-05 | ||
US98968507P | 2007-11-21 | 2007-11-21 | |
US60/989,685 | 2007-11-21 | ||
PCT/US2007/086138 WO2008067551A2 (en) | 2006-11-30 | 2007-11-30 | Genetic analysis systems and methods |
Publications (2)
Publication Number | Publication Date |
---|---|
HK1139737A1 true HK1139737A1 (en) | 2010-09-24 |
HK1139737B HK1139737B (en) | 2014-04-11 |
Family
ID=
Also Published As
Publication number | Publication date |
---|---|
EP2102651A2 (en) | 2009-09-23 |
GB0723512D0 (en) | 2008-01-09 |
CA2671267A1 (en) | 2008-06-05 |
EP2102651A4 (en) | 2010-11-17 |
CN103642902B (en) | 2016-01-20 |
GB2444410B (en) | 2011-08-24 |
JP2014140387A (en) | 2014-08-07 |
AU2007325021B2 (en) | 2013-05-09 |
GB2444410A (en) | 2008-06-04 |
WO2008067551A3 (en) | 2008-12-11 |
KR20090105921A (en) | 2009-10-07 |
CN103642902A (en) | 2014-03-19 |
TW200847056A (en) | 2008-12-01 |
WO2008067551A2 (en) | 2008-06-05 |
JP2010522537A (en) | 2010-07-08 |
TWI363309B (en) | 2012-05-01 |
AU2007325021A1 (en) | 2008-06-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9092391B2 (en) | Genetic analysis systems and methods | |
EP2215253B1 (en) | Method and computer system for correlating genotype to phenotype using population data | |
TWI363309B (en) | Genetic analysis systems, methods and on-line portal | |
CN101617227B (en) | Genetic analysis systems and methods | |
TWI423063B (en) | Methods and systems for personalized action plans | |
Thomas et al. | Recent developments in genomewide association scans: a workshop summary and review | |
TWI423151B (en) | Methods and systems for incorporating multiple environmental and genetic risk factors | |
TWI460602B (en) | Device for universal preconception screening | |
Hicks et al. | Integrative analysis of response to tamoxifen treatment in ER-positive breast cancer using GWAS information and transcription profiling | |
Schaid et al. | Discovery of cancer susceptibility genes: study designs, analytic approaches, and trends in technology | |
HK1139737B (en) | Genetic analysis systems and methods | |
Coon et al. | A generic research paradigm for identification and validation of early molecular diagnostics and new therapeutics in common disorders | |
HK1156668A (en) | Methods and systems for universal carrier screening |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PC | Patent ceased (i.e. patent has lapsed due to the failure to pay the renewal fee) |
Effective date: 20171130 |