ADVERTISEMENT

Putting genome analysis to good use: Lessons from C-reactive protein and cardiovascular disease

Cleveland Clinic Journal of Medicine. 2012 March;79(3):182-191 | 10.3949/ccjm.79a.09169
Author and Disclosure Information

ABSTRACTNew methods of studying the human genome offer novel ways to examine the relationship between biomarkers and common, chronic human diseases. As an example, we will review a large genomics study (Elliott et al, JAMA 2009; 302:37–48) that concluded that C-reactive protein (CRP) is likely not a cause of coronary heart disease, although it is a marker for it.

KEY POINTS

  • Genome-wide association studies can uncover associations between genetic markers and medical conditions, but they fall short of establishing causality or even clear biologic interactions between a genetic variant and a disease state.
  • Mendelian randomization is a method for addressing the relationship between genetic variants and disease, ie, whether a biomarker affected by the variant is a cause of the disease or merely a bystander.
  • CRP, an acute-phase reactant produced by the liver in response to inflammation, is one of many inflammatory markers whose levels correlate with coronary disease and which has been suggested to play a role in its pathogenesis.
  • The findings of Elliott et al suggest that therapies that specifically lower CRP levels are not likely to affect coronary artery disease.

Genomics research is paying off, not only by identifying people at risk of rare inherited diseases but also by clarifying the pathogenic mechanisms of important, common ones.

Thanks to advances in technology, we can now, at a reasonable cost, simultaneously screen for millions of genetic variants in thousands of people to find variants that are more common in people with a given disease than without the disease, a fruitful method called a genome-wide association study. Moreover, an epidemiologic method called mendelian randomization takes advantage of the natural reshuffling of the genetic deck that occurs with each generation to give an estimate of whether certain gene products are mediators—or merely markers—of disease.

In a landmark study published in 2009, Elliott et al1 used mendelian randomization to evaluate the role of C-reactive protein (CRP) in coronary artery disease.

Here, we review the use of genetic tools in a clinical context, highlighting CRP to illustrate some of the potential uses and limitations of applied genomics in clinical investigation.

NATURE VS NURTURE: AN AGE-OLD DEBATE

The relative contributions of genetic and environmental factors to human health and disease— nature vs nurture—is an age-old debate in which interest has been renewed in this era of intensive research in molecular genetics.

In the 19th century, Charles Darwin proposed that evolution proceeds through natural selection of variations in inherited traits. His contemporary, Gregor Mendel, showed that traits are inherited in discrete units, later named genes. Just what genes were and how they worked had to await the discovery of the structure of DNA in 1953, by Watson and Crick.2

Since then, progress has accelerated. Advances in recombinant DNA and DNA-sequencing technologies enabled sequencing of the entire human genome only 50 years later. More recently, we have seen automated rapid sequencing, the HapMap project (more on this below), and the advent of genome-wide association studies that uncover genetic variants correlated with or predisposing to common, complex human diseases.

Until recent years, medical genetics was mostly confined to the study of rare syndromes, such as Huntington disease, that are due either to a change in a single gene or to abnormal quantities of large swaths of chromosomes containing many genes. It had little application to most of the common disorders seen by primary care physicians. However, the genes and pathways implicated in rare monogenic disorders have provided key insights into common diseases. For example, defining the genes and mutations underlying familial hypercholesterolemia highlighted the role of low-density lipoprotein cholesterol (LDL-C) in the pathogenesis of atherosclerotic disease.

3.4 BILLION BASE PAIRS, 23,000 GENES

The DNA molecule consists of two strings of the nucleotides guanine (G), cytosine (C), thymine (T), and adenine (A). The human genome contains about 3.4 billion of these nucleotides, also called base pairs, as they bind G to C and A to T across the length of the double helix of the DNA molecule.

Only about 2% of these 3.4 billion base pairs make up genes, ie, sequences that are transcribed into RNA and then translated into protein. Humans have only about 23,000 genes, which is less than in some plant species.

What about the rest of the human genome, ie, most of it? Previously dismissed as “junk,” these regions likely possess more elusive regulatory functions, controlling gene expression (ultimately, the production of protein), which varies considerably from tissue to tissue and over a person’s lifetime.

It is the orchestration of gene expression over time and cell type that gives the human body its intricate complexity. The study of how all our genes and gene products interact is called genomics and is part of the larger topic of the network of protein interactions (proteomics) and of the integration of various protein pathways (metabolomics).

We are all 99% identical—or 12 million nucleotides different

Human genome sequences are 99% identical across populations. But the remaining 1% is still a big number: there are more than 12 million variants between any two individuals’ genomes. These variants include:

  • Single-nucleotide polymorphisms (SNPs), ie, a single-nucleotide change that is present in at least 1% of the population
  • Copy number variants (CNVs), ie, a stretch of DNA that is either missing or duplicated
  • Repeating patterns of DNA that vary in the number of repeated sequences.

THE EVOLUTION OF GENOMICS RESEARCH

Much of the initial focus of research in the genomics era consisted of identifying these variants and discovering associations between them and particular human diseases or clinical outcomes. In this way, we uncovered a multitude of potential new biomarkers and therapeutic targets, requiring further investigation into the connection between the DNA variant and the clinical state.

At the close of the 20th century, genetic factors were correlated with human disease by linkage analysis (a method of mapping patterns of markers that congregate in relatively narrow regions of DNA in families with specific diseases), and candidate gene approaches, whereby genes were investigated on the basis of their postulated biology and of previous studies. These techniques were relatively low-yield and cumbersome; years of work uncovered only a handful of genes proven to be associated with diseases.

Newer tools can look at scores of genes linked to common diseases. Researchers now rely on sophisticated DNA sequencing tools and interpretation software to sift masses of data to find meaningful markers (DNA variants or mutations).

Genomics research in the past few years has been mostly hypothesis-independent. Investigators are no longer limited to the small cache of genes whose corresponding proteins are well characterized, but can instead probe the entire genome for connections between our DNA and our physiology.