Genetic and Environmental Effects on Gene Expression Signatures of Blood PressureNovelty and Significance
A Transcriptome-Wide Twin Study
Recently, 2 transcriptome-wide studies identified 40 genes that were differentially expressed in relation to blood pressure. However, to what extent these BP-related gene expression signatures and their associations with BP are driven by genetic or environmental factors has not been investigated. In this study of 391 twins (193 twin pairs and 5 singletons; age 55–69 years; 40% male; 57% monozygous) recruited from the Finnish Twin Cohort, transcriptome-wide data on peripheral leukocytes were obtained using the Illumina HT12 V4 array. Our transcriptome-wide analysis identified 1 gene (MOK [MAPK/MAK/MRK overlapping kinase], P=7.16×10−8) with its expression levels associated with systolic BP at the cutoff of false-discovery rate <0.05. This association was further replicated in the Framingham Heart Study (P=1.02×10−5). Out of the 40 genes previously reported, 12 genes could be replicated in the twin cohort with false-discovery rate <0.05 and consistent direction of effect. Univariate twin modeling showed that genetic factors contributed to the expression variations of 12 genes with heritability estimates ranging from 6% to 65%. Bivariate twin modeling showed that 53% of the phenotypic association between systolic BP and MOK expression, and 100% of the phenotypic association of systolic and diastolic BP with CD97 (cluster of differentiation 97), TIPARP (TCDD-inducible poly[ADP-ribose] polymerase), and TPPP3 expression could be explained by genetic factors shared in common. In this study of adult twins, we identified one more gene, MOK, with its expression level associated with BP, and replicated several previously identified signals. Our study further provides new insights into the genetic and environmental sources of BP-related gene expression signatures.
See Editorial Commentary, pp 406–408
High blood pressure (BP) is the leading risk factor for cardiovascular disease worldwide.1 Systolic BP (SBP) and diastolic BP (DBP) are complex physiological traits that are affected by both genetic and environmental factors.2,3 Recently, 2 genome-wide gene expression studies on peripheral leukocytes identified 40 genes that were differentially expressed in relation to BP.4,5 However, the causal nature of these associations remains unclear. In addition to the possibility of 1-way causation, their relationships can also be bidirectional. For example, the cellular processes underlying BP regulation may result in changes in gene expression. In turn, the gene expression or its downstream molecular products may increase the risk of hypertension. It is also possible that common genetic effects account for the observed associations between BP and gene expression. Similar to BP, the expression level of a gene is determined by both genetic and environmental factors,6 although the extent to which the BP-related gene expression signatures are driven by genetic factors has not been investigated. Discovery of a common genetic substrate may point toward a common pathogenic pathway and improve our understanding of the molecular mechanisms underlying BP regulation. In the present study of a transcriptome-wide analysis in leukocytes from 391 adult twins, we first identified new transcriptional signals associated with BP, next replicated previous signals associated with BP, third conducted twin modeling to estimate the heritability of gene expression signatures of BP, and finally assessed whether shared genetic effects play a role in the link between BP and gene expression.
The data that support the findings of this study are available from the corresponding author on reasonable request.
In 2011, a comprehensive questionnaire was sent to all the living twins of the Finnish Twin Cohort born between 1945 and 1957 (11 738 twins). A total of 8501 twins returned the questionnaire with a response rate of 72%.7 Three questions in this questionnaire are related to hypertension: (1) When was your BP last measured? (2) Have you ever been told that you have elevated BP or hypertension? and (3) On how many days in the past year have you used antihypertensive medication? On the basis of the questionnaire, twins were defined as hypertensive if they had been diagnosed with hypertension or had taken antihypertensive medication for at least 2 months, whereas twins were defined as normotensive if they had not been diagnosed with hypertension and did not take antihypertensive medication. On the basis of the replies, 330 same-sex twin pairs free of self-reported previous diagnosis of myocardial infarction, congestive heart failure, or stroke were defined as potentially discordant for hypertension. These twin pairs were phone interviewed and asked to participate in clinical assessments at the University of Helsinki during 2012 to 2015. A total of 222 twin pairs and 3 singletons (n=447) participated in the clinical protocol. Informed consent was obtained from each subject, and the study was approved by the Ethics Committee of the University Central Hospital of Helsinki.
During the clinical testing, the twins had a comprehensive physical examination, and their health history was recorded, including questions again about previous diagnoses of hypertension and use of antihypertensive medication. Resting BPs were measured 4 times during the visit, seated by a sphygmomanometer according to the JNC7 guidelines (Seventh Joint National Committee). The average of the last 2 readings of each measurement occasion was used to represent BP values. Information on medications was further complemented with data from community pharmacies. On the basis of the fact that only 50 twin pairs met the criteria for current discordance for hypertension (1 twin on antihypertensive medication or with SBP ≥140 mm Hg or DBP ≥90 mm Hg and his/her cotwin not on antihypertensive medication and with SBP <120 mm Hg and DBP <80 mm Hg), we included all the twins for the current analysis by using SBP and DBP as continuous variables. If antihypertensive medication was used, 15 and 10 mm Hg were added to the measured SBP and DBP levels, respectively.8 Fasting peripheral blood samples were obtained from 402 participants. Zygosity was determined by genotyping using Illumina HumanCoreExome BeadChip.
RNA Extraction and Transcriptome-Wide Gene Expression Assays
RNA samples were extracted from the peripheral leukocytes stored in the RNA cell protection reagents (Qiagen, Inc, Valencia, CA) using the miRNeasy Mini Kit (Qiagen, Inc). RNA concentration and purity were evaluated on a NanoDrop spectrophotometer 2000 (Thermo Scientific, Inc, Waltham, MA). RNA integrity was evaluated on a Bioanalyzer 2100 (Agilent, Inc, Santa Clara, CA), and samples with a RNA integrity score ≥8 qualified for gene expression analysis. Out of the 402 peripheral blood samples, high-quality RNA was obtained for 391 subjects (116 monozygotic, 77 dizygotic twin pairs, and 5 singletons).
Transcriptome-wide gene expression data were obtained using the Illumina HumanHT-12 v4 Expression BeadChip (Illumina, Inc, San Diego, CA). This chip targets >48 000 probes that provide transcriptome-wide coverage of well-characterized genes, gene candidates, and splice variants. A block design was used to keep the distributions of sex and zygosity similar across chips (12 samples/chip) with the cotwins assigned to the same chip. The twin pairs were randomly assigned to the 12 positions on each chip. The Genome-Studio Gene Expression Module (Illumina) was used for initial quantification and the lumi package9 for data preprocessing and quality control, which included the following key steps: (1) probes with detection P value <0.05 in >50% of the samples were defined as present; (2) log2 transformation and quantile normalization were applied to the gene expression data. There were 19 530 probes that passed the quality control steps, and they were used as indices of gene expression levels in further analyses.
The purposes of our analyses were (1) to identify novel and replicate previously identified differentially expressed genes associated with BP in the Finnish twin cohort4,5; (2) to disentangle genetic and environmental sources of expression variation of the BP-associated genes; and (3) to test whether common genetic or environmental effects account for the observed association between BP and gene expression. We used linear mixed model to answer the first aim and structural equation modeling to answer the other 2 aims of our study.
Linear Mixed Model
In the entire sample treating twins as individuals, mixed-effect linear regression models with intercept varying among chips and varying among twin pairs within chips to account for the block design and the relatedness of twins were performed to examine the relationship between BP (explanatory variables) and gene expressions (outcome variables), adjusting for age, sex, and body mass index. The differentially expressed genes identified for BP with a false-discovery rate (FDR) <0.05 were carried forward for replication in the Framingham Heart Study5 (n=3679, 42% males, aged 51±12). RNA was isolated from whole-blood samples that were collected in PaxGene tubes (PreAnalytiX, Hombrechtikon, Switzerland), and the Affymetrix Exon Array ST 1.0 was used in the Framingham Heart Study. Replication was defined as consistent direction of the β-coefficient and FDR <0.05. We also checked whether we could replicate any of the 40 genes identified by previous studies5,10 that showed association of their expression with BP. Replication was defined successful if the direction of the BP-associated gene expression (β-coefficient) was consistent and FDR <0.05.
Univariate Structural Equation Modeling
To estimate the relative contributions of genetic and environmental factors on the expression variation of the BP-associated genes, structural equation modeling was conducted using the R package OpenMx.11,12 Before analysis, age, sex, body mass index, and chip effects were regressed out, and the gene expression residuals were used in model fitting. The univariate model (Figure [A]) allows separation of the observed phenotypic variance into underlying additive genetic variance (A), common environmental variance shared by a twin pair (C), and unique environmental variance specific to individuals (E). Under this model, we assumed that the monozygotic twins share 100% of their genes, whereas dizygotic twins share 50% of their segregating genes on average. We define shared environmental factors to be those that are the same effects on the cotwins irrespective of zygosity, whereas unique environmental factors are by definition not shared by the cotwins, whether monozygotic or dizygotic. Significance tests of the individual path (A, C, or E) were conducted by constraining paths to zero and applying χ2 test (P<0.05).
Bivariate Structural Equation Modeling
A bivariate Cholesky decomposition was used to model the covariance between gene expression and BP. This model can determine to what extent the covariance can be explained by common genetic or environmental factors. Details of this model have been described previously.13 As shown in Figure [B], similar to the univariate model, the variation of gene expression and the variation of BP were decomposed into A, C, and E variance components. The bivariate model allows determination of the sources of the observed covariance between gene expression and BP by using a sequence of models that testing which paths (a21 for genetic covariance, c21 for shared environmental covariance, or e21 for covariance of unshared environmental effects) can be set to 0. For example, if a21 cannot be set to 0, it allows further determination of the amount of overlap between the genetic factors influencing gene expression and BP by calculating the genetic correlation between the traits: rg=COVA(gene expression and BP)/√(VA gene expression*VA BP). Shared and unique environmental correlations can be calculated in a similar fashion.
Gene set enrichment analysis14,15 was used to identify sets of genes representing biological processes and pathways associated with gene expression changes associated with BP. Gene set enrichment analysis was performed on an unfiltered ranked lists of genes (ranked by the P values), and a running-sum statistic was used to determine the enrichment of a prior defined gene sets (pathways) based on the gene ranks. Statistical significance of pathway enrichment score was ascertained by permutation testing over size-matched random gene sets, and multiple testing was controlled by FDR. An FDR <0.05 was considered as significant. We tested for enrichment of any gene ontology biological processes or KEGG (Kyoto Encyclopedia of Genes and Genomes) pathways.
The general characteristics of participants are presented for all and by zygosity are presented in Table 1. We identified 1 gene (MOK [MAPK/MAK/MRK overlapping kinase]) that was differentially expressed in relation to SBP (Table 2) at FDR <0.05. For this gene, higher expression was associated with higher SBP levels. Similar association of this gene with DBP was also observed with a nominal P value of 1.34×10−5. Limiting the analysis to participants with no antihypertensive medication revealed the same results with higher MOK expression associated with higher BP levels (P=0.02 and 0.04 for SBP and DBP, respectively). The associations of MOK expression with SBP and DBP were further replicated in the Framingham Heart Study (P=1.02×10−5 and 6.11×10−4, respectively; Table 2). We did not identify any signals surviving multiple testing for DBP in the Finnish Twin Cohort.
Out of the 40 genes of which their expression levels were previously reported to be associated with BP, 12 genes could be replicated with consistent direction of effects and FDR <0.05 in our twin cohort. Table 3 lists the association of these 12 genes with both SBP and DBP in our cohort, although the original transcriptome-wide signals for certain genes were reported only for SBP (TAGLN2 and TAGAP) or DBP (S100A10). The higher expression levels of these 12 genes were associated with higher BP levels.
Twin correlations for the gene expression levels of the newly identified gene (MOK) and the 12 previously reported genes are shown in Table 4. With the exception of MYADM and TIPARP (TCDD-inducible poly[ADP-ribose] polymerase), twin correlations in monozygotic twin pairs were larger than those in dizygotic twin pairs, indicating genetic influences. For the expression of 7 genes including CRIP1, F12, LMNA, MOK, S100A10, TAGAP, and TSC22D3, the best fitting models are AE models, with heritability estimates ranging from 0.27 to 0.65. The remaining part of the variation for the expression of these 7 genes is explained by environmental influences that are unique to the individual. For the expression of MYADM, the best fitting model is the CE model, with the familiar aggregation best explained by the shared environmental factor that explains 47% of the expression variation of the MYADM gene. For the expression of the other 5 genes including CD97 (cluster of differentiation 97), SLC31A2, TAGLN2, TIPARP, and TPPP3, the best fitting models are ACE models. For these genes, the model assuming either the absence of A component or absence of C component fitted the data well, but the model assuming the absence of both components fitted the data significantly worse. This indicates that the expression of these genes did show familial aggregation, either because of genetic or because of shared environment influences, and their variations in the population could not be explained by unique environmental effects alone.
Next, we estimated the relative contributions of genetic and environmental factors to the association between gene expressions and BP. Because of the fact that the AE model has generally been the best fitting model in previous twin studies of BP,2,16 which was again confirmed in the current study (heritability was 45% for SBP and 40% for DBP, respectively), the bivariate modeling was only conducted on those genes with expression levels determined by ACE or AE models. The results are listed in Table 5. For the association of MOK expression with SBP, both a21 and e21 could not be set to 0, indicating the phenotypic correlation is determined by both shared genetic factors and shared unique environment with common genetic factors accounting for 53% of the phenotypic correlation. For the association of LMNA, SLC31A2, and TSC22D3, expression with BP and the association of TAGLN2 expression with SBP, a21 could but e21 could not be set to 0, indicating the phenotypic correlation is completely determined by shared unique environmental factors. On the contrary, the phenotypic correlations of CD97, TIPARP, and TPPP3 expression with both SBP and DBP were completely determined by shared genetic factors. For the associations of the other gene expressions with BP, the model assuming either the absence of a21 component or e21 component fitted the data well, but the model assuming the absence of both components fitted the data significantly worse, indicating a larger sample size is needed to increase the power to distinguish the relative contributions of genetic and environmental factors to the observed phenotypic correlations.
The pathway analyses yielded significant (FDR <0.05) enrichment of 16 KEGG and gene ontology biology process pathways for SBP-related gene expression changes and 18 for DBP-related gene expression changes (with 10 pathways overlapped between SBP and DBP) in peripheral leukocytes (Table S1 in the online-only Data Supplement). These pathways represent processes involved in inflammatory pathways (eg, antigen processing and presentation, innate immune response, defense response to virus, defense response to other organism, regulation of type I interferon, and mediated signaling pathway) and autoimmune response (eg, systemic lupus erythematosus, type 1 diabetes mellitus, allograft rejection, graft versus host disease, and autoimmune thyroid disease).
In this study of adult twins, we identified one more gene with its expression level associated with BP and replicated 12 previously identified signals. The newly identified signal is MOK gene, which is also known as RAGE-1 (renal tumor antigen-1). It was first identified in a renal carcinoma cell line17 and has been considered as a tumor-associated antigen for its wide expression in various tumors including renal carcinoma, melanoma, head and neck cancer, mesothelioma, hepatocellular carcinoma, and leukemia. Later studies found that it was also widely expressed in the cytoplasm of normal tissues including blood leukocytes.18 Although it is suspected that the signaling kinase that belongs to the MAPK (mitogen-activated kinase-like protein) superfamily may have important biological functions in different physiological processes and in disease, its biological roles are still largely unknown. We did not find literature support for a direct role of MOK in BP regulation. On the basis of the findings by Oehlrich et al19 that MOK could be recognized by cytotoxic T lymphocytes to induce immune response and its participation in the inflammation response in microglia by extracellular stimulation,20 we speculate that MOK may participate in BP regulation through inflammation pathway, a mechanism that has been implicated in the development of hypertension. Further experimental validation is needed.
As opposed to the instructions encoded in the genome sequence, actual gene expression (ie, the transcriptome) is tissue specific. The study of the transcriptome in disease requires focusing on the specific tissues involved in the pathogenesis of this disease. In addition to vasculature, kidney, adrenals, and central nervous system, which have long been implicated in the regulation of BP, recent evidence has also pointed to the involvement of oxidative stress and low-grade inflammation in the pathophysiology of hypertension and suggests that the immune system might be the missing mechanism that links these various organs.21 Both our study and the study from Huan et al5 obtained transcriptome-wide gene expression data from peripheral leukocytes, and the functional analysis of the differentially expressed BP genes from both studies yielded significant enrichment of pathways representing processes involved in inflammation and immune response, providing another strong piece of evidence supporting the involvement of inflammation pathways in the pathogenesis of hypertension. One previous study10 conducted a transcriptome profiling on kidney and identified 14 differentially expressed genes in renal medulla and 46 differentially expressed genes in renal cortex between 15 hypertensive patients and 7 normotensive controls. Out of these 60 genes, the expression data of 48 are available from the Illumina HumanHT-12 v4 Expression BeadChip. However, none of the associations observed in renal tissue could be replicated in the current study using peripheral leukocytes. In addition to the difference in tissues, this previous study was based on a small sample size and lacked a replication cohort. Nevertheless, we think that using similar approaches (large discovery panel plus replication) in tissues such as the kidney, which are potentially more relevant for BP other than peripheral leukocytes (ie, the most accessible cells), is a promising approach and may yield additional differentially expressed BP signature genes.
Another difference between the transcriptome and the genome is the plasticity of the transcriptome, that is, the findings may reflect the consequence rather than the cause of the disease. Theoretically, the cross-sectional association between BP and gene expression observed in the current study and the previous study can be explained in 4 ways. First, higher BP induces the changes in gene expression. Second, changes in gene expression induce higher BP. Third, the effect can be bidirectional. The cellular processes underlying BP regulation may change gene expression, whereas the latter or its downstream molecular products, in turn, may increase the risk of hypertension. Fourth, there could be a underlying factor that influences both BP levels and gene expressions profiles. The twin design of the current study allowed us to explore the last explanation. We first quantified the genetic and environmental sources of the variations of the 13 (1 novel and 12 known) BP-related gene expression signatures and observed that genetic factors contributed to the variance in expression of 12 out of the 13 genes, with the exception of MYADM. The heritability estimates of the 12 genes ranged from 6% to 65%. A genetic contribution was not identified for MYADM. The familial aggregation of its expression level was mainly determined by shared environmental factors. This is not uncommon, and a previous study6 has shown that as much as 32% of transcripts from human lymphoblastoid cell lines have a shared environmental component that explain >30% of the total variance. Because the previous twin studies2,16 and our current study did not find evidence for influence of shared environmental factors on BP variation, we examined the common pathology hypothesis for BP and the 12 genes (CD97, CRIP1, F12, LMNA, MOK, S100A10, SLC31A2, TAGAP, TAGLN2, TIPARP, TPPP3, and TSC22D3) with their expression variation determined at least partially by genetic factors. Significant genetic correlations were observed between SBP and MOK expression, as well as SBP/DBP and CD97, TIPARP, and TPPP3 gene expression. Up to 53% of the phenotypic association between SBP and MOK gene expression can be explained by common genetic factors, whereas this was 100% for CD97, TIPARP, and TPPP3 with both SBP and DBP. These results indicate that shared genes did contribute substantially to the association of BP with expression of certain genes (CD97, MOK, TIPARP, and TPP3 [tubulin polymerization-promoting protein family member 3]).
The identification of common genetic effects accounting for the association between BP and gene expression has important implications for gene-finding studies. In the current gene-finding efforts for BP, univariate analysis has been conducted with BP as the phenotype of interest. Even with the most recent findings from 3 large-scale genome-wide association studies22–24 in >300 000 subjects, which have expanded the list of genetic loci for BP to almost 400, adding all the loci together only explains 3% to 4% of BP variance, although the heritability calculated from all genotyped and imputed single-nucleotide polymorphisms is ≈17%.25 With the identification of common genetic effects, bivariate analysis with both BP and gene expression as the phenotypes of interests can be applied to identify the shared genetic variants. Such an analysis strategy might not only help to identify more BP-related loci but also has the potential to shed light on the underlying mechanisms.
Our study has several limitations that need to be recognized. First, the study was designed to have an oversampling of twins from pairs discordant for essential hypertension. However, the univariate twin modeling on BP observed results similar to previous twin studies with twins recruited from the general population, indicating this specific design did not lead to large biases in heritability estimates. Second, our study is cross-sectional, thus limited in the ability to discern the temporal order between BP and gene expression. However, the identification of common genetic precursors for BP and expression of CD97, MOK, TIPARP, and TPPP3 genes renders a cause–effect relationship for these genes unlikely. Third, the study only included white adults, thus one should be cautious extending our results to other ethnic or age groups. Fourth, although this study included almost 400 twins, an even larger sample size is required to tease out the relative contribution of genetic or environmental factor to the associations of BP with gene expression. Fifth, transcriptome data were obtained using traditional chip technology in this study. In comparison with the current state-of-the-art RNA-seq approach, some low abundance transcripts might have been missed.
In conclusion, we identified one more gene with its expression level associated with BP and replicated several previously identified signals. Our study further provides new insights into the genetic and environmental sources of BP-related gene expression signatures and their associations with BP. The identification of common genetic effects between BP and gene expression of CD97, MOK, TIPARP, and TPPP3 will aid in future BP gene-finding efforts and may ultimately point toward more effective prevention and treatment of hypertension.
Sources of Funding
This study was supported by the National Institutes of Health/National Heart, Lung, and Blood Institute (grant HL104125). J. Kaprio has been supported by the Academy of Finland (grant numbers 265240 and 263278).
The online-only Data Supplement is available with this article at http://hyper.ahajournals.org/lookup/suppl/doi:10.1161/HYPERTENSIONAHA.117.10527/-/DC1.
- Received October 25, 2017.
- Revision received November 12, 2017.
- Accepted December 1, 2017.
- © 2018 American Heart Association, Inc.
- Hao G,
- Wang X,
- Treiber FA,
- Harshfield G,
- Kapuku G,
- Su S
- Zeller T,
- Schurmann C,
- Schramm K,
- et al
- Marques FZ,
- Campain AE,
- Tomaszewski M,
- Zukowska-Szczechowska E,
- Yang YH,
- Charchar FJ,
- Morris BJ
- Neale MC,
- Hunter MD,
- Pritikin JN,
- Zahery M,
- Brick TR,
- Kirkpatrick RM,
- Estabrook R,
- Bates TC,
- Maes HH,
- Boker SM
- Wang X,
- Trivedi R,
- Treiber F,
- Snieder H
- Subramanian A,
- Tamayo P,
- Mootha VK,
- et al
- Leal-Lasarte MM,
- Franco JM,
- Labrador-Garrido A,
- Pozo D,
- Roodveldt C
- Nosalski R,
- McGinnigle E,
- Siedlinski M,
- Guzik TJ
- Warren HR,
- Evangelou E,
- Cabrera CP,
- et al
- Wain LV,
- Vaez A,
- Jansen R,
- et al
Novelty and Significance
What Is New?
Identify one more gene with its expression associated with blood pressure.
Discover that genetic factors contribute to the expression variations of 12 blood pressure (BP)–associated genes.
Discover that common genetic effects account for the observed associations between BP and the expression of 4 genes.
What Is Relevant?
Discovery of common genetic factors may point toward a common pathogenic pathway and has the potential to improve our understanding of the molecular mechanisms underlying BP regulation.
In this study of adult twins, 391 twins were recruited from the Finnish Twin Cohort. Their gene expression data in peripheral leukocytes were obtained using Illumina HT12 V4 array. Their comprehensive physical examination and health history as well as blood pressure were measured during clinical visit. Linear mixed model was used to identify the differentially expressed genes associated with blood pressure. Univariate structural equation modeling and bivariate structural equation modeling were used to further investigate to what extent the genetic factor and environmental factor influence gene expression and blood pressure. Our study identified one more gene (MOK [MAPK/MAK/MRK overlapping kinase]) with its expression level associated with BP and replicated 12 previously identified signals. Univariate twin modeling showed that genetic factors contributed to the expression variations of 12 genes with heritability estimates ranging from 6% to 65%. Bivariate twin modeling showed that 53% of the phenotypic association between systolic BP and MOK expression, and 100% of the phenotypic association of systolic and diastolic BP with CD97 (cluster of differentiation 97), TIPARP (TCDD-inducible poly[ADP-ribose] polymerase), and TPPP3 expression could be explained by genetic factors shared in common. Our study provides new insights into the genetic and environmental sources of BP-related gene expression signatures.