You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/simulation.rst
+89-8
Original file line number
Diff line number
Diff line change
@@ -15,9 +15,18 @@ To simulate data, please first create a directory to store the data:
15
15
16
16
``mkdir sim``
17
17
18
-
Now, we are going to simulate data for 3000 families genotyped at 1000 independent SNPs. We are going to simulate 20 generations of assortative mating with parental phenotype correlation 0.5.
18
+
Now, we are going to simulate data for 3000 families, each with two full-siblings, genotyped at 1000 independent SNPs.
19
+
We simulate a phenotype affected by direct genetic effects and assortative mating.
20
+
We are going to simulate 20 generations of assortative mating with parental phenotype correlation 0.5, reaching an approximate equilibrium.
It outputs the PGS to a :ref:`PGS file <pgs_file>`: direct_v1.pgs.txt.
98
+
It outputs the PGS to a :ref:`PGS file <pgs_file>`: direct_v1.pgs.txt. (Notice also that the inferred
99
+
correlation between parents' PGSs is lower than when using the true direct genetic effects as weights due to
100
+
estimation error in the weights.)
90
101
91
102
To estimate direct effect and average NTC of the PGS, use the following command:
92
103
@@ -100,12 +111,82 @@ from noisy weights (in direct_v1.1.effects.txt) will be smaller than the populat
100
111
This is because the PGS does not capture all of the heritability due to estimation error in the weights.
101
112
The PGS has its population effect inflated (relative to its
102
113
direct effect) by assortative mating, which induces a correlation of the PGS with the component of the heritability
103
-
not captured by the PGS due to estimation error. This inflation is not captured by the direct effect of the PGS
104
-
because chromosomes segregate independently during meiosis. (In this simulation, all causal SNPs segregate independently.)
105
-
Here, the ratio between direct and population effects of the PGS should be around 0.87.
114
+
not directly captured by the PGS due to estimation error. This inflation is not captured by the direct effect of the PGS
115
+
because of the within-family variation used to estimate the direct effect is due to the random segregation of genetic material during meiosis.
116
+
Here, the ratio between direct and population effects of the PGS should be around 0.86.
106
117
107
118
One should also observe a statistically significant average parental NTC (in direct_v1.2.effects.txt) of the PGS from
108
119
the two-generation model despite the absence of parental indirect genetic effects in this simulation. Here,
109
120
the ratio between the average NTC and the direct effect should be around 0.15. This demonstrates
110
121
that statistically significant average NTC estimates cannot be interpreted as demonstrating
111
-
parental indirect genetic effects, especially for phenotypes affected by assortative mating.
122
+
parental indirect genetic effects, especially for phenotypes affected by assortative mating.
123
+
124
+
Adjusting for assortative mating
125
+
--------------------------------
126
+
127
+
We now show how to adjust two-generation PGI results for assortative mating
128
+
using the procedure outlined in `Estimation of indirect genetic effects and heritability under assortative mating <https://www.biorxiv.org/content/10.1101/2023.07.10.548458v1.abstract>`_.
129
+
The estimation procedure is summarized in this diagram:
130
+
131
+
.. image:: two_gen_estimation.png
132
+
:scale:30 %
133
+
:align:center
134
+
:alt:Two-generation estimation procedure accouting for assortative mating
135
+
136
+
The estimation requires as inputs: an estimate of the correlation between parents' scores, :math:`r_k`;
137
+
the regression coefficients from two-generation PGI analysis, (:math:`\delta_{\text{PGI}:k},\alpha_{\text{PGI}:k}`);
138
+
and a heritability estimate, :math:`h^2_f`,from MZ-DZ twin comparisons, `RDR <https://www.nature.com/articles/s41588-018-0178-9>`_, or sib-regression.
139
+
140
+
The estimation procedure outputs estimates of: :math:`k`, the fraction of heritability the PGI would explain in a random mating population;
141
+
:math:`r_\delta`, the correlation between parents' true direct genetic effect components;
142
+
:math:`h^2_\text{eq}`, the equilibrium heritability, adjusting for the downward bias in heritability estimates from
143
+
MZ-DZ comparisons, RDR, and sib-regression;
144
+
:math:`\alpha_\delta`, the indirect genetic effect of true direct genetic effect PGI;
145
+
and :math:`v_{\eta:\delta}`, the fraction of phenotypic variance contribued by the indirect genetic effect component
146
+
that is correlated with the direct genetic effect component.
147
+
148
+
We can use *snipar* to compute the two-generation PGI estimates and the correlation between parents' scores,
149
+
and we can input a heritability estimate into *pgs.py* script to complete the inputs, so that
150
+
*snipar* will perform the two-generation analysis adjusting for assortative mating.
151
+
152
+
To perform the estimation, we will combine the offspring and parental genotype files.
153
+
This enables us to estimate the correlation between parents' scores
154
+
using the observed parental genotypes. (This is better than using the sibling
155
+
genotypes because the correlation estimate from observed parental genotypes is uncorrelated with the PGS regression coefficients.)
This script will take the input heritability estimate (0.42) and the standard error of the estimate (here 0 since we used the true value)
185
+
to estimate the fraction of heritability the PGI would explain in a random mating population,
186
+
:math:`k`, which should be around 0.5; the correlation between parents' direct genetic effect components, :math:`r_\delta`,
187
+
which should be around 0.29; the equilibrium heritability, :math:`h^2_\text{eq}`, which should be around 0.59;
188
+
the ratio between direct and population effects that would be expected based on assortative mating alone, :math:`\rho_k`,
189
+
which should be around 0.86; the indirect genetic effect of true direct genetic effect PGI, :math:`\alpha_\delta`, which should not be
190
+
statistically significantly different from zero (with high probability) because there are no parental indirect genetic effects in this simulation;
191
+
and :math:`v_{\eta:\delta}`, the contribution to the phenotypic variance from the indirect genetic effect component correlated with direct genetic effect component,
192
+
which should also not be statistically indistinguishable from zero (with high probability). These estimates are output to direct_v1_obs.am_adj_pars.txt.
0 commit comments