-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Two basic questions on coloc: (1) only P-value and MAF? (2) why proteome-wide MR still needs eQTL coloc? #175
Comments
1. No, beta and barbers give a more accurate test
2. If you are using single variant MR, there is a possibility your result is due to LD between your pqtl and outcome variant. Coloc would give h3 in this case.
Sent from Outlook for Android<https://aka.ms/AAb9ysg>
…________________________________
From: Jie Huang ***@***.***>
Sent: Wednesday, November 13, 2024 4:11:18 AM
To: chr1swallace/coloc ***@***.***>
Cc: Subscribed ***@***.***>
Subject: [chr1swallace/coloc] Two basic questions on coloc: (1) only P-value and MAF? (2) why proteome-wide MR still needs eQTL coloc? (Issue #175)
Hi, guys:
I have two fundamental question on colocalization.
1. Below is the Figure 1 of the original coloc paper. So, only P-value is considered for the bayesian test, not BETA and Variance?
image.png (view on web)<https://github.com/user-attachments/assets/59fb633e-86f2-45f7-b9e8-9c5b5cc461e7>
2. For proteome-wide analysis, we are already studying a protein (not an exposure such as BMI). Once we found a causal effect from a protein to an outcome through traditional MR, why do you still need a coloc analysis to test whether some eQTL is involved with this "protein --> outcome" relationship?
Your clarification/teaching would be greatly appreciated!
Jie
—
Reply to this email directly, view it on GitHub<#175>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AAQWR2FQ3F2DNWDAZ2IUAC32ALGONAVCNFSM6AAAAABRVRPOOSVHI2DSMVQWIX3LMV43ASLTON2WKOZSGY2TIMBZGAZDGOA>.
You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>
|
Dear Chris: Thank you very much! 1. I would imagine that BETA and SE would of course offer something more than P-value alone. But at least the Figure 1 of your 2014 paper did not imply that BETA/SE is needed, correct? I just looked at your Github code, pasted below, I did not see BETA/SE there. 2. I feel that in population genetics LD could be blamed for everything while COLOC could save everything :-). In my view, basically coloc is like checking whether two kids have similar daily regimens, (e.g., the time of getting up, taking school bus, watching TV, taking a dog walk, etc.), in order to determine whether they were born by the same parents or at least live in the same neighbourhood. For proteome-wide MR, usually people are NOT using a single variant as instrumental variable. Nevertheless, a Lancet 2012 paper did use a single variant within the LIPG gene (https://www.thelancet.com/journals/lancet/article/PIIS0140-6736(12)60312-2/fulltext) to test the causality of HDL on myocardial infarction (MI). So, you mean this type of MR could be confounded by LD? Now if I run coloc on LIPG pQTL --> MI, I am testing H3. Instead, if I run coloc on LIPG eQTL --> MI, I am testing H4? I feel this is hard to understand, if it is true. After all, pQTL is the downstream product of eQTL. Furthermore, pQTL is more accurate than eQTL, because eQTL is fake data (from a remote GTeX project) while pQTL is real data (measured on the same individuals for the disease phenotype study). Your clarification/teaching would be greatly appreciated! Best regards, |
1. if you give just p values and MAF, we try and back-calculate
(unsigned) beta and se. so beta/se not needed, but better
2. I think what you describe is like correlation. coloc is based on fine
mapping methods originally derived here
https://pmc.ncbi.nlm.nih.gov/articles/PMC3791416/. it is about causality
it tests whether two traits share a causal variant (H4) or have
different causal variants (H3). It does not attempt inference on whether
one causes the other.
On 13/11/2024 08:32, Jie Huang wrote:
Dear Chris:
Thank you very much!
*1.* I would imagine that BETA and SE would of course offer something
more than P-value alone. But at least the *Figure 1* of your 2014
paper did not imply that BETA/SE is needed, correct?
I just looked at your Github code, pasted below, I did not see BETA/SE
there.
image.png (view on web)
<https://github.com/user-attachments/assets/c6ecb574-46d7-4a52-89a5-3f579ae75b73>
*2.* I feel that in population genetics *LD* could be blamed for
everything while *COLOC* could save everything :-). In my view,
basically *coloc* is like checking whether two kids have similar daily
regimens, (e.g., the time of getting up, taking school bus, watching
TV, taking a dog walk, etc.), in order to determine whether they were
born by the same parents or at least live in the same neighbourhood.
For proteome-wide MR, usually people are NOT using a single variant as
instrumental variable. Nevertheless, a Lancet 2012 paper did use a
single variant within the */LIPG/* gene
(https://www.thelancet.com/journals/lancet/article/PIIS0140-6736(12)60312-2/fulltext
<https://www.thelancet.com/journals/lancet/article/PIIS0140-6736(12)60312-2%2Ffulltext&data=05%7C02%7Ccew54%40universityofcambridgecloud.onmicrosoft.com%7C27d5b96c1f24478db52c08dd03bdb44f%7C49a50445bdfa4b79ade3547b4f3986e9%7C1%7C0%7C638670835505142577%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=ncWQpoOlfMh6RJhAcQU6oo94j%2Fk9dYihxICN1V3dIuw%3D&reserved=0>)
to test the causality of *HDL on myocardial infarction (MI)*. So, you
mean this type of MR could be confounded by LD?
Now if I run coloc on /LIPG/ *pQTL --> MI*, I am testing *H3*.
Instead, if I run coloc on /LIPG/ *eQTL --> MI*, I am testing *H4*? I
feel this is hard to understand, if it is true. After all, pQTL is the
downstream product of eQTL. Furthermore, pQTL is more accurate than
eQTL, because eQTL is fake data (from a remote GTeX project) while
pQTL is real data (measured on the same individuals for the disease
phenotype study).
Your clarification/teaching would be greatly appreciated!
Best regards,
Jie
—
Reply to this email directly, view it on GitHub
<#175 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAQWR2DYY23NCY2M4JXCTET2AMFBHAVCNFSM6AAAAABRVRPOOSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINZSHAZDQMZYHE>.
You are receiving this because you commented.Message ID:
***@***.***>
--------------zA7A40ZwaoZ0vZAugz4MPTR7
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: 8bit
<!DOCTYPE html><html><head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body>
<p>1. if you give just p values and MAF, we try and back-calculate
(unsigned) beta and se. so beta/se not needed, but better</p>
<p>2. I think what you describe is like correlation. coloc is based
on fine mapping methods originally derived here
<a class="moz-txt-link-freetext" href="https://pmc.ncbi.nlm.nih.gov/articles/PMC3791416/">https://pmc.ncbi.nlm.nih.gov/articles/PMC3791416/</a>. it is about
causality</p>
<p>it tests whether two traits share a causal variant (H4) or have
different causal variants (H3). It does not attempt inference on
whether one causes the other.<br>
</p>
<div class="moz-cite-prefix">On 13/11/2024 08:32, Jie Huang wrote:<br>
</div>
<blockquote type="cite" ***@***.***">
<p dir="auto">Dear Chris:</p>
<p dir="auto">Thank you very much!</p>
<p dir="auto"><strong>1.</strong> I would imagine that BETA and SE
would of course offer something more than P-value alone. But at
least the <strong>Figure 1</strong> of your 2014 paper did not
imply that BETA/SE is needed, correct?</p>
<p dir="auto">I just looked at your Github code, pasted below, I
did not see BETA/SE there.</p>
<p dir="auto"><a href="https://github.com/user-attachments/assets/c6ecb574-46d7-4a52-89a5-3f579ae75b73" originalsrc="https://github.com/user-attachments/assets/c6ecb574-46d7-4a52-89a5-3f579ae75b73" shash="r4TIrIorK0uqUmBG6uTUQNaYVfCfvP9UDe/Xzg9ZrFKPzm8PtMDDD8f0LFp5k/OohMnksmBg2x0hST4mzVBrAF7M3v+sjxAWwldrLQ21hqQgsaaP2Wp0CE3+iitFo87yTRtHBzw4p2LQGhV1vVHLkvisjqee24bOPjLtoo+DToY=" moz-do-not-send="true">image.png (view on web)</a></p>
<p dir="auto"><strong>2.</strong> I feel that in population
genetics <strong>LD</strong> could be blamed for everything
while <strong>COLOC</strong> could save everything :-). In my
view, basically <strong>coloc</strong> is like checking whether
two kids have similar daily regimens, (e.g., the time of getting
up, taking school bus, watching TV, taking a dog walk, etc.), in
order to determine whether they were born by the same parents or
at least live in the same neighbourhood.</p>
<p dir="auto">For proteome-wide MR, usually people are NOT using a
single variant as instrumental variable. Nevertheless, a Lancet
2012 paper did use a single variant within the <strong><em>LIPG</em></strong>
gene (<a href="https://www.thelancet.com/journals/lancet/article/PIIS0140-6736(12)60312-2/fulltext" originalsrc="https://www.thelancet.com/journals/lancet/article/PIIS0140-6736(12)60312-2/fulltext" shash="cb97uG2+aFsamn+5abJ99RIUped3rjJ46yAkZvwgI8L0qXGljQQR2MhhQ28Jl852Y6Jt+qMYLClGgFHIOom5sgnu+eQvrOtNic77Etxou5o3rnSfmfH+ZMmAkn0Kp6VAekWHdWIKpvdQqwCIypi6b3P4QW3cVq96+5xuJX7+GGo=" rel="nofollow" moz-do-not-send="true">https://www.thelancet.com/journals/lancet/article/PIIS0140-6736(12)60312-2/fulltext</a>)
to test the causality of <strong>HDL on myocardial infarction
(MI)</strong>. So, you mean this type of MR could be
confounded by LD?</p>
<p dir="auto">Now if I run coloc on <em>LIPG</em> <strong>pQTL
--> MI</strong>, I am testing <strong>H3</strong>.
Instead, if I run coloc on <em>LIPG</em> <strong>eQTL -->
MI</strong>, I am testing <strong>H4</strong>? I feel this is
hard to understand, if it is true. After all, pQTL is the
downstream product of eQTL. Furthermore, pQTL is more accurate
than eQTL, because eQTL is fake data (from a remote GTeX
project) while pQTL is real data (measured on the same
individuals for the disease phenotype study).</p>
<p dir="auto">Your clarification/teaching would be greatly
appreciated!</p>
<p dir="auto">Best regards,<br>
Jie</p>
<p style="font-size:small;-webkit-text-size-adjust:none;color:#666;">—<br>
Reply to this email directly, <a href="#175 (comment)" originalsrc="#175 (comment)" shash="YnT6LVgJnjUlxFbKHIVDNy6ck9AVgE3bYZ6iEN7n9FZ1tTE2Ccnuf7GMRbmj2zytkTqvlp8ws9ATfiNwTsxRncKPG8RVmaBA25fp/Y0wdJ+nLv7B7me9qi6CUUsixAZUPdqTkxa8I7sksPibYInZI5uoYcDfHtHEB7IQEt1WO8s=" moz-do-not-send="true">view it on GitHub</a>, or <a href="https://github.com/notifications/unsubscribe-auth/AAQWR2DYY23NCY2M4JXCTET2AMFBHAVCNFSM6AAAAABRVRPOOSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINZSHAZDQMZYHE" originalsrc="https://github.com/notifications/unsubscribe-auth/AAQWR2DYY23NCY2M4JXCTET2AMFBHAVCNFSM6AAAAABRVRPOOSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINZSHAZDQMZYHE" shash="odSoOlX9sxBiL8sC/nN8cPUXJ+lzy7NF2M2Rn7gyKAAsOrM3NQR7aSVEJkbpE/itE3qhgc3ibW8dnX/dokpVLcJrQi54xDrvUE+FHDZ4c/2ebDxWO1NLuTqpLOq1W8fd3RWdxkQx4iRAxEeCFujm8uQNcB+q0CIBeesmcClb9sM=" moz-do-not-send="true">unsubscribe</a>.<br>
You are receiving this because you commented.<img src="https://github.com/notifications/beacon/AAQWR2FUBLCDFECWCW26GG32AMFBHA5CNFSM6AAAAABRVRPOOSWGG33NNVSW45C7OR4XAZNMJFZXG5LFINXW23LFNZ2KUY3PNVWWK3TUL5UWJTUTMRO6K.gif" alt="" moz-do-not-send="true" width="1" height="1"><span style="color: transparent; font-size: 0; display: none; visibility: hidden; overflow: hidden; opacity: 0; width: 0; height: 0; max-width: 0; max-height: 0; mso-hide: all">Message
ID: <span><chr1swallace/coloc/issues/175/2472828389</span><span>@</span><span>github</span><span>.</span><span>com></span></span></p>
<script type="application/ld+json">[
{
***@***.***": "http://schema.org",
***@***.***": "EmailMessage",
"potentialAction": {
***@***.***": "ViewAction",
"target": "#175 (comment)",
"url": "#175 (comment)",
"name": "View Issue"
},
"description": "View this Issue on GitHub",
"publisher": {
***@***.***": "Organization",
"name": "GitHub",
"url": "https://github.com"
}
}
]</script>
</blockquote>
</body>
</html>
…--------------zA7A40ZwaoZ0vZAugz4MPTR7--
|
Dear Chris: Thank you very much again for clarification!
Best regards, |
confirmed. as I said. we can use p values. but beta and se (z and V) are
more accurate.
On 14/11/2024 15:25, Jie Huang wrote:
Dear Chris:
Thank you very much again for clarification!
1.
Can you please confirm that
https://github.com/chr1swallace/coloc/blob/main/R/claudia.R
<https://github.com/chr1swallace/coloc/blob/main/R/claudia.R>
is the source code when I run *coloc.abf*? I did see both
*approx.bf.p* and *approx.bf.estimates*. The former used P while
the latter use z and V.
2.
The example I gave is NOT describing correlation, but *causation*.
Based on two kids' daily regimen, I am testing whether a common
parent / family caused it. I am not testing whether one child is
correlated with the other kid.
Best regards,
Jie
—
Reply to this email directly, view it on GitHub
<#175 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAQWR2E7D6G2EIE6NOJE5ED2AS6E5AVCNFSM6AAAAABRVRPOOSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINZWG4YDCMRXHE>.
You are receiving this because you commented.Message ID:
***@***.***>
--------------ZyfVi9MzIm0wBwt05TuAQwXJ
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: 8bit
<!DOCTYPE html><html><head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body>
<p>confirmed. as I said. we can use p values. but beta and se (z and
V) are more accurate.<br>
</p>
<div class="moz-cite-prefix">On 14/11/2024 15:25, Jie Huang wrote:<br>
</div>
<blockquote type="cite" ***@***.***">
<p dir="auto">Dear Chris:</p>
<p dir="auto">Thank you very much again for clarification!</p>
<ol dir="auto">
<li>
<p dir="auto">Can you please confirm that <a href="https://github.com/chr1swallace/coloc/blob/main/R/claudia.R" originalsrc="https://github.com/chr1swallace/coloc/blob/main/R/claudia.R" shash="yemfsknUOOIS1+65QfnkVymCSxTub89d44J0/FefvA6dff2DKf48hnVDNDPID4MIp4BCLbZRpSYtpBwMz2ymOvdbcDFBFua6m1low3/ytOIbUxpLXbmRLZleRvyPe4QNe08NOw3EU3CdGhAgbINEClN98ICjQN0btmPMdIbLu7s=" moz-do-not-send="true">https://github.com/chr1swallace/coloc/blob/main/R/claudia.R</a>
is the source code when I run <strong>coloc.abf</strong>? I
did see both <strong>approx.bf.p</strong> and <strong>approx.bf.estimates</strong>.
The former used P while the latter use z and V.</p>
</li>
<li>
<p dir="auto">The example I gave is NOT describing
correlation, but <strong>causation</strong>. Based on two
kids' daily regimen, I am testing whether a common parent /
family caused it. I am not testing whether one child is
correlated with the other kid.</p>
</li>
</ol>
<p dir="auto">Best regards,<br>
Jie</p>
<p style="font-size:small;-webkit-text-size-adjust:none;color:#666;">—<br>
Reply to this email directly, <a href="#175 (comment)" originalsrc="#175 (comment)" shash="Gs9B1IGk5rM7dBAT3oelcSN6h3TgbkNHjobXL61bBZauuJZrYb5EwMOZ0BT6kQmY3WucuqLVD3CP5X5eUMNJNhzc02iZBMdrg0uWg145FxU6YQtaaub0gu37mr5NmCyZcNtg1wFvbDpq9vyWi5w46kXPnNjTdbfPwSKtXS9ebls=" moz-do-not-send="true">view it on GitHub</a>, or <a href="https://github.com/notifications/unsubscribe-auth/AAQWR2E7D6G2EIE6NOJE5ED2AS6E5AVCNFSM6AAAAABRVRPOOSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINZWG4YDCMRXHE" originalsrc="https://github.com/notifications/unsubscribe-auth/AAQWR2E7D6G2EIE6NOJE5ED2AS6E5AVCNFSM6AAAAABRVRPOOSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINZWG4YDCMRXHE" shash="W++KsMGyZlNz1nSkMzbpJH6yciv2zyzaE/mM55OV+WNfd1msMMSTqhaQNpai/om5zrXa/fi5UdAjf2zogBxjlf6FA3drKVpX3pbJsjC0Z+cUjSHsnmfBvgkxvqXjujHwToFbFL+x9kCsK83I91bCclUPVV/Uwp8XSypfteuhgMM=" moz-do-not-send="true">unsubscribe</a>.<br>
You are receiving this because you commented.<img src="https://github.com/notifications/beacon/AAQWR2FRYHDRWBS2T6OKKUD2AS6E5A5CNFSM6AAAAABRVRPOOSWGG33NNVSW45C7OR4XAZNMJFZXG5LFINXW23LFNZ2KUY3PNVWWK3TUL5UWJTUTT53F6.gif" alt="" moz-do-not-send="true" width="1" height="1"><span style="color: transparent; font-size: 0; display: none; visibility: hidden; overflow: hidden; opacity: 0; width: 0; height: 0; max-width: 0; max-height: 0; mso-hide: all">Message
ID: <span><chr1swallace/coloc/issues/175/2476701279</span><span>@</span><span>github</span><span>.</span><span>com></span></span></p>
<script type="application/ld+json">[
{
***@***.***": "http://schema.org",
***@***.***": "EmailMessage",
"potentialAction": {
***@***.***": "ViewAction",
"target": "#175 (comment)",
"url": "#175 (comment)",
"name": "View Issue"
},
"description": "View this Issue on GitHub",
"publisher": {
***@***.***": "Organization",
"name": "GitHub",
"url": "https://github.com"
}
}
]</script>
</blockquote>
</body>
</html>
…--------------ZyfVi9MzIm0wBwt05TuAQwXJ--
|
Hi, guys:
I have two fundamental question on colocalization.
Below is the Figure 1 of the original coloc paper. So, only P-value is considered for the bayesian test, not BETA and Variance?
For proteome-wide analysis, we are already studying a protein (not an exposure such as BMI). Once we found a causal effect from a protein to an outcome through traditional MR, why do you still need a coloc analysis to test whether some eQTL is involved with this "protein --> outcome" relationship?
Your clarification/teaching would be greatly appreciated!
Jie
The text was updated successfully, but these errors were encountered: