Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Two basic questions on coloc: (1) only P-value and MAF? (2) why proteome-wide MR still needs eQTL coloc? #175

Open
jielab opened this issue Nov 13, 2024 · 5 comments

Comments

@jielab
Copy link

jielab commented Nov 13, 2024

Hi, guys:

I have two fundamental question on colocalization.

  1. Below is the Figure 1 of the original coloc paper. So, only P-value is considered for the bayesian test, not BETA and Variance?
    image

  2. For proteome-wide analysis, we are already studying a protein (not an exposure such as BMI). Once we found a causal effect from a protein to an outcome through traditional MR, why do you still need a coloc analysis to test whether some eQTL is involved with this "protein --> outcome" relationship?

Your clarification/teaching would be greatly appreciated!

Jie

@chr1swallace
Copy link
Owner

chr1swallace commented Nov 13, 2024 via email

@jielab
Copy link
Author

jielab commented Nov 13, 2024

Dear Chris:

Thank you very much!

1. I would imagine that BETA and SE would of course offer something more than P-value alone. But at least the Figure 1 of your 2014 paper did not imply that BETA/SE is needed, correct?

I just looked at your Github code, pasted below, I did not see BETA/SE there.

image

2. I feel that in population genetics LD could be blamed for everything while COLOC could save everything :-). In my view, basically coloc is like checking whether two kids have similar daily regimens, (e.g., the time of getting up, taking school bus, watching TV, taking a dog walk, etc.), in order to determine whether they were born by the same parents or at least live in the same neighbourhood.

For proteome-wide MR, usually people are NOT using a single variant as instrumental variable. Nevertheless, a Lancet 2012 paper did use a single variant within the LIPG gene (https://www.thelancet.com/journals/lancet/article/PIIS0140-6736(12)60312-2/fulltext) to test the causality of HDL on myocardial infarction (MI). So, you mean this type of MR could be confounded by LD?

Now if I run coloc on LIPG pQTL --> MI, I am testing H3. Instead, if I run coloc on LIPG eQTL --> MI, I am testing H4? I feel this is hard to understand, if it is true. After all, pQTL is the downstream product of eQTL. Furthermore, pQTL is more accurate than eQTL, because eQTL is fake data (from a remote GTeX project) while pQTL is real data (measured on the same individuals for the disease phenotype study).

Your clarification/teaching would be greatly appreciated!

Best regards,
Jie

@chr1swallace
Copy link
Owner

chr1swallace commented Nov 13, 2024 via email

@jielab
Copy link
Author

jielab commented Nov 14, 2024

Dear Chris:

Thank you very much again for clarification!

  1. Can you please confirm that https://github.com/chr1swallace/coloc/blob/main/R/claudia.R is the source code when I run coloc.abf? I did see both approx.bf.p and approx.bf.estimates. The former used P while the latter use z and V.

  2. The example I gave is NOT describing correlation, but causation. Based on two kids' daily regimen, I am testing whether a common parent / family caused it. I am not testing whether one child is correlated with the other kid.

Best regards,
Jie

@chr1swallace
Copy link
Owner

chr1swallace commented Nov 14, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants