Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add offsides data from nsides #110

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

apoorvasrinivasan26
Copy link
Contributor

No description provided.

description: Standard error of the PRR estimate
type: continuous
names:
- Proportional reporting ratio error
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you think this is something the model should be able to predict? (I'm just curious)

@kjappelbaum
Copy link
Collaborator

Overall, looks quite good to me. Your PR does not cause the pre-commit errors.
I'll check tomorrow more carefully, but I think we'll merge it in tomorrow. Thanks 💯

@apoorvasrinivasan26
Copy link
Contributor Author

No, I don't think the error can be predicted but it can be calculated based on other columns in the dataset like columns A, B C, and D.

@apoorvasrinivasan26
Copy link
Contributor Author

Great! Lmk if something doesn't work. I have another similar dataset in the works once that I will commit after this is merged

@kjappelbaum
Copy link
Collaborator

No, I don't think the error can be predicted but it can be calculated based on other columns in the dataset like columns A, B C, and D.

in this case, I would consider removing this column as it might add more confusion than signal to the model

@kjappelbaum
Copy link
Collaborator

Has this dataset been used in some benchmarks/papers?

Thanks again for your contribution 💯

Copy link
Collaborator

@kjappelbaum kjappelbaum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks again for your contribution; it would be great if we could address the comments/clarifications. Let me know if you want a discussion or need a hand

@apoorvasrinivasan26
Copy link
Contributor Author

@kjappelbaum I've removed those columns and yes, the dataset has been used in the following paper: https://pubmed.ncbi.nlm.nih.gov/22422992/

pls lmk if anything is unclear or if you'd like me to make further changes

@kjappelbaum
Copy link
Collaborator

Thanks a lot! Can you still remove the column from the target list in the .yml file? Thanks!

@apoorvasrinivasan26
Copy link
Contributor Author

Done! Lmk if I missed anything else.

Comment on lines 17 to 21
- id: mean_reporting_frequency
description: Proportion of reports for the drug that report the side effect
type: continuous
names:
- mean reporting frequency
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the absolute number something a model should be able to predict?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, according to the paper, a model should be able to predict it

Comment on lines 23 to 28
- id: drug_concept_name
description: RxNorm name string for the drug
type: categorical
- id: condition_concept_name
description: MedDRA identifier for the side effect
type: categorical
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need both of them simultaneously for the ratio to be meaningful?
That is, a correct prompt would ask the model something like

"What is the proportional reporting ratio for for "

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, so the prompt would be "what is the PRR of <condition_concept_name> for the <drug_concept_name>?". higher PRR means higher reported side effect for that particular drug.

Comment on lines 36 to 39
bibtex: "\n @article{Tatonetti2012,\n author = {Tatonetti, Nicholas P. and Ye, Peter P. and Daneshjou, Roxana and Altman, Russ B.},\n \
\ title = {Data-driven prediction of drug effects and interactions},\n journal = {Sci Transl Med},\n volume = {4},\n number\
\ = {125},\n pages = {125ra31},\n year = {2012},\n doi = {10.1126/scitranslmed.3003377},\n pmid = {22422992},\n pmcid\
\ = {PMC3382018}\n }\n "
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we remove those newlines somehow and just have a multiline string? I can help with that

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure! i'll let you take care of it if thats ok

@kjappelbaum
Copy link
Collaborator

Sorry for being unclear with my last reviews. I have more suggestions, but I can also take care of them if you prefer. Thanks for your contribution!

@MicPie MicPie requested a review from kjappelbaum May 3, 2023 14:19
data/offsides/meta.yaml Outdated Show resolved Hide resolved
data/offsides/meta.yaml Outdated Show resolved Hide resolved
Copy link
Contributor

@MicPie MicPie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, I just added some minor changes to the text

Comment on lines +29 to +34
- id: drug_concept_name
description: RxNorm name string for the drug
type: categorical
- id: condition_concept_name
description: MedDRA identifier for the side effect
type: categorical
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this will need to use prompt templates, I guess, because they are not independent.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants