Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability to spawn code notebooks from cBioPortal queries #4856

Merged
merged 33 commits into from
Aug 19, 2024

Conversation

gautamsarawagi
Copy link
Contributor

Jupyter-Notebook-lite in the frotnend for this issue

Describe changes proposed in this pull request:

  • Added the Jupyter Notebook Lite in the frontend code as follows:
    image
    image
  • Also sending the data of the studies in the JupyterNotebook Page , with which we can make prepare our models. These are as follows:
    image
  • Have remove the console although from the committed code.

Checks

  • Has tests or has a separate issue that describes the types of test that should be created. If no test is included it should explicitly be mentioned in the PR why there is no test.
  • The commit log is comprehensible. It follows 7 rules of great commit messages. For most PRs a single commit should suffice, in some cases multiple topical commits can be useful. During review it is ok to see tiny commits (e.g. Fix reviewer comments), but right before the code gets merged to master or rc branch, any such commits should be squashed since they are useless to the other developers. Definitely avoid merge commits, use rebase instead.
  • Is this PR adding logic based on one or more clinical attributes? If yes, please make sure validation for this attribute is also present in the data validation / data loading layers (in backend repo) and documented in File-Formats Clinical data section!

Any screenshots or GIFs?

If this is a new visual feature please add a before/after screenshot or gif
here with e.g. Giphy CAPTURE or Peek

Notify reviewers

Read our Pull request merging
policy
. It can help to figure out who worked on the
file before you. Please use git blame <filename> to determine that
and notify them either through slack or by assigning them as a reviewer on the PR

Copy link

netlify bot commented Feb 26, 2024

Deploy Preview for cbioportalfrontend ready!

Name Link
🔨 Latest commit 52c65d1
🔍 Latest deploy log https://app.netlify.com/sites/cbioportalfrontend/deploys/66c357019856fe0008158437
😎 Deploy Preview https://deploy-preview-4856--cbioportalfrontend.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@inodb inodb requested a review from alisman February 27, 2024 17:06
@alisman
Copy link
Collaborator

alisman commented Feb 28, 2024

Nice and quick work @gautamsarawagi ! Couple things:

  1. can you make make a screenshot showing an simple example analysis of the passed data (instead of dummy data)
  2. Right now we are passing the data in the url, which will most likely run up agains browser limits quickly. JupiterLite offers a way of passing the data using post message I think, but we'll probably need to host the jupyterlite ourselves in our codebase, so we should go ahead and do that.

@gautamsarawagi
Copy link
Contributor Author

@alisman You are absolutely right, We will need to host the JupyterLite ourself to pass the data to it.

Have got the direction now. Will keep you updated on progress.

@gautamsarawagi
Copy link
Contributor Author

gautamsarawagi commented Mar 8, 2024

@alisman I have read all the docs in this last week. and tried out different cases:

  1. First I tried to run the algorithm on the passed data and also used my own deployed Jupyterlite website just as you suggested me to do and here is the result:
    image
    As you can see we are trying to access the data from the file and that is what now I am trying to do.

  2. I was trying to make a connection between our hosted site with the Jupyter-lite instance and tried to execute this:
    code.
    Got some progress in this code I was able to manage the themes from the hosted site as this document which is working like this:
    image

  3. Similarly just as we can manage themes I am trying to manage the files also.

  4. We will be passing the data from the site to the Jupyterlite notebook instance and update it there every time.

  5. That would be really efficient according to me. For that, I found this document

I would love to hear your thoughts on this and how I should move further.

@alisman
Copy link
Collaborator

alisman commented Mar 8, 2024

@gautamsarawagi. Looking good! I like seeing the real data in the interface.
And yes it seems the extension is the right way to go to allow direct communication between portal and notebook.

Perhaps we can schedule a zoom session to meet eachother and try to talk about next steps.

Thank you for your work! Are you hoping to do GSOC this year?

--Aaron

@gautamsarawagi
Copy link
Contributor Author

gautamsarawagi commented Mar 8, 2024

@alisman, thanks for the feedback! The extension does seem like the way to go for smooth communication between the portal and the notebook.

A Zoom session is a great idea to meet each other and discuss the next steps. Please do let me know your availability and I'll make sure to accommodate accordingly.

And yes, I'm hoping to do GSOC this year! This project is right up my alley, so I'd love to contribute more if selected.

Appreciate you taking the time to review and provide helpful comments. Looking forward to our discussion and potential collaboration over the summer!

Regards,
Gautam

@gautamsarawagi
Copy link
Contributor Author

gautamsarawagi commented Mar 12, 2024

Hey @alisman based on our discussion yesterday I came across this extension from the jupyterlab itself. And you were right. They are using ServiceManager API's - index.ts This might be a good resource. Also, I came across this discussion this gives a good understanding of how the files are stored internally.

I am working on this and will update you once the integration is successful.

@alisman
Copy link
Collaborator

alisman commented Mar 15, 2024

@gautamsarawagi do you want me to look at anything in particular yet?

@gautamsarawagi
Copy link
Contributor Author

Based on our discussion I came out with some similar available resources. So, actually wanted to share those.

@alisman
Copy link
Collaborator

alisman commented Mar 15, 2024

@gautamsarawagi i think it would be good to include the notebook code itself in the PR. We can figure out a way get the js loaded into an HTML page on the cbioportal.org domain. THat will make progress and review quicker. Ultimately, we may choose for security reasons to host the notebook on a different domain. But the js itself can still live in this repository for convenience.

@gautamsarawagi
Copy link
Contributor Author

gautamsarawagi commented Mar 15, 2024

@alisman thats actually a good idea. It would be good to add the notebook code in the PR itself.

@gautamsarawagi
Copy link
Contributor Author

gautamsarawagi commented Mar 17, 2024

@alisman I have added the notebook in the root directory of the code itself. and have also specified a Readme.md to use it.

Although I was not able to reference the index.html from the root directory to the JupyterNotebookTool.tsx :

I would also like to discuss this with you.. I even tried adding html-loaders in the webpack.config.js file. But it didn't work.

So finally I deployed it for the use
image

If we can reference the notebook's index.html in the iframe then there won't be a need for the deployment which will help us tackle the security factors that are coming right now.

Also, I am now moving to my further work of working with the fileModification.

Please do let me know if any further modifications are needed.

@alisman
Copy link
Collaborator

alisman commented Mar 22, 2024

@gautamsarawagi i tested this and it looks good. Here are some next steps:

  • Give the output.csv a better name based on the query. I believe the download option provides a name so you can get it from there
  • Seed it with a python file (.pynb) which imports the file and does some very simple analysis on it. Like, lists the column headers.

@gautamsarawagi
Copy link
Contributor Author

Sure @alisman. First of all thanks for the feedback from your end.

I will start working on showing a simple python file containing the file and running it.

Will share the progress with you.

@gautamsarawagi
Copy link
Contributor Author

Hey @alisman

  • Updated the name of the file
  • Attempting to auto-execute the ipynb file I am facing some issues with that but working on it.
    • Till Now the file that needs to be auto-executedis getting auto-opened
    • Just need to figure out the way to execute the cells of it.

Will update you once the auto-execution part is done.

PFA: The changes in the extension created for the file-communication: link

fieldsToKeep
);

this.jupyterFileContent = allGenesMutationsCsv;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you should delete jupyterFileContent this when the modal is closed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alisman I have updated the code.

notebook/READEME.md Outdated Show resolved Hide resolved
@alisman
Copy link
Collaborator

alisman commented Aug 19, 2024

image

Add this HTML to button:

image

@gautamsarawagi
Copy link
Contributor Author

image

Add this HTML to button:

image

Added the beta! text

@alisman alisman merged commit 50adab2 into cBioPortal:master Aug 19, 2024
11 of 14 checks passed
@inodb inodb added the gsoc label Sep 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants