-
Notifications
You must be signed in to change notification settings - Fork 90
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Creating patches and extracting features for [4096 x 4096] #49
Comments
Hi @nam1410, If you just want to get the 4k-features, you can follow this notebook - https://github.com/mahmoodlab/HIPT/blob/master/HIPT_4K/HIPT_4K%20Inference%20%2B%20Attention%20Visualization.ipynb. Basically, you will need the 4096 x 4096 image regions as input and extract the corresponding 192-dim. embedding from ViT_4k-256. This is my understanding on HIPT 4k-feature extraction process. @Richarizardd, please correct me if I am wrong - For the 4k model, start with a 3 x 4096 x 4096 (RGB) region. What you want to do is convert this into a sequence of 256 x 256 patches by reshaping as 3 x 16 (w_256) x 16 (h_256) x 256 x 256. This can be written as B x 3 x 256 x 256, where B = (1 x 16 (w_256) x 16 (h_256)). So the "B" here should be viewed as no. of patches. Now each of these B = 256 patches is passed into the ViT_16-256 which yields an embedding of dimension 384. So, for the entire 4096 x 4096 region, you will end up with an embedding tensor of [256, 384]. This can now be written as 1 x 384 x 16 (w_256) x 16 (h_256), which is the input to ViT_256-4096. The output then is an embedding tensor: [1 x 192]. |
" Basically, you will need the 4096 x 4096 image regions as input...." |
they use CLAM preprocessing pipeline to extract (4096, 4096) regions |
@Richarizardd @faisalml - I appreciate your intuitive work. I have been using CLAM for quite some time, but I have encountered an obstacle as follows:
[Preface] - I use an in-house dataset, and CLAM works fine. I recently read your paper and was curious to generate the hierarchical attention maps for the custom dataset. I have the splits and features for [256 x 256] patches, but how do I connect the existing [256 x 256] to the newly extracted [4096 x 4096] features? I have read the open and closed issues. However, I am not finding a lucid explanation.
Consider a WSI with ~20000 [256 x 256] patches, and I have Resnet50 features already extracted and stored on my disk using CLAM's scripts. @Richarizardd has mentioned that I have to change [256 x 256] to [4096 x 4096] while creating patches and extracting the features. In doing this, is the hierarchy still preserved? For example, if I extract a [4096 x 4096] patch hp1, how do I correlate it with the existing [256 x 256] patches in my directory? Is it using the [x,y] coordinates? Is the trajectory of my understanding of the pre-processing reasonable? Am I missing something?
In addition to this, where do I find
ViT-16 features pretrained on TCGA
(ref)? Is it fromHIPT/1-Hierarchical-Pretraining/From ViT-16 to ViT-256.ipynb
Line 29 in b5f4844
Do I use this instead of resnet_custom in the feature extraction
Or is it from
HIPT/HIPT_4K/hipt_4k.py
Line 67 in b5f4844
Please correct me if I am wrong @Richarizardd @faisalml. Thank you.
The text was updated successfully, but these errors were encountered: