For OCGNN models, update hypersphere radius, and filter training data with active_mask #95

ahmed3amerai · 2023-12-15T23:37:49Z

This pull request follows the discussion, to update hypersphere radius. The modifications could be summarized as follows.

Set the default warmup value as math.inf, to update the hypersphere radius and centre in every epoch
Added train is_train flag parameter to update radius during training only.
Updating the centre c of the hypersphere before the first epoch, instead of initialising it by zeros. My experiments show that if the initial features are not centred around 0, the initializing by zeros affects the performance since the centre of the hypersphere starts far from the majority of data points.

Moreover, It follows the discussion, to allow the use of training/ validating/ testing masks.

modified the script to filter the data with an active_mask variable. That will consider only the active nodes in (decision_score_, decision_function, and loss_func ).
If someone wants to use it he/she needs to set data.active_mask = data.train_mask before training and data.active_mask = data.test_mask before testing.

Regards,
Ahmed Aly

ahmed3amerai · 2023-12-19T04:13:16Z

I fixed all models to run smoothly, however, I am unsure if models other than OCGNN require updating the hypersphere radius since I was focussing mainly on the OCGNN model. Theoretically, if any model depends on hypersphere learning it should require similar handling.
For enabling "active_mask" to filter train/valid/test data, I have enabled it in OCGNN. Similarly, I enabled it in models that utilize the DeepDetector function, namely AdONE, AnomalyDAE, CONAD, DOMINANT, GUIDE, GAE, and DONE. I enabled it by filtering all vectors passed to the loss_func function with "active_mask". (Please confirm that logic)
The other models need more study to their algorithms to enable "active_mask".

kayzliu · 2023-12-19T04:52:12Z

Hi Ahmed, thanks a lot for your contributions. However, I have some concerns.

active_mask: as discussed in How do you use different masks in PyGOD datasets. #50, we are focusing on unsupervised settings so far. Thus, there is no need to mask any node in the training data, and we won't encourage users to do so.
is_train: I am a little confused by "training only." For the transductive setting, the score is ready after training. For the inductive setting, we won't use warmup if predict is called.
c: we are happy to take this change if it helps.

ahmed3amerai · 2023-12-19T05:37:53Z

It's my pleasure to contribute, regarding your concerns.

For active_mask: I understand some models do not require masking training nodes in unsupervised settings. However, since OCGNN is a one-class classifier model, it requires training on only one class of data, as mentioned in the original paper, and testing on both classes. To achieve that in a transductive setting I utilized masks.
I used is_train to avoid updating the hypersphere radius or centre when predict is called. If we initialize warmup=math.inf, the condition self.warmup > 0 will still be true even when predict is called. I think it's ok if you see a better way to assert that without using is_train flag.

kayzliu · 2023-12-21T04:35:45Z

Thanks for your reply. That makes sense. I am still considering whether we should add this mask for the only one class model in our library. Actually, this one class setting is very strict. The training data cannot contain any outliers. It might be too ideal for real-world applications.

For the is_train flag, it can be more elegant by setting the warmup to the number of training steps, which can be done by passing a parameter. So no need to change this part.

ahmed3amerai · 2023-12-25T16:45:43Z

I am just confirming, Is there any tasks you need from my side for this pull request?

kayzliu · 2024-01-02T22:35:28Z

pygod/nn/ocgnn.py

@@ -108,17 +108,20 @@ def loss_func(self, emb):
        score : torch.Tensor
            Outlier scores of shape :math:`N` with gradients.
        """
+        if is_train:
+            if self.warmup > 0:


We are able to merge this part without is_train flag.

kayzliu

Can we keep only the part I marked and reverse rest of them?

ahmed3amerai · 2024-01-03T02:22:39Z

I kept the part you marked and reversed the rest of the changes.

kayzliu · 2024-01-03T03:36:24Z

Thanks for reversing. But I still see several other files was changed, including the cache files.

ahmed3amerai · 2024-01-03T04:24:32Z

Yes it was cache files and empty spaces, It should be clean now.

coveralls · 2024-01-03T05:14:11Z

Pull Request Test Coverage Report for Build 7393446504

6 of 6 (100.0%) changed or added relevant lines in 1 file are covered.
No unchanged relevant lines lost coverage.
Overall coverage increased (+0.001%) to 99.573%

Totals
Change from base Build 7283849275:	0.001%
Covered Lines:	3029
Relevant Lines:	3042

💛 - Coveralls

kayzliu

LGTM

kayzliu · 2024-01-03T05:19:33Z

Thanks for your effort. PR Merged.

ahmed3amerai added 4 commits December 15, 2023 18:15

Update hypersphere radius, and filter training data with active_mask

ce15c10

Update hypersphere radius, and filter training data with active_mask

77d754c

Set default warmup as math.inf

2695273

Updating other models

4a36ac8

Merge branch 'main' into main

d061151

kayzliu reviewed Jan 2, 2024

View reviewed changes

kayzliu requested changes Jan 2, 2024

View reviewed changes

ahmed3amerai added 2 commits January 2, 2024 21:07

Keep accepted changes and Revert rest

ecc2caa

Keep accepted changes and Revert rest

a134943

ahmed3amerai and others added 3 commits January 2, 2024 23:10

Keep accepted changes and Revert rest

b36165d

Keep accepted changes and Revert rest

76f3fb9

Cleaning the pull request

b1d1d05

kayzliu self-requested a review January 3, 2024 05:15

kayzliu approved these changes Jan 3, 2024

View reviewed changes

kayzliu merged commit 3342243 into pygod-team:main Jan 3, 2024
14 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

For OCGNN models, update hypersphere radius, and filter training data with active_mask #95

For OCGNN models, update hypersphere radius, and filter training data with active_mask #95

ahmed3amerai commented Dec 15, 2023

ahmed3amerai commented Dec 19, 2023

kayzliu commented Dec 19, 2023

ahmed3amerai commented Dec 19, 2023 •

edited

Loading

kayzliu commented Dec 21, 2023

ahmed3amerai commented Dec 25, 2023

kayzliu Jan 2, 2024

kayzliu left a comment

ahmed3amerai commented Jan 3, 2024

kayzliu commented Jan 3, 2024

ahmed3amerai commented Jan 3, 2024

coveralls commented Jan 3, 2024 •

edited

Loading

kayzliu left a comment

kayzliu commented Jan 3, 2024

For OCGNN models, update hypersphere radius, and filter training data with active_mask #95

For OCGNN models, update hypersphere radius, and filter training data with active_mask #95

Conversation

ahmed3amerai commented Dec 15, 2023

ahmed3amerai commented Dec 19, 2023

kayzliu commented Dec 19, 2023

ahmed3amerai commented Dec 19, 2023 • edited Loading

kayzliu commented Dec 21, 2023

ahmed3amerai commented Dec 25, 2023

kayzliu Jan 2, 2024

Choose a reason for hiding this comment

kayzliu left a comment

Choose a reason for hiding this comment

ahmed3amerai commented Jan 3, 2024

kayzliu commented Jan 3, 2024

ahmed3amerai commented Jan 3, 2024

coveralls commented Jan 3, 2024 • edited Loading

Pull Request Test Coverage Report for Build 7393446504

💛 - Coveralls

kayzliu left a comment

Choose a reason for hiding this comment

kayzliu commented Jan 3, 2024

ahmed3amerai commented Dec 19, 2023 •

edited

Loading

coveralls commented Jan 3, 2024 •

edited

Loading