Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] pytorch interface points and suggested refactors #6641

Open
11 tasks
fnhirwa opened this issue Jun 19, 2024 · 4 comments
Open
11 tasks

[ENH] pytorch interface points and suggested refactors #6641

fnhirwa opened this issue Jun 19, 2024 · 4 comments
Assignees
Labels
enhancement Adding new functionality module:base-framework BaseObject, registry, base framework module:deep-learning&networks neural network module - layers, network templates

Comments

@fnhirwa
Copy link
Member

fnhirwa commented Jun 19, 2024

A thorough review of the PyTorch interfacing class

  • NeuralForecast

    NeuralForecast interfaces the PyTorch indirectly through its neural network dependency.

  • PyTorch

    PyTorch adapter is implemented in a way that it abstracts the BaseForecaster class to build Deep neural networks, it enables sktime to build neural networks through torch.nn

    Classes that uses BaseDeepNetworkPyTorch abstract class:

  • HF TransformersForecaster

    The implementation of HF TransformersForecaster interfaces torch directly.

    Suggested changes
    The DataSet class that is used in the transformers PyTorchDataset is close to be same as the one in the PyTorch adapter PyTorchTrainDataset we could unify these datasets that interface PyTorch to have one class that covers all possible logic and import this class.

  • PyKANForecaster

    Suggested changes
    Unifying PyTorchTrainDataset with the one in the PyTorch adapter.

  • Networks

    • LSTF-Linear Models
      Directly interface torch and implement the BaseDeepNetworkPyTorch abstract class.
      Implementation of the classes
    • Conditional Invertible Neural Network(cINN)
      Directly interface torch but its implementation doesn't implement BaseDeepNetworkPyTorch as base class my opinion is we can unify this behaviour.
      Implementation of cINN

Design proposal

  • Renaming the BaseDeepNetwork class implemented in TensorFlow to BaseDeepNetworkTF to clarify it to be a TensorFlow-dependent class.

  • Restructuring the file organization of the Deep Learning class in sktime
    Currently, the classes that interface PyTorch are scattered and my idea is we could move all of them to the networks directory and have two distinct subdirectories namely torch and tensorflow for distinction. The networks name should also be changed to neural_networks If this is acceptable I am ready to raise a PR.

  • Rewriting the current network classes using PyTorch
    Depending on users' needs when implementing estimators using Deep Learning networks one can need to use the torch implementation due to the torch.compile functionality and with this we couldn't limit some classes to be implemented using TensorFlow only. I understand that this might sound like doubling the work but it's worth a shot as it should give the capability to build an estimator using a framework of choice. Ideas and suggestions are welcome on this topic.

Suggestion for possible deep learning models to add to sktime

This is supposed to be a growing list, anyone is welcome to trail this issue and suggest an estimator that should be added, and we are compiling all feature request that references the need for deep learning estimators

  • GRU (Cho, 2014) Paper
  • GRU-FCN(Elsayed, 2018)Paper
  • TabTransformer (Huang, 2020) Paper
  • LSTM-FCN (Karim, 2017) Paper
  • MLSTM-FCN - Multivariate LSTM-FCN (Karim, 2019) Paper
  • TST - Time Series Transformer (Zerveas, 2020) Paper
@fnhirwa fnhirwa added the enhancement Adding new functionality label Jun 19, 2024
@fkiraly fkiraly changed the title [ENH] Review of PyTorch Interfacing classes in sktime [ENH] pytorch interface points and suggested refactors Jun 19, 2024
@fkiraly fkiraly added the module:base-framework BaseObject, registry, base framework label Jun 19, 2024
@fkiraly fkiraly moved this to In Progress in 2024 May-Sep workstreams Jun 20, 2024
@geetu040
Copy link
Contributor

Based on this PR [ENH] Implements LTSFTransformer #6202, certain changes are required with the BaseDeepNetworkPyTorch class.

  1. there should be a boolean parameter scale that fits a scaler on y during fit and uses that to transform during new data in predict and return inversly transformed predictions.
  2. build_pytorch_train_dataloader and build_pytorch_pred_dataloader functions, as well as dataset classes, should be created at the interface class. or similar private methods are to be called at the interface method to initialise dataset objects.
  3. update the abstract _build_network function to have more params than just fh, transformers need information about the data to layout the architechture

@fkiraly
Copy link
Collaborator

fkiraly commented Jun 21, 2024

Some questions, @geetu040:

  1. there should be a boolean parameter scale that fits a scaler on y during fit and uses that to transform during new data in predict and return inversly transformed predictions.

Why would we do that in the class, instead of pipelining with StandardScaler?

  1. update the abstract _build_network function to have more params than just fh, transformers need information about the data to layout the architechture

Should this not be able to look at self parameters, and use anything there?

@geetu040
Copy link
Contributor

Why would we do that in the class, instead of pipelining with StandardScaler?

You are right, it can be, rather it should be done like this

Should this not be able to look at self parameters, and use anything there?

It can use self._y and self._fh to access them, but does that sound like a code practice, because if this is good practice then why are we passing y and fh to other class methods as arguments when they are accessible using self._y and self._fh, even though both the variables are totally same in the particular contexts

@fkiraly
Copy link
Collaborator

fkiraly commented Jun 25, 2024

@geetu040, not sure if I'm parsing your question right - are you asking why in some cases we pass y, fh as args and in some cases we get them from self?

I think there are a mix of factors:

  • inconsistencies related to legacy code structure
  • some methods are private and localized, where self access seems more idiomatic
  • some methods are extender facing, i.e., power users who write their own estimators. For them, it is much clearer to deal only with coerced arguments that are subject to some guarantees than having to read from self. The flow is also easier to control.

@geetu040 geetu040 moved this from In Progress to Todo in 2024 May-Sep workstreams Jul 3, 2024
@fkiraly fkiraly added the module:deep-learning&networks neural network module - layers, network templates label Jul 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Adding new functionality module:base-framework BaseObject, registry, base framework module:deep-learning&networks neural network module - layers, network templates
Projects
Status: Todo
Development

No branches or pull requests

3 participants