Skip to content

Conversation

@radka-j
Copy link
Member

@radka-j radka-j commented Aug 15, 2025

Relates to #748 (to be closed once #632 is merged).

The HistoryMatchingWorkflow object now expects a Result rather than an Emulator object. I wasn't sure how else to make this work. It means that one needs to create an Emulator through AutoEmulate for them to be able to use this class (although they could also wrap it in a Result object manually if really motivated. I don't love this limitation but we have found repeatedly that re-initialising when refitting on new data is important for overall performance.

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@radka-j radka-j requested a review from sgreenbury August 18, 2025 12:24
@radka-j
Copy link
Member Author

radka-j commented Aug 18, 2025

@sgreenbury once we decide what the best way to handle refitting emulators is in history matching, we should look at implementing something similar in active learning too

@sgreenbury
Copy link
Collaborator

sgreenbury commented Aug 18, 2025

The HistoryMatchingWorkflow object now expects a Result rather than an Emulator object. I wasn't sure how else to make this work. It means that one needs to create an Emulator through AutoEmulate for them to be able to use this class (although they could also wrap it in a Result object manually if really motivated. I don't love this limitation but we have found repeatedly that re-initialising when refitting on new data is important for overall performance.

Yeah agreed it would it be great to not to need to have fit the Emulator with the AutoEmulate object so that this can flexible regarding workflow.

Could it work adding a method to the TransformedEmulator to allow that to be directly reinitialize from there? E.g. the init params of the TransformedEmulator could be cached as ._init_params and then a method like:

def fit_from_reinitialized(self, x, y) -> Self:
    em = TransformedEmulator(x, y, **self._init_params)
    em.fit(x, y)
    return em

It seems like one limitation might be that train_x and train_y are optional at the moment so it would not be possible to reinitialize if these weren't available.

It also refocusses the API around the TransformedEmulator and it might remain preferable to support any ProbabilisticEmulator? Another option could be to introduce a class for the model params required to reconstruct but without the additional attributes of result but including a method on Result to easily construct this:

class ModelParamsAndEmulator:
    model_class: Emulator
    x_transforms: None | list[Transform]
    y_transforms: None | list[Transform]
    params: ModelParams

class Result:
    ...
    def to_model_params_and_emulator(self) -> ModelParamsAndEmulator:
        return ModelParamsAndEmulator(
            model_class = self.model_class,
            x_transforms = self.x_transforms,
            y_transforms = self.y_transforms,
            params = self.params,
       )

@radka-j
Copy link
Member Author

radka-j commented Aug 18, 2025

The one thing that we need is to store the init_params in the Emulator, if we have those we don't need the results object. From there it's easy enough to check whether we are dealing with a TransformedEmulator or not and handle things (model, x_transforms, y_transforms) accordingly. I think that's the cleanest way forward. What do you think?

@sgreenbury
Copy link
Collaborator

sgreenbury commented Aug 18, 2025

The one thing that we need is to store the init_params in the Emulator, if we have those we don't need the results object. From there it's easy enough to check whether we are dealing with a TransformedEmulator or not and handle things (model, x_transforms, y_transforms) accordingly. I think that's the cleanest way forward. What do you think?

Yeah agreed I like this option. One potential issue is that I think we would need to call this for each subclass's init so becomes a subclassing responsibility? Perhaps this would could be fixed by a method on the base Emulator to inspect the signature and cache the params that are not x and y though? And perhaps something like:

class Emulator:
    def __init__(self, x, y, **kwargs):
        self._init_params = kwargs

class Subclass(Emulator):
    def __init__(self, x, y, **kwargs):
        ...
        Emulator.__init__(self, **kwargs)   

@radka-j
Copy link
Member Author

radka-j commented Aug 18, 2025

At the moment all subclasses are mostly defined as args with default values, is there a way to easily turn that into a kwargs like dictionary to pass to the Emulator.__init__ method like in the above example or do we still need to do that manually, something like:

class GaussianProcess
    ...

    def __init__(self, standardize_x=True, ..., **kwargs):
        gp_args = {
           "standardize_x" = standardize_x,
           "standardize_y" = standardize_y,
           "likelihood_cls" = likelihood_cls, 
           "mean_module_fn" = mean_module_fn,
            ....
       }
       Emulator.__init__(**gp_args, **kwargs)

@sgreenbury
Copy link
Collaborator

I think we can do inspect.signature(self.__class__.__init__) (see e.g. for serializing a Transform to a dict):

# Get initialization parameters from the __init__ signature
init_signature = inspect.signature(self.__class__.__init__)
init_params = {}
for param_name, param in init_signature.parameters.items():
# Skip 'self' parameter
if param_name == "self":
continue
# Get the value from the instance if it exists
if hasattr(self, param_name):
value = getattr(self, param_name)
# Handle special cases for serialization
if isinstance(value, torch.Tensor):
# Convert tensors to lists for JSON serialization
init_params[param_name] = value.tolist()
elif isinstance(value, torch.device):
# Convert device to string
init_params[param_name] = str(value)
else:
init_params[param_name] = value
elif param.default is not inspect.Parameter.empty:
# Use default value if attribute doesn't exist
init_params[param_name] = param.default
return {transform_name: init_params}

@radka-j
Copy link
Member Author

radka-j commented Aug 19, 2025

@sgreenbury that looks good! I'll try implementing something along those lines and then re-request a review when ready, how does that sound?

@sgreenbury
Copy link
Collaborator

Sounds great, nice one!

@radka-j radka-j requested a review from crangelsmith August 19, 2025 13:23
@crangelsmith crangelsmith merged commit 1e4c50d into naghavi Aug 19, 2025
1 check passed
@radka-j radka-j deleted the hm_improve_em_fitting branch August 22, 2025 11:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants