Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nested config values are not updated with provided value in --config file #152

Closed
gbmarc1 opened this issue Nov 10, 2021 · 20 comments · Fixed by #160, #175 or #181
Closed

Nested config values are not updated with provided value in --config file #152

gbmarc1 opened this issue Nov 10, 2021 · 20 comments · Fixed by #160, #175 or #181
Assignees
Labels
bug Something isn't working

Comments

@gbmarc1
Copy link
Contributor

gbmarc1 commented Nov 10, 2021

Describe the bug
The value provided in a config file is not propagated to the nested instances of the config. However, when provided directly via command line, it does work.

To Reproduce
This is a simplified config and code:

from spock.builder import ConfigArgBuilder
from spock.config import spock

@spock
class TrainProcess:
    epochs: int = 100

@spock
class Train:
    train_process: TrainProcess = TrainProcess()



def main():
    attrs_class = ConfigArgBuilder(
        Train, TrainProcess,
        desc='',
    ).generate()
    print(attrs_class)


if __name__ == '__main__':
    main()

The provided command is:
train --config best_params.yaml

with best_params.yaml

epochs: 1

Expected behavior
--config values should be propagated to nested config instances like values from command line

Screenshots
Snipping of debug pane for attrs_class object
image

Additional context
spock-config==2.2.6

@gbmarc1 gbmarc1 changed the title Nested config values are not updated Nested config values are not updated with provided value in --config file Nov 10, 2021
@gbmarc1
Copy link
Contributor Author

gbmarc1 commented Nov 12, 2021

For people like me that did not understood the Nested stuff. You need to have the following in the config files:

TrainProcess:
    epochs: 1

@gbmarc1 gbmarc1 closed this as completed Nov 12, 2021
@ncilfone
Copy link
Contributor

Apologies for the slow follow up as I was away for the weekend. Yeah this is buried in the docs here and not super visible. Another pass through the docs for clarity is on my radar.

@gbmarc1 gbmarc1 reopened this Nov 17, 2021
@gbmarc1
Copy link
Contributor Author

gbmarc1 commented Nov 17, 2021

Hi, sorry so by testing a bit more again, it seems it does not work with 2 levels of nested config.

from spock.builder import ConfigArgBuilder
from spock.config import spock

@spock
class AnotherNested:
    something: int = 5

@spock
class TrainProcess:
    epochs: int = 100
    nest: AnotherNested = AnotherNested()

@spock
class Train:
    train_process: TrainProcess = TrainProcess()



def main():
    attrs_class = ConfigArgBuilder(
        Train, TrainProcess, AnotherNested
        desc='',
    ).generate()
    print(attrs_class)


if __name__ == '__main__':
    main()

The --config

AnotherNested:
    something: 1

The config arg builder will not update the nested AnotherNested config within Train->TrainProcess. But the config at the TOP level will be updated a per the --config value

@ncilfone
Copy link
Contributor

@gbmarc1 Looking at this now... seems like a mapping issue within the spock backend since the child object seems to get updated with the --config defined value but not the parent

@gbmarc1
Copy link
Contributor Author

gbmarc1 commented Nov 17, 2021

Hi thank you for looking at this.

I can't seem to find the recursivity in the code where config are instantiated with the --config values

ncilfone added a commit that referenced this issue Nov 17, 2021
…ng recursed to set config arg values within nested classes.
@ncilfone ncilfone self-assigned this Nov 17, 2021
@ncilfone ncilfone added the bug Something isn't working label Nov 17, 2021
@ncilfone ncilfone linked a pull request Nov 17, 2021 that will close this issue
@ncilfone
Copy link
Contributor

Hi @gbmarc1. I think #160 should fix this. Weird edge case that was happening that wouldn't call the recursive checking. Thanks for finding!

I'll pull it into master if you want to check and see if it works on your end, and then I can cut a minor release bump so there is a pip wheel...

ncilfone added a commit that referenced this issue Nov 17, 2021
* fixes issues wrt more than 2 levels of class nesting references. was ssuming to fall back on defaults instead of recursing the config space to set the correct parameters.

* linted

* updated docs for cmd line overrides on nested classes

* fixed edge case from #152 where the default attrs object wasn't getting recursed to set config arg values within nested classes.

* adding test conf file
@gbmarc1
Copy link
Contributor Author

gbmarc1 commented Nov 17, 2021

I still have the problem after installing
pip install git+https://github.com/fidelity/spock.git@master

@ncilfone ncilfone reopened this Nov 17, 2021
@ncilfone
Copy link
Contributor

Running the following on @master

# -*- coding: utf-8 -*-
from spock.builder import ConfigArgBuilder
from spock.config import spock


@spock
class AnotherNested:
    something: int = 5


@spock
class TrainProcess:
    epochs: int = 100
    nest: AnotherNested = AnotherNested()


@spock
class Train:
    train_process: TrainProcess = TrainProcess()


def main():
    attrs_class = ConfigArgBuilder(
        Train, TrainProcess, AnotherNested,
        desc='',
    ).generate()
    print(attrs_class)


if __name__ == '__main__':
    main()
AnotherNested:
    something: 1

Produces this:

AnotherNested: !!python/object:spock.backend.config.AnotherNested
  something: 1
Train: !!python/object:spock.backend.config.Train
  train_process: !!python/object:spock.backend.config.TrainProcess
    epochs: 100
    nest: !!python/object:spock.backend.config.AnotherNested
      something: 1
TrainProcess: !!python/object:spock.backend.config.TrainProcess
  epochs: 100
  nest: !!python/object:spock.backend.config.AnotherNested
    something: 1

@gbmarc1
Copy link
Contributor Author

gbmarc1 commented Nov 17, 2021

I agree the previous code works. But one of my test fails

image

@ncilfone
Copy link
Contributor

Do these all come in from default or via the config file?

@gbmarc1
Copy link
Contributor Author

gbmarc1 commented Nov 17, 2021

class MyEnum(Enum):
    value_1 = "value_1"
    value_2 = "value_2"


@spock
class AnotherNestedConfig:
    something: int = 5


@spock
class NestedConfig:
    integer: int = 1
    my_enum: MyEnum = "value_1"
    other_nest: AnotherNestedConfig = AnotherNestedConfig()


@spock
class DuplicatedConfig:
    flag_3: bool = False


@spock
class MakeDatasetConfig:
    flag1: bool = False
    flag_2: bool = False
    nested_config: NestedConfig = NestedConfig()
    duplicated_config: DuplicatedConfig = DuplicatedConfig()


@spock
class BuildFeatureConfig:
    duplicated_config: DuplicatedConfig = DuplicatedConfig()

# Write spock-config config file
with open("config.yaml", "w") as f:
    yaml.dump(
        {"NestedConfig": {"integer": 2}, "AnotherNestedConfig": {"something": 1}},
        f,
    )
 sys.argv = [".py", "--config", "config.yaml"]

@ncilfone
Copy link
Contributor

ncilfone commented Nov 17, 2021

Running this:

# -*- coding: utf-8 -*-
from spock.builder import ConfigArgBuilder
from spock.config import spock
from enum import Enum


class MyEnum(Enum):
    value_1 = "value_1"
    value_2 = "value_2"


@spock
class AnotherNestedConfig:
    something: int = 5


@spock
class NestedConfig:
    integer: int = 1
    my_enum: MyEnum = "value_1"
    other_nest: AnotherNestedConfig = AnotherNestedConfig()


@spock
class DuplicatedConfig:
    flag_3: bool = False


@spock
class MakeDatasetConfig:
    flag1: bool = False
    flag_2: bool = False
    nested_config: NestedConfig = NestedConfig()
    duplicated_config: DuplicatedConfig = DuplicatedConfig()


@spock
class BuildFeatureConfig:
    duplicated_config: DuplicatedConfig = DuplicatedConfig()




# @spock
# class AnotherNested:
#     something: int = 5
#
#
# @spock
# class NestedConfig:
#     integer: int = 2
#     other_nest: AnotherNested = AnotherNested()
#
#
# @spock
# class TrainProcess:
#     epochs: int = 100
#     nest: NestedConfig = NestedConfig()
#
#
# @spock
# class Train:
#     train_process: TrainProcess = TrainProcess()


def main():
    attrs_class = ConfigArgBuilder(
        BuildFeatureConfig, MakeDatasetConfig, DuplicatedConfig, NestedConfig, AnotherNestedConfig,
        desc='',
    ).generate()
    print(attrs_class)
    #
    # attrs_class = ConfigArgBuilder(
    #     Train, TrainProcess, AnotherNested, NestedConfig,
    #     desc='',
    # ).generate()
    # print(attrs_class)


if __name__ == '__main__':
    main()

and the --config yaml:

AnotherNestedConfig:
    something: 1

Leads to this...

AnotherNestedConfig: !!python/object:spock.backend.config.AnotherNestedConfig
  something: 1
BuildFeatureConfig: !!python/object:spock.backend.config.BuildFeatureConfig
  duplicated_config: !!python/object:spock.backend.config.DuplicatedConfig
    flag_3: false
DuplicatedConfig: !!python/object:spock.backend.config.DuplicatedConfig
  flag_3: false
MakeDatasetConfig: !!python/object:spock.backend.config.MakeDatasetConfig
  duplicated_config: !!python/object:spock.backend.config.DuplicatedConfig
    flag_3: false
  flag1: false
  flag_2: false
  nested_config: !!python/object:spock.backend.config.NestedConfig
    integer: 1
    my_enum: value_1
    other_nest: !!python/object:spock.backend.config.AnotherNestedConfig
      something: 1
NestedConfig: !!python/object:spock.backend.config.NestedConfig
  integer: 1
  my_enum: value_1
  other_nest: !!python/object:spock.backend.config.AnotherNestedConfig
    something: 1

Am I missing something?

@gbmarc1
Copy link
Contributor Author

gbmarc1 commented Nov 17, 2021

# -*- coding: utf-8 -*-
from spock.builder import ConfigArgBuilder
from spock.config import spock
from enum import Enum


class MyEnum(Enum):
    value_1 = "value_1"
    value_2 = "value_2"


@spock
class AnotherNestedConfig:
    something: int = 5


@spock
class NestedConfig:
    integer: int = 1
    my_enum: MyEnum = "value_1"
    other_nest: AnotherNestedConfig = AnotherNestedConfig()


@spock
class DuplicatedConfig:
    flag_3: bool = False


@spock
class MakeDatasetConfig:
    flag1: bool = False
    flag_2: bool = False
    nested_config: NestedConfig = NestedConfig()
    duplicated_config: DuplicatedConfig = DuplicatedConfig()


@spock
class BuildFeatureConfig:
    duplicated_config: DuplicatedConfig = DuplicatedConfig()


def main():
    attrs_class = ConfigArgBuilder(
        BuildFeatureConfig, MakeDatasetConfig, DuplicatedConfig, NestedConfig, AnotherNestedConfig,
        desc='',
    ).generate()
    print(attrs_class)

import sys
from ruamel.yaml import YAML
if __name__ == '__main__':
    sys.argv = [".py", "--config", "config.yaml"]
    yaml = YAML(typ="safe")
    with open("config.yaml", "w") as f:
        yaml.dump(
            {"NestedConfig": {"integer": 2}, "AnotherNestedConfig": {"something": 1}},
            f,
        )
    main()

I get this

AnotherNestedConfig: !!python/object:spock.backend.config.AnotherNestedConfig
  something: 1
BuildFeatureConfig: !!python/object:spock.backend.config.BuildFeatureConfig
  duplicated_config: !!python/object:spock.backend.config.DuplicatedConfig
    flag_3: false
DuplicatedConfig: !!python/object:spock.backend.config.DuplicatedConfig
  flag_3: false
MakeDatasetConfig: !!python/object:spock.backend.config.MakeDatasetConfig
  duplicated_config: !!python/object:spock.backend.config.DuplicatedConfig
    flag_3: false
  flag1: false
  flag_2: false
  nested_config: !!python/object:spock.backend.config.NestedConfig
    integer: 2
    my_enum: value_1
    other_nest: !!python/object:spock.backend.config.AnotherNestedConfig
      something: 5
NestedConfig: !!python/object:spock.backend.config.NestedConfig
  integer: 2
  my_enum: value_1
  other_nest: !!python/object:spock.backend.config.AnotherNestedConfig
    something: 1


Process finished with exit code 0

I get a something = 5

@ncilfone
Copy link
Contributor

ahhhh ok got it now... seems like there is still a bug in the recursivity that is overwriting some of the config based values. Investigating now

@ncilfone
Copy link
Contributor

@gbmarc1 I'm going to be out for a few weeks so resolution might not come swift on this :-/

Can you just avoid some of the nesting for now? you can always just pass around multiple Spockspaces albeit a little more verbose... I'll fix it when I'm back in early Dec.

If your interested, I think the solution here is to be more pedantic when dealing with nesting by building a directed graph of parent -> child relations such that the overall config can be built recursively by a DFS pattern... This in theory is what it's doing with the recursive calls it's just that it's not directly factored into a graph first. I think this should solve all these problems since it will be much more traceable

@gbmarc1
Copy link
Contributor Author

gbmarc1 commented Nov 17, 2021

@ncilfone I can try to resolve the problem if you point me in the right direction. Where do you think the problem lies in the code?
I will make a pull request that you will be able to agree at your return.

@ncilfone
Copy link
Contributor

@gbmarc1 I might have a really ugly temp fix that would be a bandaid in the meantime... I'll leave it on a branch for a PR but not rope it into master in case it causes chaos elsewhere....

hang tight...

@ncilfone
Copy link
Contributor

@gbmarc1 Also happy to have you contribute as well! This is where all the 'late defaults' get set... aka classes that come in from config files that might need recursing:

self._handle_late_defaults(args, fields, input_class)

The _handle_late_defaults function is where there is just a bunch of spaghetti boolean logic to catch all these edge cases.

My thought is that here:

def generate(self, dict_args):

Before starting to iterate through all the input classes (the spock objects that are defined but not instantiated yet with the default values or values from the config/cmd-line) build a directed graph of all classes. Then you should be just be able to DFS through that and set the correct values from the dict_args value in the child classes and then use those as the values of the dictionary (**kwargs that get passed to the actual instantiation of the object) for the parents working up the graph...

But this is just a half-baked thought to start with... I think that this or something else graph/chil-parent based will be much less spaghetti boolean recursive logic

ncilfone added a commit that referenced this issue Nov 17, 2021
…ctor should be thought about to handle all the parent/child relations in a cleaner manner
@ncilfone
Copy link
Contributor

Try this branch that has the ugly patch:

https://github.com/fidelity/spock/tree/nested_fix

ncilfone added a commit that referenced this issue Dec 13, 2021
* fixes issues wrt more than 2 levels of class nesting references. was ssuming to fall back on defaults instead of recursing the config space to set the correct parameters.

* linted

* updated docs for cmd line overrides on nested classes

* fixed edge case from #152 where the default attrs object wasn't getting recursed to set config arg values within nested classes.

* adding test conf file

* wonky patch to deal with the monster under the bed that is #152. refactor should be thought about to handle all the parent/child relations in a cleaner manner

* WIP: Changing all the nesting deps to be handle via a DAG. Breaks most tests.

* fixes optional class refs.

* fix signature to deal with tuner -- should resolve all tests in /tune. Probably a bunch of vestigial code still to remove -- base tests still only 43/48

* fixing some tests

* some interesting work but breaking the tests

* test_command_line pass

* some cleaning and refactoring

* fix when optional nested value is None

* clearner refactor to handle type optionals

* "config" cannot be a general argument

* fix some more tests and windows path compatible

* s3 tests pass on windows

* all tests now are passing

* dont need the graph in builde space

* renamed everything to builder_space
and doc string

* some cleaning

* removing networkx dep

* linted

* removed dataclasses dep by swapping to a namedtuple

* fix-up of some tests. some were missing correct spock classes as *args. Some now raise a different exception with the new refactor

* moved help functionality to separate file for readability of the builder class

* documentation. almost finished

* finished doc strings. linted

* files for extra tests

* fixing issues with python3.6 which doesn't have the typing alias 'list' yet only 'typing.List'

* linted

Co-authored-by: mbelanger_explorance <[email protected]>
ncilfone added a commit that referenced this issue Dec 13, 2021
* Nested values edge case fix (#160)

* fixes issues wrt more than 2 levels of class nesting references. was ssuming to fall back on defaults instead of recursing the config space to set the correct parameters.

* linted

* updated docs for cmd line overrides on nested classes

* fixed edge case from #152 where the default attrs object wasn't getting recursed to set config arg values within nested classes.

* adding test conf file

* fixing some tests

* some interesting work but breaking the tests

* test_command_line pass

* some cleaning and refactoring

* fix when optional nested value is None

* clearner refactor to handle type optionals

* "config" cannot be a general argument

* fix some more tests and windows path compatible

* s3 tests pass on windows

* all tests now are passing

* dont need the graph in builde space

* renamed everything to builder_space
and doc string

* some cleaning

* removing networkx dep

* linted

* removed dataclasses dep by swapping to a namedtuple

* fix-up of some tests. some were missing correct spock classes as *args. Some now raise a different exception with the new refactor

* moved help functionality to separate file for readability of the builder class

* documentation. almost finished

* finished doc strings. linted

* files for extra tests

* fixing issues with python3.6 which doesn't have the typing alias 'list' yet only 'typing.List'

* linted

Co-authored-by: mbelanger_explorance <[email protected]>
@ncilfone
Copy link
Contributor

Actually closed in #181 to give attribution.

ncilfone added a commit that referenced this issue Dec 13, 2021
* fixes issues wrt more than 2 levels of class nesting references. was ssuming to fall back on defaults instead of recursing the config space to set the correct parameters.

* linted

* updated docs for cmd line overrides on nested classes

* fixed edge case from #152 where the default attrs object wasn't getting recursed to set config arg values within nested classes.

* adding test conf file

* wonky patch to deal with the monster under the bed that is #152. refactor should be thought about to handle all the parent/child relations in a cleaner manner

* WIP: Changing all the nesting deps to be handle via a DAG. Breaks most tests.

* fixes optional class refs.

* fix signature to deal with tuner -- should resolve all tests in /tune. Probably a bunch of vestigial code still to remove -- base tests still only 43/48

* fixing some tests (#163)

* fixing some tests

* some interesting work but breaking the tests

* test_command_line pass

* some cleaning and refactoring

* fix when optional nested value is None

* clearner refactor to handle type optionals

* "config" cannot be a general argument

* fix some more tests and windows path compatible

* s3 tests pass on windows

* all tests now are passing

* dont need the graph in builde space

* renamed everything to builder_space
and doc string

* some cleaning

Co-authored-by: mbelanger_explorance <[email protected]>

* Re-Improve handling of nested dependencies (#180)

* Nested values edge case fix (#160)

* fixes issues wrt more than 2 levels of class nesting references. was ssuming to fall back on defaults instead of recursing the config space to set the correct parameters.

* linted

* updated docs for cmd line overrides on nested classes

* fixed edge case from #152 where the default attrs object wasn't getting recursed to set config arg values within nested classes.

* adding test conf file

* fixing some tests

* some interesting work but breaking the tests

* test_command_line pass

* some cleaning and refactoring

* fix when optional nested value is None

* clearner refactor to handle type optionals

* "config" cannot be a general argument

* fix some more tests and windows path compatible

* s3 tests pass on windows

* all tests now are passing

* dont need the graph in builde space

* renamed everything to builder_space
and doc string

* some cleaning

* removing networkx dep

* linted

* removed dataclasses dep by swapping to a namedtuple

* fix-up of some tests. some were missing correct spock classes as *args. Some now raise a different exception with the new refactor

* moved help functionality to separate file for readability of the builder class

* documentation. almost finished

* finished doc strings. linted

* files for extra tests

* fixing issues with python3.6 which doesn't have the typing alias 'list' yet only 'typing.List'

* linted

Co-authored-by: mbelanger_explorance <[email protected]>

* updated readme

Co-authored-by: gbmarc1 <[email protected]>
Co-authored-by: mbelanger_explorance <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
2 participants