Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Override loaded config file through a command line flag #386

Closed
tamuhey opened this issue Jan 20, 2020 · 22 comments · Fixed by #604
Closed

[Feature Request] Override loaded config file through a command line flag #386

tamuhey opened this issue Jan 20, 2020 · 22 comments · Fixed by #604
Labels
enhancement Enhanvement request
Milestone

Comments

@tamuhey
Copy link

tamuhey commented Jan 20, 2020

🚀 Feature Request

I want to pass yaml file instead of parameters like foo.bar=1 ... to override default configuration:

$ python app.py --yaml foo.yaml
@tamuhey tamuhey added the enhancement Enhanvement request label Jan 20, 2020
@omry
Copy link
Collaborator

omry commented Jan 20, 2020

Can this file be inside the search path (in the same directory as the config file you specify in @hydra.main() )?

If so, try:

conf/
  config.yaml
  experiment/
      exp1.yaml

and from the command line:

$ python foo.py experiment=exp1

This will override your config file with exp1.

@tamuhey
Copy link
Author

tamuhey commented Jan 20, 2020

Can this file be inside the search path (in the same directory as the config file you specify in @hydra.main() )?

No.
I'm creating a cli with hydra, and this use case is for the user, so the config file cannot be placed inside conf directory.
The pattern of the modification is so enormous that cannot be defined in advance.

@tamuhey
Copy link
Author

tamuhey commented Jan 20, 2020

The reason why I use hydra is that the configuration of my cli is a little bit complexed and hydra is suitable for my cli.
This app is closed source now, but will be opened before long. (After opened, I'll tell you that)

@omry omry added this to the 1.1.0 milestone Jan 31, 2020
@EtienneDavid
Copy link

Yes it would be definitly helpful to be able to change the config_name from the CLI interface. Hydra sounds super helpful to simplify the architecture of some of my modules but the user should be able to redefine the config from scratch...

@omry
Copy link
Collaborator

omry commented Feb 11, 2020

This is more involved than it sounds.
config_path is currently describing both the config search path relative to the Python file with hydra.main, and optionally a config file to load from the search path.
There are a few related issues open for allowing some control over the search path from the command line as well this one.
When loading a config file using some command line override, what does it do (if anything) to the search path?

This is also a user facing interface. which means it's not very forgiving to many changes. I want to get this right from the first time.

There are other higher priority issues I am working on. I will get to it.

@omry
Copy link
Collaborator

omry commented Feb 21, 2020

This is a clean workaround that addresses most of the use cases:

In conf, create a directory called experiment:

conf\
  experiment\
    exp1.yaml
    exp2.yaml

Each one can specialize the generated config, for example exp1.yaml can be:

exp1.yaml:

learning_rate: 0.1
batch_size: 32

Now from the command line:
$ python foo.py experiment=exp1

There are some use cases where an external config file can be useful, but this should cover a large number of the use cases.

@omry omry changed the title [Feature Request] Pass yaml file in cli [Feature Request] Override loaded config file through a command line flag Feb 21, 2020
@jbohnslav
Copy link

jbohnslav commented Apr 15, 2020

Hi there,

I just wanted to +1 for this feature request. My use case is as follows. I have a complex configuration file with lots of details the user will not need to interact with, e.g. optimizers. There are elements, particularly regarding the data and augmentations, that all users will have to customize.

Default conf/config.yaml:

train
  lr: 0.0001
  optimizer: adam
dataset
  name: imagenet
  classes: 
    - tench
    - goldfish
    - great white shark
    - ... etc
augs
  resize: 224
  flip_lr: true
  flip_ud: false

user_config.yaml:

dataset
  name: not hotdog
  classes:
    - hotdog
    - not hotdog
augs
  resize: 64
  flip_lr: true
  flip_ud: true

What I want is to be able to use the following syntax and have Hydra load the default configuration file first, and optionally override with the user's supplied config file:
python train.py --config user_config.yaml

Ideally, the user could also supply command line arguments which would take precedence over all. I'm not sure if this is too unique to my use case. I want the priority to be command line args > user defaults from user_config.yaml > conf/config.yaml.

Is there a way to do this currently with hydra?

@omry
Copy link
Collaborator

omry commented Apr 15, 2020

Please reach out on the chat.

@SunQpark
Copy link

I'm currently using a workaround for a similar case.

Default conf/config.yaml:

user_config: # path to config file to update.
train:
  lr: 0.0001
  optimizer: adam
dataset:
  name: imagenet
  classes: 
    - tench
    ...

Then, in the train.py you can override hydra config object with OmegaConf.merge function when user_config is given.

if config.user_config is not None:
    user_config = OmegaConf.load(config.user_config)
    config = OmegaConf.merge(config, user_config)

Now, you can use python train.py user_config=user_config.yaml

@gunthergl
Copy link

This comment goes in the direction of @jbohnslav, I think @tamuhey 's question can, but does not have to be connected with it because in this case I assume you can have a userconfig inside the search path.

├── conf
│   ├── config_default.yaml
│   ├── cfgs
│   │   ├── base.yaml
│   │   └── userconfig.yaml
└── my_app.py

config_default.yaml:

defaults:
  - cfgs: base
  - cfgs: userconfig

base.yaml:

db:
  first_param: 1

userconfig.yaml:

db:
  userdefined_param: 2

my_app.py

import hydra
from omegaconf import DictConfig


@hydra.main(config_path="conf/config_default.yaml")
def my_app(cfg: DictConfig) -> None:
    print(cfg.pretty())

if __name__ == "__main__":
    my_app()

And the output is:

db:
first_param: 1
userdefined_param: 2

For multirun, I added userconfig2.yaml inside cfgs:

db:
  anotherParam: 2

Then start via:
python my_app.py cfgs=userconfig,userconfig2 -m

[2020-05-16 12:04:38,165][HYDRA] Sweep output dir : multirun/2020-05-16/12-04-38
[2020-05-16 12:04:38,166][HYDRA] Launching 2 jobs locally
[2020-05-16 12:04:38,166][HYDRA] #0 : cfgs=userconfig
db:
first_param: 1
userdefined_param: 2

[2020-05-16 12:04:38,246][HYDRA] #1 : cfgs=userconfig2
db:
anotherParam: 2
first_param: 1

I do not know what happens here exactly but as long as it works..

@gunthergl
Copy link

I just found that it is sufficient to have the following main config file config_default.yaml:

defaults:
  - cfgs: base
  - cfgs: base

Then generally, its base and you can overwrite it on the commandline.

Still, I do not know if there are unwanted side effects.

@omry
Copy link
Collaborator

omry commented May 16, 2020

@gunthergl, I mentioned this approach in multiple comments above.
This is also documented in the tutorial to some extent here.

@moinfar
Copy link

moinfar commented May 20, 2020

Feature Request

I want to pass yaml file instead of parameters like foo.bar=1 ... to override default configuration:

$ python app.py --yaml foo.yaml

Hi,
I believe the requested feature is a must-have.

By the way, I suggest this workaround:

def main(cfg: DictConfig) -> None:
    print(cfg.pretty())

if __name__ == "__main__":
    config_path = "./conf/default.yaml"

    if len(sys.argv) > 1 and sys.argv[1].startswith("config="):
        config_path = sys.argv[1].split("=")[-1]
        sys.argv.pop(1)

    main_wrapper = hydra.main(config_path, strict=True)
    main_wrapper(main)()

This way you can specify your config file by adding config=somefile.yaml at the beginning of your command.

@omry
Copy link
Collaborator

omry commented May 20, 2020

This requested feature is also a will-have.
Patience.

@omry omry modified the milestones: 1.1.0, 1.0.0 May 24, 2020
@omry
Copy link
Collaborator

omry commented May 24, 2020

This is coming, and by "this", I mean what I am ready to support at this stage:
Hydra 1.0 is splitting the config_path into a config_path and config_name, you can learn about it here.

This feature is opening up the door for overriding the config name and the config path individually through a simple command line flag.

Limitations

  1. config_name is relative to the search path. this will not work for files in arbitrary locations on the file system.
  2. config_path is overriding the one in the file, if you override it you will lose access to configs in the current config_path mentioned in the Python file.

I am not planning on doing more than this for 1.0.
If there are concrete uses cases where this is insufficient please open a new issue.

@omry
Copy link
Collaborator

omry commented Aug 13, 2020

#874 is coming in Hydra 1.0.0rc3 and will probably address most uncovered use cases.

@r9y9
Copy link

r9y9 commented Aug 14, 2020

Wow, --config-dir option is exactly what I wanted. It worked perfectly in my use case, where I wanted to allow hydra app users to have custom configurations outside the package path. Looking forward to v1.0.0 release 💯

@omry
Copy link
Collaborator

omry commented Aug 14, 2020

Hydra 1.0.0 already has rc2 released which you can try today (pip install hydra-core --pre --upgrade).
You will be able to try --config-dir now by installing from master.

@npuichigo
Copy link

npuichigo commented Nov 5, 2020

@omry I think a use case is to reload the configs in .hydra if overriding happened during previous runs.
For example

config
├── dataset
│   ├── cifar10.yaml
│   └── imagenet.yaml
├── eval.yaml
├── model
│   ├── alexnet.yaml
│   └── vanilla.yaml
├── train.yaml
└── trainer
    └── default.yaml

python train.py model.num_units=256

Then I get the overrided configs in .hydra:

.hydra
├── config.yaml
├── hydra.yaml
└── overrides.yaml

# overrides.yaml
- model.num_units=256

Now, I can reproduce my training with the following command line:

python train.py --config-path=output/xxxx-xx-xx/.hydra --config-name=config

That's great.

However, how can I explicitly combine the overrides.yaml with the primary configs in config directory? For example, here I want to reload my model for inference. What I want is:

# eval.py use config-path=config and config-name=eval, which is eval.yaml
python eval.py --override-yaml=output/xxxx-xx-xx/.hydra/overrides.yaml

# it's equivalent to
python eval.py --config-path=config --config-name=eval model.num_units=256

Maybe it's ad-hoc to split the config into train.yaml and eval.yaml. Here's another use case.

If I want the user to save their own overrides configs in their workspace, the combination with the primary config is needed.

egs
└── neural_network
    ├── config
    │   ├── large_network.yaml
    │   ├── medium_network.yaml
    │   └── small_network.yaml
    └── run.sh

# large_network.yaml
- model.num_classes=1024

# medium_network.yaml
- model.num_classes=512

# small_network.yaml
- model.num_classes=256

Here, the user may only want to override the num_classes of model

python train.py --override-yaml=egs/neural_network/config/large_network.yaml

# it's equivalent to
python train.py --config-path=config --config-name=train model.num_classes=1024

Of course, the workaround is to use command-line arguments directly, but I'm seeking for a more elegant way.

egs
└── neural_network
    ├── run_large.sh
    ├── run_medium.sh
    └── run_small.sh

@npuichigo
Copy link

@omry

@omry
Copy link
Collaborator

omry commented Nov 24, 2020

@npuichigo, please open a separate feature request.

@dmarx
Copy link

dmarx commented Feb 17, 2022

Another workaround option for anyone who needs it: I'm invoking a plugin that adds the user's current working directory to hydra's search path. Depending on how you integrate this into your application, it probably won't override defaults. At least in my use case, it's still preferable to requiring the users to provide the --config-dir flag (for now).

import os

from hydra.core.config_search_path import ConfigSearchPath
from hydra.plugins.search_path_plugin import SearchPathPlugin

# See also:
# https://hydra.cc/docs/advanced/search_path/#
# https://github.com/facebookresearch/hydra/issues/763

class PyttiLocalConfigSearchPathPlugin(SearchPathPlugin):
    def manipulate_search_path(self, search_path: ConfigSearchPath) -> None:

        local_path = f"{os.getcwd()}/config/"
        logger.debug(local_path)
        search_path.append(
            provider="myframework", path=f"file://{local_path}"
        )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Enhanvement request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

10 participants