Skip to content

Fix: Removal of transformers logger and addition of python native logger#270

Merged
Abhishek-TAMU merged 41 commits intofoundation-model-stack:mainfrom
Abhishek-TAMU:fix_logging
Aug 9, 2024
Merged

Fix: Removal of transformers logger and addition of python native logger#270
Abhishek-TAMU merged 41 commits intofoundation-model-stack:mainfrom
Abhishek-TAMU:fix_logging

Conversation

@Abhishek-TAMU
Copy link
Collaborator

@Abhishek-TAMU Abhishek-TAMU commented Jul 30, 2024

Description of the change

Aim was to move away from using logger modules from transformers and use python native logger. Now, log level is either set by CLI argument of --log_level OR environment variable TRANSFORMER_VERBOSITY OR LOG_LEVEL. The setting of log level applies to both python native logger and log level of transformers logging.

Related issue number

Closes #243
Closes #195
Closes #194
Closes #215

#994

How to verify the PR

  • Set environment variable LOG_LEVEL OR TRANSFORMER_VERBOSITY as desired log level and check if logs are correct according to passed value in env variable, by running tuning job for a small model like Maykeye/TinyLLama-v0.

Command:

python tuning/sft_trainer.py  \                                             
--model_name_or_path Maykeye/TinyLLama-v0  \
--training_data_path tests/data/twitter_complaints_small.json  \
--output_dir outputs/full-tuning  \
--num_train_epochs 5  \
--per_device_train_batch_size 2  \
--gradient_accumulation_steps 1  \
--learning_rate 1e-5  \
--response_template "\n### Label:"  \
--dataset_text_field "output" \
--use_flash_attn false \
--torch_dtype "float32"
  • Pass log_level via cli argument, and check if logs are correct according to passed LOG_LEVEL, by running tuning job for a small model like Maykeye/TinyLLama-v0.

Command:

python tuning/sft_trainer.py  \                                             
--model_name_or_path Maykeye/TinyLLama-v0  \
--training_data_path tests/data/twitter_complaints_small.json  \
--output_dir outputs/full-tuning  \
--num_train_epochs 5  \
--per_device_train_batch_size 2  \
--gradient_accumulation_steps 1  \
--learning_rate 1e-5  \
--response_template "\n### Label:"  \
--dataset_text_field "output" \
--use_flash_attn false \
--torch_dtype "float32" \
--log_level "info"

Was the PR tested

  • I have added >=1 unit test(s) for every new method I have added.
  • I have ensured all unit tests pass

Abhishek-TAMU and others added 3 commits July 30, 2024 11:21
Signed-off-by: Abhishek <maurya.abhishek@ibm.com>
…thon logger

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>
…del-stack#258)

* initial code for deleting lm_head

Signed-off-by: Anh-Uong <anh.uong@ibm.com>

* fix logic for copying checkpoint

Signed-off-by: Anh-Uong <anh.uong@ibm.com>

* fix check that embed_tokens and lm_head weights are the same

Signed-off-by: Anh-Uong <anh.uong@ibm.com>

* fix warning assertion

Signed-off-by: Anh-Uong <anh.uong@ibm.com>

* fix lm_head check, remove test

Signed-off-by: Anh-Uong <anh.uong@ibm.com>

* small fixes from code review

Signed-off-by: Anh-Uong <anh.uong@ibm.com>

* fmt

Signed-off-by: Anh-Uong <anh.uong@ibm.com>

---------

Signed-off-by: Anh-Uong <anh.uong@ibm.com>
Co-authored-by: Anh-Uong <anh.uong@ibm.com>
Signed-off-by: Abhishek <maurya.abhishek@ibm.com>
Copy link
Collaborator

@anhuong anhuong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Abhishek for the logging update, I think this will help a lot. A few questions on how the logging works and the desired logic

)

# Configure log_level of python native logger.
LOGLEVEL = None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I had used an uppercase variable for this since it was pulling from the environment vars. But uppercase variables are usually used for constants, and since this isn't one, we can change it to a lowercase log_level

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR review. This is done!

Comment on lines 82 to 87
if "log_level" in job_config and job_config["log_level"]:
LOGLEVEL = job_config["log_level"].upper()
logging.basicConfig(level=LOGLEVEL)
else:
LOGLEVEL = os.environ.get("LOG_LEVEL", "WARNING").upper()
logging.basicConfig(level=LOGLEVEL)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small refactor you can do:

      log_level = job_config.get("log_level")    # this will be set to either the value found or None
      if not log_level:   # if log level not set by job_config aka by JSON, set it via env var or set default
            log_level = os.environ.get("LOG_LEVEL", "WARNING")
      log_level = log_level.upper()
      logging.basicConfig(level=log_level)
      os.environ["LOG_LEVEL"] = log_level  # nice to move this up so all the parts related to log level are together

Copy link
Collaborator Author

@Abhishek-TAMU Abhishek-TAMU Aug 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is done! Thank you for the input!

default="passive",
metadata={
"help": "The log level to adopt during training. \
Possible values are 'debug', 'info', 'warning', 'error' and 'critical'"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add note on what "passive" value means

plus a ‘passive’ level which doesn’t set anything and keeps the current log level for the Transformers library (which will be "warning" by default)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added note in metadata!

# Assign value of either env var LOG_LEVEL or warning
LOGLEVEL = os.environ.get("LOG_LEVEL", "WARNING").upper()
logging.basicConfig(level=LOGLEVEL)
train_logger = logging.getLogger()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you call this on line 497, you don't need to call it within this block as well

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is done!

logger.debug(

# Function to set log level for python native logger and transformers training logger
training_args, _ = set_log_level(training_args)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you not using the logging.getLogger() that you returned? Since you are using the root logger, running

logging.warning("Warning")
# above is the same as below
root_logger = logging.getLogger()
root_logger.warning("Warning")

Running the above is the same. So if you are not using the getLogger, you shouldn't return it/use it.

But it looks like it is best practice to use getLogger from python docs and python logging howto.
So then instead of running logging.warning() you'd run logger.warning(). In which in the train() you would also call logging.getLogger(). It looks like its best practice to create module level loggers. So you could do logging.getLogger("sft_trainer_main") and logging.getLogger("sft_trainer_train")` and then the log messages should show like:

# named logger if you do getLogger("sft_trainer_train")
WARNING:sft_trainer_train:My warning

# root logger if you do getLogger()
WARNING:root:My warning

Copy link
Collaborator Author

@Abhishek-TAMU Abhishek-TAMU Aug 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In returned logger on line number 497, just to get logger in unit test cases functions. returning logger won't make sense in same file as we can set log level in same py file with returning logger.

But ya that's a good idea to use logging.getLogger. I have made the changes.

self.logger = logging.get_logger("file_logging_tracker")
# Configure log level
LOGLEVEL = os.environ.get("LOG_LEVEL", "WARNING").upper()
logging.basicConfig(level=LOGLEVEL)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logging.basicConfig() sets the log level for the root logger. But you can set individual loggers level by running self.logger.setLevel(level)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a few of these to fix, not going to comment in each file

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you even have to set log level for every file, you only have to set it once in the sft_trainer.py after parsing user's input right?

In each file, you just need to get a logger by doing:

import logging
logger = logging.getLogger(__name__)

and every time we log something, we just use that logger:

logger.info("Log this info")

Copy link
Collaborator Author

@Abhishek-TAMU Abhishek-TAMU Aug 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aluu317 Yes I think it makes sense. But I am not sure if all the other files other than sfttrainer.py are used standalone where they need to set their own log level. Maybe @anhuong can comment better.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that having each file set their own logging level helps to ensure that the logging level is set if a file is called outside of sft_trainer.py. I do agree that using logging.getLogger(__name__) is best practice and it is unlikely these files will be called outside of sft_trainer.py so I think going with Angel's suggestion works


# Check if env var TRANSFORMERS_VERBOSITY is not set.
# Else if env var is already set then, log level of transformers is automatically set.
if os.environ.get("TRANSFORMERS_VERBOSITY") is None:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can do

Suggested change
if os.environ.get("TRANSFORMERS_VERBOSITY") is None:
if not os.environ.get("TRANSFORMERS_VERBOSITY"):

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although do we even need the code block below? What use case is it covering? Do we have to check that TRANFORMERS_VERBOSITY at all?

is this trying to do

if os.getenv("TRANFORMERS_VERBOSITY"):
   parsed_training_args.log_level = os.getenv("TRANFORMERS_VERBOSITY").lower()

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we are required to check "TRANFORMERS_VERBOSITY". Hence I haven't added this code in final commit.

Comment on lines 478 to 484
if parsed_training_args.log_level != "passive":
logging.basicConfig(level=parsed_training_args.log_level.upper())
else:
# Assign value of either env var LOG_LEVEL or warning
LOGLEVEL = os.environ.get("LOG_LEVEL", "WARNING").upper()
logging.basicConfig(level=LOGLEVEL)
train_logger = logging.getLogger()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this work to attain the logic we're looking for?
Goal for logic:

  • TRANSFORMERS_VERBOSITY ---> should only set transformers logging, this is different than setting --log_level
  • --log_level --> sets both python and transformers logging
  • For consistency with the flag, LOG_LEVEL also sets python and transformers logging
    if parsed_training_args.log_level != "passive":
        logging.basicConfig(level=parsed_training_args.log_level.upper())
    elif os.environ.get("LOG_LEVEL"):
        # Assign value of either env var LOG_LEVEL or warning
        LOGLEVEL = os.environ.get("LOG_LEVEL")
        logging.basicConfig(level=LOGLEVEL)
       # Set log_level in TrainingArguments
       parsed_training_args.log_level = LOGLEVEL.lower()

  train_logger = logging.getLogger()

Copy link
Collaborator Author

@Abhishek-TAMU Abhishek-TAMU Aug 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't add this line parsed_training_args.log_level = LOGLEVEL.lower() in the same if-else because if env variable 'TRANSFORMERS_VERBOSITY' is set, then this would get overwritten. Hence wrote this line in separate log below checking if not os.environ.get("TRANSFORMERS_VERBOSITY"):

Also I think in your code above in else logic we can add logging.basicConfig(level="WARNING") as default.
Thanks for the optimized code idea. Taking above points in consideration too I have optimized the code to work according to our logic. Feel free to comment on it.

log_level = None
if parsed_training_args.log_level != "passive":
    log_level = parsed_training_args.log_level
elif os.environ.get("LOG_LEVEL"):
    log_level = os.environ.get("LOG_LEVEL")
    parsed_training_args.log_level = (
        log_level.lower()
        if not os.environ.get("TRANSFORMERS_VERBOSITY")
        else os.environ.get("TRANSFORMERS_VERBOSITY")
    )
else:
    log_level = "WARNING"

logging.basicConfig(level=log_level.upper())
train_logger = logging.getLogger()

Copy link
Collaborator

@anhuong anhuong Aug 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's chat about this logic to clarify -- update: done!

logging.basicConfig(level=parsed_training_args.log_level.upper())
else:
# Assign value of either env var LOG_LEVEL or warning
LOGLEVEL = os.environ.get("LOG_LEVEL", "WARNING").upper()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
LOGLEVEL = os.environ.get("LOG_LEVEL", "WARNING").upper()
LOGLEVEL = os.environ.get("LOG_LEVEL", logging.WARNING).upper()

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here if LOG_LEVEL is not set, it will default to logging.WARNING, which is an integer constant from the logging module (default is 30). The use of .upper() here will cause an error because .upper() is a string method and cannot be called on an integer.

self.logger = logging.get_logger("file_logging_tracker")
# Configure log level
LOGLEVEL = os.environ.get("LOG_LEVEL", "WARNING").upper()
logging.basicConfig(level=LOGLEVEL)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'd be helpful also if in each file, you could do:

logger = logging.getLogger(__name__)

This will specify in the logs which module the log came from, example:

INFO:__main__:Started
INFO:mylib:Doing something
INFO:__main__:Finished

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea that's a good idea. I will do that.

self.logger = logging.get_logger("file_logging_tracker")
# Configure log level
LOGLEVEL = os.environ.get("LOG_LEVEL", "WARNING").upper()
logging.basicConfig(level=LOGLEVEL)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you even have to set log level for every file, you only have to set it once in the sft_trainer.py after parsing user's input right?

In each file, you just need to get a logger by doing:

import logging
logger = logging.getLogger(__name__)

and every time we log something, we just use that logger:

logger.info("Log this info")

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>
Signed-off-by: Abhishek <maurya.abhishek@ibm.com>
assert training_args.log_level == "info"


@mock.patch.dict(os.environ, {}, clear=True)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should include TRANSFORMERS_VERBOSITY to set the env and so you don't have to set it in the test and worry about removing it

Suggested change
@mock.patch.dict(os.environ, {}, clear=True)
@mock.patch.dict(os.environ, {"TRANSFORMERS_VERBOSITY": "info"}, clear=True)

# Local
from tuning import sft_trainer
from tuning.config import configs, peft_config
from tuning.utils.logging import set_log_level
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you moved the function that you are testing, you should also move the tests. They should be in /tests/utils/test_logging.py

assert training_args.log_level == "error"


@mock.patch.dict(os.environ, {}, clear=True)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similarly, want to include LOG_LEVEL here

Suggested change
@mock.patch.dict(os.environ, {}, clear=True)
@mock.patch.dict(os.environ, {"LOG_LEVEL": "info"}, clear=True)

assert training_args.log_level == "passive"


@mock.patch.dict(os.environ, {}, clear=True)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Want to include LOG_LEVEL here to setup env

Suggested change
@mock.patch.dict(os.environ, {}, clear=True)
@mock.patch.dict(os.environ, {"LOG_LEVEL": "info"}, clear=True)

from tuning.trainercontroller.controllermetrics.metricshandler import MetricHandler

logger = logging.get_logger(__name__)
# Configure log level
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove comment, no longer relevant

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@anhuong Thanks for the quick review. I have made all changes mentioned above and pushed it again. Thank you!
As discussed, I can also wait for Dushyant's comment!

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>
Signed-off-by: Abhishek <maurya.abhishek@ibm.com>
from tuning.config import configs
from tuning.utils.logging import set_log_level

TRAIN_ARGS = configs.TrainingArguments(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is same as the TRAIN_ARGS in sft_trainer, you can import this directly from the main test file instead of redeclaration, please see -

from tests.test_sft_trainer import (

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this input. This is done!

super().__init__(name="aim", tracker_config=tracker_config)
self.logger = logging.get_logger("aimstack_tracker")
# Get logger with root log level
self.logger = logging.getLogger()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Abhishek-TAMU any particular reason to remove module name from the logger I had kept it to segregate logs from various subcomponents.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment for file_logging_tracker

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Explained in the comment above..no need for futher explanation.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool. Thank you!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still dont quite understand this. You're setting log_level at runtime to restrict the amount of logs that a user wants to see (debug vs info etc.). But if you don't do logging.get_logger("aimstack_tracker") and so on in each file, all the logs will say they come from root, not module specific, helpful but not as helpful if the module is also included.
Maybe it'd be clearer if you could share an example run of what the logs look like if user sets debug or error for example and we can see what the logs look like.

@@ -0,0 +1,46 @@
# Standard
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs a code ownership declaration similar to this -

# Copyright The FMS HF Tuning Authors

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is done!

@Abhishek-TAMU
Copy link
Collaborator Author

Abhishek-TAMU commented Aug 7, 2024

@anhuong Thank you again for all the multiple review on this PR. I have pushed the final changes. Feel free to have look whenever possible. Thanks!

anhuong
anhuong previously approved these changes Aug 8, 2024
Copy link
Collaborator

@anhuong anhuong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome work Abhishek! This will help a lot, many thanks!

Copy link
Collaborator

@aluu317 aluu317 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Abhishek-TAMU Some comments inline. Thank you!

or 'SFT_TRAINER_CONFIG_JSON_ENV_VAR'."
)

# Configure log_level of python native logger.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's also add a comment here that CLI arg takes precedence over env var. And if neither is set, we use default "WARNING"

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! This is done

def test_set_log_level_for_logger_with_env_var_and_cli():
"""
Ensure that the correct log level is being set for python native logger and
transformers logger when env var LOG_LEVEL is used and CLI flag is passed
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a quick comment that CLI arg takes precedence and we ignore env var LOG_LEVEL in this case?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is done!

def set_log_level(train_args, logger_name=None):
"""Set log level of python native logger and TF logger via argument from CLI or env variable.

Args:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add an arg for logger_name here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is done!

super().__init__(name="aim", tracker_config=tracker_config)
self.logger = logging.get_logger("aimstack_tracker")
# Get logger with root log level
self.logger = logging.getLogger()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still dont quite understand this. You're setting log_level at runtime to restrict the amount of logs that a user wants to see (debug vs info etc.). But if you don't do logging.get_logger("aimstack_tracker") and so on in each file, all the logs will say they come from root, not module specific, helpful but not as helpful if the module is also included.
Maybe it'd be clearer if you could share an example run of what the logs look like if user sets debug or error for example and we can see what the logs look like.

@Abhishek-TAMU
Copy link
Collaborator Author

Abhishek-TAMU commented Aug 8, 2024

@aluu317 The reason behind removing logging.get_logger("aimstack_tracker") or logging.get_logger(__name__) was that it doesn't instantiate logger with log level equivalent to root log level and creates a new logger with default log level, whereas logging.getLogger() gets the logger which is equivalent to root log level.

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>
@Abhishek-TAMU
Copy link
Collaborator Author

Abhishek-TAMU commented Aug 9, 2024

We have used just logging.getLogger() so that logger automatically gets imported with root log level in all files but that means logs will look like this, with all logs mentioning root:

image

Hence, I have added an argument now in python native root logger, which logs the module name for each logging statement (Can add function name of the logging statement too), without us having to manually define logger name with module. Hence now the same above logs look like:

image

Thank you for the input!

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>
@aluu317
Copy link
Collaborator

aluu317 commented Aug 9, 2024

Thank you @Abhishek-TAMU. This works!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

5 participants