diff --git a/docs/source/testing.rst b/docs/source/testing.rst index 0a9d3d525bfa..49f9765d1bcf 100644 --- a/docs/source/testing.rst +++ b/docs/source/testing.rst @@ -700,11 +700,11 @@ Temporary files and directories ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Using unique temporary files and directories are essential for parallel test running, so that the tests won't overwrite -each other's data. Also we want to get the temp files and directories removed at the end of each test that created +each other's data. Also we want to get the temporary files and directories removed at the end of each test that created them. Therefore, using packages like ``tempfile``, which address these needs is essential. -However, when debugging tests, you need to be able to see what goes into the temp file or directory and you want to -know it's exact path and not having it randomized on every test re-run. +However, when debugging tests, you need to be able to see what goes into the temporary file or directory and you want +to know it's exact path and not having it randomized on every test re-run. A helper class :obj:`transformers.test_utils.TestCasePlus` is best used for such purposes. It's a sub-class of :obj:`unittest.TestCase`, so we can easily inherit from it in the test modules. @@ -720,32 +720,33 @@ Here is an example of its usage: This code creates a unique temporary directory, and sets :obj:`tmp_dir` to its location. -In this and all the following scenarios the temporary directory will be auto-removed at the end of test, unless -``after=False`` is passed to the helper function. - -* Create a temporary directory of my choice and delete it at the end - useful for debugging when you want to monitor a - specific directory: +* Create a unique temporary dir: .. code-block:: python def test_whatever(self): - tmp_dir = self.get_auto_remove_tmp_dir(tmp_dir="./tmp/run/test") + tmp_dir = self.get_auto_remove_tmp_dir() + +``tmp_dir`` will contain the path to the created temporary dir. It will be automatically removed at the end of the +test. -* Create a temporary directory of my choice and do not delete it at the end---useful for when you want to look at the - temp results: +* Create a temporary dir of my choice, ensure it's empty before the test starts and don't empty it after the test. .. code-block:: python def test_whatever(self): - tmp_dir = self.get_auto_remove_tmp_dir(tmp_dir="./tmp/run/test", after=False) + tmp_dir = self.get_auto_remove_tmp_dir("./xxx") -* Create a temporary directory of my choice and ensure to delete it right away---useful for when you disabled deletion - in the previous test run and want to make sure the that temporary directory is empty before the new test is run: +This is useful for debug when you want to monitor a specific directory and want to make sure the previous tests didn't +leave any data in there. -.. code-block:: python +* You can override the default behavior by directly overriding the ``before`` and ``after`` args, leading to one of the + following behaviors: - def test_whatever(self): - tmp_dir = self.get_auto_remove_tmp_dir(tmp_dir="./tmp/run/test", before=True) + - ``before=True``: the temporary dir will always be cleared at the beginning of the test. + - ``before=False``: if the temporary dir already existed, any existing files will remain there. + - ``after=True``: the temporary dir will always be deleted at the end of the test. + - ``after=False``: the temporary dir will always be left intact at the end of the test. .. note:: In order to run the equivalent of ``rm -r`` safely, only subdirs of the project repository checkout are allowed if @@ -799,7 +800,7 @@ or the ``xfail`` way: @pytest.mark.xfail def test_feature_x(): -Here is how to skip a test based on some internal check inside the test: +- Here is how to skip a test based on some internal check inside the test: .. code-block:: python @@ -822,7 +823,7 @@ or the ``xfail`` way: def test_feature_x(): pytest.xfail("expected to fail until bug XYZ is fixed") -Here is how to skip all tests in a module if some import is missing: +- Here is how to skip all tests in a module if some import is missing: .. code-block:: python diff --git a/src/transformers/testing_utils.py b/src/transformers/testing_utils.py index 02998bcfd656..657ac8a3ce50 100644 --- a/src/transformers/testing_utils.py +++ b/src/transformers/testing_utils.py @@ -516,45 +516,47 @@ class solves this problem by sorting out all the basic paths and provides easy a - ``repo_root_dir_str`` - ``src_dir_str`` - Feature 2: Flexible auto-removable temp dirs which are guaranteed to get removed at the end of test. + Feature 2: Flexible auto-removable temporary dirs which are guaranteed to get removed at the end of test. - In all the following scenarios the temp dir will be auto-removed at the end of test, unless `after=False`. - - # 1. create a unique temp dir, `tmp_dir` will contain the path to the created temp dir + 1. Create a unique temporary dir: :: def test_whatever(self): tmp_dir = self.get_auto_remove_tmp_dir() - # 2. create a temp dir of my choice and delete it at the end - useful for debug when you want to # monitor a - specific directory + ``tmp_dir`` will contain the path to the created temporary dir. It will be automatically removed at the end of the + test. + + + 2. Create a temporary dir of my choice, ensure it's empty before the test starts and don't + empty it after the test. :: def test_whatever(self): - tmp_dir = self.get_auto_remove_tmp_dir(tmp_dir="./tmp/run/test") + tmp_dir = self.get_auto_remove_tmp_dir("./xxx") - # 3. create a temp dir of my choice and do not delete it at the end - useful for when you want # to look at the - temp results + This is useful for debug when you want to monitor a specific directory and want to make sure the previous tests + didn't leave any data in there. - :: - def test_whatever(self): - tmp_dir = self.get_auto_remove_tmp_dir(tmp_dir="./tmp/run/test", after=False) + 3. You can override the first two options by directly overriding the ``before`` and ``after`` args, leading to the + following behavior: - # 4. create a temp dir of my choice and ensure to delete it right away - useful for when you # disabled deletion in - the previous test run and want to make sure the that tmp dir is empty # before the new test is run + ``before=True``: the temporary dir will always be cleared at the beginning of the test. - :: + ``before=False``: if the temporary dir already existed, any existing files will remain there. - def test_whatever(self): - tmp_dir = self.get_auto_remove_tmp_dir(tmp_dir="./tmp/run/test", before=True) + ``after=True``: the temporary dir will always be deleted at the end of the test. + + ``after=False``: the temporary dir will always be left intact at the end of the test. - Note 1: In order to run the equivalent of `rm -r` safely, only subdirs of the project repository checkout are - allowed if an explicit `tmp_dir` is used, so that by mistake no `/tmp` or similar important part of the filesystem - will get nuked. i.e. please always pass paths that start with `./` + Note 1: In order to run the equivalent of ``rm -r`` safely, only subdirs of the project repository checkout are + allowed if an explicit ``tmp_dir`` is used, so that by mistake no ``/tmp`` or similar important part of the + filesystem will get nuked. i.e. please always pass paths that start with ``./`` - Note 2: Each test can register multiple temp dirs and they all will get auto-removed, unless requested otherwise. + Note 2: Each test can register multiple temporary dirs and they all will get auto-removed, unless requested + otherwise. Feature 3: Get a copy of the ``os.environ`` object that sets up ``PYTHONPATH`` specific to the current test suite. This is useful for invoking external programs from the test suite - e.g. distributed training. @@ -567,6 +569,7 @@ def test_whatever(self): """ def setUp(self): + # get_auto_remove_tmp_dir feature: self.teardown_tmp_dirs = [] # figure out the resolved paths for repo_root, tests, examples, etc. @@ -654,21 +657,42 @@ def get_env(self): env["PYTHONPATH"] = ":".join(paths) return env - def get_auto_remove_tmp_dir(self, tmp_dir=None, after=True, before=False): + def get_auto_remove_tmp_dir(self, tmp_dir=None, before=None, after=None): """ Args: tmp_dir (:obj:`string`, `optional`): - use this path, if None a unique path will be assigned - before (:obj:`bool`, `optional`, defaults to :obj:`False`): - if `True` and tmp dir already exists make sure to empty it right away - after (:obj:`bool`, `optional`, defaults to :obj:`True`): - delete the tmp dir at the end of the test + if :obj:`None`: + + - a unique temporary path will be created + - sets ``before=True`` if ``before`` is :obj:`None` + - sets ``after=True`` if ``after`` is :obj:`None` + else: + + - :obj:`tmp_dir` will be created + - sets ``before=True`` if ``before`` is :obj:`None` + - sets ``after=False`` if ``after`` is :obj:`None` + before (:obj:`bool`, `optional`): + If :obj:`True` and the :obj:`tmp_dir` already exists, make sure to empty it right away if :obj:`False` + and the :obj:`tmp_dir` already exists, any existing files will remain there. + after (:obj:`bool`, `optional`): + If :obj:`True`, delete the :obj:`tmp_dir` at the end of the test if :obj:`False`, leave the + :obj:`tmp_dir` and its contents intact at the end of the test. Returns: - tmp_dir(:obj:`string`): either the same value as passed via `tmp_dir` or the path to the auto-created tmp + tmp_dir(:obj:`string`): either the same value as passed via `tmp_dir` or the path to the auto-selected tmp dir """ if tmp_dir is not None: + + # defining the most likely desired behavior for when a custom path is provided. + # this most likely indicates the debug mode where we want an easily locatable dir that: + # 1. gets cleared out before the test (if it already exists) + # 2. is left intact after the test + if before is None: + before = True + if after is None: + after = False + # using provided path path = Path(tmp_dir).resolve() @@ -685,6 +709,15 @@ def get_auto_remove_tmp_dir(self, tmp_dir=None, after=True, before=False): path.mkdir(parents=True, exist_ok=True) else: + # defining the most likely desired behavior for when a unique tmp path is auto generated + # (not a debug mode), here we require a unique tmp dir that: + # 1. is empty before the test (it will be empty in this situation anyway) + # 2. gets fully removed after the test + if before is None: + before = True + if after is None: + after = True + # using unique tmp dir (always empty, regardless of `before`) tmp_dir = tempfile.mkdtemp() @@ -695,7 +728,8 @@ def get_auto_remove_tmp_dir(self, tmp_dir=None, after=True, before=False): return tmp_dir def tearDown(self): - # remove registered temp dirs + + # get_auto_remove_tmp_dir feature: remove registered temp dirs for path in self.teardown_tmp_dirs: shutil.rmtree(path, ignore_errors=True) self.teardown_tmp_dirs = []