-
Notifications
You must be signed in to change notification settings - Fork 6.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fail DB::Open() if logger cannot be created #9984
Fail DB::Open() if logger cannot be created #9984
Conversation
@riversand963 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
I think that would be good. Also consider requiring failure for OPTIONS file errors - there is currently a fail_if_options_file_error flag that is false by default. |
+1 for failing the open if Logger creation failed. |
8a74d2d
to
7fefa5e
Compare
@riversand963 has updated the pull request. You must reimport the pull request before landing. |
At this point, I expect there will be a lot of errors due to a bug related to non-empty |
We can explore this in another PR since it's another behavior change. |
7fefa5e
to
379d20d
Compare
@riversand963 has updated the pull request. You must reimport the pull request before landing. |
@riversand963 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@riversand963 has updated the pull request. You must reimport the pull request before landing. |
46a2bee
to
d8535a9
Compare
@riversand963 has updated the pull request. You must reimport the pull request before landing. |
@riversand963 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
Before reviewing this PR, a bigger question: for non-read-only instance, should we return error and refuse to open DB if Logger creation in `SanitizeOptions()` fails? Our current code allows it, but it is really difficult to debug because there will be no LOG files. Summary: Arena::AllocateAligned(size_t bytes, size_t huge_page_size, Logger* logger) has an assertion as the following: ```cpp \#ifdef MAP_HUGETLB if (huge_page_size > 0 && bytes > 0) { assert(logger != nullptr); } \#endif ``` This assertion is based on wrong assumption. The logger comes from ConcurrentArena::AllocateAligned(..., Logger* logger = nullptr) and can be nullptr by default. Tracing back further, logger comes from immutable_db_options.info_log which can be nullptr if `CreateLoggerFromOptions()` return non-ok status. In this case, RocksDB currently still proceeds. ```cpp if (result.info_log == nullptr && !read_only) { Status s = CreateLoggerFromOptions(...); if (!s.ok()) { result.info_log = nullptr; } } ``` An actual occurence of the backtrace will be provided Test Plan: make check
d8535a9
to
394e2e4
Compare
@riversand963 has updated the pull request. You must reimport the pull request before landing. |
@riversand963 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
I remembered realizing a problem with this last year - #8275 (comment) |
Summary: After #9984, BackupEngineTest.Concurrency becomes flaky. During DB::Open(), someone else can rename/remove the LOG file, causing this thread's `CreateLoggerFromOptions()` to fail. The reason is that the operation sequence of "FileExists -> Rename" is not atomic. It's possible that a FileExists() returns OK, but the file gets deleted before Rename(), causing the latter to return IOError with PathNotFound subcode. Although it's not encouraged to concurrently modify the contents of the directories managed by the database instance in this case, we can still perform some simple handling to make DB::Open() more robust. In this case, we can check if a racing thread has deleted the original LOG file, we can allow this thread to continue creating a new LOG file. Pull Request resolved: #10069 Test Plan: ~/gtest-parallel/gtest-parallel -r 100 ./backup_engine_test --gtest_filter=BackupEngineTest.Concurrency Reviewed By: jay-zhuang Differential Revision: D36736913 Pulled By: riversand963 fbshipit-source-id: 3cbe92d77ca175e55e586bdb1a32ac8107217ae6
Summary: #9984 changes the behavior of RocksDB: if logger creation failed during `SanitizeOptions()`, `DB::Open()` will fail. However, since `SanitizeOptions()` is called in `DBImpl::DBImpl()`, we cannot directly expose the error to caller without some additional work. This is a first version proposal which: - Adds a new member `init_logger_creation_s` to `DBImpl` to store the result of init logger creation - Checks the error during `DB::Open()` and return it to caller if non-ok This is not very ideal. We can alternatively move the logger creation logic out of the `SanitizeOptions()`. Since `SanitizeOptions()` is used in other places, we need to check whether this change breaks anything in case other callers of `SanitizeOptions()` assumes that a logger should be created. Pull Request resolved: #10223 Test Plan: make check Reviewed By: pdillinger Differential Revision: D37321717 Pulled By: riversand963 fbshipit-source-id: 58042358a86369d606549dd9938933dd47591c4b
Summary: facebook#9984 changes the behavior of RocksDB: if logger creation failed during `SanitizeOptions()`, `DB::Open()` will fail. However, since `SanitizeOptions()` is called in `DBImpl::DBImpl()`, we cannot directly expose the error to caller without some additional work. This is a first version proposal which: - Adds a new member `init_logger_creation_s` to `DBImpl` to store the result of init logger creation - Checks the error during `DB::Open()` and return it to caller if non-ok This is not very ideal. We can alternatively move the logger creation logic out of the `SanitizeOptions()`. Since `SanitizeOptions()` is used in other places, we need to check whether this change breaks anything in case other callers of `SanitizeOptions()` assumes that a logger should be created. Pull Request resolved: facebook#10223 Test Plan: make check Reviewed By: pdillinger Differential Revision: D37321717 Pulled By: riversand963 fbshipit-source-id: 58042358a86369d606549dd9938933dd47591c4b
For regular db instance and secondary instance, we return error and refuse to open DB if Logger creation fails.
Our current code allows it, but it is really difficult to debug because
there will be no LOG files. The same for OPTIONS file, which will be explored in another PR.
Furthermore, Arena::AllocateAligned(size_t bytes, size_t huge_page_size, Logger* logger) has an
assertion as the following:
It can be removed.
Test Plan:
make check